?????????? ????????? - ??????????????? - /home/agenciai/public_html/cd38d8/pod.zip
???????
PK �=�[����\ �\ perl5120delta.podnu �[��� =encoding utf8 =head1 NAME perl5120delta - what is new for perl v5.12.0 =head1 DESCRIPTION This document describes differences between the 5.10.0 release and the 5.12.0 release. Many of the bug fixes in 5.12.0 are already included in the 5.10.1 maintenance release. You can see the list of those changes in the 5.10.1 release notes (L<perl5101delta>). =head1 Core Enhancements =head2 New C<package NAME VERSION> syntax This new syntax allows a module author to set the $VERSION of a namespace when the namespace is declared with 'package'. It eliminates the need for C<our $VERSION = ...> and similar constructs. E.g. package Foo::Bar 1.23; # $Foo::Bar::VERSION == 1.23 There are several advantages to this: =over =item * C<$VERSION> is parsed in exactly the same way as C<use NAME VERSION> =item * C<$VERSION> is set at compile time =item * C<$VERSION> is a version object that provides proper overloading of comparison operators so comparing C<$VERSION> to decimal (1.23) or dotted-decimal (v1.2.3) version numbers works correctly. =item * Eliminates C<$VERSION = ...> and C<eval $VERSION> clutter =item * As it requires VERSION to be a numeric literal or v-string literal, it can be statically parsed by toolchain modules without C<eval> the way MM-E<gt>parse_version does for C<$VERSION = ...> =back It does not break old code with only C<package NAME>, but code that uses C<package NAME VERSION> will need to be restricted to perl 5.12.0 or newer This is analogous to the change to C<open> from two-args to three-args. Users requiring the latest Perl will benefit, and perhaps after several years, it will become a standard practice. However, C<package NAME VERSION> requires a new, 'strict' version number format. See L</"Version number formats"> for details. =head2 The C<...> operator A new operator, C<...>, nicknamed the Yada Yada operator, has been added. It is intended to mark placeholder code that is not yet implemented. See L<perlop/"Yada Yada Operator">. =head2 Implicit strictures Using the C<use VERSION> syntax with a version number greater or equal to 5.11.0 will lexically enable strictures just like C<use strict> would do (in addition to enabling features.) The following: use 5.12.0; means: use strict; use feature ':5.12'; =head2 Unicode improvements Perl 5.12 comes with Unicode 5.2, the latest version available to us at the time of release. This version of Unicode was released in October 2009. See L<http://www.unicode.org/versions/Unicode5.2.0> for further details about what's changed in this version of the standard. See L<perlunicode> for instructions on installing and using other versions of Unicode. Additionally, Perl's developers have significantly improved Perl's Unicode implementation. For full details, see L</Unicode overhaul> below. =head2 Y2038 compliance Perl's core time-related functions are now Y2038 compliant. (It may not mean much to you, but your kids will love it!) =head2 qr overloading It is now possible to overload the C<qr//> operator, that is, conversion to regexp, like it was already possible to overload conversion to boolean, string or number of objects. It is invoked when an object appears on the right hand side of the C<=~> operator or when it is interpolated into a regexp. See L<overload>. =head2 Pluggable keywords Extension modules can now cleanly hook into the Perl parser to define new kinds of keyword-headed expression and compound statement. The syntax following the keyword is defined entirely by the extension. This allows a completely non-Perl sublanguage to be parsed inline, with the correct ops cleanly generated. See L<perlapi/PL_keyword_plugin> for the mechanism. The Perl core source distribution also includes a new module L<XS::APItest::KeywordRPN>, which implements reverse Polish notation arithmetic via pluggable keywords. This module is mainly used for test purposes, and is not normally installed, but also serves as an example of how to use the new mechanism. Perl's developers consider this feature to be experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14. =head2 APIs for more internals The lowest layers of the lexer and parts of the pad system now have C APIs available to XS extensions. These are necessary to support proper use of pluggable keywords, but have other uses too. The new APIs are experimental, and only cover a small proportion of what would be necessary to take full advantage of the core's facilities in these areas. It is intended that the Perl 5.13 development cycle will see the addition of a full range of clean, supported interfaces. Perl's developers consider this feature to be experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14. =head2 Overridable function lookup Where an extension module hooks the creation of rv2cv ops to modify the subroutine lookup process, this now works correctly for bareword subroutine calls. This means that prototypes on subroutines referenced this way will be processed correctly. (Previously bareword subroutine names were initially looked up, for parsing purposes, by an unhookable mechanism, so extensions could only properly influence subroutine names that appeared with an C<&> sigil.) =head2 A proper interface for pluggable Method Resolution Orders As of Perl 5.12.0 there is a new interface for plugging and using method resolution orders other than the default linear depth first search. The C3 method resolution order added in 5.10.0 has been re-implemented as a plugin, without changing its Perl-space interface. See L<perlmroapi> for more information. =head2 C<\N> experimental regex escape Perl now supports C<\N>, a new regex escape which you can think of as the inverse of C<\n>. It will match any character that is not a newline, independently from the presence or absence of the single line match modifier C</s>. It is not usable within a character class. C<\N{3}> means to match 3 non-newlines; C<\N{5,}> means to match at least 5. C<\N{NAME}> still means the character or sequence named C<NAME>, but C<NAME> no longer can be things like C<3>, or C<5,>. This will break a L<custom charnames translator|charnames/CUSTOM TRANSLATORS> which allows numbers for character names, as C<\N{3}> will now mean to match 3 non-newline characters, and not the character whose name is C<3>. (No name defined by the Unicode standard is a number, so only custom translators might be affected.) Perl's developers are somewhat concerned about possible user confusion with the existing C<\N{...}> construct which matches characters by their Unicode name. Consequently, this feature is experimental. We may remove it or change it in a backwards-incompatible way in Perl 5.14. =head2 DTrace support Perl now has some support for DTrace. See "DTrace support" in F<INSTALL>. =head2 Support for C<configure_requires> in CPAN module metadata Both C<CPAN> and C<CPANPLUS> now support the C<configure_requires> keyword in the F<META.yml> metadata file included in most recent CPAN distributions. This allows distribution authors to specify configuration prerequisites that must be installed before running F<Makefile.PL> or F<Build.PL>. See the documentation for C<ExtUtils::MakeMaker> or C<Module::Build> for more on how to specify C<configure_requires> when creating a distribution for CPAN. =head2 C<each>, C<keys>, C<values> are now more flexible The C<each>, C<keys>, C<values> function can now operate on arrays. =head2 C<when> as a statement modifier C<when> is now allowed to be used as a statement modifier. =head2 C<$,> flexibility The variable C<$,> may now be tied. =head2 // in when clauses // now behaves like || in when clauses =head2 Enabling warnings from your shell environment You can now set C<-W> from the C<PERL5OPT> environment variable =head2 C<delete local> C<delete local> now allows you to locally delete a hash entry. =head2 New support for Abstract namespace sockets Abstract namespace sockets are Linux-specific socket type that live in AF_UNIX family, slightly abusing it to be able to use arbitrary character arrays as addresses: They start with nul byte and are not terminated by nul byte, but with the length passed to the socket() system call. =head2 32-bit limit on substr arguments removed The 32-bit limit on C<substr> arguments has now been removed. The full range of the system's signed and unsigned integers is now available for the C<pos> and C<len> arguments. =head1 Potentially Incompatible Changes =head2 Deprecations warn by default Over the years, Perl's developers have deprecated a number of language features for a variety of reasons. Perl now defaults to issuing a warning if a deprecated language feature is used. Many of the deprecations Perl now warns you about have been deprecated for many years. You can find a list of what was deprecated in a given release of Perl in the C<perl5xxdelta.pod> file for that release. To disable this feature in a given lexical scope, you should use C<no warnings 'deprecated';> For information about which language features are deprecated and explanations of various deprecation warnings, please see L<perldiag>. See L</Deprecations> below for the list of features and modules Perl's developers have deprecated as part of this release. =head2 Version number formats Acceptable version number formats have been formalized into "strict" and "lax" rules. C<package NAME VERSION> takes a strict version number. C<UNIVERSAL::VERSION> and the L<version> object constructors take lax version numbers. Providing an invalid version will result in a fatal error. The version argument in C<use NAME VERSION> is first parsed as a numeric literal or v-string and then passed to C<UNIVERSAL::VERSION> (and must then pass the "lax" format test). These formats are documented fully in the L<version> module. To a first approximation, a "strict" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. A "lax" version number allows v-strings with fewer than three components or without a leading 'v'. Under "lax" rules, both decimal and dotted-decimal versions may have a trailing "alpha" component separated by an underscore character after a fractional or dotted-decimal component. The L<version> module adds C<version::is_strict> and C<version::is_lax> functions to check a scalar against these rules. =head2 @INC reorganization In C<@INC>, C<ARCHLIB> and C<PRIVLIB> now occur after the current version's C<site_perl> and C<vendor_perl>. Modules installed into C<site_perl> and C<vendor_perl> will now be loaded in preference to those installed in C<ARCHLIB> and C<PRIVLIB>. =head2 REGEXPs are now first class Internally, Perl now treats compiled regular expressions (such as those created with C<qr//>) as first class entities. Perl modules which serialize, deserialize or otherwise have deep interaction with Perl's internal data structures need to be updated for this change. Most affected CPAN modules have already been updated as of this writing. =head2 Switch statement changes The C<given>/C<when> switch statement handles complex statements better than Perl 5.10.0 did (These enhancements are also available in 5.10.1 and subsequent 5.10 releases.) There are two new cases where C<when> now interprets its argument as a boolean, instead of an expression to be used in a smart match: =over =item flip-flop operators The C<..> and C<...> flip-flop operators are now evaluated in boolean context, following their usual semantics; see L<perlop/"Range Operators">. Note that, as in perl 5.10.0, C<when (1..10)> will not work to test whether a given value is an integer between 1 and 10; you should use C<when ([1..10])> instead (note the array reference). However, contrary to 5.10.0, evaluating the flip-flop operators in boolean context ensures it can now be useful in a C<when()>, notably for implementing bistable conditions, like in: when (/^=begin/ .. /^=end/) { # do something } =item defined-or operator A compound expression involving the defined-or operator, as in C<when (expr1 // expr2)>, will be treated as boolean if the first expression is boolean. (This just extends the existing rule that applies to the regular or operator, as in C<when (expr1 || expr2)>.) =back =head2 Smart match changes Since Perl 5.10.0, Perl's developers have made a number of changes to the smart match operator. These, of course, also alter the behaviour of the switch statements where smart matching is implicitly used. These changes were also made for the 5.10.1 release, and will remain in subsequent 5.10 releases. =head3 Changes to type-based dispatch The smart match operator C<~~> is no longer commutative. The behaviour of a smart match now depends primarily on the type of its right hand argument. Moreover, its semantics have been adjusted for greater consistency or usefulness in several cases. While the general backwards compatibility is maintained, several changes must be noted: =over 4 =item * Code references with an empty prototype are no longer treated specially. They are passed an argument like the other code references (even if they choose to ignore it). =item * C<%hash ~~ sub {}> and C<@array ~~ sub {}> now test that the subroutine returns a true value for each key of the hash (or element of the array), instead of passing the whole hash or array as a reference to the subroutine. =item * Due to the commutativity breakage, code references are no longer treated specially when appearing on the left of the C<~~> operator, but like any vulgar scalar. =item * C<undef ~~ %hash> is always false (since C<undef> can't be a key in a hash). No implicit conversion to C<""> is done (as was the case in perl 5.10.0). =item * C<$scalar ~~ @array> now always distributes the smart match across the elements of the array. It's true if one element in @array verifies C<$scalar ~~ $element>. This is a generalization of the old behaviour that tested whether the array contained the scalar. =back The full dispatch table for the smart match operator is given in L<perlsyn/"Smart matching in detail">. =head3 Smart match and overloading According to the rule of dispatch based on the rightmost argument type, when an object overloading C<~~> appears on the right side of the operator, the overload routine will always be called (with a 3rd argument set to a true value, see L<overload>.) However, when the object will appear on the left, the overload routine will be called only when the rightmost argument is a simple scalar. This way, distributivity of smart match across arrays is not broken, as well as the other behaviours with complex types (coderefs, hashes, regexes). Thus, writers of overloading routines for smart match mostly need to worry only with comparing against a scalar, and possibly with stringification overloading; the other common cases will be automatically handled consistently. C<~~> will now refuse to work on objects that do not overload it (in order to avoid relying on the object's underlying structure). (However, if the object overloads the stringification or the numification operators, and if overload fallback is active, it will be used instead, as usual.) =head2 Other potentially incompatible changes =over 4 =item * The definitions of a number of Unicode properties have changed to match those of the current Unicode standard. These are listed above under L</Unicode overhaul>. This change may break code that expects the old definitions. =item * The boolkeys op has moved to the group of hash ops. This breaks binary compatibility. =item * Filehandles are now always blessed into C<IO::File>. The previous behaviour was to bless Filehandles into L<FileHandle> (an empty proxy class) if it was loaded into memory and otherwise to bless them into C<IO::Handle>. =item * The semantics of C<use feature :5.10*> have changed slightly. See L</"Modules and Pragmata"> for more information. =item * Perl's developers now use git, rather than Perforce. This should be a purely internal change only relevant to people actively working on the core. However, you may see minor difference in perl as a consequence of the change. For example in some of details of the output of C<perl -V>. See L<perlrepository> for more information. =item * As part of the C<Test::Harness> 2.x to 3.x upgrade, the experimental C<Test::Harness::Straps> module has been removed. See L</"Modules and Pragmata"> for more details. =item * As part of the C<ExtUtils::MakeMaker> upgrade, the C<ExtUtils::MakeMaker::bytes> and C<ExtUtils::MakeMaker::vmsish> modules have been removed from this distribution. =item * C<Module::CoreList> no longer contains the C<%:patchlevel> hash. =item * C<length undef> now returns undef. =item * Unsupported private C API functions are now declared "static" to prevent leakage to Perl's public API. =item * To support the bootstrapping process, F<miniperl> no longer builds with UTF-8 support in the regexp engine. This allows a build to complete with PERL_UNICODE set and a UTF-8 locale. Without this there's a bootstrapping problem, as miniperl can't load the UTF-8 components of the regexp engine, because they're not yet built. =item * F<miniperl>'s @INC is now restricted to just C<-I...>, the split of C<$ENV{PERL5LIB}>, and "C<.>" =item * A space or a newline is now required after a C<"#line XXX"> directive. =item * Tied filehandles now have an additional method EOF which provides the EOF type. =item * To better match all other flow control statements, C<foreach> may no longer be used as an attribute. =item * Perl's command-line switch "-P", which was deprecated in version 5.10.0, has now been removed. The CPAN module C<< Filter::cpp >> can be used as an alternative. =back =head1 Deprecations From time to time, Perl's developers find it necessary to deprecate features or modules we've previously shipped as part of the core distribution. We are well aware of the pain and frustration that a backwards-incompatible change to Perl can cause for developers building or maintaining software in Perl. You can be sure that when we deprecate a functionality or syntax, it isn't a choice we make lightly. Sometimes, we choose to deprecate functionality or syntax because it was found to be poorly designed or implemented. Sometimes, this is because they're holding back other features or causing performance problems. Sometimes, the reasons are more complex. Wherever possible, we try to keep deprecated functionality available to developers in its previous form for at least one major release. So long as a deprecated feature isn't actively disrupting our ability to maintain and extend Perl, we'll try to leave it in place as long as possible. The following items are now deprecated: =over =item suidperl C<suidperl> is no longer part of Perl. It used to provide a mechanism to emulate setuid permission bits on systems that don't support it properly. =item Use of C<:=> to mean an empty attribute list An accident of Perl's parser meant that these constructions were all equivalent: my $pi := 4; my $pi : = 4; my $pi : = 4; with the C<:> being treated as the start of an attribute list, which ends before the C<=>. As whitespace is not significant here, all are parsed as an empty attribute list, hence all the above are equivalent to, and better written as my $pi = 4; because no attribute processing is done for an empty list. As is, this meant that C<:=> cannot be used as a new token, without silently changing the meaning of existing code. Hence that particular form is now deprecated, and will become a syntax error. If it is absolutely necessary to have empty attribute lists (for example, because of a code generator) then avoid the warning by adding a space before the C<=>. =item C<< UNIVERSAL->import() >> The method C<< UNIVERSAL->import() >> is now deprecated. Attempting to pass import arguments to a C<use UNIVERSAL> statement will result in a deprecation warning. =item Use of "goto" to jump into a construct Using C<goto> to jump from an outer scope into an inner scope is now deprecated. This rare use case was causing problems in the implementation of scopes. =item Custom character names in \N{name} that don't look like names In C<\N{I<name>}>, I<name> can be just about anything. The standard Unicode names have a very limited domain, but a custom name translator could create names that are, for example, made up entirely of punctuation symbols. It is now deprecated to make names that don't begin with an alphabetic character, and aren't alphanumeric or contain other than a very few other characters, namely spaces, dashes, parentheses and colons. Because of the added meaning of C<\N> (See L</C<\N> experimental regex escape>), names that look like curly brace -enclosed quantifiers won't work. For example, C<\N{3,4}> now means to match 3 to 4 non-newlines; before a custom name C<3,4> could have been created. =item Deprecated Modules The following modules will be removed from the core distribution in a future release, and should be installed from CPAN instead. Distributions on CPAN which require these should add them to their prerequisites. The core versions of these modules warnings will issue a deprecation warning. If you ship a packaged version of Perl, either alone or as part of a larger system, then you should carefully consider the repercussions of core module deprecations. You may want to consider shipping your default build of Perl with packages for some or all deprecated modules which install into C<vendor> or C<site> perl library directories. This will inhibit the deprecation warnings. Alternatively, you may want to consider patching F<lib/deprecate.pm> to provide deprecation warnings specific to your packaging system or distribution of Perl, consistent with how your packaging system or distribution manages a staged transition from a release where the installation of a single package provides the given functionality, to a later release where the system administrator needs to know to install multiple packages to get that same functionality. You can silence these deprecation warnings by installing the modules in question from CPAN. To install the latest version of all of them, just install C<Task::Deprecations::5_12>. =over =item L<Class::ISA> =item L<Pod::Plainer> =item L<Shell> =item L<Switch> Switch is buggy and should be avoided. You may find Perl's new C<given>/C<when> feature a suitable replacement. See L<perlsyn/"Switch statements"> for more information. =back =item Assignment to $[ =item Use of the attribute :locked on subroutines =item Use of "locked" with the attributes pragma =item Use of "unique" with the attributes pragma =item Perl_pmflag C<Perl_pmflag> is no longer part of Perl's public API. Calling it now generates a deprecation warning, and it will be removed in a future release. Although listed as part of the API, it was never documented, and only ever used in F<toke.c>, and prior to 5.10, F<regcomp.c>. In core, it has been replaced by a static function. =item Numerous Perl 4-era libraries F<termcap.pl>, F<tainted.pl>, F<stat.pl>, F<shellwords.pl>, F<pwd.pl>, F<open3.pl>, F<open2.pl>, F<newgetopt.pl>, F<look.pl>, F<find.pl>, F<finddepth.pl>, F<importenv.pl>, F<hostname.pl>, F<getopts.pl>, F<getopt.pl>, F<getcwd.pl>, F<flush.pl>, F<fastcwd.pl>, F<exceptions.pl>, F<ctime.pl>, F<complete.pl>, F<cacheout.pl>, F<bigrat.pl>, F<bigint.pl>, F<bigfloat.pl>, F<assert.pl>, F<abbrev.pl>, F<dotsh.pl>, and F<timelocal.pl> are all now deprecated. Earlier, Perl's developers intended to remove these libraries from Perl's core for the 5.14.0 release. During final testing before the release of 5.12.0, several developers discovered current production code using these ancient libraries, some inside the Perl core itself. Accordingly, the pumpking granted them a stay of execution. They will begin to warn about their deprecation in the 5.14.0 release and will be removed in the 5.16.0 release. =back =head1 Unicode overhaul Perl's developers have made a concerted effort to update Perl to be in sync with the latest Unicode standard. Changes for this include: Perl can now handle every Unicode character property. New documentation, L<perluniprops>, lists all available non-Unihan character properties. By default, perl does not expose Unihan, deprecated or Unicode-internal properties. See below for more details on these; there is also a section in the pod listing them, and explaining why they are not exposed. Perl now fully supports the Unicode compound-style of using C<=> and C<:> in writing regular expressions: C<\p{property=value}> and C<\p{property:value}> (both of which mean the same thing). Perl now fully supports the Unicode loose matching rules for text between the braces in C<\p{...}> constructs. In addition, Perl allows underscores between digits of numbers. Perl now accepts all the Unicode-defined synonyms for properties and property values. C<qr/\X/>, which matches a Unicode logical character, has been expanded to work better with various Asian languages. It now is defined as an I<extended grapheme cluster>. (See L<http://www.unicode.org/reports/tr29/>). Anything matched previously and that made sense will continue to be accepted. Additionally: =over =item * C<\X> will not break apart a C<S<CR LF>> sequence. =item * C<\X> will now match a sequence which includes the C<ZWJ> and C<ZWNJ> characters. =item * C<\X> will now always match at least one character, including an initial mark. Marks generally come after a base character, but it is possible in Unicode to have them in isolation, and C<\X> will now handle that case, for example at the beginning of a line, or after a C<ZWSP>. And this is the part where C<\X> doesn't match the things that it used to that don't make sense. Formerly, for example, you could have the nonsensical case of an accented LF. =item * C<\X> will now match a (Korean) Hangul syllable sequence, and the Thai and Lao exception cases. =back Otherwise, this change should be transparent for the non-affected languages. C<\p{...}> matches using the Canonical_Combining_Class property were completely broken in previous releases of Perl. They should now work correctly. Before Perl 5.12, the Unicode C<Decomposition_Type=Compat> property and a Perl extension had the same name, which led to neither matching all the correct values (with more than 100 mistakes in one, and several thousand in the other). The Perl extension has now been renamed to be C<Decomposition_Type=Noncanonical> (short: C<dt=noncanon>). It has the same meaning as was previously intended, namely the union of all the non-canonical Decomposition types, with Unicode C<Compat> being just one of those. C<\p{Decomposition_Type=Canonical}> now includes the Hangul syllables. C<\p{Uppercase}> and C<\p{Lowercase}> now work as the Unicode standard says they should. This means they each match a few more characters than they used to. C<\p{Cntrl}> now matches the same characters as C<\p{Control}>. This means it no longer will match Private Use (gc=co), Surrogates (gc=cs), nor Format (gc=cf) code points. The Format code points represent the biggest possible problem. All but 36 of them are either officially deprecated or strongly discouraged from being used. Of those 36, likely the most widely used are the soft hyphen (U+00AD), and BOM, ZWSP, ZWNJ, WJ, and similar characters, plus bidirectional controls. C<\p{Alpha}> now matches the same characters as C<\p{Alphabetic}>. Before 5.12, Perl's definition included a number of things that aren't really alpha (all marks) while omitting many that were. The definitions of C<\p{Alnum}> and C<\p{Word}> depend on Alpha's definition and have changed accordingly. C<\p{Word}> no longer incorrectly matches non-word characters such as fractions. C<\p{Print}> no longer matches the line control characters: Tab, LF, CR, FF, VT, and NEL. This brings it in line with standards and the documentation. C<\p{XDigit}> now matches the same characters as C<\p{Hex_Digit}>. This means that in addition to the characters it currently matches, C<[A-Fa-f0-9]>, it will also match the 22 fullwidth equivalents, for example U+FF10: FULLWIDTH DIGIT ZERO. The Numeric type property has been extended to include the Unihan characters. There is a new Perl extension, the 'Present_In', or simply 'In', property. This is an extension of the Unicode Age property, but C<\p{In=5.0}> matches any code point whose usage has been determined I<as of> Unicode version 5.0. The C<\p{Age=5.0}> only matches code points added in I<precisely> version 5.0. A number of properties now have the correct values for unassigned code points. The affected properties are Bidi_Class, East_Asian_Width, Joining_Type, Decomposition_Type, Hangul_Syllable_Type, Numeric_Type, and Line_Break. The Default_Ignorable_Code_Point, ID_Continue, and ID_Start properties are now up to date with current Unicode definitions. Earlier versions of Perl erroneously exposed certain properties that are supposed to be Unicode internal-only. Use of these in regular expressions will now generate, if enabled, a deprecation warning message. The properties are: Other_Alphabetic, Other_Default_Ignorable_Code_Point, Other_Grapheme_Extend, Other_ID_Continue, Other_ID_Start, Other_Lowercase, Other_Math, and Other_Uppercase. It is now possible to change which Unicode properties Perl understands on a per-installation basis. As mentioned above, certain properties are turned off by default. These include all the Unihan properties (which should be accessible via the CPAN module Unicode::Unihan) and any deprecated or Unicode internal-only property that Perl has never exposed. The generated files in the C<lib/unicore/To> directory are now more clearly marked as being stable, directly usable by applications. New hash entries in them give the format of the normal entries, which allows for easier machine parsing. Perl can generate files in this directory for any property, though most are suppressed. You can find instructions for changing which are written in L<perluniprops>. =head1 Modules and Pragmata =head2 New Modules and Pragmata =over 4 =item C<autodie> C<autodie> is a new lexically-scoped alternative for the C<Fatal> module. The bundled version is 2.06_01. Note that in this release, using a string eval when C<autodie> is in effect can cause the autodie behaviour to leak into the surrounding scope. See L<autodie/"BUGS"> for more details. Version 2.06_01 has been added to the Perl core. =item C<Compress::Raw::Bzip2> Version 2.024 has been added to the Perl core. =item C<overloading> C<overloading> allows you to lexically disable or enable overloading for some or all operations. Version 0.001 has been added to the Perl core. =item C<parent> C<parent> establishes an ISA relationship with base classes at compile time. It provides the key feature of C<base> without further unwanted behaviors. Version 0.223 has been added to the Perl core. =item C<Parse::CPAN::Meta> Version 1.40 has been added to the Perl core. =item C<VMS::DCLsym> Version 1.03 has been added to the Perl core. =item C<VMS::Stdio> Version 2.4 has been added to the Perl core. =item C<XS::APItest::KeywordRPN> Version 0.003 has been added to the Perl core. =back =head2 Updated Pragmata =over 4 =item C<base> Upgraded from version 2.13 to 2.15. =item C<bignum> Upgraded from version 0.22 to 0.23. =item C<charnames> C<charnames> now contains the Unicode F<NameAliases.txt> database file. This has the effect of adding some extra C<\N> character names that formerly wouldn't have been recognised; for example, C<"\N{LATIN CAPITAL LETTER GHA}">. Upgraded from version 1.06 to 1.07. =item C<constant> Upgraded from version 1.13 to 1.20. =item C<diagnostics> C<diagnostics> now supports %.0f formatting internally. C<diagnostics> no longer suppresses C<Use of uninitialized value in range (or flip)> warnings. [perl #71204] Upgraded from version 1.17 to 1.19. =item C<feature> In C<feature>, the meaning of the C<:5.10> and C<:5.10.X> feature bundles has changed slightly. The last component, if any (i.e. C<X>) is simply ignored. This is predicated on the assumption that new features will not, in general, be added to maintenance releases. So C<:5.10> and C<:5.10.X> have identical effect. This is a change to the behaviour documented for 5.10.0. C<feature> now includes the C<unicode_strings> feature: use feature "unicode_strings"; This pragma turns on Unicode semantics for the case-changing operations (C<uc>, C<lc>, C<ucfirst>, C<lcfirst>) on strings that don't have the internal UTF-8 flag set, but that contain single-byte characters between 128 and 255. Upgraded from version 1.11 to 1.16. =item C<less> C<less> now includes the C<stash_name> method to allow subclasses of C<less> to pick where in %^H to store their stash. Upgraded from version 0.02 to 0.03. =item C<lib> Upgraded from version 0.5565 to 0.62. =item C<mro> C<mro> is now implemented as an XS extension. The documented interface has not changed. Code relying on the implementation detail that some C<mro::> methods happened to be available at all times gets to "keep both pieces". Upgraded from version 1.00 to 1.02. =item C<overload> C<overload> now allow overloading of 'qr'. Upgraded from version 1.06 to 1.10. =item C<threads> Upgraded from version 1.67 to 1.75. =item C<threads::shared> Upgraded from version 1.14 to 1.32. =item C<version> C<version> now has support for L</Version number formats> as described earlier in this document and in its own documentation. Upgraded from version 0.74 to 0.82. =item C<warnings> C<warnings> has a new C<warnings::fatal_enabled()> function. It also includes a new C<illegalproto> warning category. See also L</New or Changed Diagnostics> for this change. Upgraded from version 1.06 to 1.09. =back =head2 Updated Modules =over 4 =item C<Archive::Extract> Upgraded from version 0.24 to 0.38. =item C<Archive::Tar> Upgraded from version 1.38 to 1.54. =item C<Attribute::Handlers> Upgraded from version 0.79 to 0.87. =item C<AutoLoader> Upgraded from version 5.63 to 5.70. =item C<B::Concise> Upgraded from version 0.74 to 0.78. =item C<B::Debug> Upgraded from version 1.05 to 1.12. =item C<B::Deparse> Upgraded from version 0.83 to 0.96. =item C<B::Lint> Upgraded from version 1.09 to 1.11_01. =item C<CGI> Upgraded from version 3.29 to 3.48. =item C<Class::ISA> Upgraded from version 0.33 to 0.36. NOTE: C<Class::ISA> is deprecated and may be removed from a future version of Perl. =item C<Compress::Raw::Zlib> Upgraded from version 2.008 to 2.024. =item C<CPAN> Upgraded from version 1.9205 to 1.94_56. =item C<CPANPLUS> Upgraded from version 0.84 to 0.90. =item C<CPANPLUS::Dist::Build> Upgraded from version 0.06_02 to 0.46. =item C<Data::Dumper> Upgraded from version 2.121_14 to 2.125. =item C<DB_File> Upgraded from version 1.816_1 to 1.820. =item C<Devel::PPPort> Upgraded from version 3.13 to 3.19. =item C<Digest> Upgraded from version 1.15 to 1.16. =item C<Digest::MD5> Upgraded from version 2.36_01 to 2.39. =item C<Digest::SHA> Upgraded from version 5.45 to 5.47. =item C<Encode> Upgraded from version 2.23 to 2.39. =item C<Exporter> Upgraded from version 5.62 to 5.64_01. =item C<ExtUtils::CBuilder> Upgraded from version 0.21 to 0.27. =item C<ExtUtils::Command> Upgraded from version 1.13 to 1.16. =item C<ExtUtils::Constant> Upgraded from version 0.2 to 0.22. =item C<ExtUtils::Install> Upgraded from version 1.44 to 1.55. =item C<ExtUtils::MakeMaker> Upgraded from version 6.42 to 6.56. =item C<ExtUtils::Manifest> Upgraded from version 1.51_01 to 1.57. =item C<ExtUtils::ParseXS> Upgraded from version 2.18_02 to 2.21. =item C<File::Fetch> Upgraded from version 0.14 to 0.24. =item C<File::Path> Upgraded from version 2.04 to 2.08_01. =item C<File::Temp> Upgraded from version 0.18 to 0.22. =item C<Filter::Simple> Upgraded from version 0.82 to 0.84. =item C<Filter::Util::Call> Upgraded from version 1.07 to 1.08. =item C<Getopt::Long> Upgraded from version 2.37 to 2.38. =item C<IO> Upgraded from version 1.23_01 to 1.25_02. =item C<IO::Zlib> Upgraded from version 1.07 to 1.10. =item C<IPC::Cmd> Upgraded from version 0.40_1 to 0.54. =item C<IPC::SysV> Upgraded from version 1.05 to 2.01. =item C<Locale::Maketext> Upgraded from version 1.12 to 1.14. =item C<Locale::Maketext::Simple> Upgraded from version 0.18 to 0.21. =item C<Log::Message> Upgraded from version 0.01 to 0.02. =item C<Log::Message::Simple> Upgraded from version 0.04 to 0.06. =item C<Math::BigInt> Upgraded from version 1.88 to 1.89_01. =item C<Math::BigInt::FastCalc> Upgraded from version 0.16 to 0.19. =item C<Math::BigRat> Upgraded from version 0.21 to 0.24. =item C<Math::Complex> Upgraded from version 1.37 to 1.56. =item C<Memoize> Upgraded from version 1.01_02 to 1.01_03. =item C<MIME::Base64> Upgraded from version 3.07_01 to 3.08. =item C<Module::Build> Upgraded from version 0.2808_01 to 0.3603. =item C<Module::CoreList> Upgraded from version 2.12 to 2.29. =item C<Module::Load> Upgraded from version 0.12 to 0.16. =item C<Module::Load::Conditional> Upgraded from version 0.22 to 0.34. =item C<Module::Loaded> Upgraded from version 0.01 to 0.06. =item C<Module::Pluggable> Upgraded from version 3.6 to 3.9. =item C<Net::Ping> Upgraded from version 2.33 to 2.36. =item C<NEXT> Upgraded from version 0.60_01 to 0.64. =item C<Object::Accessor> Upgraded from version 0.32 to 0.36. =item C<Package::Constants> Upgraded from version 0.01 to 0.02. =item C<PerlIO> Upgraded from version 1.04 to 1.06. =item C<Pod::Parser> Upgraded from version 1.35 to 1.37. =item C<Pod::Perldoc> Upgraded from version 3.14_02 to 3.15_02. =item C<Pod::Plainer> Upgraded from version 0.01 to 1.02. NOTE: C<Pod::Plainer> is deprecated and may be removed from a future version of Perl. =item C<Pod::Simple> Upgraded from version 3.05 to 3.13. =item C<Safe> Upgraded from version 2.12 to 2.22. =item C<SelfLoader> Upgraded from version 1.11 to 1.17. =item C<Storable> Upgraded from version 2.18 to 2.22. =item C<Switch> Upgraded from version 2.13 to 2.16. NOTE: C<Switch> is deprecated and may be removed from a future version of Perl. =item C<Sys::Syslog> Upgraded from version 0.22 to 0.27. =item C<Term::ANSIColor> Upgraded from version 1.12 to 2.02. =item C<Term::UI> Upgraded from version 0.18 to 0.20. =item C<Test> Upgraded from version 1.25 to 1.25_02. =item C<Test::Harness> Upgraded from version 2.64 to 3.17. =item C<Test::Simple> Upgraded from version 0.72 to 0.94. =item C<Text::Balanced> Upgraded from version 2.0.0 to 2.02. =item C<Text::ParseWords> Upgraded from version 3.26 to 3.27. =item C<Text::Soundex> Upgraded from version 3.03 to 3.03_01. =item C<Thread::Queue> Upgraded from version 2.00 to 2.11. =item C<Thread::Semaphore> Upgraded from version 2.01 to 2.09. =item C<Tie::RefHash> Upgraded from version 1.37 to 1.38. =item C<Time::HiRes> Upgraded from version 1.9711 to 1.9719. =item C<Time::Local> Upgraded from version 1.18 to 1.1901_01. =item C<Time::Piece> Upgraded from version 1.12 to 1.15. =item C<Unicode::Collate> Upgraded from version 0.52 to 0.52_01. =item C<Unicode::Normalize> Upgraded from version 1.02 to 1.03. =item C<Win32> Upgraded from version 0.34 to 0.39. =item C<Win32API::File> Upgraded from version 0.1001_01 to 0.1101. =item C<XSLoader> Upgraded from version 0.08 to 0.10. =back =head2 Removed Modules and Pragmata =over 4 =item C<attrs> Removed from the Perl core. Prior version was 1.02. =item C<CPAN::API::HOWTO> Removed from the Perl core. Prior version was 'undef'. =item C<CPAN::DeferedCode> Removed from the Perl core. Prior version was 5.50. =item C<CPANPLUS::inc> Removed from the Perl core. Prior version was 'undef'. =item C<DCLsym> Removed from the Perl core. Prior version was 1.03. =item C<ExtUtils::MakeMaker::bytes> Removed from the Perl core. Prior version was 6.42. =item C<ExtUtils::MakeMaker::vmsish> Removed from the Perl core. Prior version was 6.42. =item C<Stdio> Removed from the Perl core. Prior version was 2.3. =item C<Test::Harness::Assert> Removed from the Perl core. Prior version was 0.02. =item C<Test::Harness::Iterator> Removed from the Perl core. Prior version was 0.02. =item C<Test::Harness::Point> Removed from the Perl core. Prior version was 0.01. =item C<Test::Harness::Results> Removed from the Perl core. Prior version was 0.01. =item C<Test::Harness::Straps> Removed from the Perl core. Prior version was 0.26_01. =item C<Test::Harness::Util> Removed from the Perl core. Prior version was 0.01. =item C<XSSymSet> Removed from the Perl core. Prior version was 1.1. =back =head2 Deprecated Modules and Pragmata See L</Deprecated Modules> above. =head1 Documentation =head2 New Documentation =over 4 =item * L<perlhaiku> contains instructions on how to build perl for the Haiku platform. =item * L<perlmroapi> describes the new interface for pluggable Method Resolution Orders. =item * L<perlperf>, by Richard Foley, provides an introduction to the use of performance and optimization techniques which can be used with particular reference to perl programs. =item * L<perlrepository> describes how to access the perl source using the I<git> version control system. =item * L<perlpolicy> extends the "Social contract about contributed modules" into the beginnings of a document on Perl porting policies. =back =head2 Changes to Existing Documentation =over =item * The various large F<Changes*> files (which listed every change made to perl over the last 18 years) have been removed, and replaced by a small file, also called F<Changes>, which just explains how that same information may be extracted from the git version control system. =item * F<Porting/patching.pod> has been deleted, as it mainly described interacting with the old Perforce-based repository, which is now obsolete. Information still relevant has been moved to L<perlrepository>. =item * The syntax C<unless (EXPR) BLOCK else BLOCK> is now documented as valid, as is the syntax C<unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK>, although actually using the latter may not be the best idea for the readability of your source code. =item * Documented -X overloading. =item * Documented that C<when()> treats specially most of the filetest operators =item * Documented C<when> as a syntax modifier. =item * Eliminated "Old Perl threads tutorial", which described 5005 threads. F<pod/perlthrtut.pod> is the same material reworked for ithreads. =item * Correct previous documentation: v-strings are not deprecated With version objects, we need them to use MODULE VERSION syntax. This patch removes the deprecation notice. =item * Security contact information is now part of L<perlsec>. =item * A significant fraction of the core documentation has been updated to clarify the behavior of Perl's Unicode handling. Much of the remaining core documentation has been reviewed and edited for clarity, consistent use of language, and to fix the spelling of Tom Christiansen's name. =item * The Pod specification (L<perlpodspec>) has been updated to bring the specification in line with modern usage already supported by most Pod systems. A parameter string may now follow the format name in a "begin/end" region. Links to URIs with a text description are now allowed. The usage of C<LE<lt>"section"E<gt>> has been marked as deprecated. =item * L<if.pm|if> has been documented in L<perlfunc/use> as a means to get conditional loading of modules despite the implicit BEGIN block around C<use>. =item * The documentation for C<$1> in perlvar.pod has been clarified. =item * C<\N{U+I<code point>}> is now documented. =back =head1 Selected Performance Enhancements =over 4 =item * A new internal cache means that C<isa()> will often be faster. =item * The implementation of C<C3> Method Resolution Order has been optimised - linearisation for classes with single inheritance is 40% faster. Performance for multiple inheritance is unchanged. =item * Under C<use locale>, the locale-relevant information is now cached on read-only values, such as the list returned by C<keys %hash>. This makes operations such as C<sort keys %hash> in the scope of C<use locale> much faster. =item * Empty C<DESTROY> methods are no longer called. =item * C<Perl_sv_utf8_upgrade()> is now faster. =item * C<keys> on empty hash is now faster. =item * C<if (%foo)> has been optimized to be faster than C<if (keys %foo)>. =item * The string repetition operator (C<$str x $num>) is now several times faster when C<$str> has length one or C<$num> is large. =item * Reversing an array to itself (as in C<@a = reverse @a>) in void context now happens in-place and is several orders of magnitude faster than it used to be. It will also preserve non-existent elements whenever possible, i.e. for non magical arrays or tied arrays with C<EXISTS> and C<DELETE> methods. =back =head1 Installation and Configuration Improvements =over 4 =item * L<perlapi>, L<perlintern>, L<perlmodlib> and L<perltoc> are now all generated at build time, rather than being shipped as part of the release. =item * If C<vendorlib> and C<vendorarch> are the same, then they are only added to C<@INC> once. =item * C<$Config{usedevel}> and the C-level C<PERL_USE_DEVEL> are now defined if perl is built with C<-Dusedevel>. =item * F<Configure> will enable use of C<-fstack-protector>, to provide protection against stack-smashing attacks, if the compiler supports it. =item * F<Configure> will now determine the correct prototypes for re-entrant functions and for C<gconvert> if you are using a C++ compiler rather than a C compiler. =item * On Unix, if you build from a tree containing a git repository, the configuration process will note the commit hash you have checked out, for display in the output of C<perl -v> and C<perl -V>. Unpushed local commits are automatically added to the list of local patches displayed by C<perl -V>. =item * Perl now supports SystemTap's C<dtrace> compatibility layer and an issue with linking C<miniperl> has been fixed in the process. =item * perldoc now uses C<less -R> instead of C<less> for improved behaviour in the face of C<groff>'s new usage of ANSI escape codes. =item * C<perl -V> now reports use of the compile-time options C<USE_PERL_ATOF> and C<USE_ATTRIBUTES_FOR_PERLIO>. =item * As part of the flattening of F<ext>, all extensions on all platforms are built by F<make_ext.pl>. This replaces the Unix-specific F<ext/util/make_ext>, VMS-specific F<make_ext.com> and Win32-specific F<win32/buildext.pl>. =back =head1 Internal Changes Each release of Perl sees numerous internal changes which shouldn't affect day to day usage but may still be notable for developers working with Perl's source code. =over =item * The J.R.R. Tolkien quotes at the head of C source file have been checked and proper citations added, thanks to a patch from Tom Christiansen. =item * The internal structure of the dual-life modules traditionally found in the F<lib/> and F<ext/> directories in the perl source has changed significantly. Where possible, dual-lifed modules have been extracted from F<lib/> and F<ext/>. Dual-lifed modules maintained by Perl's developers as part of the Perl core now live in F<dist/>. Dual-lifed modules maintained primarily on CPAN now live in F<cpan/>. When reporting a bug in a module located under F<cpan/>, please send your bug report directly to the module's bug tracker or author, rather than Perl's bug tracker. =item * C<\N{...}> now compiles better, always forces UTF-8 internal representation Perl's developers have fixed several problems with the recognition of C<\N{...}> constructs. As part of this, perl will store any scalar or regex containing C<\N{I<name>}> or C<\N{U+I<code point>}> in its definition in UTF-8 format. (This was true previously for all occurrences of C<\N{I<name>}> that did not use a custom translator, but now it's always true.) =item * Perl_magic_setmglob now knows about globs, fixing RT #71254. =item * C<SVt_RV> no longer exists. RVs are now stored in IVs. =item * C<Perl_vcroak()> now accepts a null first argument. In addition, a full audit was made of the "not NULL" compiler annotations, and those for several other internal functions were corrected. =item * New macros C<dSAVEDERRNO>, C<dSAVE_ERRNO>, C<SAVE_ERRNO>, C<RESTORE_ERRNO> have been added to formalise the temporary saving of the C<errno> variable. =item * The function C<Perl_sv_insert_flags> has been added to augment C<Perl_sv_insert>. =item * The function C<Perl_newSV_type(type)> has been added, equivalent to C<Perl_newSV()> followed by C<Perl_sv_upgrade(type)>. =item * The function C<Perl_newSVpvn_flags()> has been added, equivalent to C<Perl_newSVpvn()> and then performing the action relevant to the flag. Two flag bits are currently supported. =over 4 =item * C<SVf_UTF8> will call C<SvUTF8_on()> for you. (Note that this does not convert a sequence of ISO 8859-1 characters to UTF-8). A wrapper, C<newSVpvn_utf8()> is available for this. =item * C<SVs_TEMP> now calls C<Perl_sv_2mortal()> on the new SV. =back There is also a wrapper that takes constant strings, C<newSVpvs_flags()>. =item * The function C<Perl_croak_xs_usage> has been added as a wrapper to C<Perl_croak>. =item * Perl now exports the functions C<PerlIO_find_layer> and C<PerlIO_list_alloc>. =item * C<PL_na> has been exterminated from the core code, replaced by local STRLEN temporaries, or C<*_nolen()> calls. Either approach is faster than C<PL_na>, which is a pointer dereference into the interpreter structure under ithreads, and a global variable otherwise. =item * C<Perl_mg_free()> used to leave freed memory accessible via C<SvMAGIC()> on the scalar. It now updates the linked list to remove each piece of magic as it is freed. =item * Under ithreads, the regex in C<PL_reg_curpm> is now reference counted. This eliminates a lot of hackish workarounds to cope with it not being reference counted. =item * C<Perl_mg_magical()> would sometimes incorrectly turn on C<SvRMAGICAL()>. This has been fixed. =item * The I<public> IV and NV flags are now not set if the string value has trailing "garbage". This behaviour is consistent with not setting the public IV or NV flags if the value is out of range for the type. =item * Uses of C<Nullav>, C<Nullcv>, C<Nullhv>, C<Nullop>, C<Nullsv> etc have been replaced by C<NULL> in the core code, and non-dual-life modules, as C<NULL> is clearer to those unfamiliar with the core code. =item * A macro C<MUTABLE_PTR(p)> has been added, which on (non-pedantic) gcc will not cast away C<const>, returning a C<void *>. Macros C<MUTABLE_SV(av)>, C<MUTABLE_SV(cv)> etc build on this, casting to C<AV *> etc without casting away C<const>. This allows proper compile-time auditing of C<const> correctness in the core, and helped picked up some errors (now fixed). =item * Macros C<mPUSHs()> and C<mXPUSHs()> have been added, for pushing SVs on the stack and mortalizing them. =item * Use of the private structure C<mro_meta> has changed slightly. Nothing outside the core should be accessing this directly anyway. =item * A new tool, F<Porting/expand-macro.pl> has been added, that allows you to view how a C preprocessor macro would be expanded when compiled. This is handy when trying to decode the macro hell that is the perl guts. =back =head1 Testing =head2 Testing improvements =over 4 =item Parallel tests The core distribution can now run its regression tests in parallel on Unix-like platforms. Instead of running C<make test>, set C<TEST_JOBS> in your environment to the number of tests to run in parallel, and run C<make test_harness>. On a Bourne-like shell, this can be done as TEST_JOBS=3 make test_harness # Run 3 tests in parallel An environment variable is used, rather than parallel make itself, because L<TAP::Harness> needs to be able to schedule individual non-conflicting test scripts itself, and there is no standard interface to C<make> utilities to interact with their job schedulers. Note that currently some test scripts may fail when run in parallel (most notably C<ext/IO/t/io_dir.t>). If necessary run just the failing scripts again sequentially and see if the failures go away. =item Test harness flexibility It's now possible to override C<PERL5OPT> and friends in F<t/TEST> =item Test watchdog Several tests that have the potential to hang forever if they fail now incorporate a "watchdog" functionality that will kill them after a timeout, which helps ensure that C<make test> and C<make test_harness> run to completion automatically. =back =head2 New Tests Perl's developers have added a number of new tests to the core. In addition to the items listed below, many modules updated from CPAN incorporate new tests. =over 4 =item * Significant cleanups to core tests to ensure that language and interpreter features are not used before they're tested. =item * C<make test_porting> now runs a number of important pre-commit checks which might be of use to anyone working on the Perl core. =item * F<t/porting/podcheck.t> automatically checks the well-formedness of POD found in all .pl, .pm and .pod files in the F<MANIFEST>, other than in dual-lifed modules which are primarily maintained outside the Perl core. =item * F<t/porting/manifest.t> now tests that all files listed in MANIFEST are present. =item * F<t/op/while_readdir.t> tests that a bare readdir in while loop sets $_. =item * F<t/comp/retainedlines.t> checks that the debugger can retain source lines from C<eval>. =item * F<t/io/perlio_fail.t> checks that bad layers fail. =item * F<t/io/perlio_leaks.t> checks that PerlIO layers are not leaking. =item * F<t/io/perlio_open.t> checks that certain special forms of open work. =item * F<t/io/perlio.t> includes general PerlIO tests. =item * F<t/io/pvbm.t> checks that there is no unexpected interaction between the internal types C<PVBM> and C<PVGV>. =item * F<t/mro/package_aliases.t> checks that mro works properly in the presence of aliased packages. =item * F<t/op/dbm.t> tests C<dbmopen> and C<dbmclose>. =item * F<t/op/index_thr.t> tests the interaction of C<index> and threads. =item * F<t/op/pat_thr.t> tests the interaction of esoteric patterns and threads. =item * F<t/op/qr_gc.t> tests that C<qr> doesn't leak. =item * F<t/op/reg_email_thr.t> tests the interaction of regex recursion and threads. =item * F<t/op/regexp_qr_embed_thr.t> tests the interaction of patterns with embedded C<qr//> and threads. =item * F<t/op/regexp_unicode_prop.t> tests Unicode properties in regular expressions. =item * F<t/op/regexp_unicode_prop_thr.t> tests the interaction of Unicode properties and threads. =item * F<t/op/reg_nc_tie.t> tests the tied methods of C<Tie::Hash::NamedCapture>. =item * F<t/op/reg_posixcc.t> checks that POSIX character classes behave consistently. =item * F<t/op/re.t> checks that exportable C<re> functions in F<universal.c> work. =item * F<t/op/setpgrpstack.t> checks that C<setpgrp> works. =item * F<t/op/substr_thr.t> tests the interaction of C<substr> and threads. =item * F<t/op/upgrade.t> checks that upgrading and assigning scalars works. =item * F<t/uni/lex_utf8.t> checks that Unicode in the lexer works. =item * F<t/uni/tie.t> checks that Unicode and C<tie> work. =item * F<t/comp/final_line_num.t> tests whether line numbers are correct at EOF =item * F<t/comp/form_scope.t> tests format scoping. =item * F<t/comp/line_debug.t> tests whether C<< @{"_<$file"} >> works. =item * F<t/op/filetest_t.t> tests if -t file test works. =item * F<t/op/qr.t> tests C<qr>. =item * F<t/op/utf8cache.t> tests malfunctions of the utf8 cache. =item * F<t/re/uniprops.t> test unicodes C<\p{}> regex constructs. =item * F<t/op/filehandle.t> tests some suitably portable filetest operators to check that they work as expected, particularly in the light of some internal changes made in how filehandles are blessed. =item * F<t/op/time_loop.t> tests that unix times greater than C<2**63>, which can now be handed to C<gmtime> and C<localtime>, do not cause an internal overflow or an excessively long loop. =back =head1 New or Changed Diagnostics =head2 New Diagnostics =over =item * SV allocation tracing has been added to the diagnostics enabled by C<-Dm>. The tracing can alternatively output via the C<PERL_MEM_LOG> mechanism, if that was enabled when the F<perl> binary was compiled. =item * Smartmatch resolution tracing has been added as a new diagnostic. Use C<-DM> to enable it. =item * A new debugging flag C<-DB> now dumps subroutine definitions, leaving C<-Dx> for its original purpose of dumping syntax trees. =item * Perl 5.12 provides a number of new diagnostic messages to help you write better code. See L<perldiag> for details of these new messages. =over 4 =item * C<Bad plugin affecting keyword '%s'> =item * C<gmtime(%.0f) too large> =item * C<Lexing code attempted to stuff non-Latin-1 character into Latin-1 input> =item * C<Lexing code internal error (%s)> =item * C<localtime(%.0f) too large> =item * C<Overloaded dereference did not return a reference> =item * C<Overloaded qr did not return a REGEXP> =item * C<Perl_pmflag() is deprecated, and will be removed from the XS API> =item * C<lvalue attribute ignored after the subroutine has been defined> This new warning is issued when one attempts to mark a subroutine as lvalue after it has been defined. =item * Perl now warns you if C<++> or C<--> are unable to change the value because it's beyond the limit of representation. This uses a new warnings category: "imprecision". =item * C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> warn when passed undef. =item * C<Show constant in "Useless use of a constant in void context"> =item * C<Prototype after '%s'> =item * C<panic: sv_chop %s> This new fatal error occurs when the C routine C<Perl_sv_chop()> was passed a position that is not within the scalar's string buffer. This could be caused by buggy XS code, and at this point recovery is not possible. =item * The fatal error C<Malformed UTF-8 returned by \N> is now produced if the C<charnames> handler returns malformed UTF-8. =item * If an unresolved named character or sequence was encountered when compiling a regex pattern then the fatal error C<\N{NAME} must be resolved by the lexer> is now produced. This can happen, for example, when using a single-quotish context like C<$re = '\N{SPACE}'; /$re/;>. See L<perldiag> for more examples of how the lexer can get bypassed. =item * C<Invalid hexadecimal number in \N{U+...}> is a new fatal error triggered when the character constant represented by C<...> is not a valid hexadecimal number. =item * The new meaning of C<\N> as C<[^\n]> is not valid in a bracketed character class, just like C<.> in a character class loses its special meaning, and will cause the fatal error C<\N in a character class must be a named character: \N{...}>. =item * The rules on what is legal for the C<...> in C<\N{...}> have been tightened up so that unless the C<...> begins with an alphabetic character and continues with a combination of alphanumerics, dashes, spaces, parentheses or colons then the warning C<Deprecated character(s) in \N{...} starting at '%s'> is now issued. =item * The warning C<Using just the first characters returned by \N{}> will be issued if the C<charnames> handler returns a sequence of characters which exceeds the limit of the number of characters that can be used. The message will indicate which characters were used and which were discarded. =back =back =head2 Changed Diagnostics A number of existing diagnostic messages have been improved or corrected: =over =item * A new warning category C<illegalproto> allows finer-grained control of warnings around function prototypes. The two warnings: =over =item C<Illegal character in prototype for %s : %s> =item C<Prototype after '%c' for %s : %s> =back have been moved from the C<syntax> top-level warnings category into a new first-level category, C<illegalproto>. These two warnings are currently the only ones emitted during parsing of an invalid/illegal prototype, so one can now use no warnings 'illegalproto'; to suppress only those, but not other syntax-related warnings. Warnings where prototypes are changed, ignored, or not met are still in the C<prototype> category as before. =item * C<Deep recursion on subroutine "%s"> It is now possible to change the depth threshold for this warning from the default of 100, by recompiling the F<perl> binary, setting the C pre-processor macro C<PERL_SUB_DEPTH_WARN> to the desired value. =item * C<Illegal character in prototype> warning is now more precise when reporting illegal characters after _ =item * mro merging error messages are now very similar to those produced by L<Algorithm::C3>. =item * Amelioration of the error message "Unrecognized character %s in column %d" Changes the error message to "Unrecognized character %s; marked by E<lt>-- HERE after %sE<lt>-- HERE near column %d". This should make it a little simpler to spot and correct the suspicious character. =item * Perl now explicitly points to C<$.> when it causes an uninitialized warning for ranges in scalar context. =item * C<split> now warns when called in void context. =item * C<printf>-style functions called with too few arguments will now issue the warning C<"Missing argument in %s"> [perl #71000] =item * Perl now properly returns a syntax error instead of segfaulting if C<each>, C<keys>, or C<values> is used without an argument. =item * C<tell()> now fails properly if called without an argument and when no previous file was read. C<tell()> now returns C<-1>, and sets errno to C<EBADF>, thus restoring the 5.8.x behaviour. =item * C<overload> no longer implicitly unsets fallback on repeated 'use overload' lines. =item * POSIX::strftime() can now handle Unicode characters in the format string. =item * The C<syntax> category was removed from 5 warnings that should only be in C<deprecated>. =item * Three fatal C<pack>/C<unpack> error messages have been normalized to C<panic: %s> =item * C<Unicode character is illegal> has been rephrased to be more accurate It now reads C<Unicode non-character is illegal in interchange> and the perldiag documentation has been expanded a bit. =item * Currently, all but the first of the several characters that the C<charnames> handler may return are discarded when used in a regular expression pattern bracketed character class. If this happens then the warning C<Using just the first character returned by \N{} in character class> will be issued. =item * The warning C<Missing right brace on \N{} or unescaped left brace after \N. Assuming the latter> will be issued if Perl encounters a C<\N{> but doesn't find a matching C<}>. In this case Perl doesn't know if it was mistakenly omitted, or if "match non-newline" followed by "match a C<{>" was desired. It assumes the latter because that is actually a valid interpretation as written, unlike the other case. If you meant the former, you need to add the matching right brace. If you did mean the latter, you can silence this warning by writing instead C<\N\{>. =item * C<gmtime> and C<localtime> called with numbers smaller than they can reliably handle will now issue the warnings C<gmtime(%.0f) too small> and C<localtime(%.0f) too small>. =back The following diagnostic messages have been removed: =over 4 =item * C<Runaway format> =item * C<Can't locate package %s for the parents of %s> In general this warning it only got produced in conjunction with other warnings, and removing it allowed an ISA lookup optimisation to be added. =item * C<v-string in use/require is non-portable> =back =head1 Utility Changes =over 4 =item * F<h2ph> now looks in C<include-fixed> too, which is a recent addition to gcc's search path. =item * F<h2xs> no longer incorrectly treats enum values like macros. It also now handles C++ style comments (C<//>) properly in enums. =item * F<perl5db.pl> now supports C<LVALUE> subroutines. Additionally, the debugger now correctly handles proxy constant subroutines, and subroutine stubs. =item * F<perlbug> now uses C<%Module::CoreList::bug_tracker> to print out upstream bug tracker URLs. If a user identifies a particular module as the topic of their bug report and we're able to divine the URL for its upstream bug tracker, perlbug now provide a message to the user explaining that the core copies the CPAN version directly, and provide the URL for reporting the bug directly to the upstream author. F<perlbug> no longer reports "Message sent" when it hasn't actually sent the message =item * F<perlthanks> is a new utility for sending non-bug-reports to the authors and maintainers of Perl. Getting nothing but bug reports can become a bit demoralising. If Perl 5.12 works well for you, please try out F<perlthanks>. It will make the developers smile. =item * Perl's developers have fixed bugs in F<a2p> having to do with the C<match()> operator in list context. Additionally, F<a2p> no longer generates code that uses the C<$[> variable. =back =head1 Selected Bug Fixes =over 4 =item * U+0FFFF is now a legal character in regular expressions. =item * pp_qr now always returns a new regexp SV. Resolves RT #69852. Instead of returning a(nother) reference to the (pre-compiled) regexp in the optree, use reg_temp_copy() to create a copy of it, and return a reference to that. This resolves issues about Regexp::DESTROY not being called in a timely fashion (the original bug tracked by RT #69852), as well as bugs related to blessing regexps, and of assigning to regexps, as described in correspondence added to the ticket. It transpires that we also need to undo the SvPVX() sharing when ithreads cloning a Regexp SV, because mother_re is set to NULL, instead of a cloned copy of the mother_re. This change might fix bugs with regexps and threads in certain other situations, but as yet neither tests nor bug reports have indicated any problems, so it might not actually be an edge case that it's possible to reach. =item * Several compilation errors and segfaults when perl was built with C<-Dmad> were fixed. =item * Fixes for lexer API changes in 5.11.2 which broke NYTProf's savesrc option. =item * C<-t> should only return TRUE for file handles connected to a TTY The Microsoft C version of C<isatty()> returns TRUE for all character mode devices, including the F</dev/null>-style "nul" device and printers like "lpt1". =item * Fixed a regression caused by commit fafafbaf which caused a panic during parameter passing [perl #70171] =item * On systems which in-place edits without backup files, -i'*' now works as the documentation says it does [perl #70802] =item * Saving and restoring magic flags no longer loses readonly flag. =item * The malformed syntax C<grep EXPR LIST> (note the missing comma) no longer causes abrupt and total failure. =item * Regular expressions compiled with C<qr{}> literals properly set C<$'> when matching again. =item * Using named subroutines with C<sort> should no longer lead to bus errors [perl #71076] =item * Numerous bugfixes catch small issues caused by the recently-added Lexer API. =item * Smart match against C<@_> sometimes gave false negatives. [perl #71078] =item * C<$@> may now be assigned a read-only value (without error or busting the stack). =item * C<sort> called recursively from within an active comparison subroutine no longer causes a bus error if run multiple times. [perl #71076] =item * Tie::Hash::NamedCapture::* will not abort if passed bad input (RT #71828) =item * @_ and $_ no longer leak under threads (RT #34342 and #41138, also #70602, #70974) =item * C<-I> on shebang line now adds directories in front of @INC as documented, and as does C<-I> when specified on the command-line. =item * C<kill> is now fatal when called on non-numeric process identifiers. Previously, an C<undef> process identifier would be interpreted as a request to kill process 0, which would terminate the current process group on POSIX systems. Since process identifiers are always integers, killing a non-numeric process is now fatal. =item * 5.10.0 inadvertently disabled an optimisation, which caused a measurable performance drop in list assignment, such as is often used to assign function parameters from C<@_>. The optimisation has been re-instated, and the performance regression fixed. (This fix is also present in 5.10.1) =item * Fixed memory leak on C<while (1) { map 1, 1 }> [RT #53038]. =item * Some potential coredumps in PerlIO fixed [RT #57322,54828]. =item * The debugger now works with lvalue subroutines. =item * The debugger's C<m> command was broken on modules that defined constants [RT #61222]. =item * C<crypt> and string complement could return tainted values for untainted arguments [RT #59998]. =item * The C<-i>I<.suffix> command-line switch now recreates the file using restricted permissions, before changing its mode to match the original file. This eliminates a potential race condition [RT #60904]. =item * On some Unix systems, the value in C<$?> would not have the top bit set (C<$? & 128>) even if the child core dumped. =item * Under some circumstances, C<$^R> could incorrectly become undefined [RT #57042]. =item * In the XS API, various hash functions, when passed a pre-computed hash where the key is UTF-8, might result in an incorrect lookup. =item * XS code including F<XSUB.h> before F<perl.h> gave a compile-time error [RT #57176]. =item * C<< $object-E<gt>isa('Foo') >> would report false if the package C<Foo> didn't exist, even if the object's C<@ISA> contained C<Foo>. =item * Various bugs in the new-to 5.10.0 mro code, triggered by manipulating C<@ISA>, have been found and fixed. =item * Bitwise operations on references could crash the interpreter, e.g. C<$x=\$y; $x |= "foo"> [RT #54956]. =item * Patterns including alternation might be sensitive to the internal UTF-8 representation, e.g. my $byte = chr(192); my $utf8 = chr(192); utf8::upgrade($utf8); $utf8 =~ /$byte|X}/i; # failed in 5.10.0 =item * Within UTF8-encoded Perl source files (i.e. where C<use utf8> is in effect), double-quoted literal strings could be corrupted where a C<\xNN>, C<\0NNN> or C<\N{}> is followed by a literal character with ordinal value greater than 255 [RT #59908]. =item * C<B::Deparse> failed to correctly deparse various constructs: C<readpipe STRING> [RT #62428], C<CORE::require(STRING)> [RT #62488], C<sub foo(_)> [RT #62484]. =item * Using C<setpgrp> with no arguments could corrupt the perl stack. =item * The block form of C<eval> is now specifically trappable by C<Safe> and C<ops>. Previously it was erroneously treated like string C<eval>. =item * In 5.10.0, the two characters C<[~> were sometimes parsed as the smart match operator (C<~~>) [RT #63854]. =item * In 5.10.0, the C<*> quantifier in patterns was sometimes treated as C<{0,32767}> [RT #60034, #60464]. For example, this match would fail: ("ab" x 32768) =~ /^(ab)*$/ =item * C<shmget> was limited to a 32 bit segment size on a 64 bit OS [RT #63924]. =item * Using C<next> or C<last> to exit a C<given> block no longer produces a spurious warning like the following: Exiting given via last at foo.pl line 123 =item * Assigning a format to a glob could corrupt the format; e.g.: *bar=*foo{FORMAT}; # foo format now bad =item * Attempting to coerce a typeglob to a string or number could cause an assertion failure. The correct error message is now generated, C<Can't coerce GLOB to I<$type>>. =item * Under C<use filetest 'access'>, C<-x> was using the wrong access mode. This has been fixed [RT #49003]. =item * C<length> on a tied scalar that returned a Unicode value would not be correct the first time. This has been fixed. =item * Using an array C<tie> inside in array C<tie> could SEGV. This has been fixed. [RT #51636] =item * A race condition inside C<PerlIOStdio_close()> has been identified and fixed. This used to cause various threading issues, including SEGVs. =item * In C<unpack>, the use of C<()> groups in scalar context was internally placing a list on the interpreter's stack, which manifested in various ways, including SEGVs. This is now fixed [RT #50256]. =item * Magic was called twice in C<substr>, C<\&$x>, C<tie $x, $m> and C<chop>. These have all been fixed. =item * A 5.10.0 optimisation to clear the temporary stack within the implicit loop of C<s///ge> has been reverted, as it turned out to be the cause of obscure bugs in seemingly unrelated parts of the interpreter [commit ef0d4e17921ee3de]. =item * The line numbers for warnings inside C<elsif> are now correct. =item * The C<..> operator now works correctly with ranges whose ends are at or close to the values of the smallest and largest integers. =item * C<binmode STDIN, ':raw'> could lead to segmentation faults on some platforms. This has been fixed [RT #54828]. =item * An off-by-one error meant that C<index $str, ...> was effectively being executed as C<index "$str\0", ...>. This has been fixed [RT #53746]. =item * Various leaks associated with named captures in regexes have been fixed [RT #57024]. =item * A weak reference to a hash would leak. This was affecting C<DBI> [RT #56908]. =item * Using (?|) in a regex could cause a segfault [RT #59734]. =item * Use of a UTF-8 C<tr//> within a closure could cause a segfault [RT #61520]. =item * Calling C<Perl_sv_chop()> or otherwise upgrading an SV could result in an unaligned 64-bit access on the SPARC architecture [RT #60574]. =item * In the 5.10.0 release, C<inc_version_list> would incorrectly list C<5.10.*> after C<5.8.*>; this affected the C<@INC> search order [RT #67628]. =item * In 5.10.0, C<pack "a*", $tainted_value> returned a non-tainted value [RT #52552]. =item * In 5.10.0, C<printf> and C<sprintf> could produce the fatal error C<panic: utf8_mg_pos_cache_update> when printing UTF-8 strings [RT #62666]. =item * In the 5.10.0 release, a dynamically created C<AUTOLOAD> method might be missed (method cache issue) [RT #60220,60232]. =item * In the 5.10.0 release, a combination of C<use feature> and C<//ee> could cause a memory leak [RT #63110]. =item * C<-C> on the shebang (C<#!>) line is once more permitted if it is also specified on the command line. C<-C> on the shebang line used to be a silent no-op I<if> it was not also on the command line, so perl 5.10.0 disallowed it, which broke some scripts. Now perl checks whether it is also on the command line and only dies if it is not [RT #67880]. =item * In 5.10.0, certain types of re-entrant regular expression could crash, or cause the following assertion failure [RT #60508]: Assertion rx->sublen >= (s - rx->subbeg) + i failed =item * Perl now includes previously missing files from the Unicode Character Database. =item * Perl now honors C<TMPDIR> when opening an anonymous temporary file. =back =head1 Platform Specific Changes Perl is incredibly portable. In general, if a platform has a C compiler, someone has ported Perl to it (or will soon). We're happy to announce that Perl 5.12 includes support for several new platforms. At the same time, it's time to bid farewell to some (very) old friends. =head2 New Platforms =over =item Haiku Perl's developers have merged patches from Haiku's maintainers. Perl should now build on Haiku. =item MirOS BSD Perl should now build on MirOS BSD. =back =head2 Discontinued Platforms =over =item Domain/OS =item MiNT =item Tenon MachTen =back =head2 Updated Platforms =over 4 =item AIX =over 4 =item * Removed F<libbsd> for AIX 5L and 6.1. Only C<flock()> was used from F<libbsd>. =item * Removed F<libgdbm> for AIX 5L and 6.1 if F<libgdbm> < 1.8.3-5 is installed. The F<libgdbm> is delivered as an optional package with the AIX Toolbox. Unfortunately the versions below 1.8.3-5 are broken. =item * Hints changes mean that AIX 4.2 should work again. =back =item Cygwin =over 4 =item * Perl now supports IPv6 on Cygwin 1.7 and newer. =item * On Cygwin we now strip the last number from the DLL. This has been the behaviour in the cygwin.com build for years. The hints files have been updated. =back =item Darwin (Mac OS X) =over 4 =item * Skip testing the be_BY.CP1131 locale on Darwin 10 (Mac OS X 10.6), as it's still buggy. =item * Correct infelicities in the regexp used to identify buggy locales on Darwin 8 and 9 (Mac OS X 10.4 and 10.5, respectively). =back =item DragonFly BSD =over 4 =item * Fix thread library selection [perl #69686] =back =item FreeBSD =over 4 =item * The hints files now identify the correct threading libraries on FreeBSD 7 and later. =back =item Irix =over 4 =item * We now work around a bizarre preprocessor bug in the Irix 6.5 compiler: C<cc -E -> unfortunately goes into K&R mode, but C<cc -E file.c> doesn't. =back =item NetBSD =over 4 =item * Hints now supports versions 5.*. =back =item OpenVMS =over 4 =item * C<-UDEBUGGING> is now the default on VMS. Like it has been everywhere else for ages and ages. Also make command-line selection of -UDEBUGGING and -DDEBUGGING work in configure.com; before the only way to turn it off was by saying no in answer to the interactive question. =item * The default pipe buffer size on VMS has been updated to 8192 on 64-bit systems. =item * Reads from the in-memory temporary files of C<PerlIO::scalar> used to fail if C<$/> was set to a numeric reference (to indicate record-style reads). This is now fixed. =item * VMS now supports C<getgrgid>. =item * Many improvements and cleanups have been made to the VMS file name handling and conversion code. =item * Enabling the C<PERL_VMS_POSIX_EXIT> logical name now encodes a POSIX exit status in a VMS condition value for better interaction with GNV's bash shell and other utilities that depend on POSIX exit values. See L<perlvms/"$?"> for details. =item * C<File::Copy> now detects Unix compatibility mode on VMS. =back =item Stratus VOS =over 4 =item * Various changes from Stratus have been merged in. =back =item Symbian =over 4 =item * There is now support for Symbian S60 3.2 SDK and S60 5.0 SDK. =back =item Windows =over 4 =item * Perl 5.12 supports Windows 2000 and later. The supporting code for legacy versions of Windows is still included, but will be removed during the next development cycle. =item * Initial support for building Perl with MinGW-w64 is now available. =item * F<perl.exe> now includes a manifest resource to specify the C<trustInfo> settings for Windows Vista and later. Without this setting Windows would treat F<perl.exe> as a legacy application and apply various heuristics like redirecting access to protected file system areas (like the "Program Files" folder) to the users "VirtualStore" instead of generating a proper "permission denied" error. The manifest resource also requests the Microsoft Common-Controls version 6.0 (themed controls introduced in Windows XP). Check out the Win32::VisualStyles module on CPAN to switch back to old style unthemed controls for legacy applications. =item * The C<-t> filetest operator now only returns true if the filehandle is connected to a console window. In previous versions of Perl it would return true for all character mode devices, including F<NUL> and F<LPT1>. =item * The C<-p> filetest operator now works correctly, and the Fcntl::S_IFIFO constant is defined when Perl is compiled with Microsoft Visual C. In previous Perl versions C<-p> always returned a false value, and the Fcntl::S_IFIFO constant was not defined. This bug is specific to Microsoft Visual C and never affected Perl binaries built with MinGW. =item * The socket error codes are now more widely supported: The POSIX module will define the symbolic names, like POSIX::EWOULDBLOCK, and stringification of socket error codes in $! works as well now; C:\>perl -MPOSIX -E "$!=POSIX::EWOULDBLOCK; say $!" A non-blocking socket operation could not be completed immediately. =item * flock() will now set sensible error codes in $!. Previous Perl versions copied the value of $^E into $!, which caused much confusion. =item * select() now supports all empty C<fd_set>s more correctly. =item * C<'.\foo'> and C<'..\foo'> were treated differently than C<'./foo'> and C<'../foo'> by C<do> and C<require> [RT #63492]. =item * Improved message window handling means that C<alarm> and C<kill> messages will no longer be dropped under race conditions. =item * Various bits of Perl's build infrastructure are no longer converted to win32 line endings at release time. If this hurts you, please report the problem with the L<perlbug> program included with perl. =back =back =head1 Known Problems This is a list of some significant unfixed bugs, which are regressions from either 5.10.x or 5.8.x. =over 4 =item * Some CPANPLUS tests may fail if there is a functioning file F<../../cpanp-run-perl> outside your build directory. The failure shouldn't imply there's a problem with the actual functional software. The bug is already fixed in [RT #74188] and is scheduled for inclusion in perl-v5.12.1. =item * C<List::Util::first> misbehaves in the presence of a lexical C<$_> (typically introduced by C<my $_> or implicitly by C<given>). The variable which gets set for each iteration is the package variable C<$_>, not the lexical C<$_> [RT #67694]. A similar issue may occur in other modules that provide functions which take a block as their first argument, like foo { ... $_ ...} list =item * Some regexes may run much more slowly when run in a child thread compared with the thread the pattern was compiled into [RT #55600]. =item * Things like C<"\N{LATIN SMALL LIGATURE FF}" =~ /\N{LATIN SMALL LETTER F}+/> will appear to hang as they get into a very long running loop [RT #72998]. =item * Several porters have reported mysterious crashes when Perl's entire test suite is run after a build on certain Windows 2000 systems. When run by hand, the individual tests reportedly work fine. =back =head1 Errata =over =item * This one is actually a change introduced in 5.10.0, but it was missed from that release's perldelta, so it is mentioned here instead. A bugfix related to the handling of the C</m> modifier and C<qr> resulted in a change of behaviour between 5.8.x and 5.10.0: # matches in 5.8.x, doesn't match in 5.10.0 $re = qr/^bar/; "foo\nbar" =~ /$re/m; =back =head1 Acknowledgements Perl 5.12.0 represents approximately two years of development since Perl 5.10.0 and contains over 750,000 lines of changes across over 3,000 files from over 200 authors and committers. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.0: Aaron Crane, Abe Timmerman, Abhijit Menon-Sen, Abigail, Adam Russell, Adriano Ferreira, Ævar Arnfjörð Bjarmason, Alan Grover, Alexandr Ciornii, Alex Davies, Alex Vandiver, Andreas Koenig, Andrew Rodland, andrew@sundale.net, Andy Armstrong, Andy Dougherty, Jose AUGUSTE-ETIENNE, Benjamin Smith, Ben Morrow, bharanee rathna, Bo Borgerson, Bo Lindbergh, Brad Gilbert, Bram, Brendan O'Dea, brian d foy, Charles Bailey, Chip Salzenberg, Chris 'BinGOs' Williams, Christoph Lamprecht, Chris Williams, chromatic, Claes Jakobsson, Craig A. Berry, Dan Dascalescu, Daniel Frederick Crisman, Daniel M. Quinlan, Dan Jacobson, Dan Kogai, Dave Mitchell, Dave Rolsky, David Cantrell, David Dick, David Golden, David Mitchell, David M. Syzdek, David Nicol, David Wheeler, Dennis Kaarsemaker, Dintelmann, Peter, Dominic Dunlop, Dr.Ruud, Duke Leto, Enrico Sorcinelli, Eric Brine, Father Chrysostomos, Florian Ragwitz, Frank Wiegand, Gabor Szabo, Gene Sullivan, Geoffrey T. Dairiki, George Greer, Gerard Goossen, Gisle Aas, Goro Fuji, Graham Barr, Green, Paul, Hans Dieter Pearcey, Harmen, H. Merijn Brand, Hugo van der Sanden, Ian Goodacre, Igor Sutton, Ingo Weinhold, James Bence, James Mastros, Jan Dubois, Jari Aalto, Jarkko Hietaniemi, Jay Hannah, Jerry Hedden, Jesse Vincent, Jim Cromie, Jody Belka, John E. Malmberg, John Malmberg, John Peacock, John Peacock via RT, John P. Linderman, John Wright, Josh ben Jore, Jos I. Boumans, Karl Williamson, Kenichi Ishigaki, Ken Williams, Kevin Brintnall, Kevin Ryde, Kurt Starsinic, Leon Brocard, Lubomir Rintel, Luke Ross, Marcel Grünauer, Marcus Holland-Moritz, Mark Jason Dominus, Marko Asplund, Martin Hasch, Mashrab Kuvatov, Matt Kraai, Matt S Trout, Max Maischein, Michael Breen, Michael Cartmell, Michael G Schwern, Michael Witten, Mike Giroux, Milosz Tanski, Moritz Lenz, Nicholas Clark, Nick Cleaton, Niko Tyni, Offer Kaye, Osvaldo Villalon, Paul Fenwick, Paul Gaborit, Paul Green, Paul Johnson, Paul Marquess, Philip Hazel, Philippe Bruhat, Rafael Garcia-Suarez, Rainer Tammer, Rajesh Mandalemula, Reini Urban, Renée Bäcker, Ricardo Signes, Ricardo SIGNES, Richard Foley, Rich Rauenzahn, Rick Delaney, Risto Kankkunen, Robert May, Roberto C. Sanchez, Robin Barker, SADAHIRO Tomoyuki, Salvador Ortiz Garcia, Sam Vilain, Scott Lanning, Sébastien Aperghis-Tramoni, Sérgio Durigan Júnior, Shlomi Fish, Simon 'corecode' Schubert, Sisyphus, Slaven Rezic, Smylers, Steffen Müller, Steffen Ullrich, Stepan Kasal, Steve Hay, Steven Schubiger, Steve Peters, Tels, The Doctor, Tim Bunce, Tim Jenness, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tom Wyant, Tony Cook, Torsten Schoenfeld, Tye McQueen, Vadim Konovalov, Vincent Pit, Hio YAMASHINA, Yasuhiro Matsumoto, Yitzchak Scott-Thoennes, Yuval Kogman, Yves Orton, Zefram, Zsban Ambrus This is woefully incomplete as it's automatically generated from version control history. In particular, it doesn't include the names of the (very much appreciated) contributors who reported issues in previous versions of Perl that helped make Perl 5.12.0 better. For a more complete list of all of Perl's historical contributors, please see the C<AUTHORS> file in the Perl 5.12.0 distribution. Our "retired" pumpkings Nicholas Clark and Rafael Garcia-Suarez deserve special thanks for their brilliant and substantive ongoing contributions. Nicholas personally authored over 30% of the patches since 5.10.0. Rafael comes in second in patch authorship with 11%, but is first by a long shot in committing patches authored by others, pushing 44% of the commits since 5.10.0 in this category, often after providing considerable coaching to the patch authors. These statistics in no way comprise all of their contributions, but express in shorthand that we couldn't have done it without them. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at L<http://rt.perl.org/perlbug/>. There may also be information at L<http://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analyzed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. L<http://dev.perl.org/perl5/errata.html> for a list of issues found after this release, as well as a list of CPAN modules known to be incompatible with this release. =cut PK �=�[K���� � perllinux.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specifically designed to be readable as is. =head1 NAME perllinux - Perl version 5 on Linux systems =head1 DESCRIPTION This document describes various features of Linux that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. =head2 Deploying Perl on Linux Normally one can install F</usr/bin/perl> on Linux using your distribution's package manager (e.g: C<sudo apt-get install perl>, or C<sudo dnf install perl>). Note that sometimes one needs to install some extra system packages in order to be able to use CPAN frontends, and that messing with the system's perl is not always recommended. One can use L<perlbrew|https://perlbrew.pl/> to avoid such issues. Otherwise, perl should build fine on Linux using the mainstream compilers GCC and clang, while following the usual instructions. =head2 Experimental Support for Sun Studio Compilers for Linux OS Sun Microsystems has released a port of their Sun Studio compilers for Linux. As of May 2019, the last stable release took place on 2017, and one can buy support contracts for them. There are some special instructions for building Perl with Sun Studio on Linux. Following the normal C<Configure>, you have to run make as follows: LDLOADLIBS=-lc make C<LDLOADLIBS> is an environment variable used by the linker to link C</ext> modules to glibc. Currently, that environment variable is not getting populated by a combination of C<Config> entries and C<ExtUtil::MakeMaker>. While there may be a bug somewhere in Perl's configuration or C<ExtUtil::MakeMaker> causing the problem, the most likely cause is an incomplete understanding of Sun Studio by this author. Further investigation is needed to get this working better. =head1 AUTHOR Steve Peters <steve@fisharerojo.org> Please report any errors, updates, or suggestions to L<https://github.com/Perl/perl5/issues>. PK �=�[cT�L5 L5 perlunifaq.podnu �[��� =head1 NAME perlunifaq - Perl Unicode FAQ =head1 Q and A This is a list of questions and answers about Unicode in Perl, intended to be read after L<perlunitut>. =head2 perlunitut isn't really a Unicode tutorial, is it? No, and this isn't really a Unicode FAQ. Perl has an abstracted interface for all supported character encodings, so this is actually a generic C<Encode> tutorial and C<Encode> FAQ. But many people think that Unicode is special and magical, and I didn't want to disappoint them, so I decided to call the document a Unicode tutorial. =head2 What character encodings does Perl support? To find out which character encodings your Perl supports, run: perl -MEncode -le "print for Encode->encodings(':all')" =head2 Which version of perl should I use? Well, if you can, upgrade to the most recent, but certainly C<5.8.1> or newer. The tutorial and FAQ assume the latest release. You should also check your modules, and upgrade them if necessary. For example, HTML::Entities requires version >= 1.32 to function correctly, even though the changelog is silent about this. =head2 What about binary data, like images? Well, apart from a bare C<binmode $fh>, you shouldn't treat them specially. (The binmode is needed because otherwise Perl may convert line endings on Win32 systems.) Be careful, though, to never combine text strings with binary strings. If you need text in a binary stream, encode your text strings first using the appropriate encoding, then join them with binary strings. See also: "What if I don't encode?". =head2 When should I decode or encode? Whenever you're communicating text with anything that is external to your perl process, like a database, a text file, a socket, or another program. Even if the thing you're communicating with is also written in Perl. =head2 What if I don't decode? Whenever your encoded, binary string is used together with a text string, Perl will assume that your binary string was encoded with ISO-8859-1, also known as latin-1. If it wasn't latin-1, then your data is unpleasantly converted. For example, if it was UTF-8, the individual bytes of multibyte characters are seen as separate characters, and then again converted to UTF-8. Such double encoding can be compared to double HTML encoding (C<&gt;>), or double URI encoding (C<%253E>). This silent implicit decoding is known as "upgrading". That may sound positive, but it's best to avoid it. =head2 What if I don't encode? Your text string will be sent using the bytes in Perl's internal format. In some cases, Perl will warn you that you're doing something wrong, with a friendly warning: Wide character in print at example.pl line 2. Because the internal format is often UTF-8, these bugs are hard to spot, because UTF-8 is usually the encoding you wanted! But don't be lazy, and don't use the fact that Perl's internal format is UTF-8 to your advantage. Encode explicitly to avoid weird bugs, and to show to maintenance programmers that you thought this through. =head2 Is there a way to automatically decode or encode? If all data that comes from a certain handle is encoded in exactly the same way, you can tell the PerlIO system to automatically decode everything, with the C<encoding> layer. If you do this, you can't accidentally forget to decode or encode anymore, on things that use the layered handle. You can provide this layer when C<open>ing the file: open my $fh, '>:encoding(UTF-8)', $filename; # auto encoding on write open my $fh, '<:encoding(UTF-8)', $filename; # auto decoding on read Or if you already have an open filehandle: binmode $fh, ':encoding(UTF-8)'; Some database drivers for DBI can also automatically encode and decode, but that is sometimes limited to the UTF-8 encoding. =head2 What if I don't know which encoding was used? Do whatever you can to find out, and if you have to: guess. (Don't forget to document your guess with a comment.) You could open the document in a web browser, and change the character set or character encoding until you can visually confirm that all characters look the way they should. There is no way to reliably detect the encoding automatically, so if people keep sending you data without charset indication, you may have to educate them. =head2 Can I use Unicode in my Perl sources? Yes, you can! If your sources are UTF-8 encoded, you can indicate that with the C<use utf8> pragma. use utf8; This doesn't do anything to your input, or to your output. It only influences the way your sources are read. You can use Unicode in string literals, in identifiers (but they still have to be "word characters" according to C<\w>), and even in custom delimiters. =head2 Data::Dumper doesn't restore the UTF8 flag; is it broken? No, Data::Dumper's Unicode abilities are as they should be. There have been some complaints that it should restore the UTF8 flag when the data is read again with C<eval>. However, you should really not look at the flag, and nothing indicates that Data::Dumper should break this rule. Here's what happens: when Perl reads in a string literal, it sticks to 8 bit encoding as long as it can. (But perhaps originally it was internally encoded as UTF-8, when you dumped it.) When it has to give that up because other characters are added to the text string, it silently upgrades the string to UTF-8. If you properly encode your strings for output, none of this is of your concern, and you can just C<eval> dumped data as always. =head2 Why do regex character classes sometimes match only in the ASCII range? Starting in Perl 5.14 (and partially in Perl 5.12), just put a C<use feature 'unicode_strings'> near the beginning of your program. Within its lexical scope you shouldn't have this problem. It also is automatically enabled under C<use feature ':5.12'> or C<use v5.12> or using C<-E> on the command line for Perl 5.12 or higher. The rationale for requiring this is to not break older programs that rely on the way things worked before Unicode came along. Those older programs knew only about the ASCII character set, and so may not work properly for additional characters. When a string is encoded in UTF-8, Perl assumes that the program is prepared to deal with Unicode, but when the string isn't, Perl assumes that only ASCII is wanted, and so those characters that are not ASCII characters aren't recognized as to what they would be in Unicode. C<use feature 'unicode_strings'> tells Perl to treat all characters as Unicode, whether the string is encoded in UTF-8 or not, thus avoiding the problem. However, on earlier Perls, or if you pass strings to subroutines outside the feature's scope, you can force Unicode rules by changing the encoding to UTF-8 by doing C<utf8::upgrade($string)>. This can be used safely on any string, as it checks and does not change strings that have already been upgraded. For a more detailed discussion, see L<Unicode::Semantics> on CPAN. =head2 Why do some characters not uppercase or lowercase correctly? See the answer to the previous question. =head2 How can I determine if a string is a text string or a binary string? You can't. Some use the UTF8 flag for this, but that's misuse, and makes well behaved modules like Data::Dumper look bad. The flag is useless for this purpose, because it's off when an 8 bit encoding (by default ISO-8859-1) is used to store the string. This is something you, the programmer, has to keep track of; sorry. You could consider adopting a kind of "Hungarian notation" to help with this. =head2 How do I convert from encoding FOO to encoding BAR? By first converting the FOO-encoded byte string to a text string, and then the text string to a BAR-encoded byte string: my $text_string = decode('FOO', $foo_string); my $bar_string = encode('BAR', $text_string); or by skipping the text string part, and going directly from one binary encoding to the other: use Encode qw(from_to); from_to($string, 'FOO', 'BAR'); # changes contents of $string or by letting automatic decoding and encoding do all the work: open my $foofh, '<:encoding(FOO)', 'example.foo.txt'; open my $barfh, '>:encoding(BAR)', 'example.bar.txt'; print { $barfh } $_ while <$foofh>; =head2 What are C<decode_utf8> and C<encode_utf8>? These are alternate syntaxes for C<decode('utf8', ...)> and C<encode('utf8', ...)>. Do not use these functions for data exchange. Instead use C<decode('UTF-8', ...)> and C<encode('UTF-8', ...)>; see L</What's the difference between UTF-8 and utf8?> below. =head2 What is a "wide character"? This is a term used for characters occupying more than one byte. The Perl warning "Wide character in ..." is caused by such a character. With no specified encoding layer, Perl tries to fit things into a single byte. When it can't, it emits this warning (if warnings are enabled), and uses UTF-8 encoded data instead. To avoid this warning and to avoid having different output encodings in a single stream, always specify an encoding explicitly, for example with a PerlIO layer: binmode STDOUT, ":encoding(UTF-8)"; =head1 INTERNALS =head2 What is "the UTF8 flag"? Please, unless you're hacking the internals, or debugging weirdness, don't think about the UTF8 flag at all. That means that you very probably shouldn't use C<is_utf8>, C<_utf8_on> or C<_utf8_off> at all. The UTF8 flag, also called SvUTF8, is an internal flag that indicates that the current internal representation is UTF-8. Without the flag, it is assumed to be ISO-8859-1. Perl converts between these automatically. (Actually Perl usually assumes the representation is ASCII; see L</Why do regex character classes sometimes match only in the ASCII range?> above.) One of Perl's internal formats happens to be UTF-8. Unfortunately, Perl can't keep a secret, so everyone knows about this. That is the source of much confusion. It's better to pretend that the internal format is some unknown encoding, and that you always have to encode and decode explicitly. =head2 What about the C<use bytes> pragma? Don't use it. It makes no sense to deal with bytes in a text string, and it makes no sense to deal with characters in a byte string. Do the proper conversions (by decoding/encoding), and things will work out well: you get character counts for decoded data, and byte counts for encoded data. C<use bytes> is usually a failed attempt to do something useful. Just forget about it. =head2 What about the C<use encoding> pragma? Don't use it. Unfortunately, it assumes that the programmer's environment and that of the user will use the same encoding. It will use the same encoding for the source code and for STDIN and STDOUT. When a program is copied to another machine, the source code does not change, but the STDIO environment might. If you need non-ASCII characters in your source code, make it a UTF-8 encoded file and C<use utf8>. If you need to set the encoding for STDIN, STDOUT, and STDERR, for example based on the user's locale, C<use open>. =head2 What is the difference between C<:encoding> and C<:utf8>? Because UTF-8 is one of Perl's internal formats, you can often just skip the encoding or decoding step, and manipulate the UTF8 flag directly. Instead of C<:encoding(UTF-8)>, you can simply use C<:utf8>, which skips the encoding step if the data was already represented as UTF8 internally. This is widely accepted as good behavior when you're writing, but it can be dangerous when reading, because it causes internal inconsistency when you have invalid byte sequences. Using C<:utf8> for input can sometimes result in security breaches, so please use C<:encoding(UTF-8)> instead. Instead of C<decode> and C<encode>, you could use C<_utf8_on> and C<_utf8_off>, but this is considered bad style. Especially C<_utf8_on> can be dangerous, for the same reason that C<:utf8> can. There are some shortcuts for oneliners; see L<-C in perlrun|perlrun/-C [numberE<sol>list]>. =head2 What's the difference between C<UTF-8> and C<utf8>? C<UTF-8> is the official standard. C<utf8> is Perl's way of being liberal in what it accepts. If you have to communicate with things that aren't so liberal, you may want to consider using C<UTF-8>. If you have to communicate with things that are too liberal, you may have to use C<utf8>. The full explanation is in L<Encode/"UTF-8 vs. utf8 vs. UTF8">. C<UTF-8> is internally known as C<utf-8-strict>. The tutorial uses UTF-8 consistently, even where utf8 is actually used internally, because the distinction can be hard to make, and is mostly irrelevant. For example, utf8 can be used for code points that don't exist in Unicode, like 9999999, but if you encode that to UTF-8, you get a substitution character (by default; see L<Encode/"Handling Malformed Data"> for more ways of dealing with this.) Okay, if you insist: the "internal format" is utf8, not UTF-8. (When it's not some other encoding.) =head2 I lost track; what encoding is the internal format really? It's good that you lost track, because you shouldn't depend on the internal format being any specific encoding. But since you asked: by default, the internal format is either ISO-8859-1 (latin-1), or utf8, depending on the history of the string. On EBCDIC platforms, this may be different even. Perl knows how it stored the string internally, and will use that knowledge when you C<encode>. In other words: don't try to find out what the internal encoding for a certain string is, but instead just encode it into the encoding that you want. =head1 AUTHOR Juerd Waalboer <#####@juerd.nl> =head1 SEE ALSO L<perlunicode>, L<perluniintro>, L<Encode> PK �=�[lM7�� � perl5182delta.podnu �[��� =encoding utf8 =head1 NAME perl5182delta - what is new for perl v5.18.2 =head1 DESCRIPTION This document describes differences between the 5.18.1 release and the 5.18.2 release. If you are upgrading from an earlier release such as 5.18.0, first read L<perl5181delta>, which describes differences between 5.18.0 and 5.18.1. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<B> has been upgraded from version 1.42_01 to 1.42_02. The fix for [perl #118525] introduced a regression in the behaviour of C<B::CV::GV>, changing the return value from a C<B::SPECIAL> object on a C<NULL> C<CvGV> to C<undef>. C<B::CV::GV> again returns a C<B::SPECIAL> object in this case. [perl #119413] =item * L<B::Concise> has been upgraded from version 0.95 to 0.95_01. This fixes a bug in dumping unexpected SPECIALs. =item * L<English> has been upgraded from version 1.06 to 1.06_01. This fixes an error about the performance of C<$`>, C<$&>, and C<$'>. =item * L<File::Glob> has been upgraded from version 1.20 to 1.20_01. =back =head1 Documentation =head2 Changes to Existing Documentation =over 4 =item * L<perlrepository> has been restored with a pointer to more useful pages. =item * L<perlhack> has been updated with the latest changes from blead. =back =head1 Selected Bug Fixes =over 4 =item * Perl 5.18.1 introduced a regression along with a bugfix for lexical subs. Some B::SPECIAL results from B::CV::GV became undefs instead. This broke Devel::Cover among other libraries. This has been fixed. [perl #119351] =item * Perl 5.18.0 introduced a regression whereby C<[:^ascii:]>, if used in the same character class as other qualifiers, would fail to match characters in the Latin-1 block. This has been fixed. [perl #120799] =item * Perl 5.18.0 introduced a regression when using ->SUPER::method with AUTOLOAD by looking up AUTOLOAD from the current package, rather than the current package’s superclass. This has been fixed. [perl #120694] =item * Perl 5.18.0 introduced a regression whereby C<-bareword> was no longer permitted under the C<strict> and C<integer> pragmata when used together. This has been fixed. [perl #120288] =item * Previously PerlIOBase_dup didn't check if pushing the new layer succeeded before (optionally) setting the utf8 flag. This could cause segfaults-by-nullpointer. This has been fixed. =item * A buffer overflow with very long identifiers has been fixed. =item * A regression from 5.16 in the handling of padranges led to assertion failures if a keyword plugin declined to handle the second ‘my’, but only after creating a padop. This affected, at least, Devel::CallParser under threaded builds. This has been fixed. =item * The construct C<< $r=qr/.../; /$r/p >> is now handled properly, an issue which had been worsened by changes 5.18.0. [perl #118213] =back =head1 Acknowledgements Perl 5.18.2 represents approximately 3 months of development since Perl 5.18.1 and contains approximately 980 lines of changes across 39 files from 4 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.18.2: Craig A. Berry, David Mitchell, Ricardo Signes, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�Cx perl5123delta.podnu �[��� =encoding utf8 =head1 NAME perl5123delta - what is new for perl v5.12.3 =head1 DESCRIPTION This document describes differences between the 5.12.2 release and the 5.12.3 release. If you are upgrading from an earlier release such as 5.12.1, first read L<perl5122delta>, which describes differences between 5.12.1 and 5.12.2. The major changes made in 5.12.0 are described in L<perl5120delta>. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.12.2. If any exist, they are bugs and reports are welcome. =head1 Core Enhancements =head2 C<keys>, C<values> work on arrays You can now use the C<keys>, C<values>, C<each> builtin functions on arrays (previously you could only use them on hashes). See L<perlfunc> for details. This is actually a change introduced in perl 5.12.0, but it was missed from that release's perldelta. =head1 Bug Fixes "no VERSION" will now correctly deparse with B::Deparse, as will certain constant expressions. Module::Build should be more reliably pass its tests under cygwin. Lvalue subroutines are again able to return copy-on-write scalars. This had been broken since version 5.10.0. =head1 Platform Specific Notes =over 4 =item Solaris A separate DTrace is now build for miniperl, which means that perl can be compiled with -Dusedtrace on Solaris again. =item VMS A number of regressions on VMS have been fixed. In addition to minor cleanup of questionable expressions in F<vms.c>, file permissions should no longer be garbled by the PerlIO layer, and spurious record boundaries should no longer be introduced by the PerlIO layer during output. For more details and discussion on the latter, see: http://www.nntp.perl.org/group/perl.vmsperl/2010/11/msg15419.html =item VOS A few very small changes were made to the build process on VOS to better support the platform. Longer-than-32-character filenames are now supported on OpenVOS, and build properly without IPv6 support. =back =head1 Acknowledgements Perl 5.12.3 represents approximately four months of development since Perl 5.12.2 and contains approximately 2500 lines of changes across 54 files from 16 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.3: Craig A. Berry, David Golden, David Leadbeater, Father Chrysostomos, Florian Ragwitz, Jesse Vincent, Karl Williamson, Nick Johnston, Nicolas Kaiser, Paul Green, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes, Steffen Mueller, Zsbán Ambrus, Ævar Arnfjörð Bjarmason =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[j*�#8 8 perlpragma.podnu �[��� =head1 NAME perlpragma - how to write a user pragma =head1 DESCRIPTION A pragma is a module which influences some aspect of the compile time or run time behaviour of Perl, such as C<strict> or C<warnings>. With Perl 5.10 you are no longer limited to the built in pragmata; you can now create user pragmata that modify the behaviour of user functions within a lexical scope. =head1 A basic example For example, say you need to create a class implementing overloaded mathematical operators, and would like to provide your own pragma that functions much like C<use integer;> You'd like this code use MyMaths; my $l = MyMaths->new(1.2); my $r = MyMaths->new(3.4); print "A: ", $l + $r, "\n"; use myint; print "B: ", $l + $r, "\n"; { no myint; print "C: ", $l + $r, "\n"; } print "D: ", $l + $r, "\n"; no myint; print "E: ", $l + $r, "\n"; to give the output A: 4.6 B: 4 C: 4.6 D: 4 E: 4.6 I<i.e.>, where C<use myint;> is in effect, addition operations are forced to integer, whereas by default they are not, with the default behaviour being restored via C<no myint;> The minimal implementation of the package C<MyMaths> would be something like this: package MyMaths; use warnings; use strict; use myint(); use overload '+' => sub { my ($l, $r) = @_; # Pass 1 to check up one call level from here if (myint::in_effect(1)) { int($$l) + int($$r); } else { $$l + $$r; } }; sub new { my ($class, $value) = @_; bless \$value, $class; } 1; Note how we load the user pragma C<myint> with an empty list C<()> to prevent its C<import> being called. The interaction with the Perl compilation happens inside package C<myint>: package myint; use strict; use warnings; sub import { $^H{"myint/in_effect"} = 1; } sub unimport { $^H{"myint/in_effect"} = 0; } sub in_effect { my $level = shift // 0; my $hinthash = (caller($level))[10]; return $hinthash->{"myint/in_effect"}; } 1; As pragmata are implemented as modules, like any other module, C<use myint;> becomes BEGIN { require myint; myint->import(); } and C<no myint;> is BEGIN { require myint; myint->unimport(); } Hence the C<import> and C<unimport> routines are called at B<compile time> for the user's code. User pragmata store their state by writing to the magical hash C<%^H>, hence these two routines manipulate it. The state information in C<%^H> is stored in the optree, and can be retrieved read-only at runtime with C<caller()>, at index 10 of the list of returned results. In the example pragma, retrieval is encapsulated into the routine C<in_effect()>, which takes as parameter the number of call frames to go up to find the value of the pragma in the user's script. This uses C<caller()> to determine the value of C<$^H{"myint/in_effect"}> when each line of the user's script was called, and therefore provide the correct semantics in the subroutine implementing the overloaded addition. =head1 Key naming There is only a single C<%^H>, but arbitrarily many modules that want to use its scoping semantics. To avoid stepping on each other's toes, they need to be sure to use different keys in the hash. It is therefore conventional for a module to use only keys that begin with the module's name (the name of its main package) and a "/" character. After this module-identifying prefix, the rest of the key is entirely up to the module: it may include any characters whatsoever. For example, a module C<Foo::Bar> should use keys such as C<Foo::Bar/baz> and C<Foo::Bar/$%/_!>. Modules following this convention all play nicely with each other. The Perl core uses a handful of keys in C<%^H> which do not follow this convention, because they predate it. Keys that follow the convention won't conflict with the core's historical keys. =head1 Implementation details The optree is shared between threads. This means there is a possibility that the optree will outlive the particular thread (and therefore the interpreter instance) that created it, so true Perl scalars cannot be stored in the optree. Instead a compact form is used, which can only store values that are integers (signed and unsigned), strings or C<undef> - references and floating point values are stringified. If you need to store multiple values or complex structures, you should serialise them, for example with C<pack>. The deletion of a hash key from C<%^H> is recorded, and as ever can be distinguished from the existence of a key with value C<undef> with C<exists>. B<Don't> attempt to store references to data structures as integers which are retrieved via C<caller> and converted back, as this will not be threadsafe. Accesses would be to the structure without locking (which is not safe for Perl's scalars), and either the structure has to leak, or it has to be freed when its creating thread terminates, which may be before the optree referencing it is deleted, if other threads outlive it. PK �=�[f��! �! perllocale.podnu �[��� =encoding utf8 =head1 NAME perllocale - Perl locale handling (internationalization and localization) =head1 DESCRIPTION In the beginning there was ASCII, the "American Standard Code for Information Interchange", which works quite well for Americans with their English alphabet and dollar-denominated currency. But it doesn't work so well even for other English speakers, who may use different currencies, such as the pound sterling (as the symbol for that currency is not in ASCII); and it's hopelessly inadequate for many of the thousands of the world's other languages. To address these deficiencies, the concept of locales was invented (formally the ISO C, XPG4, POSIX 1.c "locale system"). And applications were and are being written that use the locale mechanism. The process of making such an application take account of its users' preferences in these kinds of matters is called B<internationalization> (often abbreviated as B<i18n>); telling such an application about a particular set of preferences is known as B<localization> (B<l10n>). Perl has been extended to support certain types of locales available in the locale system. This is controlled per application by using one pragma, one function call, and several environment variables. Perl supports single-byte locales that are supersets of ASCII, such as the ISO 8859 ones, and one multi-byte-type locale, UTF-8 ones, described in the next paragraph. Perl doesn't support any other multi-byte locales, such as the ones for East Asian languages. Unfortunately, there are quite a few deficiencies with the design (and often, the implementations) of locales. Unicode was invented (see L<perlunitut> for an introduction to that) in part to address these design deficiencies, and nowadays, there is a series of "UTF-8 locales", based on Unicode. These are locales whose character set is Unicode, encoded in UTF-8. Starting in v5.20, Perl fully supports UTF-8 locales, except for sorting and string comparisons like C<lt> and C<ge>. Starting in v5.26, Perl can handle these reasonably as well, depending on the platform's implementation. However, for earlier releases or for better control, use L<Unicode::Collate>. There are actually two slightly different types of UTF-8 locales: one for Turkic languages and one for everything else. Starting in Perl v5.30, Perl detects Turkic locales by their behaviour, and seamlessly handles both types; previously only the non-Turkic one was supported. The name of the locale is ignored, if your system has a C<tr_TR.UTF-8> locale and it doesn't behave like a Turkic locale, perl will treat it like a non-Turkic locale. Perl continues to support the old non UTF-8 locales as well. There are currently no UTF-8 locales for EBCDIC platforms. (Unicode is also creating C<CLDR>, the "Common Locale Data Repository", L<http://cldr.unicode.org/> which includes more types of information than are available in the POSIX locale system. At the time of this writing, there was no CPAN module that provides access to this XML-encoded data. However, it is possible to compute the POSIX locale data from them, and earlier CLDR versions had these already extracted for you as UTF-8 locales L<http://unicode.org/Public/cldr/2.0.1/>.) =head1 WHAT IS A LOCALE A locale is a set of data that describes various aspects of how various communities in the world categorize their world. These categories are broken down into the following types (some of which include a brief note here): =over =item Category C<LC_NUMERIC>: Numeric formatting This indicates how numbers should be formatted for human readability, for example the character used as the decimal point. =item Category C<LC_MONETARY>: Formatting of monetary amounts Z<> =item Category C<LC_TIME>: Date/Time formatting Z<> =item Category C<LC_MESSAGES>: Error and other messages This is used by Perl itself only for accessing operating system error messages via L<$!|perlvar/$ERRNO> and L<$^E|perlvar/$EXTENDED_OS_ERROR>. =item Category C<LC_COLLATE>: Collation This indicates the ordering of letters for comparison and sorting. In Latin alphabets, for example, "b", generally follows "a". =item Category C<LC_CTYPE>: Character Types This indicates, for example if a character is an uppercase letter. =item Other categories Some platforms have other categories, dealing with such things as measurement units and paper sizes. None of these are used directly by Perl, but outside operations that Perl interacts with may use these. See L</Not within the scope of "use locale"> below. =back More details on the categories used by Perl are given below in L</LOCALE CATEGORIES>. Together, these categories go a long way towards being able to customize a single program to run in many different locations. But there are deficiencies, so keep reading. =head1 PREPARING TO USE LOCALES Perl itself (outside the L<POSIX> module) will not use locales unless specifically requested to (but again note that Perl may interact with code that does use them). Even if there is such a request, B<all> of the following must be true for it to work properly: =over 4 =item * B<Your operating system must support the locale system>. If it does, you should find that the C<setlocale()> function is a documented part of its C library. =item * B<Definitions for locales that you use must be installed>. You, or your system administrator, must make sure that this is the case. The available locales, the location in which they are kept, and the manner in which they are installed all vary from system to system. Some systems provide only a few, hard-wired locales and do not allow more to be added. Others allow you to add "canned" locales provided by the system supplier. Still others allow you or the system administrator to define and add arbitrary locales. (You may have to ask your supplier to provide canned locales that are not delivered with your operating system.) Read your system documentation for further illumination. =item * B<Perl must believe that the locale system is supported>. If it does, C<perl -V:d_setlocale> will say that the value for C<d_setlocale> is C<define>. =back If you want a Perl application to process and present your data according to a particular locale, the application code should include the S<C<use locale>> pragma (see L</The "use locale" pragma>) where appropriate, and B<at least one> of the following must be true: =over 4 =item 1 B<The locale-determining environment variables (see L</"ENVIRONMENT">) must be correctly set up> at the time the application is started, either by yourself or by whomever set up your system account; or =item 2 B<The application must set its own locale> using the method described in L</The setlocale function>. =back =head1 USING LOCALES =head2 The C<"use locale"> pragma Starting in Perl 5.28, this pragma may be used in L<multi-threaded|threads> applications on systems that have thread-safe locale ability. Some caveats apply, see L</Multi-threaded> below. On systems without this capability, or in earlier Perls, do NOT use this pragma in scripts that have multiple L<threads|threads> active. The locale in these cases is not local to a single thread. Another thread may change the locale at any time, which could cause at a minimum that a given thread is operating in a locale it isn't expecting to be in. On some platforms, segfaults can also occur. The locale change need not be explicit; some operations cause perl to change the locale itself. You are vulnerable simply by having done a S<C<"use locale">>. By default, Perl itself (outside the L<POSIX> module) ignores the current locale. The S<C<use locale>> pragma tells Perl to use the current locale for some operations. Starting in v5.16, there are optional parameters to this pragma, described below, which restrict which operations are affected by it. The current locale is set at execution time by L<setlocale()|/The setlocale function> described below. If that function hasn't yet been called in the course of the program's execution, the current locale is that which was determined by the L</"ENVIRONMENT"> in effect at the start of the program. If there is no valid environment, the current locale is whatever the system default has been set to. On POSIX systems, it is likely, but not necessarily, the "C" locale. On Windows, the default is set via the computer's S<C<Control Panel-E<gt>Regional and Language Options>> (or its current equivalent). The operations that are affected by locale are: =over 4 =item B<Not within the scope of C<"use locale">> Only certain operations (all originating outside Perl) should be affected, as follows: =over 4 =item * The current locale is used when going outside of Perl with operations like L<system()|perlfunc/system LIST> or L<qxE<sol>E<sol>|perlop/qxE<sol>STRINGE<sol>>, if those operations are locale-sensitive. =item * Also Perl gives access to various C library functions through the L<POSIX> module. Some of those functions are always affected by the current locale. For example, C<POSIX::strftime()> uses C<LC_TIME>; C<POSIX::strtod()> uses C<LC_NUMERIC>; C<POSIX::strcoll()> and C<POSIX::strxfrm()> use C<LC_COLLATE>. All such functions will behave according to the current underlying locale, even if that locale isn't exposed to Perl space. This applies as well to L<I18N::Langinfo>. =item * XS modules for all categories but C<LC_NUMERIC> get the underlying locale, and hence any C library functions they call will use that underlying locale. For more discussion, see L<perlxs/CAVEATS>. =back Note that all C programs (including the perl interpreter, which is written in C) always have an underlying locale. That locale is the "C" locale unless changed by a call to L<setlocale()|/The setlocale function>. When Perl starts up, it changes the underlying locale to the one which is indicated by the L</ENVIRONMENT>. When using the L<POSIX> module or writing XS code, it is important to keep in mind that the underlying locale may be something other than "C", even if the program hasn't explicitly changed it. Z<> =item B<Lingering effects of C<S<use locale>>> Certain Perl operations that are set-up within the scope of a C<use locale> retain that effect even outside the scope. These include: =over 4 =item * The output format of a L<write()|perlfunc/write> is determined by an earlier format declaration (L<perlfunc/format>), so whether or not the output is affected by locale is determined by if the C<format()> is within the scope of a C<use locale>, not whether the C<write()> is. =item * Regular expression patterns can be compiled using L<qrE<sol>E<sol>|perlop/qrE<sol>STRINGE<sol>msixpodualn> with actual matching deferred to later. Again, it is whether or not the compilation was done within the scope of C<use locale> that determines the match behavior, not if the matches are done within such a scope or not. =back Z<> =item B<Under C<"use locale";>> =over 4 =item * All the above operations =item * B<Format declarations> (L<perlfunc/format>) and hence any subsequent C<write()>s use C<LC_NUMERIC>. =item * B<stringification and output> use C<LC_NUMERIC>. These include the results of C<print()>, C<printf()>, C<say()>, and C<sprintf()>. =item * B<The comparison operators> (C<lt>, C<le>, C<cmp>, C<ge>, and C<gt>) use C<LC_COLLATE>. C<sort()> is also affected if used without an explicit comparison function, because it uses C<cmp> by default. B<Note:> C<eq> and C<ne> are unaffected by locale: they always perform a char-by-char comparison of their scalar operands. What's more, if C<cmp> finds that its operands are equal according to the collation sequence specified by the current locale, it goes on to perform a char-by-char comparison, and only returns I<0> (equal) if the operands are char-for-char identical. If you really want to know whether two strings--which C<eq> and C<cmp> may consider different--are equal as far as collation in the locale is concerned, see the discussion in L</Category C<LC_COLLATE>: Collation>. =item * B<Regular expressions and case-modification functions> (C<uc()>, C<lc()>, C<ucfirst()>, and C<lcfirst()>) use C<LC_CTYPE> =item * B<The variables L<$!|perlvar/$ERRNO>> (and its synonyms C<$ERRNO> and C<$OS_ERROR>) B<and L<$^E|perlvar/$EXTENDED_OS_ERROR>> (and its synonym C<$EXTENDED_OS_ERROR>) when used as strings use C<LC_MESSAGES>. =back =back The default behavior is restored with the S<C<no locale>> pragma, or upon reaching the end of the block enclosing C<use locale>. Note that C<use locale> calls may be nested, and that what is in effect within an inner scope will revert to the outer scope's rules at the end of the inner scope. The string result of any operation that uses locale information is tainted, as it is possible for a locale to be untrustworthy. See L</"SECURITY">. Starting in Perl v5.16 in a very limited way, and more generally in v5.22, you can restrict which category or categories are enabled by this particular instance of the pragma by adding parameters to it. For example, use locale qw(:ctype :numeric); enables locale awareness within its scope of only those operations (listed above) that are affected by C<LC_CTYPE> and C<LC_NUMERIC>. The possible categories are: C<:collate>, C<:ctype>, C<:messages>, C<:monetary>, C<:numeric>, C<:time>, and the pseudo category C<:characters> (described below). Thus you can say use locale ':messages'; and only L<$!|perlvar/$ERRNO> and L<$^E|perlvar/$EXTENDED_OS_ERROR> will be locale aware. Everything else is unaffected. Since Perl doesn't currently do anything with the C<LC_MONETARY> category, specifying C<:monetary> does effectively nothing. Some systems have other categories, such as C<LC_PAPER>, but Perl also doesn't do anything with them, and there is no way to specify them in this pragma's arguments. You can also easily say to use all categories but one, by either, for example, use locale ':!ctype'; use locale ':not_ctype'; both of which mean to enable locale awarness of all categories but C<LC_CTYPE>. Only one category argument may be specified in a S<C<use locale>> if it is of the negated form. Prior to v5.22 only one form of the pragma with arguments is available: use locale ':not_characters'; (and you have to say C<not_>; you can't use the bang C<!> form). This pseudo category is a shorthand for specifying both C<:collate> and C<:ctype>. Hence, in the negated form, it is nearly the same thing as saying use locale qw(:messages :monetary :numeric :time); We use the term "nearly", because C<:not_characters> also turns on S<C<use feature 'unicode_strings'>> within its scope. This form is less useful in v5.20 and later, and is described fully in L</Unicode and UTF-8>, but briefly, it tells Perl to not use the character portions of the locale definition, that is the C<LC_CTYPE> and C<LC_COLLATE> categories. Instead it will use the native character set (extended by Unicode). When using this parameter, you are responsible for getting the external character set translated into the native/Unicode one (which it already will be if it is one of the increasingly popular UTF-8 locales). There are convenient ways of doing this, as described in L</Unicode and UTF-8>. =head2 The setlocale function WARNING! Prior to Perl 5.28 or on a system that does not support thread-safe locale operations, do NOT use this function in a L<thread|threads>. The locale will change in all other threads at the same time, and should your thread get paused by the operating system, and another started, that thread will not have the locale it is expecting. On some platforms, there can be a race leading to segfaults if two threads call this function nearly simultaneously. This warning does not apply on unthreaded builds, or on perls where C<${^SAFE_LOCALES}> exists and is non-zero; namely Perl 5.28 and later unthreaded or compiled to be locale-thread-safe. You can switch locales as often as you wish at run time with the C<POSIX::setlocale()> function: # Import locale-handling tool set from POSIX module. # This example uses: setlocale -- the function call # LC_CTYPE -- explained below # (Showing the testing for success/failure of operations is # omitted in these examples to avoid distracting from the main # point) use POSIX qw(locale_h); use locale; my $old_locale; # query and save the old locale $old_locale = setlocale(LC_CTYPE); setlocale(LC_CTYPE, "fr_CA.ISO8859-1"); # LC_CTYPE now in locale "French, Canada, codeset ISO 8859-1" setlocale(LC_CTYPE, ""); # LC_CTYPE now reset to the default defined by the # LC_ALL/LC_CTYPE/LANG environment variables, or to the system # default. See below for documentation. # restore the old locale setlocale(LC_CTYPE, $old_locale); The first argument of C<setlocale()> gives the B<category>, the second the B<locale>. The category tells in what aspect of data processing you want to apply locale-specific rules. Category names are discussed in L</LOCALE CATEGORIES> and L</"ENVIRONMENT">. The locale is the name of a collection of customization information corresponding to a particular combination of language, country or territory, and codeset. Read on for hints on the naming of locales: not all systems name locales as in the example. If no second argument is provided and the category is something other than C<LC_ALL>, the function returns a string naming the current locale for the category. You can use this value as the second argument in a subsequent call to C<setlocale()>, B<but> on some platforms the string is opaque, not something that most people would be able to decipher as to what locale it means. If no second argument is provided and the category is C<LC_ALL>, the result is implementation-dependent. It may be a string of concatenated locale names (separator also implementation-dependent) or a single locale name. Please consult your L<setlocale(3)> man page for details. If a second argument is given and it corresponds to a valid locale, the locale for the category is set to that value, and the function returns the now-current locale value. You can then use this in yet another call to C<setlocale()>. (In some implementations, the return value may sometimes differ from the value you gave as the second argument--think of it as an alias for the value you gave.) As the example shows, if the second argument is an empty string, the category's locale is returned to the default specified by the corresponding environment variables. Generally, this results in a return to the default that was in force when Perl started up: changes to the environment made by the application after startup may or may not be noticed, depending on your system's C library. Note that when a form of C<use locale> that doesn't include all categories is specified, Perl ignores the excluded categories. If C<setlocale()> fails for some reason (for example, an attempt to set to a locale unknown to the system), the locale for the category is not changed, and the function returns C<undef>. Starting in Perl 5.28, on multi-threaded perls compiled on systems that implement POSIX 2008 thread-safe locale operations, this function doesn't actually call the system C<setlocale>. Instead those thread-safe operations are used to emulate the C<setlocale> function, but in a thread-safe manner. You can force the thread-safe locale operations to always be used (if available) by recompiling perl with -Accflags='-DUSE_THREAD_SAFE_LOCALE' added to your call to F<Configure>. For further information about the categories, consult L<setlocale(3)>. =head2 Multi-threaded operation Beginning in Perl 5.28, multi-threaded locale operation is supported on systems that implement either the POSIX 2008 or Windows-specific thread-safe locale operations. Many modern systems, such as various Unix variants and Darwin do have this. You can tell if using locales is safe on your system by looking at the read-only boolean variable C<${^SAFE_LOCALES}>. The value is 1 if the perl is not threaded, or if it is using thread-safe locale operations. Thread-safe operations are supported in Windows starting in Visual Studio 2005, and in systems compatible with POSIX 2008. Some platforms claim to support POSIX 2008, but have buggy implementations, so that the hints files for compiling to run on them turn off attempting to use thread-safety. C<${^SAFE_LOCALES}> will be 0 on them. Be aware that writing a multi-threaded application will not be portable to a platform which lacks the native thread-safe locale support. On systems that do have it, you automatically get this behavior for threaded perls, without having to do anything. If for some reason, you don't want to use this capability (perhaps the POSIX 2008 support is buggy on your system), you can manually compile Perl to use the old non-thread-safe implementation by passing the argument C<-Accflags='-DNO_THREAD_SAFE_LOCALE'> to F<Configure>. Except on Windows, this will continue to use certain of the POSIX 2008 functions in some situations. If these are buggy, you can pass the following to F<Configure> instead or additionally: C<-Accflags='-DNO_POSIX_2008_LOCALE'>. This will also keep the code from using thread-safe locales. C<${^SAFE_LOCALES}> will be 0 on systems that turn off the thread-safe operations. Normally on unthreaded builds, the traditional C<setlocale()> is used and not the thread-safe locale functions. You can force the use of these on systems that have them by adding the C<-Accflags='-DUSE_THREAD_SAFE_LOCALE'> to F<Configure>. The initial program is started up using the locale specified from the environment, as currently, described in L</ENVIRONMENT>. All newly created threads start with C<LC_ALL> set to C<"C">>. Each thread may use C<POSIX::setlocale()> to query or switch its locale at any time, without affecting any other thread. All locale-dependent operations automatically use their thread's locale. This should be completely transparent to any applications written entirely in Perl (minus a few rarely encountered caveats given in the L</Multi-threaded> section). Information for XS module writers is given in L<perlxs/Locale-aware XS code>. =head2 Finding locales For locales available in your system, consult also L<setlocale(3)> to see whether it leads to the list of available locales (search for the I<SEE ALSO> section). If that fails, try the following command lines: locale -a nlsinfo ls /usr/lib/nls/loc ls /usr/lib/locale ls /usr/lib/nls ls /usr/share/locale and see whether they list something resembling these en_US.ISO8859-1 de_DE.ISO8859-1 ru_RU.ISO8859-5 en_US.iso88591 de_DE.iso88591 ru_RU.iso88595 en_US de_DE ru_RU en de ru english german russian english.iso88591 german.iso88591 russian.iso88595 english.roman8 russian.koi8r Sadly, even though the calling interface for C<setlocale()> has been standardized, names of locales and the directories where the configuration resides have not been. The basic form of the name is I<language_territory>B<.>I<codeset>, but the latter parts after I<language> are not always present. The I<language> and I<country> are usually from the standards B<ISO 3166> and B<ISO 639>, the two-letter abbreviations for the countries and the languages of the world, respectively. The I<codeset> part often mentions some B<ISO 8859> character set, the Latin codesets. For example, C<ISO 8859-1> is the so-called "Western European codeset" that can be used to encode most Western European languages adequately. Again, there are several ways to write even the name of that one standard. Lamentably. Two special locales are worth particular mention: "C" and "POSIX". Currently these are effectively the same locale: the difference is mainly that the first one is defined by the C standard, the second by the POSIX standard. They define the B<default locale> in which every program starts in the absence of locale information in its environment. (The I<default> default locale, if you will.) Its language is (American) English and its character codeset ASCII or, rarely, a superset thereof (such as the "DEC Multinational Character Set (DEC-MCS)"). B<Warning>. The C locale delivered by some vendors may not actually exactly match what the C standard calls for. So beware. B<NOTE>: Not all systems have the "POSIX" locale (not all systems are POSIX-conformant), so use "C" when you need explicitly to specify this default locale. =head2 LOCALE PROBLEMS You may encounter the following warning message at Perl startup: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). This means that your locale settings had C<LC_ALL> set to "En_US" and LANG exists but has no value. Perl tried to believe you but could not. Instead, Perl gave up and fell back to the "C" locale, the default locale that is supposed to work no matter what. (On Windows, it first tries falling back to the system default locale.) This usually means your locale settings were wrong, they mention locales your system has never heard of, or the locale installation in your system has problems (for example, some system files are broken or missing). There are quick and temporary fixes to these problems, as well as more thorough and lasting fixes. =head2 Testing for broken locales If you are building Perl from source, the Perl test suite file F<lib/locale.t> can be used to test the locales on your system. Setting the environment variable C<PERL_DEBUG_FULL_TEST> to 1 will cause it to output detailed results. For example, on Linux, you could say PERL_DEBUG_FULL_TEST=1 ./perl -T -Ilib lib/locale.t > locale.log 2>&1 Besides many other tests, it will test every locale it finds on your system to see if they conform to the POSIX standard. If any have errors, it will include a summary near the end of the output of which locales passed all its tests, and which failed, and why. =head2 Temporarily fixing locale problems The two quickest fixes are either to render Perl silent about any locale inconsistencies or to run Perl under the default locale "C". Perl's moaning about locale problems can be silenced by setting the environment variable C<PERL_BADLANG> to "0" or "". This method really just sweeps the problem under the carpet: you tell Perl to shut up even when Perl sees that something is wrong. Do not be surprised if later something locale-dependent misbehaves. Perl can be run under the "C" locale by setting the environment variable C<LC_ALL> to "C". This method is perhaps a bit more civilized than the C<PERL_BADLANG> approach, but setting C<LC_ALL> (or other locale variables) may affect other programs as well, not just Perl. In particular, external programs run from within Perl will see these changes. If you make the new settings permanent (read on), all programs you run see the changes. See L</"ENVIRONMENT"> for the full list of relevant environment variables and L</"USING LOCALES"> for their effects in Perl. Effects in other programs are easily deducible. For example, the variable C<LC_COLLATE> may well affect your B<sort> program (or whatever the program that arranges "records" alphabetically in your system is called). You can test out changing these variables temporarily, and if the new settings seem to help, put those settings into your shell startup files. Consult your local documentation for the exact details. For Bourne-like shells (B<sh>, B<ksh>, B<bash>, B<zsh>): LC_ALL=en_US.ISO8859-1 export LC_ALL This assumes that we saw the locale "en_US.ISO8859-1" using the commands discussed above. We decided to try that instead of the above faulty locale "En_US"--and in Cshish shells (B<csh>, B<tcsh>) setenv LC_ALL en_US.ISO8859-1 or if you have the "env" application you can do (in any shell) env LC_ALL=en_US.ISO8859-1 perl ... If you do not know what shell you have, consult your local helpdesk or the equivalent. =head2 Permanently fixing locale problems The slower but superior fixes are when you may be able to yourself fix the misconfiguration of your own environment variables. The mis(sing)configuration of the whole system's locales usually requires the help of your friendly system administrator. First, see earlier in this document about L</Finding locales>. That tells how to find which locales are really supported--and more importantly, installed--on your system. In our example error message, environment variables affecting the locale are listed in the order of decreasing importance (and unset variables do not matter). Therefore, having LC_ALL set to "En_US" must have been the bad choice, as shown by the error message. First try fixing locale settings listed first. Second, if using the listed commands you see something B<exactly> (prefix matches do not count and case usually counts) like "En_US" without the quotes, then you should be okay because you are using a locale name that should be installed and available in your system. In this case, see L</Permanently fixing your system's locale configuration>. =head2 Permanently fixing your system's locale configuration This is when you see something like: perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. but then cannot see that "En_US" listed by the above-mentioned commands. You may see things like "en_US.ISO8859-1", but that isn't the same. In this case, try running under a locale that you can list and which somehow matches what you tried. The rules for matching locale names are a bit vague because standardization is weak in this area. See again the L</Finding locales> about general rules. =head2 Fixing system locale configuration Contact a system administrator (preferably your own) and report the exact error message you get, and ask them to read this same documentation you are now reading. They should be able to check whether there is something wrong with the locale configuration of the system. The L</Finding locales> section is unfortunately a bit vague about the exact commands and places because these things are not that standardized. =head2 The localeconv function The C<POSIX::localeconv()> function allows you to get particulars of the locale-dependent numeric formatting information specified by the current underlying C<LC_NUMERIC> and C<LC_MONETARY> locales (regardless of whether called from within the scope of C<S<use locale>> or not). (If you just want the name of the current locale for a particular category, use C<POSIX::setlocale()> with a single parameter--see L</The setlocale function>.) use POSIX qw(locale_h); # Get a reference to a hash of locale-dependent info $locale_values = localeconv(); # Output sorted list of the values for (sort keys %$locale_values) { printf "%-20s = %s\n", $_, $locale_values->{$_} } C<localeconv()> takes no arguments, and returns B<a reference to> a hash. The keys of this hash are variable names for formatting, such as C<decimal_point> and C<thousands_sep>. The values are the corresponding, er, values. See L<POSIX/localeconv> for a longer example listing the categories an implementation might be expected to provide; some provide more and others fewer. You don't need an explicit C<use locale>, because C<localeconv()> always observes the current locale. Here's a simple-minded example program that rewrites its command-line parameters as integers correctly formatted in the current locale: use POSIX qw(locale_h); # Get some of locale's numeric formatting parameters my ($thousands_sep, $grouping) = @{localeconv()}{'thousands_sep', 'grouping'}; # Apply defaults if values are missing $thousands_sep = ',' unless $thousands_sep; # grouping and mon_grouping are packed lists # of small integers (characters) telling the # grouping (thousand_seps and mon_thousand_seps # being the group dividers) of numbers and # monetary quantities. The integers' meanings: # 255 means no more grouping, 0 means repeat # the previous grouping, 1-254 means use that # as the current grouping. Grouping goes from # right to left (low to high digits). In the # below we cheat slightly by never using anything # else than the first grouping (whatever that is). if ($grouping) { @grouping = unpack("C*", $grouping); } else { @grouping = (3); } # Format command line params for current locale for (@ARGV) { $_ = int; # Chop non-integer part 1 while s/(\d)(\d{$grouping[0]}($|$thousands_sep))/$1$thousands_sep$2/; print "$_"; } print "\n"; Note that if the platform doesn't have C<LC_NUMERIC> and/or C<LC_MONETARY> available or enabled, the corresponding elements of the hash will be missing. =head2 I18N::Langinfo Another interface for querying locale-dependent information is the C<I18N::Langinfo::langinfo()> function. The following example will import the C<langinfo()> function itself and three constants to be used as arguments to C<langinfo()>: a constant for the abbreviated first day of the week (the numbering starts from Sunday = 1) and two more constants for the affirmative and negative answers for a yes/no question in the current locale. use I18N::Langinfo qw(langinfo ABDAY_1 YESSTR NOSTR); my ($abday_1, $yesstr, $nostr) = map { langinfo } qw(ABDAY_1 YESSTR NOSTR); print "$abday_1? [$yesstr/$nostr] "; In other words, in the "C" (or English) locale the above will probably print something like: Sun? [yes/no] See L<I18N::Langinfo> for more information. =head1 LOCALE CATEGORIES The following subsections describe basic locale categories. Beyond these, some combination categories allow manipulation of more than one basic category at a time. See L</"ENVIRONMENT"> for a discussion of these. =head2 Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting In the scope of a S<C<use locale>> form that includes collation, Perl looks to the C<LC_COLLATE> environment variable to determine the application's notions on collation (ordering) of characters. For example, "b" follows "a" in Latin alphabets, but where do "E<aacute>" and "E<aring>" belong? And while "color" follows "chocolate" in English, what about in traditional Spanish? The following collations all make sense and you may meet any of them if you C<"use locale">. A B C D E a b c d e A a B b C c D d E e a A b B c C d D e E a b c d e A B C D E Here is a code snippet to tell what "word" characters are in the current locale, in that locale's order: use locale; print +(sort grep /\w/, map { chr } 0..255), "\n"; Compare this with the characters that you see and their order if you state explicitly that the locale should be ignored: no locale; print +(sort grep /\w/, map { chr } 0..255), "\n"; This machine-native collation (which is what you get unless S<C<use locale>> has appeared earlier in the same block) must be used for sorting raw binary data, whereas the locale-dependent collation of the first example is useful for natural text. As noted in L</USING LOCALES>, C<cmp> compares according to the current collation locale when C<use locale> is in effect, but falls back to a char-by-char comparison for strings that the locale says are equal. You can use C<POSIX::strcoll()> if you don't want this fall-back: use POSIX qw(strcoll); $equal_in_locale = !strcoll("space and case ignored", "SpaceAndCaseIgnored"); C<$equal_in_locale> will be true if the collation locale specifies a dictionary-like ordering that ignores space characters completely and which folds case. Perl uses the platform's C library collation functions C<strcoll()> and C<strxfrm()>. That means you get whatever they give. On some platforms, these functions work well on UTF-8 locales, giving a reasonable default collation for the code points that are important in that locale. (And if they aren't working well, the problem may only be that the locale definition is deficient, so can be fixed by using a better definition file. Unicode's definitions (see L</Freely available locale definitions>) provide reasonable UTF-8 locale collation definitions.) Starting in Perl v5.26, Perl's use of these functions has been made more seamless. This may be sufficient for your needs. For more control, and to make sure strings containing any code point (not just the ones important in the locale) collate properly, the L<Unicode::Collate> module is suggested. In non-UTF-8 locales (hence single byte), code points above 0xFF are technically invalid. But if present, again starting in v5.26, they will collate to the same position as the highest valid code point does. This generally gives good results, but the collation order may be skewed if the valid code point gets special treatment when it forms particular sequences with other characters as defined by the locale. When two strings collate identically, the code point order is used as a tie breaker. If Perl detects that there are problems with the locale collation order, it reverts to using non-locale collation rules for that locale. If you have a single string that you want to check for "equality in locale" against several others, you might think you could gain a little efficiency by using C<POSIX::strxfrm()> in conjunction with C<eq>: use POSIX qw(strxfrm); $xfrm_string = strxfrm("Mixed-case string"); print "locale collation ignores spaces\n" if $xfrm_string eq strxfrm("Mixed-casestring"); print "locale collation ignores hyphens\n" if $xfrm_string eq strxfrm("Mixedcase string"); print "locale collation ignores case\n" if $xfrm_string eq strxfrm("mixed-case string"); C<strxfrm()> takes a string and maps it into a transformed string for use in char-by-char comparisons against other transformed strings during collation. "Under the hood", locale-affected Perl comparison operators call C<strxfrm()> for both operands, then do a char-by-char comparison of the transformed strings. By calling C<strxfrm()> explicitly and using a non locale-affected comparison, the example attempts to save a couple of transformations. But in fact, it doesn't save anything: Perl magic (see L<perlguts/Magic Variables>) creates the transformed version of a string the first time it's needed in a comparison, then keeps this version around in case it's needed again. An example rewritten the easy way with C<cmp> runs just about as fast. It also copes with null characters embedded in strings; if you call C<strxfrm()> directly, it treats the first null it finds as a terminator. Don't expect the transformed strings it produces to be portable across systems--or even from one revision of your operating system to the next. In short, don't call C<strxfrm()> directly: let Perl do it for you. Note: C<use locale> isn't shown in some of these examples because it isn't needed: C<strcoll()> and C<strxfrm()> are POSIX functions which use the standard system-supplied C<libc> functions that always obey the current C<LC_COLLATE> locale. =head2 Category C<LC_CTYPE>: Character Types In the scope of a S<C<use locale>> form that includes C<LC_CTYPE>, Perl obeys the C<LC_CTYPE> locale setting. This controls the application's notion of which characters are alphabetic, numeric, punctuation, I<etc>. This affects Perl's C<\w> regular expression metanotation, which stands for alphanumeric characters--that is, alphabetic, numeric, and the platform's native underscore. (Consult L<perlre> for more information about regular expressions.) Thanks to C<LC_CTYPE>, depending on your locale setting, characters like "E<aelig>", "E<eth>", "E<szlig>", and "E<oslash>" may be understood as C<\w> characters. It also affects things like C<\s>, C<\D>, and the POSIX character classes, like C<[[:graph:]]>. (See L<perlrecharclass> for more information on all these.) The C<LC_CTYPE> locale also provides the map used in transliterating characters between lower and uppercase. This affects the case-mapping functions--C<fc()>, C<lc()>, C<lcfirst()>, C<uc()>, and C<ucfirst()>; case-mapping interpolation with C<\F>, C<\l>, C<\L>, C<\u>, or C<\U> in double-quoted strings and C<s///> substitutions; and case-insensitive regular expression pattern matching using the C<i> modifier. Starting in v5.20, Perl supports UTF-8 locales for C<LC_CTYPE>, but otherwise Perl only supports single-byte locales, such as the ISO 8859 series. This means that wide character locales, for example for Asian languages, are not well-supported. Use of these locales may cause core dumps. If the platform has the capability for Perl to detect such a locale, starting in Perl v5.22, L<Perl will warn, default enabled|warnings/Category Hierarchy>, using the C<locale> warning category, whenever such a locale is switched into. The UTF-8 locale support is actually a superset of POSIX locales, because it is really full Unicode behavior as if no C<LC_CTYPE> locale were in effect at all (except for tainting; see L</SECURITY>). POSIX locales, even UTF-8 ones, are lacking certain concepts in Unicode, such as the idea that changing the case of a character could expand to be more than one character. Perl in a UTF-8 locale, will give you that expansion. Prior to v5.20, Perl treated a UTF-8 locale on some platforms like an ISO 8859-1 one, with some restrictions, and on other platforms more like the "C" locale. For releases v5.16 and v5.18, C<S<use locale 'not_characters>> could be used as a workaround for this (see L</Unicode and UTF-8>). Note that there are quite a few things that are unaffected by the current locale. Any literal character is the native character for the given platform. Hence 'A' means the character at code point 65 on ASCII platforms, and 193 on EBCDIC. That may or may not be an 'A' in the current locale, if that locale even has an 'A'. Similarly, all the escape sequences for particular characters, C<\n> for example, always mean the platform's native one. This means, for example, that C<\N> in regular expressions (every character but new-line) works on the platform character set. Starting in v5.22, Perl will by default warn when switching into a locale that redefines any ASCII printable character (plus C<\t> and C<\n>) into a different class than expected. This is likely to happen on modern locales only on EBCDIC platforms, where, for example, a CCSID 0037 locale on a CCSID 1047 machine moves C<"[">, but it can happen on ASCII platforms with the ISO 646 and other 7-bit locales that are essentially obsolete. Things may still work, depending on what features of Perl are used by the program. For example, in the example from above where C<"|"> becomes a C<\w>, and there are no regular expressions where this matters, the program may still work properly. The warning lists all the characters that it can determine could be adversely affected. B<Note:> A broken or malicious C<LC_CTYPE> locale definition may result in clearly ineligible characters being considered to be alphanumeric by your application. For strict matching of (mundane) ASCII letters and digits--for example, in command strings--locale-aware applications should use C<\w> with the C</a> regular expression modifier. See L</"SECURITY">. =head2 Category C<LC_NUMERIC>: Numeric Formatting After a proper C<POSIX::setlocale()> call, and within the scope of a C<use locale> form that includes numerics, Perl obeys the C<LC_NUMERIC> locale information, which controls an application's idea of how numbers should be formatted for human readability. In most implementations the only effect is to change the character used for the decimal point--perhaps from "." to ",". The functions aren't aware of such niceties as thousands separation and so on. (See L</The localeconv function> if you care about these things.) use POSIX qw(strtod setlocale LC_NUMERIC); use locale; setlocale LC_NUMERIC, ""; $n = 5/2; # Assign numeric 2.5 to $n $a = " $n"; # Locale-dependent conversion to string print "half five is $n\n"; # Locale-dependent output printf "half five is %g\n", $n; # Locale-dependent output print "DECIMAL POINT IS COMMA\n" if $n == (strtod("2,5"))[0]; # Locale-dependent conversion See also L<I18N::Langinfo> and C<RADIXCHAR>. =head2 Category C<LC_MONETARY>: Formatting of monetary amounts The C standard defines the C<LC_MONETARY> category, but not a function that is affected by its contents. (Those with experience of standards committees will recognize that the working group decided to punt on the issue.) Consequently, Perl essentially takes no notice of it. If you really want to use C<LC_MONETARY>, you can query its contents--see L</The localeconv function>--and use the information that it returns in your application's own formatting of currency amounts. However, you may well find that the information, voluminous and complex though it may be, still does not quite meet your requirements: currency formatting is a hard nut to crack. See also L<I18N::Langinfo> and C<CRNCYSTR>. =head2 Category C<LC_TIME>: Respresentation of time Output produced by C<POSIX::strftime()>, which builds a formatted human-readable date/time string, is affected by the current C<LC_TIME> locale. Thus, in a French locale, the output produced by the C<%B> format element (full month name) for the first month of the year would be "janvier". Here's how to get a list of long month names in the current locale: use POSIX qw(strftime); for (0..11) { $long_month_name[$_] = strftime("%B", 0, 0, 0, 1, $_, 96); } Note: C<use locale> isn't needed in this example: C<strftime()> is a POSIX function which uses the standard system-supplied C<libc> function that always obeys the current C<LC_TIME> locale. See also L<I18N::Langinfo> and C<ABDAY_1>..C<ABDAY_7>, C<DAY_1>..C<DAY_7>, C<ABMON_1>..C<ABMON_12>, and C<ABMON_1>..C<ABMON_12>. =head2 Other categories The remaining locale categories are not currently used by Perl itself. But again note that things Perl interacts with may use these, including extensions outside the standard Perl distribution, and by the operating system and its utilities. Note especially that the string value of C<$!> and the error messages given by external utilities may be changed by C<LC_MESSAGES>. If you want to have portable error codes, use C<%!>. See L<Errno>. =head1 SECURITY Although the main discussion of Perl security issues can be found in L<perlsec>, a discussion of Perl's locale handling would be incomplete if it did not draw your attention to locale-dependent security issues. Locales--particularly on systems that allow unprivileged users to build their own locales--are untrustworthy. A malicious (or just plain broken) locale can make a locale-aware application give unexpected results. Here are a few possibilities: =over 4 =item * Regular expression checks for safe file names or mail addresses using C<\w> may be spoofed by an C<LC_CTYPE> locale that claims that characters such as C<"E<gt>"> and C<"|"> are alphanumeric. =item * String interpolation with case-mapping, as in, say, C<$dest = "C:\U$name.$ext">, may produce dangerous results if a bogus C<LC_CTYPE> case-mapping table is in effect. =item * A sneaky C<LC_COLLATE> locale could result in the names of students with "D" grades appearing ahead of those with "A"s. =item * An application that takes the trouble to use information in C<LC_MONETARY> may format debits as if they were credits and vice versa if that locale has been subverted. Or it might make payments in US dollars instead of Hong Kong dollars. =item * The date and day names in dates formatted by C<strftime()> could be manipulated to advantage by a malicious user able to subvert the C<LC_DATE> locale. ("Look--it says I wasn't in the building on Sunday.") =back Such dangers are not peculiar to the locale system: any aspect of an application's environment which may be modified maliciously presents similar challenges. Similarly, they are not specific to Perl: any programming language that allows you to write programs that take account of their environment exposes you to these issues. Perl cannot protect you from all possibilities shown in the examples--there is no substitute for your own vigilance--but, when C<use locale> is in effect, Perl uses the tainting mechanism (see L<perlsec>) to mark string results that become locale-dependent, and which may be untrustworthy in consequence. Here is a summary of the tainting behavior of operators and functions that may be affected by the locale: =over 4 =item * B<Comparison operators> (C<lt>, C<le>, C<ge>, C<gt> and C<cmp>): Scalar true/false (or less/equal/greater) result is never tainted. =item * B<Case-mapping interpolation> (with C<\l>, C<\L>, C<\u>, C<\U>, or C<\F>) The result string containing interpolated material is tainted if a C<use locale> form that includes C<LC_CTYPE> is in effect. =item * B<Matching operator> (C<m//>): Scalar true/false result never tainted. All subpatterns, either delivered as a list-context result or as C<$1> I<etc>., are tainted if a C<use locale> form that includes C<LC_CTYPE> is in effect, and the subpattern regular expression contains a locale-dependent construct. These constructs include C<\w> (to match an alphanumeric character), C<\W> (non-alphanumeric character), C<\b> and C<\B> (word-boundary and non-boundardy, which depend on what C<\w> and C<\W> match), C<\s> (whitespace character), C<\S> (non whitespace character), C<\d> and C<\D> (digits and non-digits), and the POSIX character classes, such as C<[:alpha:]> (see L<perlrecharclass/POSIX Character Classes>). Tainting is also likely if the pattern is to be matched case-insensitively (via C</i>). The exception is if all the code points to be matched this way are above 255 and do not have folds under Unicode rules to below 256. Tainting is not done for these because Perl only uses Unicode rules for such code points, and those rules are the same no matter what the current locale. The matched-pattern variables, C<$&>, C<$`> (pre-match), C<$'> (post-match), and C<$+> (last match) also are tainted. =item * B<Substitution operator> (C<s///>): Has the same behavior as the match operator. Also, the left operand of C<=~> becomes tainted when a C<use locale> form that includes C<LC_CTYPE> is in effect, if modified as a result of a substitution based on a regular expression match involving any of the things mentioned in the previous item, or of case-mapping, such as C<\l>, C<\L>,C<\u>, C<\U>, or C<\F>. =item * B<Output formatting functions> (C<printf()> and C<write()>): Results are never tainted because otherwise even output from print, for example C<print(1/7)>, should be tainted if C<use locale> is in effect. =item * B<Case-mapping functions> (C<lc()>, C<lcfirst()>, C<uc()>, C<ucfirst()>): Results are tainted if a C<use locale> form that includes C<LC_CTYPE> is in effect. =item * B<POSIX locale-dependent functions> (C<localeconv()>, C<strcoll()>, C<strftime()>, C<strxfrm()>): Results are never tainted. =back Three examples illustrate locale-dependent tainting. The first program, which ignores its locale, won't run: a value taken directly from the command line may not be used to name an output file when taint checks are enabled. #/usr/local/bin/perl -T # Run with taint checking # Command line sanity check omitted... $tainted_output_file = shift; open(F, ">$tainted_output_file") or warn "Open of $tainted_output_file failed: $!\n"; The program can be made to run by "laundering" the tainted value through a regular expression: the second example--which still ignores locale information--runs, creating the file named on its command line if it can. #/usr/local/bin/perl -T $tainted_output_file = shift; $tainted_output_file =~ m%[\w/]+%; $untainted_output_file = $&; open(F, ">$untainted_output_file") or warn "Open of $untainted_output_file failed: $!\n"; Compare this with a similar but locale-aware program: #/usr/local/bin/perl -T $tainted_output_file = shift; use locale; $tainted_output_file =~ m%[\w/]+%; $localized_output_file = $&; open(F, ">$localized_output_file") or warn "Open of $localized_output_file failed: $!\n"; This third program fails to run because C<$&> is tainted: it is the result of a match involving C<\w> while C<use locale> is in effect. =head1 ENVIRONMENT =over 12 =item PERL_SKIP_LOCALE_INIT This environment variable, available starting in Perl v5.20, if set (to any value), tells Perl to not use the rest of the environment variables to initialize with. Instead, Perl uses whatever the current locale settings are. This is particularly useful in embedded environments, see L<perlembed/Using embedded Perl with POSIX locales>. =item PERL_BADLANG A string that can suppress Perl's warning about failed locale settings at startup. Failure can occur if the locale support in the operating system is lacking (broken) in some way--or if you mistyped the name of a locale when you set up your environment. If this environment variable is absent, or has a value other than "0" or "", Perl will complain about locale setting failures. B<NOTE>: C<PERL_BADLANG> only gives you a way to hide the warning message. The message tells about some problem in your system's locale support, and you should investigate what the problem is. =back The following environment variables are not specific to Perl: They are part of the standardized (ISO C, XPG4, POSIX 1.c) C<setlocale()> method for controlling an application's opinion on data. Windows is non-POSIX, but Perl arranges for the following to work as described anyway. If the locale given by an environment variable is not valid, Perl tries the next lower one in priority. If none are valid, on Windows, the system default locale is then tried. If all else fails, the C<"C"> locale is used. If even that doesn't work, something is badly broken, but Perl tries to forge ahead with whatever the locale settings might be. =over 12 =item C<LC_ALL> C<LC_ALL> is the "override-all" locale environment variable. If set, it overrides all the rest of the locale environment variables. =item C<LANGUAGE> B<NOTE>: C<LANGUAGE> is a GNU extension, it affects you only if you are using the GNU libc. This is the case if you are using e.g. Linux. If you are using "commercial" Unixes you are most probably I<not> using GNU libc and you can ignore C<LANGUAGE>. However, in the case you are using C<LANGUAGE>: it affects the language of informational, warning, and error messages output by commands (in other words, it's like C<LC_MESSAGES>) but it has higher priority than C<LC_ALL>. Moreover, it's not a single value but instead a "path" (":"-separated list) of I<languages> (not locales). See the GNU C<gettext> library documentation for more information. =item C<LC_CTYPE> In the absence of C<LC_ALL>, C<LC_CTYPE> chooses the character type locale. In the absence of both C<LC_ALL> and C<LC_CTYPE>, C<LANG> chooses the character type locale. =item C<LC_COLLATE> In the absence of C<LC_ALL>, C<LC_COLLATE> chooses the collation (sorting) locale. In the absence of both C<LC_ALL> and C<LC_COLLATE>, C<LANG> chooses the collation locale. =item C<LC_MONETARY> In the absence of C<LC_ALL>, C<LC_MONETARY> chooses the monetary formatting locale. In the absence of both C<LC_ALL> and C<LC_MONETARY>, C<LANG> chooses the monetary formatting locale. =item C<LC_NUMERIC> In the absence of C<LC_ALL>, C<LC_NUMERIC> chooses the numeric format locale. In the absence of both C<LC_ALL> and C<LC_NUMERIC>, C<LANG> chooses the numeric format. =item C<LC_TIME> In the absence of C<LC_ALL>, C<LC_TIME> chooses the date and time formatting locale. In the absence of both C<LC_ALL> and C<LC_TIME>, C<LANG> chooses the date and time formatting locale. =item C<LANG> C<LANG> is the "catch-all" locale environment variable. If it is set, it is used as the last resort after the overall C<LC_ALL> and the category-specific C<LC_I<foo>>. =back =head2 Examples The C<LC_NUMERIC> controls the numeric output: use locale; use POSIX qw(locale_h); # Imports setlocale() and the LC_ constants. setlocale(LC_NUMERIC, "fr_FR") or die "Pardon"; printf "%g\n", 1.23; # If the "fr_FR" succeeded, probably shows 1,23. and also how strings are parsed by C<POSIX::strtod()> as numbers: use locale; use POSIX qw(locale_h strtod); setlocale(LC_NUMERIC, "de_DE") or die "Entschuldigung"; my $x = strtod("2,34") + 5; print $x, "\n"; # Probably shows 7,34. =head1 NOTES =head2 String C<eval> and C<LC_NUMERIC> A string L<eval|perlfunc/eval EXPR> parses its expression as standard Perl. It is therefore expecting the decimal point to be a dot. If C<LC_NUMERIC> is set to have this be a comma instead, the parsing will be confused, perhaps silently. use locale; use POSIX qw(locale_h); setlocale(LC_NUMERIC, "fr_FR") or die "Pardon"; my $a = 1.2; print eval "$a + 1.5"; print "\n"; prints C<13,5>. This is because in that locale, the comma is the decimal point character. The C<eval> thus expands to: eval "1,2 + 1.5" and the result is not what you likely expected. No warnings are generated. If you do string C<eval>'s within the scope of S<C<use locale>>, you should instead change the C<eval> line to do something like: print eval "no locale; $a + 1.5"; This prints C<2.7>. You could also exclude C<LC_NUMERIC>, if you don't need it, by use locale ':!numeric'; =head2 Backward compatibility Versions of Perl prior to 5.004 B<mostly> ignored locale information, generally behaving as if something similar to the C<"C"> locale were always in force, even if the program environment suggested otherwise (see L</The setlocale function>). By default, Perl still behaves this way for backward compatibility. If you want a Perl application to pay attention to locale information, you B<must> use the S<C<use locale>> pragma (see L</The "use locale" pragma>) or, in the unlikely event that you want to do so for just pattern matching, the C</l> regular expression modifier (see L<perlre/Character set modifiers>) to instruct it to do so. Versions of Perl from 5.002 to 5.003 did use the C<LC_CTYPE> information if available; that is, C<\w> did understand what were the letters according to the locale environment variables. The problem was that the user had no control over the feature: if the C library supported locales, Perl used them. =head2 I18N:Collate obsolete In versions of Perl prior to 5.004, per-locale collation was possible using the C<I18N::Collate> library module. This module is now mildly obsolete and should be avoided in new applications. The C<LC_COLLATE> functionality is now integrated into the Perl core language: One can use locale-specific scalar data completely normally with C<use locale>, so there is no longer any need to juggle with the scalar references of C<I18N::Collate>. =head2 Sort speed and memory use impacts Comparing and sorting by locale is usually slower than the default sorting; slow-downs of two to four times have been observed. It will also consume more memory: once a Perl scalar variable has participated in any string comparison or sorting operation obeying the locale collation rules, it will take 3-15 times more memory than before. (The exact multiplier depends on the string's contents, the operating system and the locale.) These downsides are dictated more by the operating system's implementation of the locale system than by Perl. =head2 Freely available locale definitions The Unicode CLDR project extracts the POSIX portion of many of its locales, available at https://unicode.org/Public/cldr/2.0.1/ (Newer versions of CLDR require you to compute the POSIX data yourself. See L<http://unicode.org/Public/cldr/latest/>.) There is a large collection of locale definitions at: http://std.dkuug.dk/i18n/WG15-collection/locales/ You should be aware that it is unsupported, and is not claimed to be fit for any purpose. If your system allows installation of arbitrary locales, you may find the definitions useful as they are, or as a basis for the development of your own locales. =head2 I18n and l10n "Internationalization" is often abbreviated as B<i18n> because its first and last letters are separated by eighteen others. (You may guess why the internalin ... internaliti ... i18n tends to get abbreviated.) In the same way, "localization" is often abbreviated to B<l10n>. =head2 An imperfect standard Internationalization, as defined in the C and POSIX standards, can be criticized as incomplete and ungainly. They also have a tendency, like standards groups, to divide the world into nations, when we all know that the world can equally well be divided into bankers, bikers, gamers, and so on. =head1 Unicode and UTF-8 The support of Unicode is new starting from Perl version v5.6, and more fully implemented in versions v5.8 and later. See L<perluniintro>. Starting in Perl v5.20, UTF-8 locales are supported in Perl, except C<LC_COLLATE> is only partially supported; collation support is improved in Perl v5.26 to a level that may be sufficient for your needs (see L</Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting>). If you have Perl v5.16 or v5.18 and can't upgrade, you can use use locale ':not_characters'; When this form of the pragma is used, only the non-character portions of locales are used by Perl, for example C<LC_NUMERIC>. Perl assumes that you have translated all the characters it is to operate on into Unicode (actually the platform's native character set (ASCII or EBCDIC) plus Unicode). For data in files, this can conveniently be done by also specifying use open ':locale'; This pragma arranges for all inputs from files to be translated into Unicode from the current locale as specified in the environment (see L</ENVIRONMENT>), and all outputs to files to be translated back into the locale. (See L<open>). On a per-filehandle basis, you can instead use the L<PerlIO::locale> module, or the L<Encode::Locale> module, both available from CPAN. The latter module also has methods to ease the handling of C<ARGV> and environment variables, and can be used on individual strings. If you know that all your locales will be UTF-8, as many are these days, you can use the L<B<-C>|perlrun/-C [numberE<sol>list]> command line switch. This form of the pragma allows essentially seamless handling of locales with Unicode. The collation order will be by Unicode code point order. L<Unicode::Collate> can be used to get Unicode rules collation. All the modules and switches just described can be used in v5.20 with just plain C<use locale>, and, should the input locales not be UTF-8, you'll get the less than ideal behavior, described below, that you get with pre-v5.16 Perls, or when you use the locale pragma without the C<:not_characters> parameter in v5.16 and v5.18. If you are using exclusively UTF-8 locales in v5.20 and higher, the rest of this section does not apply to you. There are two cases, multi-byte and single-byte locales. First multi-byte: The only multi-byte (or wide character) locale that Perl is ever likely to support is UTF-8. This is due to the difficulty of implementation, the fact that high quality UTF-8 locales are now published for every area of the world (L<https://unicode.org/Public/cldr/2.0.1/> for ones that are already set-up, but from an earlier version; L<https://unicode.org/Public/cldr/latest/> for the most up-to-date, but you have to extract the POSIX information yourself), and that failing all that you can use the L<Encode> module to translate to/from your locale. So, you'll have to do one of those things if you're using one of these locales, such as Big5 or Shift JIS. For UTF-8 locales, in Perls (pre v5.20) that don't have full UTF-8 locale support, they may work reasonably well (depending on your C library implementation) simply because both they and Perl store characters that take up multiple bytes the same way. However, some, if not most, C library implementations may not process the characters in the upper half of the Latin-1 range (128 - 255) properly under C<LC_CTYPE>. To see if a character is a particular type under a locale, Perl uses the functions like C<isalnum()>. Your C library may not work for UTF-8 locales with those functions, instead only working under the newer wide library functions like C<iswalnum()>, which Perl does not use. These multi-byte locales are treated like single-byte locales, and will have the restrictions described below. Starting in Perl v5.22 a warning message is raised when Perl detects a multi-byte locale that it doesn't fully support. For single-byte locales, Perl generally takes the tack to use locale rules on code points that can fit in a single byte, and Unicode rules for those that can't (though this isn't uniformly applied, see the note at the end of this section). This prevents many problems in locales that aren't UTF-8. Suppose the locale is ISO8859-7, Greek. The character at 0xD7 there is a capital Chi. But in the ISO8859-1 locale, Latin1, it is a multiplication sign. The POSIX regular expression character class C<[[:alpha:]]> will magically match 0xD7 in the Greek locale but not in the Latin one. However, there are places where this breaks down. Certain Perl constructs are for Unicode only, such as C<\p{Alpha}>. They assume that 0xD7 always has its Unicode meaning (or the equivalent on EBCDIC platforms). Since Latin1 is a subset of Unicode and 0xD7 is the multiplication sign in both Latin1 and Unicode, C<\p{Alpha}> will never match it, regardless of locale. A similar issue occurs with C<\N{...}>. Prior to v5.20, it is therefore a bad idea to use C<\p{}> or C<\N{}> under plain C<use locale>--I<unless> you can guarantee that the locale will be ISO8859-1. Use POSIX character classes instead. Another problem with this approach is that operations that cross the single byte/multiple byte boundary are not well-defined, and so are disallowed. (This boundary is between the codepoints at 255/256.) For example, lower casing LATIN CAPITAL LETTER Y WITH DIAERESIS (U+0178) should return LATIN SMALL LETTER Y WITH DIAERESIS (U+00FF). But in the Greek locale, for example, there is no character at 0xFF, and Perl has no way of knowing what the character at 0xFF is really supposed to represent. Thus it disallows the operation. In this mode, the lowercase of U+0178 is itself. The same problems ensue if you enable automatic UTF-8-ification of your standard file handles, default C<open()> layer, and C<@ARGV> on non-ISO8859-1, non-UTF-8 locales (by using either the B<-C> command line switch or the C<PERL_UNICODE> environment variable; see L<perlrun|perlrun/-C [numberE<sol>list]>). Things are read in as UTF-8, which would normally imply a Unicode interpretation, but the presence of a locale causes them to be interpreted in that locale instead. For example, a 0xD7 code point in the Unicode input, which should mean the multiplication sign, won't be interpreted by Perl that way under the Greek locale. This is not a problem I<provided> you make certain that all locales will always and only be either an ISO8859-1, or, if you don't have a deficient C library, a UTF-8 locale. Still another problem is that this approach can lead to two code points meaning the same character. Thus in a Greek locale, both U+03A7 and U+00D7 are GREEK CAPITAL LETTER CHI. Because of all these problems, starting in v5.22, Perl will raise a warning if a multi-byte (hence Unicode) code point is used when a single-byte locale is in effect. (Although it doesn't check for this if doing so would unreasonably slow execution down.) Vendor locales are notoriously buggy, and it is difficult for Perl to test its locale-handling code because this interacts with code that Perl has no control over; therefore the locale-handling code in Perl may be buggy as well. (However, the Unicode-supplied locales should be better, and there is a feed back mechanism to correct any problems. See L</Freely available locale definitions>.) If you have Perl v5.16, the problems mentioned above go away if you use the C<:not_characters> parameter to the locale pragma (except for vendor bugs in the non-character portions). If you don't have v5.16, and you I<do> have locales that work, using them may be worthwhile for certain specific purposes, as long as you keep in mind the gotchas already mentioned. For example, if the collation for your locales works, it runs faster under locales than under L<Unicode::Collate>; and you gain access to such things as the local currency symbol and the names of the months and days of the week. (But to hammer home the point, in v5.16, you get this access without the downsides of locales by using the C<:not_characters> form of the pragma.) Note: The policy of using locale rules for code points that can fit in a byte, and Unicode rules for those that can't is not uniformly applied. Pre-v5.12, it was somewhat haphazard; in v5.12 it was applied fairly consistently to regular expression matching except for bracketed character classes; in v5.14 it was extended to all regex matches; and in v5.16 to the casing operations such as C<\L> and C<uc()>. For collation, in all releases so far, the system's C<strxfrm()> function is called, and whatever it does is what you get. Starting in v5.26, various bugs are fixed with the way perl uses this function. =head1 BUGS =head2 Collation of strings containing embedded C<NUL> characters C<NUL> characters will sort the same as the lowest collating control character does, or to C<"\001"> in the unlikely event that there are no control characters at all in the locale. In cases where the strings don't contain this non-C<NUL> control, the results will be correct, and in many locales, this control, whatever it might be, will rarely be encountered. But there are cases where a C<NUL> should sort before this control, but doesn't. If two strings do collate identically, the one containing the C<NUL> will sort to earlier. Prior to 5.26, there were more bugs. =head2 Multi-threaded XS code or C-language libraries called from it that use the system L<C<setlocale(3)>> function (except on Windows) likely will not work from a multi-threaded application without changes. See L<perlxs/Locale-aware XS code>. An XS module that is locale-dependent could have been written under the assumption that it will never be called in a multi-threaded environment, and so uses other non-locale constructs that aren't multi-thread-safe. See L<perlxs/Thread-aware system interfaces>. POSIX does not define a way to get the name of the current per-thread locale. Some systems, such as Darwin and NetBSD do implement a function, L<querylocale(3)> to do this. On non-Windows systems without it, such as Linux, there are some additional caveats: =over =item * An embedded perl needs to be started up while the global locale is in effect. See L<perlembed/Using embedded Perl with POSIX locales>. =item * It becomes more important for perl to know about all the possible locale categories on the platform, even if they aren't apparently used in your program. Perl knows all of the Linux ones. If your platform has others, you can submit an issue at L<https://github.com/Perl/perl5/issues> for inclusion of it in the next release. In the meantime, it is possible to edit the Perl source to teach it about the category, and then recompile. Search for instances of, say, C<LC_PAPER> in the source, and use that as a template to add the omitted one. =item * It is possible, though hard to do, to call C<POSIX::setlocale> with a locale that it doesn't recognize as syntactically legal, but actually is legal on that system. This should happen only with embedded perls, or if you hand-craft a locale name yourself. =back =head2 Broken systems In certain systems, the operating system's locale support is broken and cannot be fixed or used by Perl. Such deficiencies can and will result in mysterious hangs and/or Perl core dumps when C<use locale> is in effect. When confronted with such a system, please report in excruciating detail to <L<https://github.com/Perl/perl5/issues>>, and also contact your vendor: bug fixes may exist for these problems in your operating system. Sometimes such bug fixes are called an operating system upgrade. If you have the source for Perl, include in the bug report the output of the test described above in L</Testing for broken locales>. =head1 SEE ALSO L<I18N::Langinfo>, L<perluniintro>, L<perlunicode>, L<open>, L<POSIX/localeconv>, L<POSIX/setlocale>, L<POSIX/strcoll>, L<POSIX/strftime>, L<POSIX/strtod>, L<POSIX/strxfrm>. For special considerations when Perl is embedded in a C program, see L<perlembed/Using embedded Perl with POSIX locales>. =head1 HISTORY Jarkko Hietaniemi's original F<perli18n.pod> heavily hacked by Dominic Dunlop, assisted by the perl5-porters. Prose worked over a bit by Tom Christiansen, and now maintained by Perl 5 porters. PK �=�[�TY=�K �K perlsecpolicy.podnu �[��� =encoding utf-8 =for stopwords CVE perlsecpolicy SV perl Perl SDBM HackerOne Mitre =head1 NAME perlsecpolicy - Perl security report handling policy =head1 DESCRIPTION The Perl project takes security issues seriously. The responsibility for handling security reports in a timely and effective manner has been delegated to a security team composed of a subset of the Perl core developers. This document describes how the Perl security team operates and how the team evaluates new security reports. =head1 REPORTING SECURITY ISSUES IN PERL If you believe you have found a security vulnerability in the Perl interpreter or modules maintained in the core Perl codebase, email the details to L<perl-security@perl.org|mailto:perl-security@perl.org>. This address is a closed membership mailing list monitored by the Perl security team. You should receive an initial response to your report within 72 hours. If you do not receive a response in that time, please contact the security team lead L<John Lightsey|mailto:john@04755.net> and the L<Perl steering council|mailto:steering-council@perl.org>. When members of the security team reply to your messages, they will generally include the perl-security@perl.org address in the "To" or "CC" fields of the response. This allows all of the security team to follow the discussion and chime in as needed. Use the "Reply-all" functionality of your email client when you send subsequent responses so that the entire security team receives the message. The security team will evaluate your report and make an initial determination of whether it is likely to fit the scope of issues the team handles. General guidelines about how this is determined are detailed in the L</WHAT ARE SECURITY ISSUES> section. If your report meets the team's criteria, an issue will be opened in the team's private issue tracker and you will be provided the issue's ID number. Issue identifiers have the form perl-security#NNN. Include this identifier with any subsequent messages you send. The security team will send periodic updates about the status of your issue and guide you through any further action that is required to complete the vulnerability remediation process. The stages vulnerabilities typically go through are explained in the L</HOW WE DEAL WITH SECURITY ISSUES> section. =head1 WHAT ARE SECURITY ISSUES A vulnerability is a behavior of a software system that compromises the system's expected confidentiality, integrity or availability protections. A security issue is a bug in one or more specific components of a software system that creates a vulnerability. Software written in the Perl programming language is typically composed of many layers of software written by many different groups. It can be very complicated to determine which specific layer of a complex real-world application was responsible for preventing a vulnerable behavior, but this is an essential part of fixing the vulnerability. =head2 Software covered by the Perl security team The Perl security team handles security issues in: =over =item * The Perl interpreter =item * The Perl modules shipped with the interpreter that are developed in the core Perl repository =item * The command line tools shipped with the interpreter that are developed in the core Perl repository =back Files under the F<cpan/> directory in Perl's repository and release tarballs are developed and maintained independently. The Perl security team does not handle security issues for these modules. =head2 Bugs that may qualify as security issues in Perl Perl is designed to be a fast and flexible general purpose programming language. The Perl interpreter and Perl modules make writing safe and secure applications easy, but they do have limitations. As a general rule, a bug in Perl needs to meet all of the following criteria to be considered a security issue: =over =item * The vulnerable behavior is not mentioned in Perl's documentation or public issue tracker. =item * The vulnerable behavior is not implied by an expected behavior. =item * The vulnerable behavior is not a generally accepted limitation of the implementation. =item * The vulnerable behavior is likely to be exposed to attack in otherwise secure applications written in Perl. =item * The vulnerable behavior provides a specific tangible benefit to an attacker that triggers the behavior. =back =head2 Bugs that do not qualify as security issues in Perl There are certain categories of bugs that are frequently reported to the security team that do not meet the criteria listed above. The following is a list of commonly reported bugs that are not handled as security issues. =head3 Feeding untrusted code to the interpreter The Perl parser is not designed to evaluate untrusted code. If your application requires the evaluation of untrusted code, it should rely on an operating system level sandbox for its security. =head3 Stack overflows due to excessive recursion Excessive recursion is often caused by code that does not enforce limits on inputs. The Perl interpreter assumes limits on recursion will be enforced by the application. =head3 Out of memory errors Common Perl constructs such as C<pack>, the C<x> operator, and regular expressions accept numeric quantifiers that control how much memory will be allocated to store intermediate values or results. If you allow an attacker to supply these quantifiers and consume all available memory, the Perl interpreter will not prevent it. =head3 Escape from a L<Safe> compartment L<Opcode> restrictions and L<Safe> compartments are not supported as security mechanisms. The Perl parser is not designed to evaluate untrusted code. =head3 Use of the C<p> and C<P> pack templates These templates are unsafe by design. =head3 Stack not reference-counted issues These bugs typically present as use-after-free errors or as assertion failures on the type of a C<SV>. Stack not reference-counted crashes usually occur because code is both modifying a reference or glob and using the values referenced by that glob or reference. This type of bug is a long standing issue with the Perl interpreter that seldom occurs in normal code. Examples of this type of bug generally assume that attacker-supplied code will be evaluated by the Perl interpreter. =head3 Thawing attacker-supplied data with L<Storable> L<Storable> is designed to be a very fast serialization format. It is not designed to be safe for deserializing untrusted inputs. =head3 Using attacker supplied L<SDBM_File> databases The L<SDBM_File> module is not intended for use with untrusted SDBM databases. =head3 Badly encoded UTF-8 flagged scalars This type of bug occurs when the C<:utf8> PerlIO layer is used to read badly encoded data, or other mechanisms are used to directly manipulate the UTF-8 flag on an SV. A badly encoded UTF-8 flagged SV is not a valid SV. Code that creates SV's in this fashion is corrupting Perl's internal state. =head3 Issues that exist only in blead, or in a release candidate The blead branch and Perl release candidates do not receive security support. Security defects that are present only in pre-release versions of Perl are handled through the normal bug reporting and resolution process. =head3 CPAN modules or other Perl project resources The Perl security team is focused on the Perl interpreter and modules maintained in the core Perl codebase. The team has no special access to fix CPAN modules, applications written in Perl, Perl project websites, Perl mailing lists or the Perl IRC servers. =head3 Emulated POSIX behaviors on Windows systems The Perl interpreter attempts to emulate C<fork>, C<system>, C<exec> and other POSIX behaviors on Windows systems. This emulation has many quirks that are extensively documented in Perl's public issue tracker. Changing these behaviors would cause significant disruption for existing users on Windows. =head2 Bugs that require special categorization Some bugs in the Perl interpreter occur in areas of the codebase that are both security sensitive and prone to failure during normal usage. =head3 Regular expressions Untrusted regular expressions are generally safe to compile and match against with several caveats. The following behaviors of Perl's regular expression engine are the developer's responsibility to constrain. The evaluation of untrusted regular expressions while C<use re 'eval';> is in effect is never safe. Regular expressions are not guaranteed to compile or evaluate in any specific finite time frame. Regular expressions may consume all available system memory when they are compiled or evaluated. Regular expressions may cause excessive recursion that halts the perl interpreter. As a general rule, do not expect Perl's regular expression engine to be resistant to denial of service attacks. =head3 L<DB_File>, L<ODBM_File>, or L<GDBM_File> databases These modules rely on external libraries to interact with database files. Bugs caused by reading and writing these file formats are generally caused by the underlying library implementation and are not security issues in Perl. Bugs where Perl mishandles unexpected valid return values from the underlying libraries may qualify as security issues in Perl. =head3 Algorithmic complexity attacks The perl interpreter is reasonably robust to algorithmic complexity attacks. It is not immune to them. Algorithmic complexity bugs that depend on the interpreter processing extremely large amounts of attacker supplied data are not generally handled as security issues. See L<perlsec/Algorithmic Complexity Attacks> for additional information. =head1 HOW WE DEAL WITH SECURITY ISSUES The Perl security team follows responsible disclosure practices. Security issues are kept secret until a fix is readily available for most users. This minimizes inherent risks users face from vulnerabilities in Perl. Hiding problems from the users temporarily is a necessary trade-off to keep them safe. Hiding problems from users permanently is not the goal. When you report a security issue privately to the L<perl-security@perl.org|mailto:perl-security@perl.org> contact address, we normally expect you to follow responsible disclosure practices in the handling of the report. If you are unable or unwilling to keep the issue secret until a fix is available to users you should state this clearly in the initial report. The security team's vulnerability remediation workflow is intended to be as open and transparent as possible about the state of your security report. =head2 Perl's vulnerability remediation workflow =head3 Initial contact New vulnerability reports will receive an initial reply within 72 hours from the time they arrive at the security team's mailing list. If you do not receive any response in that time, contact the security team lead L<John Lightsey|mailto:john@04755.net> and the the L<Perl steering council|mailto:steering-council@perl.org>. The initial response sent by the security team will confirm your message was received and provide an estimated time frame for the security team's triage analysis. =head3 Initial triage The security team will evaluate the report and determine whether or not it is likely to meet the criteria for handling as a security issue. The security team aims to complete the initial report triage within two weeks' time. Complex issues that require significant discussion or research may take longer. If the security report cannot be reproduced or does not meet the team's criteria for handling as a security issue, you will be notified by email and given an opportunity to respond. =head3 Issue ID assignment Security reports that pass initial triage analysis are turned into issues in the security team's private issue tracker. When a report progresses to this point you will be provided the issue ID for future reference. These identifiers have the format perl-security#NNN or Perl/perl-security#NNN. The assignment of an issue ID does not confirm that a security report represents a vulnerability in Perl. Many reports require further analysis to reach that determination. Issues in the security team's private tracker are used to collect details about the problem and track progress towards a resolution. These notes and other details are not made public when the issue is resolved. Keeping the issue notes private allows the security team to freely discuss attack methods, attack tools, and other related private issues. =head3 Development of patches Members of the security team will inspect the report and related code in detail to produce fixes for supported versions of Perl. If the team discovers that the reported issue does not meet the team's criteria at this stage, you will be notified by email and given an opportunity to respond before the issue is closed. The team may discuss potential fixes with you or provide you with patches for testing purposes during this time frame. No information should be shared publicly at this stage. =head3 CVE ID assignment Once an issue is fully confirmed and a potential fix has been found, the security team will request a CVE identifier for the issue to use in public announcements. Details like the range of vulnerable Perl versions and identities of the people that discovered the flaw need to be collected to submit the CVE ID request. The security team may ask you to clarify the exact name we should use when crediting discovery of the issue. The L</Vulnerability credit and bounties> section of this document explains our preferred format for this credit. Once a CVE ID has been assigned, you will be notified by email. The vulnerability should not be discussed publicly at this stage. =head3 Pre-release notifications When the security team is satisfied that the fix for a security issue is ready to release publicly, a pre-release notification announcement is sent to the major redistributors of Perl. This pre-release announcement includes a list of Perl versions that are affected by the flaw, an analysis of the risks to users, patches the security team has produced, and any information about mitigations or backporting fixes to older versions of Perl that the security team has available. The pre-release announcement will include a specific target date when the issue will be announced publicly. The time frame between the pre-release announcement and the release date allows redistributors to prepare and test their own updates and announcements. During this period the vulnerability details and fixes are embargoed and should not be shared publicly. This embargo period may be extended further if problems are discovered during testing. You will be sent the portions of pre-release announcements that are relevant to the specific issue you reported. This email will include the target release date. Additional updates will be sent if the target release date changes. =head3 Pre-release testing The Perl security team does not directly produce official Perl releases. The team releases security fixes by placing commits in Perl's public git repository and sending announcements. Many users and redistributors prefer using official Perl releases rather than applying patches to an older release. The security team works with Perl's release managers to make this possible. New official releases of Perl are generally produced and tested on private systems during the pre-release embargo period. =head3 Release of fixes and announcements At the end of the embargo period the security fixes will be committed to Perl's public git repository and announcements will be sent to the L<perl5-porters|https://lists.perl.org/list/perl5-porters.html> and L<oss-security|https://oss-security.openwall.org/wiki/mailing-lists/oss-security> mailing lists. If official Perl releases are ready, they will be published at this time and announced on the L<perl5-porters|https://lists.perl.org/list/perl5-porters.html> mailing list. The security team will send a follow-up notification to everyone that participated in the pre-release embargo period once the release process is finished. Vulnerability reporters and Perl redistributors should not publish their own announcements or fixes until the Perl security team's release process is complete. =head2 Publicly known and zero-day security issues The security team's vulnerability remediation workflow assumes that issues are reported privately and kept secret until they are resolved. This isn't always the case and information occasionally leaks out before a fix is ready. In these situations the team must decide whether operating in secret increases or decreases the risk to users of Perl. In some cases being open about the risk a security issue creates will allow users to defend against it, in other cases calling attention to an unresolved security issue will make it more likely to be misused. =head3 Zero-day security issues If an unresolved critical security issue in Perl is being actively abused to attack systems the security team will send out announcements as rapidly as possible with any mitigations the team has available. Perl's public defect tracker will be used to handle the issue so that additional information, fixes, and CVE IDs are visible to affected users as rapidly as possible. =head3 Other leaks of security issue information Depending on the prominence of the information revealed about a security issue and the issue's risk of becoming a zero-day attack, the security team may skip all or part of its normal remediation workflow. If the security team learns of a significant security issue after it has been identified and resolved in Perl's public issue tracker, the team will request a CVE ID and send an announcement to inform users. =head2 Vulnerability credit and bounties The Perl project appreciates the effort security researchers invest in making Perl safe and secure. Since much of this work is hidden from the public, crediting researchers publicly is an important part of the vulnerability remediation process. =head3 Credits in vulnerability announcements When security issues are fixed we will attempt to credit the specific researcher(s) that discovered the flaw in our announcements. Credits are announced using the researcher's preferred full name. If the researcher's contributions were funded by a specific company or part of an organized vulnerability research project, we will include a short name for this group at the researcher's request. Perl's announcements are written in the English language using the 7bit ASCII character set to be reproducible in a variety of formats. We do not include hyperlinks, domain names or marketing material with these acknowledgments. In the event that proper credit for vulnerability discovery cannot be established or there is a disagreement between the Perl security team and the researcher about how the credit should be given, it will be omitted from announcements. =head3 Bounties for Perl vulnerabilities The Perl project is a non-profit volunteer effort. We do not provide any monetary rewards for reporting security issues in Perl. The L<Internet Bug Bounty|https://internetbugbounty.org/> offers monetary rewards for some Perl security issues after they are fully resolved. The terms of this program are available at L<HackerOne|https://hackerone.com/ibb-perl>. This program is not run by the Perl project or the Perl security team. =cut PK �=�[��u perl5162delta.podnu �[��� =encoding utf8 =head1 NAME perl5162delta - what is new for perl v5.16.2 =head1 DESCRIPTION This document describes differences between the 5.16.1 release and the 5.16.2 release. If you are upgrading from an earlier release such as 5.16.0, first read L<perl5161delta>, which describes differences between 5.16.0 and 5.16.1. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.16.0 If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Module::CoreList> has been upgraded from version 2.70 to version 2.76. =back =head1 Configuration and Compilation =over 4 =item * configuration should no longer be confused by ls colorization =back =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item AIX Configure now always adds -qlanglvl=extc99 to the CC flags on AIX when using xlC. This will make it easier to compile a number of XS-based modules that assume C99 [perl #113778]. =back =head1 Selected Bug Fixes =over 4 =item * fix /\h/ equivalence with /[\h]/ see [perl #114220] =back =head1 Known Problems There are no new known problems. =head1 Acknowledgements Perl 5.16.2 represents approximately 2 months of development since Perl 5.16.1 and contains approximately 740 lines of changes across 20 files from 9 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.2: Andy Dougherty, Craig A. Berry, Darin McBride, Dominic Hargreaves, Karen Etheridge, Karl Williamson, Peter Martini, Ricardo Signes, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[_=�]uZ uZ perlport.podnu �[��� =head1 NAME perlport - Writing portable Perl =head1 DESCRIPTION Perl runs on numerous operating systems. While most of them share much in common, they also have their own unique features. This document is meant to help you to find out what constitutes portable Perl code. That way once you make a decision to write portably, you know where the lines are drawn, and you can stay within them. There is a tradeoff between taking full advantage of one particular type of computer and taking advantage of a full range of them. Naturally, as you broaden your range and become more diverse, the common factors drop, and you are left with an increasingly smaller area of common ground in which you can operate to accomplish a particular task. Thus, when you begin attacking a problem, it is important to consider under which part of the tradeoff curve you want to operate. Specifically, you must decide whether it is important that the task that you are coding has the full generality of being portable, or whether to just get the job done right now. This is the hardest choice to be made. The rest is easy, because Perl provides many choices, whichever way you want to approach your problem. Looking at it another way, writing portable code is usually about willfully limiting your available choices. Naturally, it takes discipline and sacrifice to do that. The product of portability and convenience may be a constant. You have been warned. Be aware of two important points: =over 4 =item Not all Perl programs have to be portable There is no reason you should not use Perl as a language to glue Unix tools together, or to prototype a Macintosh application, or to manage the Windows registry. If it makes no sense to aim for portability for one reason or another in a given program, then don't bother. =item Nearly all of Perl already I<is> portable Don't be fooled into thinking that it is hard to create portable Perl code. It isn't. Perl tries its level-best to bridge the gaps between what's available on different platforms, and all the means available to use those features. Thus almost all Perl code runs on any machine without modification. But there are some significant issues in writing portable code, and this document is entirely about those issues. =back Here's the general rule: When you approach a task commonly done using a whole range of platforms, think about writing portable code. That way, you don't sacrifice much by way of the implementation choices you can avail yourself of, and at the same time you can give your users lots of platform choices. On the other hand, when you have to take advantage of some unique feature of a particular platform, as is often the case with systems programming (whether for Unix, Windows, VMS, etc.), consider writing platform-specific code. When the code will run on only two or three operating systems, you may need to consider only the differences of those particular systems. The important thing is to decide where the code will run and to be deliberate in your decision. The material below is separated into three main sections: main issues of portability (L</"ISSUES">), platform-specific issues (L</"PLATFORMS">), and built-in Perl functions that behave differently on various ports (L</"FUNCTION IMPLEMENTATIONS">). This information should not be considered complete; it includes possibly transient information about idiosyncrasies of some of the ports, almost all of which are in a state of constant evolution. Thus, this material should be considered a perpetual work in progress (C<< <IMG SRC="yellow_sign.gif" ALT="Under Construction"> >>). =head1 ISSUES =head2 Newlines In most operating systems, lines in files are terminated by newlines. Just what is used as a newline may vary from OS to OS. Unix traditionally uses C<\012>, one type of DOSish I/O uses C<\015\012>, S<Mac OS> uses C<\015>, and z/OS uses C<\025>. Perl uses C<\n> to represent the "logical" newline, where what is logical may depend on the platform in use. In MacPerl, C<\n> always means C<\015>. On EBCDIC platforms, C<\n> could be C<\025> or C<\045>. In DOSish perls, C<\n> usually means C<\012>, but when accessing a file in "text" mode, perl uses the C<:crlf> layer that translates it to (or from) C<\015\012>, depending on whether you're reading or writing. Unix does the same thing on ttys in canonical mode. C<\015\012> is commonly referred to as CRLF. To trim trailing newlines from text lines use L<C<chomp>|perlfunc/chomp VARIABLE>. With default settings that function looks for a trailing C<\n> character and thus trims in a portable way. When dealing with binary files (or text files in binary mode) be sure to explicitly set L<C<$E<sol>>|perlvar/$E<sol>> to the appropriate value for your file format before using L<C<chomp>|perlfunc/chomp VARIABLE>. Because of the "text" mode translation, DOSish perls have limitations in using L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and L<C<tell>|perlfunc/tell FILEHANDLE> on a file accessed in "text" mode. Stick to L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE>-ing to locations you got from L<C<tell>|perlfunc/tell FILEHANDLE> (and no others), and you are usually free to use L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and L<C<tell>|perlfunc/tell FILEHANDLE> even in "text" mode. Using L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> or L<C<tell>|perlfunc/tell FILEHANDLE> or other file operations may be non-portable. If you use L<C<binmode>|perlfunc/binmode FILEHANDLE> on a file, however, you can usually L<C<seek>|perlfunc/seek FILEHANDLE,POSITION,WHENCE> and L<C<tell>|perlfunc/tell FILEHANDLE> with arbitrary values safely. A common misconception in socket programming is that S<C<\n eq \012>> everywhere. When using protocols such as common Internet protocols, C<\012> and C<\015> are called for specifically, and the values of the logical C<\n> and C<\r> (carriage return) are not reliable. print $socket "Hi there, client!\r\n"; # WRONG print $socket "Hi there, client!\015\012"; # RIGHT However, using C<\015\012> (or C<\cM\cJ>, or C<\x0D\x0A>) can be tedious and unsightly, as well as confusing to those maintaining the code. As such, the L<C<Socket>|Socket> module supplies the Right Thing for those who want it. use Socket qw(:DEFAULT :crlf); print $socket "Hi there, client!$CRLF" # RIGHT When reading from a socket, remember that the default input record separator L<C<$E<sol>>|perlvar/$E<sol>> is C<\n>, but robust socket code will recognize as either C<\012> or C<\015\012> as end of line: while (<$socket>) { # NOT ADVISABLE! # ... } Because both CRLF and LF end in LF, the input record separator can be set to LF and any CR stripped later. Better to write: use Socket qw(:DEFAULT :crlf); local($/) = LF; # not needed if $/ is already \012 while (<$socket>) { s/$CR?$LF/\n/; # not sure if socket uses LF or CRLF, OK # s/\015?\012/\n/; # same thing } This example is preferred over the previous one--even for Unix platforms--because now any C<\015>'s (C<\cM>'s) are stripped out (and there was much rejoicing). Similarly, functions that return text data--such as a function that fetches a web page--should sometimes translate newlines before returning the data, if they've not yet been translated to the local newline representation. A single line of code will often suffice: $data =~ s/\015?\012/\n/g; return $data; Some of this may be confusing. Here's a handy reference to the ASCII CR and LF characters. You can print it out and stick it in your wallet. LF eq \012 eq \x0A eq \cJ eq chr(10) eq ASCII 10 CR eq \015 eq \x0D eq \cM eq chr(13) eq ASCII 13 | Unix | DOS | Mac | --------------------------- \n | LF | LF | CR | \r | CR | CR | LF | \n * | LF | CRLF | CR | \r * | CR | CR | LF | --------------------------- * text-mode STDIO The Unix column assumes that you are not accessing a serial line (like a tty) in canonical mode. If you are, then CR on input becomes "\n", and "\n" on output becomes CRLF. These are just the most common definitions of C<\n> and C<\r> in Perl. There may well be others. For example, on an EBCDIC implementation such as z/OS (OS/390) or OS/400 (using the ILE, the PASE is ASCII-based) the above material is similar to "Unix" but the code numbers change: LF eq \025 eq \x15 eq \cU eq chr(21) eq CP-1047 21 LF eq \045 eq \x25 eq chr(37) eq CP-0037 37 CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-1047 13 CR eq \015 eq \x0D eq \cM eq chr(13) eq CP-0037 13 | z/OS | OS/400 | ---------------------- \n | LF | LF | \r | CR | CR | \n * | LF | LF | \r * | CR | CR | ---------------------- * text-mode STDIO =head2 Numbers endianness and Width Different CPUs store integers and floating point numbers in different orders (called I<endianness>) and widths (32-bit and 64-bit being the most common today). This affects your programs when they attempt to transfer numbers in binary format from one CPU architecture to another, usually either "live" via network connection, or by storing the numbers to secondary storage such as a disk file or tape. Conflicting storage orders make an utter mess out of the numbers. If a little-endian host (Intel, VAX) stores 0x12345678 (305419896 in decimal), a big-endian host (Motorola, Sparc, PA) reads it as 0x78563412 (2018915346 in decimal). Alpha and MIPS can be either: Digital/Compaq used/uses them in little-endian mode; SGI/Cray uses them in big-endian mode. To avoid this problem in network (socket) connections use the L<C<pack>|perlfunc/pack TEMPLATE,LIST> and L<C<unpack>|perlfunc/unpack TEMPLATE,EXPR> formats C<n> and C<N>, the "network" orders. These are guaranteed to be portable. As of Perl 5.10.0, you can also use the C<E<gt>> and C<E<lt>> modifiers to force big- or little-endian byte-order. This is useful if you want to store signed integers or 64-bit integers, for example. You can explore the endianness of your platform by unpacking a data structure packed in native format such as: print unpack("h*", pack("s2", 1, 2)), "\n"; # '10002000' on e.g. Intel x86 or Alpha 21064 in little-endian mode # '00100020' on e.g. Motorola 68040 If you need to distinguish between endian architectures you could use either of the variables set like so: $is_big_endian = unpack("h*", pack("s", 1)) =~ /01/; $is_little_endian = unpack("h*", pack("s", 1)) =~ /^1/; Differing widths can cause truncation even between platforms of equal endianness. The platform of shorter width loses the upper parts of the number. There is no good solution for this problem except to avoid transferring or storing raw binary numbers. One can circumnavigate both these problems in two ways. Either transfer and store numbers always in text format, instead of raw binary, or else consider using modules like L<C<Data::Dumper>|Data::Dumper> and L<C<Storable>|Storable> (included as of Perl 5.8). Keeping all data as text significantly simplifies matters. =head2 Files and Filesystems Most platforms these days structure files in a hierarchical fashion. So, it is reasonably safe to assume that all platforms support the notion of a "path" to uniquely identify a file on the system. How that path is really written, though, differs considerably. Although similar, file path specifications differ between Unix, Windows, S<Mac OS>, OS/2, VMS, VOS, S<RISC OS>, and probably others. Unix, for example, is one of the few OSes that has the elegant idea of a single root directory. DOS, OS/2, VMS, VOS, and Windows can work similarly to Unix with C</> as path separator, or in their own idiosyncratic ways (such as having several root directories and various "unrooted" device files such NIL: and LPT:). S<Mac OS> 9 and earlier used C<:> as a path separator instead of C</>. The filesystem may support neither hard links (L<C<link>|perlfunc/link OLDFILE,NEWFILE>) nor symbolic links (L<C<symlink>|perlfunc/symlink OLDFILE,NEWFILE>, L<C<readlink>|perlfunc/readlink EXPR>, L<C<lstat>|perlfunc/lstat FILEHANDLE>). The filesystem may support neither access timestamp nor change timestamp (meaning that about the only portable timestamp is the modification timestamp), or one second granularity of any timestamps (e.g. the FAT filesystem limits the time granularity to two seconds). The "inode change timestamp" (the L<C<-C>|perlfunc/-X FILEHANDLE> filetest) may really be the "creation timestamp" (which it is not in Unix). VOS perl can emulate Unix filenames with C</> as path separator. The native pathname characters greater-than, less-than, number-sign, and percent-sign are always accepted. S<RISC OS> perl can emulate Unix filenames with C</> as path separator, or go native and use C<.> for path separator and C<:> to signal filesystems and disk names. Don't assume Unix filesystem access semantics: that read, write, and execute are all the permissions there are, and even if they exist, that their semantics (for example what do C<r>, C<w>, and C<x> mean on a directory) are the Unix ones. The various Unix/POSIX compatibility layers usually try to make interfaces like L<C<chmod>|perlfunc/chmod LIST> work, but sometimes there simply is no good mapping. The L<C<File::Spec>|File::Spec> modules provide methods to manipulate path specifications and return the results in native format for each platform. This is often unnecessary as Unix-style paths are understood by Perl on every supported platform, but if you need to produce native paths for a native utility that does not understand Unix syntax, or if you are operating on paths or path components in unknown (and thus possibly native) syntax, L<C<File::Spec>|File::Spec> is your friend. Here are two brief examples: use File::Spec::Functions; chdir(updir()); # go up one directory # Concatenate a path from its components my $file = catfile(updir(), 'temp', 'file.txt'); # on Unix: '../temp/file.txt' # on Win32: '..\temp\file.txt' # on VMS: '[-.temp]file.txt' In general, production code should not have file paths hardcoded. Making them user-supplied or read from a configuration file is better, keeping in mind that file path syntax varies on different machines. This is especially noticeable in scripts like Makefiles and test suites, which often assume C</> as a path separator for subdirectories. Also of use is L<C<File::Basename>|File::Basename> from the standard distribution, which splits a pathname into pieces (base filename, full path to directory, and file suffix). Even when on a single platform (if you can call Unix a single platform), remember not to count on the existence or the contents of particular system-specific files or directories, like F</etc/passwd>, F</etc/sendmail.conf>, F</etc/resolv.conf>, or even F</tmp/>. For example, F</etc/passwd> may exist but not contain the encrypted passwords, because the system is using some form of enhanced security. Or it may not contain all the accounts, because the system is using NIS. If code does need to rely on such a file, include a description of the file and its format in the code's documentation, then make it easy for the user to override the default location of the file. Don't assume a text file will end with a newline. They should, but people forget. Do not have two files or directories of the same name with different case, like F<test.pl> and F<Test.pl>, as many platforms have case-insensitive (or at least case-forgiving) filenames. Also, try not to have non-word characters (except for C<.>) in the names, and keep them to the 8.3 convention, for maximum portability, onerous a burden though this may appear. Likewise, when using the L<C<AutoSplit>|AutoSplit> module, try to keep your functions to 8.3 naming and case-insensitive conventions; or, at the least, make it so the resulting files have a unique (case-insensitively) first 8 characters. Whitespace in filenames is tolerated on most systems, but not all, and even on systems where it might be tolerated, some utilities might become confused by such whitespace. Many systems (DOS, VMS ODS-2) cannot have more than one C<.> in their filenames. Don't assume C<< > >> won't be the first character of a filename. Always use the three-arg version of L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR>: open my $fh, '<', $existing_file) or die $!; Two-arg L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> is magic and can translate characters like C<< > >>, C<< < >>, and C<|> in filenames, which is usually the wrong thing to do. L<C<sysopen>|perlfunc/sysopen FILEHANDLE,FILENAME,MODE> and three-arg L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> don't have this problem. Don't use C<:> as a part of a filename since many systems use that for their own semantics (Mac OS Classic for separating pathname components, many networking schemes and utilities for separating the nodename and the pathname, and so on). For the same reasons, avoid C<@>, C<;> and C<|>. Don't assume that in pathnames you can collapse two leading slashes C<//> into one: some networking and clustering filesystems have special semantics for that. Let the operating system sort it out. The I<portable filename characters> as defined by ANSI C are a b c d e f g h i j k l m n o p q r s t u v w x y z A B C D E F G H I J K L M N O P Q R S T U V W X Y Z 0 1 2 3 4 5 6 7 8 9 . _ - and C<-> shouldn't be the first character. If you want to be hypercorrect, stay case-insensitive and within the 8.3 naming convention (all the files and directories have to be unique within one directory if their names are lowercased and truncated to eight characters before the C<.>, if any, and to three characters after the C<.>, if any). (And do not use C<.>s in directory names.) =head2 System Interaction Not all platforms provide a command line. These are usually platforms that rely primarily on a Graphical User Interface (GUI) for user interaction. A program requiring a command line interface might not work everywhere. This is probably for the user of the program to deal with, so don't stay up late worrying about it. Some platforms can't delete or rename files held open by the system, this limitation may also apply to changing filesystem metainformation like file permissions or owners. Remember to L<C<close>|perlfunc/close FILEHANDLE> files when you are done with them. Don't L<C<unlink>|perlfunc/unlink LIST> or L<C<rename>|perlfunc/rename OLDNAME,NEWNAME> an open file. Don't L<C<tie>|perlfunc/tie VARIABLE,CLASSNAME,LIST> or L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> a file already tied or opened; L<C<untie>|perlfunc/untie VARIABLE> or L<C<close>|perlfunc/close FILEHANDLE> it first. Don't open the same file more than once at a time for writing, as some operating systems put mandatory locks on such files. Don't assume that write/modify permission on a directory gives the right to add or delete files/directories in that directory. That is filesystem specific: in some filesystems you need write/modify permission also (or even just) in the file/directory itself. In some filesystems (AFS, DFS) the permission to add/delete directory entries is a completely separate permission. Don't assume that a single L<C<unlink>|perlfunc/unlink LIST> completely gets rid of the file: some filesystems (most notably the ones in VMS) have versioned filesystems, and L<C<unlink>|perlfunc/unlink LIST> removes only the most recent one (it doesn't remove all the versions because by default the native tools on those platforms remove just the most recent version, too). The portable idiom to remove all the versions of a file is 1 while unlink "file"; This will terminate if the file is undeletable for some reason (protected, not there, and so on). Don't count on a specific environment variable existing in L<C<%ENV>|perlvar/%ENV>. Don't count on L<C<%ENV>|perlvar/%ENV> entries being case-sensitive, or even case-preserving. Don't try to clear L<C<%ENV>|perlvar/%ENV> by saying C<%ENV = ();>, or, if you really have to, make it conditional on C<$^O ne 'VMS'> since in VMS the L<C<%ENV>|perlvar/%ENV> table is much more than a per-process key-value string table. On VMS, some entries in the L<C<%ENV>|perlvar/%ENV> hash are dynamically created when their key is used on a read if they did not previously exist. The values for C<$ENV{HOME}>, C<$ENV{TERM}>, C<$ENV{PATH}>, and C<$ENV{USER}>, are known to be dynamically generated. The specific names that are dynamically generated may vary with the version of the C library on VMS, and more may exist than are documented. On VMS by default, changes to the L<C<%ENV>|perlvar/%ENV> hash persist after perl exits. Subsequent invocations of perl in the same process can inadvertently inherit environment settings that were meant to be temporary. Don't count on signals or L<C<%SIG>|perlvar/%SIG> for anything. Don't count on filename globbing. Use L<C<opendir>|perlfunc/opendir DIRHANDLE,EXPR>, L<C<readdir>|perlfunc/readdir DIRHANDLE>, and L<C<closedir>|perlfunc/closedir DIRHANDLE> instead. Don't count on per-program environment variables, or per-program current directories. Don't count on specific values of L<C<$!>|perlvar/$!>, neither numeric nor especially the string values. Users may switch their locales causing error messages to be translated into their languages. If you can trust a POSIXish environment, you can portably use the symbols defined by the L<C<Errno>|Errno> module, like C<ENOENT>. And don't trust on the values of L<C<$!>|perlvar/$!> at all except immediately after a failed system call. =head2 Command names versus file pathnames Don't assume that the name used to invoke a command or program with L<C<system>|perlfunc/system LIST> or L<C<exec>|perlfunc/exec LIST> can also be used to test for the existence of the file that holds the executable code for that command or program. First, many systems have "internal" commands that are built-in to the shell or OS and while these commands can be invoked, there is no corresponding file. Second, some operating systems (e.g., Cygwin, DJGPP, OS/2, and VOS) have required suffixes for executable files; these suffixes are generally permitted on the command name but are not required. Thus, a command like C<perl> might exist in a file named F<perl>, F<perl.exe>, or F<perl.pm>, depending on the operating system. The variable L<C<$Config{_exe}>|Config/C<_exe>> in the L<C<Config>|Config> module holds the executable suffix, if any. Third, the VMS port carefully sets up L<C<$^X>|perlvar/$^X> and L<C<$Config{perlpath}>|Config/C<perlpath>> so that no further processing is required. This is just as well, because the matching regular expression used below would then have to deal with a possible trailing version number in the VMS file name. To convert L<C<$^X>|perlvar/$^X> to a file pathname, taking account of the requirements of the various operating system possibilities, say: use Config; my $thisperl = $^X; if ($^O ne 'VMS') { $thisperl .= $Config{_exe} unless $thisperl =~ m/\Q$Config{_exe}\E$/i; } To convert L<C<$Config{perlpath}>|Config/C<perlpath>> to a file pathname, say: use Config; my $thisperl = $Config{perlpath}; if ($^O ne 'VMS') { $thisperl .= $Config{_exe} unless $thisperl =~ m/\Q$Config{_exe}\E$/i; } =head2 Networking Don't assume that you can reach the public Internet. Don't assume that there is only one way to get through firewalls to the public Internet. Don't assume that you can reach outside world through any other port than 80, or some web proxy. ftp is blocked by many firewalls. Don't assume that you can send email by connecting to the local SMTP port. Don't assume that you can reach yourself or any node by the name 'localhost'. The same goes for '127.0.0.1'. You will have to try both. Don't assume that the host has only one network card, or that it can't bind to many virtual IP addresses. Don't assume a particular network device name. Don't assume a particular set of L<C<ioctl>|perlfunc/ioctl FILEHANDLE,FUNCTION,SCALAR>s will work. Don't assume that you can ping hosts and get replies. Don't assume that any particular port (service) will respond. Don't assume that L<C<Sys::Hostname>|Sys::Hostname> (or any other API or command) returns either a fully qualified hostname or a non-qualified hostname: it all depends on how the system had been configured. Also remember that for things such as DHCP and NAT, the hostname you get back might not be very useful. All the above I<don't>s may look daunting, and they are, but the key is to degrade gracefully if one cannot reach the particular network service one wants. Croaking or hanging do not look very professional. =head2 Interprocess Communication (IPC) In general, don't directly access the system in code meant to be portable. That means, no L<C<system>|perlfunc/system LIST>, L<C<exec>|perlfunc/exec LIST>, L<C<fork>|perlfunc/fork>, L<C<pipe>|perlfunc/pipe READHANDLE,WRITEHANDLE>, L<C<``> or C<qxE<sol>E<sol>>|perlop/C<qxE<sol>I<STRING>E<sol>>>, L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> with a C<|>, nor any of the other things that makes being a Perl hacker worth being. Commands that launch external processes are generally supported on most platforms (though many of them do not support any type of forking). The problem with using them arises from what you invoke them on. External tools are often named differently on different platforms, may not be available in the same location, might accept different arguments, can behave differently, and often present their results in a platform-dependent way. Thus, you should seldom depend on them to produce consistent results. (Then again, if you're calling C<netstat -a>, you probably don't expect it to run on both Unix and CP/M.) One especially common bit of Perl code is opening a pipe to B<sendmail>: open(my $mail, '|-', '/usr/lib/sendmail -t') or die "cannot fork sendmail: $!"; This is fine for systems programming when sendmail is known to be available. But it is not fine for many non-Unix systems, and even some Unix systems that may not have sendmail installed. If a portable solution is needed, see the various distributions on CPAN that deal with it. L<C<Mail::Mailer>|Mail::Mailer> and L<C<Mail::Send>|Mail::Send> in the C<MailTools> distribution are commonly used, and provide several mailing methods, including C<mail>, C<sendmail>, and direct SMTP (via L<C<Net::SMTP>|Net::SMTP>) if a mail transfer agent is not available. L<C<Mail::Sendmail>|Mail::Sendmail> is a standalone module that provides simple, platform-independent mailing. The Unix System V IPC (C<msg*(), sem*(), shm*()>) is not available even on all Unix platforms. Do not use either the bare result of C<pack("N", 10, 20, 30, 40)> or bare v-strings (such as C<v10.20.30.40>) to represent IPv4 addresses: both forms just pack the four bytes into network order. That this would be equal to the C language C<in_addr> struct (which is what the socket code internally uses) is not guaranteed. To be portable use the routines of the L<C<Socket>|Socket> module, such as L<C<inet_aton>|Socket/$ip_address = inet_aton $string>, L<C<inet_ntoa>|Socket/$string = inet_ntoa $ip_address>, and L<C<sockaddr_in>|Socket/$sockaddr = sockaddr_in $port, $ip_address>. The rule of thumb for portable code is: Do it all in portable Perl, or use a module (that may internally implement it with platform-specific code, but exposes a common interface). =head2 External Subroutines (XS) XS code can usually be made to work with any platform, but dependent libraries, header files, etc., might not be readily available or portable, or the XS code itself might be platform-specific, just as Perl code might be. If the libraries and headers are portable, then it is normally reasonable to make sure the XS code is portable, too. A different type of portability issue arises when writing XS code: availability of a C compiler on the end-user's system. C brings with it its own portability issues, and writing XS code will expose you to some of those. Writing purely in Perl is an easier way to achieve portability. =head2 Standard Modules In general, the standard modules work across platforms. Notable exceptions are the L<C<CPAN>|CPAN> module (which currently makes connections to external programs that may not be available), platform-specific modules (like L<C<ExtUtils::MM_VMS>|ExtUtils::MM_VMS>), and DBM modules. There is no one DBM module available on all platforms. L<C<SDBM_File>|SDBM_File> and the others are generally available on all Unix and DOSish ports, but not in MacPerl, where only L<C<NDBM_File>|NDBM_File> and L<C<DB_File>|DB_File> are available. The good news is that at least some DBM module should be available, and L<C<AnyDBM_File>|AnyDBM_File> will use whichever module it can find. Of course, then the code needs to be fairly strict, dropping to the greatest common factor (e.g., not exceeding 1K for each record), so that it will work with any DBM module. See L<AnyDBM_File> for more details. =head2 Time and Date The system's notion of time of day and calendar date is controlled in widely different ways. Don't assume the timezone is stored in C<$ENV{TZ}>, and even if it is, don't assume that you can control the timezone through that variable. Don't assume anything about the three-letter timezone abbreviations (for example that MST would be the Mountain Standard Time, it's been known to stand for Moscow Standard Time). If you need to use timezones, express them in some unambiguous format like the exact number of minutes offset from UTC, or the POSIX timezone format. Don't assume that the epoch starts at 00:00:00, January 1, 1970, because that is OS- and implementation-specific. It is better to store a date in an unambiguous representation. The ISO 8601 standard defines YYYY-MM-DD as the date format, or YYYY-MM-DDTHH:MM:SS (that's a literal "T" separating the date from the time). Please do use the ISO 8601 instead of making us guess what date 02/03/04 might be. ISO 8601 even sorts nicely as-is. A text representation (like "1987-12-18") can be easily converted into an OS-specific value using a module like L<C<Time::Piece>|Time::Piece> (see L<Time::Piece/Date Parsing>) or L<C<Date::Parse>|Date::Parse>. An array of values, such as those returned by L<C<localtime>|perlfunc/localtime EXPR>, can be converted to an OS-specific representation using L<C<Time::Local>|Time::Local>. When calculating specific times, such as for tests in time or date modules, it may be appropriate to calculate an offset for the epoch. use Time::Local qw(timegm); my $offset = timegm(0, 0, 0, 1, 0, 1970); The value for C<$offset> in Unix will be C<0>, but in Mac OS Classic will be some large number. C<$offset> can then be added to a Unix time value to get what should be the proper value on any system. =head2 Character sets and character encoding Assume very little about character sets. Assume nothing about numerical values (L<C<ord>|perlfunc/ord EXPR>, L<C<chr>|perlfunc/chr NUMBER>) of characters. Do not use explicit code point ranges (like C<\xHH-\xHH)>. However, starting in Perl v5.22, regular expression pattern bracketed character class ranges specified like C<qr/[\N{U+HH}-\N{U+HH}]/> are portable, and starting in Perl v5.24, the same ranges are portable in L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>>. You can portably use symbolic character classes like C<[:print:]>. Do not assume that the alphabetic characters are encoded contiguously (in the numeric sense). There may be gaps. Special coding in Perl, however, guarantees that all subsets of C<qr/[A-Z]/>, C<qr/[a-z]/>, and C<qr/[0-9]/> behave as expected. L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>> behaves the same for these ranges. In patterns, any ranges specified with end points using the C<\N{...}> notations ensures character set portability, but it is a bug in Perl v5.22 that this isn't true of L<C<trE<sol>E<sol>E<sol>>|perlop/C<trE<sol>I<SEARCHLIST>E<sol>I<REPLACEMENTLIST>E<sol>cdsr>>, fixed in v5.24. Do not assume anything about the ordering of the characters. The lowercase letters may come before or after the uppercase letters; the lowercase and uppercase may be interlaced so that both "a" and "A" come before "b"; the accented and other international characters may be interlaced so that E<auml> comes before "b". L<Unicode::Collate> can be used to sort this all out. =head2 Internationalisation If you may assume POSIX (a rather large assumption), you may read more about the POSIX locale system from L<perllocale>. The locale system at least attempts to make things a little bit more portable, or at least more convenient and native-friendly for non-English users. The system affects character sets and encoding, and date and time formatting--amongst other things. If you really want to be international, you should consider Unicode. See L<perluniintro> and L<perlunicode> for more information. By default Perl assumes your source code is written in an 8-bit ASCII superset. To embed Unicode characters in your strings and regexes, you can use the L<C<\x{HH}> or (more portably) C<\N{U+HH}> notations|perlop/Quote and Quote-like Operators>. You can also use the L<C<utf8>|utf8> pragma and write your code in UTF-8, which lets you use Unicode characters directly (not just in quoted constructs but also in identifiers). =head2 System Resources If your code is destined for systems with severely constrained (or missing!) virtual memory systems then you want to be I<especially> mindful of avoiding wasteful constructs such as: my @lines = <$very_large_file>; # bad while (<$fh>) {$file .= $_} # sometimes bad my $file = join('', <$fh>); # better The last two constructs may appear unintuitive to most people. The first repeatedly grows a string, whereas the second allocates a large chunk of memory in one go. On some systems, the second is more efficient than the first. =head2 Security Most multi-user platforms provide basic levels of security, usually implemented at the filesystem level. Some, however, unfortunately do not. Thus the notion of user id, or "home" directory, or even the state of being logged-in, may be unrecognizable on many platforms. If you write programs that are security-conscious, it is usually best to know what type of system you will be running under so that you can write code explicitly for that platform (or class of platforms). Don't assume the Unix filesystem access semantics: the operating system or the filesystem may be using some ACL systems, which are richer languages than the usual C<rwx>. Even if the C<rwx> exist, their semantics might be different. (From the security viewpoint, testing for permissions before attempting to do something is silly anyway: if one tries this, there is potential for race conditions. Someone or something might change the permissions between the permissions check and the actual operation. Just try the operation.) Don't assume the Unix user and group semantics: especially, don't expect L<C<< $< >>|perlvar/$E<lt>> and L<C<< $> >>|perlvar/$E<gt>> (or L<C<$(>|perlvar/$(> and L<C<$)>|perlvar/$)>) to work for switching identities (or memberships). Don't assume set-uid and set-gid semantics. (And even if you do, think twice: set-uid and set-gid are a known can of security worms.) =head2 Style For those times when it is necessary to have platform-specific code, consider keeping the platform-specific code in one place, making porting to other platforms easier. Use the L<C<Config>|Config> module and the special variable L<C<$^O>|perlvar/$^O> to differentiate platforms, as described in L</"PLATFORMS">. Beware of the "else syndrome": if ($^O eq 'MSWin32') { # code that assumes Windows } else { # code that assumes Linux } The C<else> branch should be used for the really ultimate fallback, not for code specific to some platform. Be careful in the tests you supply with your module or programs. Module code may be fully portable, but its tests might not be. This often happens when tests spawn off other processes or call external programs to aid in the testing, or when (as noted above) the tests assume certain things about the filesystem and paths. Be careful not to depend on a specific output style for errors, such as when checking L<C<$!>|perlvar/$!> after a failed system call. Using L<C<$!>|perlvar/$!> for anything else than displaying it as output is doubtful (though see the L<C<Errno>|Errno> module for testing reasonably portably for error value). Some platforms expect a certain output format, and Perl on those platforms may have been adjusted accordingly. Most specifically, don't anchor a regex when testing an error value. =head1 CPAN Testers Modules uploaded to CPAN are tested by a variety of volunteers on different platforms. These CPAN testers are notified by mail of each new upload, and reply to the list with PASS, FAIL, NA (not applicable to this platform), or UNKNOWN (unknown), along with any relevant notations. The purpose of the testing is twofold: one, to help developers fix any problems in their code that crop up because of lack of testing on other platforms; two, to provide users with information about whether a given module works on a given platform. Also see: =over 4 =item * Mailing list: cpan-testers-discuss@perl.org =item * Testing results: L<https://www.cpantesters.org/> =back =head1 PLATFORMS Perl is built with a L<C<$^O>|perlvar/$^O> variable that indicates the operating system it was built on. This was implemented to help speed up code that would otherwise have to C<use Config> and use the value of L<C<$Config{osname}>|Config/C<osname>>. Of course, to get more detailed information about the system, looking into L<C<%Config>|Config/DESCRIPTION> is certainly recommended. L<C<%Config>|Config/DESCRIPTION> cannot always be trusted, however, because it was built at compile time. If perl was built in one place, then transferred elsewhere, some values may be wrong. The values may even have been edited after the fact. =head2 Unix Perl works on a bewildering variety of Unix and Unix-like platforms (see e.g. most of the files in the F<hints/> directory in the source code kit). On most of these systems, the value of L<C<$^O>|perlvar/$^O> (hence L<C<$Config{osname}>|Config/C<osname>>, too) is determined either by lowercasing and stripping punctuation from the first field of the string returned by typing C<uname -a> (or a similar command) at the shell prompt or by testing the file system for the presence of uniquely named files such as a kernel or header file. Here, for example, are a few of the more popular Unix flavors: uname $^O $Config{archname} -------------------------------------------- AIX aix aix BSD/OS bsdos i386-bsdos Darwin darwin darwin DYNIX/ptx dynixptx i386-dynixptx FreeBSD freebsd freebsd-i386 Haiku haiku BePC-haiku Linux linux arm-linux Linux linux armv5tel-linux Linux linux i386-linux Linux linux i586-linux Linux linux ppc-linux HP-UX hpux PA-RISC1.1 IRIX irix irix Mac OS X darwin darwin NeXT 3 next next-fat NeXT 4 next OPENSTEP-Mach openbsd openbsd i386-openbsd OSF1 dec_osf alpha-dec_osf reliantunix-n svr4 RM400-svr4 SCO_SV sco_sv i386-sco_sv SINIX-N svr4 RM400-svr4 sn4609 unicos CRAY_C90-unicos sn6521 unicosmk t3e-unicosmk sn9617 unicos CRAY_J90-unicos SunOS solaris sun4-solaris SunOS solaris i86pc-solaris SunOS4 sunos sun4-sunos Because the value of L<C<$Config{archname}>|Config/C<archname>> may depend on the hardware architecture, it can vary more than the value of L<C<$^O>|perlvar/$^O>. =head2 DOS and Derivatives Perl has long been ported to Intel-style microcomputers running under systems like PC-DOS, MS-DOS, OS/2, and most Windows platforms you can bring yourself to mention (except for Windows CE, if you count that). Users familiar with I<COMMAND.COM> or I<CMD.EXE> style shells should be aware that each of these file specifications may have subtle differences: my $filespec0 = "c:/foo/bar/file.txt"; my $filespec1 = "c:\\foo\\bar\\file.txt"; my $filespec2 = 'c:\foo\bar\file.txt'; my $filespec3 = 'c:\\foo\\bar\\file.txt'; System calls accept either C</> or C<\> as the path separator. However, many command-line utilities of DOS vintage treat C</> as the option prefix, so may get confused by filenames containing C</>. Aside from calling any external programs, C</> will work just fine, and probably better, as it is more consistent with popular usage, and avoids the problem of remembering what to backwhack and what not to. The DOS FAT filesystem can accommodate only "8.3" style filenames. Under the "case-insensitive, but case-preserving" HPFS (OS/2) and NTFS (NT) filesystems you may have to be careful about case returned with functions like L<C<readdir>|perlfunc/readdir DIRHANDLE> or used with functions like L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> or L<C<opendir>|perlfunc/opendir DIRHANDLE,EXPR>. DOS also treats several filenames as special, such as F<AUX>, F<PRN>, F<NUL>, F<CON>, F<COM1>, F<LPT1>, F<LPT2>, etc. Unfortunately, sometimes these filenames won't even work if you include an explicit directory prefix. It is best to avoid such filenames, if you want your code to be portable to DOS and its derivatives. It's hard to know what these all are, unfortunately. Users of these operating systems may also wish to make use of scripts such as F<pl2bat.bat> to put wrappers around your scripts. Newline (C<\n>) is translated as C<\015\012> by the I/O system when reading from and writing to files (see L</"Newlines">). C<binmode($filehandle)> will keep C<\n> translated as C<\012> for that filehandle. L<C<binmode>|perlfunc/binmode FILEHANDLE> should always be used for code that deals with binary data. That's assuming you realize in advance that your data is in binary. General-purpose programs should often assume nothing about their data. The L<C<$^O>|perlvar/$^O> variable and the L<C<$Config{archname}>|Config/C<archname>> values for various DOSish perls are as follows: OS $^O $Config{archname} ID Version --------------------------------------------------------- MS-DOS dos ? PC-DOS dos ? OS/2 os2 ? Windows 3.1 ? ? 0 3 01 Windows 95 MSWin32 MSWin32-x86 1 4 00 Windows 98 MSWin32 MSWin32-x86 1 4 10 Windows ME MSWin32 MSWin32-x86 1 ? Windows NT MSWin32 MSWin32-x86 2 4 xx Windows NT MSWin32 MSWin32-ALPHA 2 4 xx Windows NT MSWin32 MSWin32-ppc 2 4 xx Windows 2000 MSWin32 MSWin32-x86 2 5 00 Windows XP MSWin32 MSWin32-x86 2 5 01 Windows 2003 MSWin32 MSWin32-x86 2 5 02 Windows Vista MSWin32 MSWin32-x86 2 6 00 Windows 7 MSWin32 MSWin32-x86 2 6 01 Windows 7 MSWin32 MSWin32-x64 2 6 01 Windows 2008 MSWin32 MSWin32-x86 2 6 01 Windows 2008 MSWin32 MSWin32-x64 2 6 01 Windows CE MSWin32 ? 3 Cygwin cygwin cygwin The various MSWin32 Perl's can distinguish the OS they are running on via the value of the fifth element of the list returned from L<C<Win32::GetOSVersion()>|Win32/Win32::GetOSVersion()>. For example: if ($^O eq 'MSWin32') { my @os_version_info = Win32::GetOSVersion(); print +('3.1','95','NT')[$os_version_info[4]],"\n"; } There are also C<Win32::IsWinNT()|Win32/Win32::IsWinNT()>, C<Win32::IsWin95()|Win32/Win32::IsWin95()>, and L<C<Win32::GetOSName()>|Win32/Win32::GetOSName()>; try L<C<perldoc Win32>|Win32>. The very portable L<C<POSIX::uname()>|POSIX/C<uname>> will work too: c:\> perl -MPOSIX -we "print join '|', uname" Windows NT|moonru|5.0|Build 2195 (Service Pack 2)|x86 Errors set by Winsock functions are now put directly into C<$^E>, and the relevant C<WSAE*> error codes are now exported from the L<Errno> and L<POSIX> modules for testing this against. The previous behavior of putting the errors (converted to POSIX-style C<E*> error codes since Perl 5.20.0) into C<$!> was buggy due to the non-equivalence of like-named Winsock and POSIX error constants, a relationship between which has unfortunately been established in one way or another since Perl 5.8.0. The new behavior provides a much more robust solution for checking Winsock errors in portable software without accidentally matching POSIX tests that were intended for other OSes and may have different meanings for Winsock. The old behavior is currently retained, warts and all, for backwards compatibility, but users are encouraged to change any code that tests C<$!> against C<E*> constants for Winsock errors to instead test C<$^E> against C<WSAE*> constants. After a suitable deprecation period, which started with Perl 5.24, the old behavior may be removed, leaving C<$!> unchanged after Winsock function calls, to avoid any possible confusion over which error variable to check. Also see: =over 4 =item * The djgpp environment for DOS, L<http://www.delorie.com/djgpp/> and L<perldos>. =item * The EMX environment for DOS, OS/2, etc. emx@iaehv.nl, L<ftp://hobbes.nmsu.edu/pub/os2/dev/emx/> Also L<perlos2>. =item * Build instructions for Win32 in L<perlwin32>, or under the Cygnus environment in L<perlcygwin>. =item * The C<Win32::*> modules in L<Win32>. =item * The ActiveState Pages, L<https://www.activestate.com/> =item * The Cygwin environment for Win32; F<README.cygwin> (installed as L<perlcygwin>), L<https://www.cygwin.com/> =item * The U/WIN environment for Win32, L<http://www.research.att.com/sw/tools/uwin/> =item * Build instructions for OS/2, L<perlos2> =back =head2 VMS Perl on VMS is discussed in L<perlvms> in the Perl distribution. The official name of VMS as of this writing is OpenVMS. Interacting with Perl from the Digital Command Language (DCL) shell often requires a different set of quotation marks than Unix shells do. For example: $ perl -e "print ""Hello, world.\n""" Hello, world. There are several ways to wrap your Perl scripts in DCL F<.COM> files, if you are so inclined. For example: $ write sys$output "Hello from DCL!" $ if p1 .eqs. "" $ then perl -x 'f$environment("PROCEDURE") $ else perl -x - 'p1 'p2 'p3 'p4 'p5 'p6 'p7 'p8 $ deck/dollars="__END__" #!/usr/bin/perl print "Hello from Perl!\n"; __END__ $ endif Do take care with C<$ ASSIGN/nolog/user SYS$COMMAND: SYS$INPUT> if your Perl-in-DCL script expects to do things like C<< $read = <STDIN>; >>. The VMS operating system has two filesystems, designated by their on-disk structure (ODS) level: ODS-2 and its successor ODS-5. The initial port of Perl to VMS pre-dates ODS-5, but all current testing and development assumes ODS-5 and its capabilities, including case preservation, extended characters in filespecs, and names up to 8192 bytes long. Perl on VMS can accept either VMS- or Unix-style file specifications as in either of the following: $ perl -ne "print if /perl_setup/i" SYS$LOGIN:LOGIN.COM $ perl -ne "print if /perl_setup/i" /sys$login/login.com but not a mixture of both as in: $ perl -ne "print if /perl_setup/i" sys$login:/login.com Can't open sys$login:/login.com: file specification syntax error In general, the easiest path to portability is always to specify filenames in Unix format unless they will need to be processed by native commands or utilities. Because of this latter consideration, the L<File::Spec> module by default returns native format specifications regardless of input format. This default may be reversed so that filenames are always reported in Unix format by specifying the C<DECC$FILENAME_UNIX_REPORT> feature logical in the environment. The file type, or extension, is always present in a VMS-format file specification even if it's zero-length. This means that, by default, L<C<readdir>|perlfunc/readdir DIRHANDLE> will return a trailing dot on a file with no extension, so where you would see C<"a"> on Unix you'll see C<"a."> on VMS. However, the trailing dot may be suppressed by enabling the C<DECC$READDIR_DROPDOTNOTYPE> feature in the environment (see the CRTL documentation on feature logical names). What C<\n> represents depends on the type of file opened. It usually represents C<\012> but it could also be C<\015>, C<\012>, C<\015\012>, C<\000>, C<\040>, or nothing depending on the file organization and record format. The L<C<VMS::Stdio>|VMS::Stdio> module provides access to the special C<fopen()> requirements of files with unusual attributes on VMS. The value of L<C<$^O>|perlvar/$^O> on OpenVMS is "VMS". To determine the architecture that you are running on refer to L<C<$Config{archname}>|Config/C<archname>>. On VMS, perl determines the UTC offset from the C<SYS$TIMEZONE_DIFFERENTIAL> logical name. Although the VMS epoch began at 17-NOV-1858 00:00:00.00, calls to L<C<localtime>|perlfunc/localtime EXPR> are adjusted to count offsets from 01-JAN-1970 00:00:00.00, just like Unix. Also see: =over 4 =item * F<README.vms> (installed as F<README_vms>), L<perlvms> =item * vmsperl list, vmsperl-subscribe@perl.org =item * vmsperl on the web, L<http://www.sidhe.org/vmsperl/index.html> =item * VMS Software Inc. web site, L<http://www.vmssoftware.com> =back =head2 VOS Perl on VOS (also known as OpenVOS) is discussed in F<README.vos> in the Perl distribution (installed as L<perlvos>). Perl on VOS can accept either VOS- or Unix-style file specifications as in either of the following: $ perl -ne "print if /perl_setup/i" >system>notices $ perl -ne "print if /perl_setup/i" /system/notices or even a mixture of both as in: $ perl -ne "print if /perl_setup/i" >system/notices Even though VOS allows the slash character to appear in object names, because the VOS port of Perl interprets it as a pathname delimiting character, VOS files, directories, or links whose names contain a slash character cannot be processed. Such files must be renamed before they can be processed by Perl. Older releases of VOS (prior to OpenVOS Release 17.0) limit file names to 32 or fewer characters, prohibit file names from starting with a C<-> character, and prohibit file names from containing C< > (space) or any character from the set C<< !#%&'()*;<=>? >>. Newer releases of VOS (OpenVOS Release 17.0 or later) support a feature known as extended names. On these releases, file names can contain up to 255 characters, are prohibited from starting with a C<-> character, and the set of prohibited characters is reduced to C<< #%*<>? >>. There are restrictions involving spaces and apostrophes: these characters must not begin or end a name, nor can they immediately precede or follow a period. Additionally, a space must not immediately precede another space or hyphen. Specifically, the following character combinations are prohibited: space-space, space-hyphen, period-space, space-period, period-apostrophe, apostrophe-period, leading or trailing space, and leading or trailing apostrophe. Although an extended file name is limited to 255 characters, a path name is still limited to 256 characters. The value of L<C<$^O>|perlvar/$^O> on VOS is "vos". To determine the architecture that you are running on refer to L<C<$Config{archname}>|Config/C<archname>>. Also see: =over 4 =item * F<README.vos> (installed as L<perlvos>) =item * The VOS mailing list. There is no specific mailing list for Perl on VOS. You can contact the Stratus Technologies Customer Assistance Center (CAC) for your region, or you can use the contact information located in the distribution files on the Stratus Anonymous FTP site. =item * Stratus Technologies on the web at L<http://www.stratus.com> =item * VOS Open-Source Software on the web at L<http://ftp.stratus.com/pub/vos/vos.html> =back =head2 EBCDIC Platforms v5.22 core Perl runs on z/OS (formerly OS/390). Theoretically it could run on the successors of OS/400 on AS/400 minicomputers as well as VM/ESA, and BS2000 for S/390 Mainframes. Such computers use EBCDIC character sets internally (usually Character Code Set ID 0037 for OS/400 and either 1047 or POSIX-BC for S/390 systems). The rest of this section may need updating, but we don't know what it should say. Please submit comments to L<https://github.com/Perl/perl5/issues>. On the mainframe Perl currently works under the "Unix system services for OS/390" (formerly known as OpenEdition), VM/ESA OpenEdition, or the BS200 POSIX-BC system (BS2000 is supported in Perl 5.6 and greater). See L<perlos390> for details. Note that for OS/400 there is also a port of Perl 5.8.1/5.10.0 or later to the PASE which is ASCII-based (as opposed to ILE which is EBCDIC-based), see L<perlos400>. As of R2.5 of USS for OS/390 and Version 2.3 of VM/ESA these Unix sub-systems do not support the C<#!> shebang trick for script invocation. Hence, on OS/390 and VM/ESA Perl scripts can be executed with a header similar to the following simple script: : # use perl eval 'exec /usr/local/bin/perl -S $0 ${1+"$@"}' if 0; #!/usr/local/bin/perl # just a comment really print "Hello from perl!\n"; OS/390 will support the C<#!> shebang trick in release 2.8 and beyond. Calls to L<C<system>|perlfunc/system LIST> and backticks can use POSIX shell syntax on all S/390 systems. On the AS/400, if PERL5 is in your library list, you may need to wrap your Perl scripts in a CL procedure to invoke them like so: BEGIN CALL PGM(PERL5/PERL) PARM('/QOpenSys/hello.pl') ENDPGM This will invoke the Perl script F<hello.pl> in the root of the QOpenSys file system. On the AS/400 calls to L<C<system>|perlfunc/system LIST> or backticks must use CL syntax. On these platforms, bear in mind that the EBCDIC character set may have an effect on what happens with some Perl functions (such as L<C<chr>|perlfunc/chr NUMBER>, L<C<pack>|perlfunc/pack TEMPLATE,LIST>, L<C<print>|perlfunc/print FILEHANDLE LIST>, L<C<printf>|perlfunc/printf FILEHANDLE FORMAT, LIST>, L<C<ord>|perlfunc/ord EXPR>, L<C<sort>|perlfunc/sort SUBNAME LIST>, L<C<sprintf>|perlfunc/sprintf FORMAT, LIST>, L<C<unpack>|perlfunc/unpack TEMPLATE,EXPR>), as well as bit-fiddling with ASCII constants using operators like L<C<^>, C<&> and C<|>|perlop/Bitwise String Operators>, not to mention dealing with socket interfaces to ASCII computers (see L</"Newlines">). Fortunately, most web servers for the mainframe will correctly translate the C<\n> in the following statement to its ASCII equivalent (C<\r> is the same under both Unix and z/OS): print "Content-type: text/html\r\n\r\n"; The values of L<C<$^O>|perlvar/$^O> on some of these platforms include: uname $^O $Config{archname} -------------------------------------------- OS/390 os390 os390 OS400 os400 os400 POSIX-BC posix-bc BS2000-posix-bc Some simple tricks for determining if you are running on an EBCDIC platform could include any of the following (perhaps all): if ("\t" eq "\005") { print "EBCDIC may be spoken here!\n"; } if (ord('A') == 193) { print "EBCDIC may be spoken here!\n"; } if (chr(169) eq 'z') { print "EBCDIC may be spoken here!\n"; } One thing you may not want to rely on is the EBCDIC encoding of punctuation characters since these may differ from code page to code page (and once your module or script is rumoured to work with EBCDIC, folks will want it to work with all EBCDIC character sets). Also see: =over 4 =item * L<perlos390>, L<perlos400>, L<perlbs2000>, L<perlebcdic>. =item * The perl-mvs@perl.org list is for discussion of porting issues as well as general usage issues for all EBCDIC Perls. Send a message body of "subscribe perl-mvs" to majordomo@perl.org. =item * AS/400 Perl information at L<http://as400.rochester.ibm.com/> as well as on CPAN in the F<ports/> directory. =back =head2 Acorn RISC OS Because Acorns use ASCII with newlines (C<\n>) in text files as C<\012> like Unix, and because Unix filename emulation is turned on by default, most simple scripts will probably work "out of the box". The native filesystem is modular, and individual filesystems are free to be case-sensitive or insensitive, and are usually case-preserving. Some native filesystems have name length limits, which file and directory names are silently truncated to fit. Scripts should be aware that the standard filesystem currently has a name length limit of B<10> characters, with up to 77 items in a directory, but other filesystems may not impose such limitations. Native filenames are of the form Filesystem#Special_Field::DiskName.$.Directory.Directory.File where Special_Field is not usually present, but may contain . and $ . Filesystem =~ m|[A-Za-z0-9_]| DsicName =~ m|[A-Za-z0-9_/]| $ represents the root directory . is the path separator @ is the current directory (per filesystem but machine global) ^ is the parent directory Directory and File =~ m|[^\0- "\.\$\%\&:\@\\^\|\177]+| The default filename translation is roughly C<tr|/.|./|>, swapping dots and slashes. Note that C<"ADFS::HardDisk.$.File" ne 'ADFS::HardDisk.$.File'> and that the second stage of C<$> interpolation in regular expressions will fall foul of the L<C<$.>|perlvar/$.> variable if scripts are not careful. Logical paths specified by system variables containing comma-separated search lists are also allowed; hence C<System:Modules> is a valid filename, and the filesystem will prefix C<Modules> with each section of C<System$Path> until a name is made that points to an object on disk. Writing to a new file C<System:Modules> would be allowed only if C<System$Path> contains a single item list. The filesystem will also expand system variables in filenames if enclosed in angle brackets, so C<< <System$Dir>.Modules >> would look for the file S<C<$ENV{'System$Dir'} . 'Modules'>>. The obvious implication of this is that B<fully qualified filenames can start with C<< <> >>> and the three-argument form of L<C<open>|perlfunc/open FILEHANDLE,MODE,EXPR> should always be used. Because C<.> was in use as a directory separator and filenames could not be assumed to be unique after 10 characters, Acorn implemented the C compiler to strip the trailing C<.c> C<.h> C<.s> and C<.o> suffix from filenames specified in source code and store the respective files in subdirectories named after the suffix. Hence files are translated: foo.h h.foo C:foo.h C:h.foo (logical path variable) sys/os.h sys.h.os (C compiler groks Unix-speak) 10charname.c c.10charname 10charname.o o.10charname 11charname_.c c.11charname (assuming filesystem truncates at 10) The Unix emulation library's translation of filenames to native assumes that this sort of translation is required, and it allows a user-defined list of known suffixes that it will transpose in this fashion. This may seem transparent, but consider that with these rules F<foo/bar/baz.h> and F<foo/bar/h/baz> both map to F<foo.bar.h.baz>, and that L<C<readdir>|perlfunc/readdir DIRHANDLE> and L<C<glob>|perlfunc/glob EXPR> cannot and do not attempt to emulate the reverse mapping. Other C<.>'s in filenames are translated to C</>. As implied above, the environment accessed through L<C<%ENV>|perlvar/%ENV> is global, and the convention is that program specific environment variables are of the form C<Program$Name>. Each filesystem maintains a current directory, and the current filesystem's current directory is the B<global> current directory. Consequently, sociable programs don't change the current directory but rely on full pathnames, and programs (and Makefiles) cannot assume that they can spawn a child process which can change the current directory without affecting its parent (and everyone else for that matter). Because native operating system filehandles are global and are currently allocated down from 255, with 0 being a reserved value, the Unix emulation library emulates Unix filehandles. Consequently, you can't rely on passing C<STDIN>, C<STDOUT>, or C<STDERR> to your children. The desire of users to express filenames of the form C<< <Foo$Dir>.Bar >> on the command line unquoted causes problems, too: L<C<``>|perlop/C<qxE<sol>I<STRING>E<sol>>> command output capture has to perform a guessing game. It assumes that a string C<< <[^<>]+\$[^<>]> >> is a reference to an environment variable, whereas anything else involving C<< < >> or C<< > >> is redirection, and generally manages to be 99% right. Of course, the problem remains that scripts cannot rely on any Unix tools being available, or that any tools found have Unix-like command line arguments. Extensions and XS are, in theory, buildable by anyone using free tools. In practice, many don't, as users of the Acorn platform are used to binary distributions. MakeMaker does run, but no available make currently copes with MakeMaker's makefiles; even if and when this should be fixed, the lack of a Unix-like shell will cause problems with makefile rules, especially lines of the form C<cd sdbm && make all>, and anything using quoting. S<"RISC OS"> is the proper name for the operating system, but the value in L<C<$^O>|perlvar/$^O> is "riscos" (because we don't like shouting). =head2 Other perls Perl has been ported to many platforms that do not fit into any of the categories listed above. Some, such as AmigaOS, QNX, Plan 9, and VOS, have been well-integrated into the standard Perl source code kit. You may need to see the F<ports/> directory on CPAN for information, and possibly binaries, for the likes of: aos, Atari ST, lynxos, riscos, Novell Netware, Tandem Guardian, I<etc.> (Yes, we know that some of these OSes may fall under the Unix category, but we are not a standards body.) Some approximate operating system names and their L<C<$^O>|perlvar/$^O> values in the "OTHER" category include: OS $^O $Config{archname} ------------------------------------------ Amiga DOS amigaos m68k-amigos See also: =over 4 =item * Amiga, F<README.amiga> (installed as L<perlamiga>). =item * A free perl5-based PERL.NLM for Novell Netware is available in precompiled binary and source code form from L<http://www.novell.com/> as well as from CPAN. =item * S<Plan 9>, F<README.plan9> =back =head1 FUNCTION IMPLEMENTATIONS Listed below are functions that are either completely unimplemented or else have been implemented differently on various platforms. Preceding each description will be, in parentheses, a list of platforms that the description applies to. The list may well be incomplete, or even wrong in some places. When in doubt, consult the platform-specific README files in the Perl source distribution, and any other documentation resources accompanying a given port. Be aware, moreover, that even among Unix-ish systems there are variations. For many functions, you can also query L<C<%Config>|Config/DESCRIPTION>, exported by default from the L<C<Config>|Config> module. For example, to check whether the platform has the L<C<lstat>|perlfunc/lstat FILEHANDLE> call, check L<C<$Config{d_lstat}>|Config/C<d_lstat>>. See L<Config> for a full description of available variables. =head2 Alphabetical Listing of Perl Functions =over 8 =item -X (Win32) C<-w> only inspects the read-only file attribute (FILE_ATTRIBUTE_READONLY), which determines whether the directory can be deleted, not whether it can be written to. Directories always have read and write access unless denied by discretionary access control lists (DACLs). (VMS) C<-r>, C<-w>, C<-x>, and C<-o> tell whether the file is accessible, which may not reflect UIC-based file protections. (S<RISC OS>) C<-s> by name on an open file will return the space reserved on disk, rather than the current extent. C<-s> on an open filehandle returns the current size. (Win32, VMS, S<RISC OS>) C<-R>, C<-W>, C<-X>, C<-O> are indistinguishable from C<-r>, C<-w>, C<-x>, C<-o>. (Win32, VMS, S<RISC OS>) C<-g>, C<-k>, C<-l>, C<-u>, C<-A> are not particularly meaningful. (VMS, S<RISC OS>) C<-p> is not particularly meaningful. (VMS) C<-d> is true if passed a device spec without an explicit directory. (Win32) C<-x> (or C<-X>) determine if a file ends in one of the executable suffixes. C<-S> is meaningless. (S<RISC OS>) C<-x> (or C<-X>) determine if a file has an executable file type. =item alarm (Win32) Emulated using timers that must be explicitly polled whenever Perl wants to dispatch "safe signals" and therefore cannot interrupt blocking system calls. =item atan2 (Tru64, HP-UX 10.20) Due to issues with various CPUs, math libraries, compilers, and standards, results for C<atan2> may vary depending on any combination of the above. Perl attempts to conform to the Open Group/IEEE standards for the results returned from C<atan2>, but cannot force the issue if the system Perl is run on does not allow it. The current version of the standards for C<atan2> is available at L<http://www.opengroup.org/onlinepubs/009695399/functions/atan2.html>. =item binmode (S<RISC OS>) Meaningless. (VMS) Reopens file and restores pointer; if function fails, underlying filehandle may be closed, or pointer may be in a different position. (Win32) The value returned by L<C<tell>|perlfunc/tell FILEHANDLE> may be affected after the call, and the filehandle may be flushed. =item chmod (Win32) Only good for changing "owner" read-write access; "group" and "other" bits are meaningless. (S<RISC OS>) Only good for changing "owner" and "other" read-write access. (VOS) Access permissions are mapped onto VOS access-control list changes. (Cygwin) The actual permissions set depend on the value of the C<CYGWIN> variable in the SYSTEM environment settings. (Android) Setting the exec bit on some locations (generally F</sdcard>) will return true but not actually set the bit. (VMS) A mode argument of zero sets permissions to the user's default permission mask rather than disabling all permissions. =item chown (S<Plan 9>, S<RISC OS>) Not implemented. (Win32) Does nothing, but won't fail. (VOS) A little funky, because VOS's notion of ownership is a little funky. =item chroot (Win32, VMS, S<Plan 9>, S<RISC OS>, VOS) Not implemented. =item crypt (Win32) May not be available if library or source was not provided when building perl. (Android) Not implemented. =item dbmclose (VMS, S<Plan 9>, VOS) Not implemented. =item dbmopen (VMS, S<Plan 9>, VOS) Not implemented. =item dump (S<RISC OS>) Not useful. (Cygwin, Win32) Not supported. (VMS) Invokes VMS debugger. =item exec (Win32) C<exec LIST> without the use of indirect object syntax (C<exec PROGRAM LIST>) may fall back to trying the shell if the first C<spawn()> fails. Note that the list form of exec() is emulated since the Win32 API CreateProcess() accepts a simple string rather than an array of command-line arguments. This may have security implications for your code. (SunOS, Solaris, HP-UX) Does not automatically flush output handles on some platforms. (Symbian OS) Not supported. =item exit (VMS) Emulates Unix C<exit> (which considers C<exit 1> to indicate an error) by mapping the C<1> to C<SS$_ABORT> (C<44>). This behavior may be overridden with the pragma L<C<use vmsish 'exit'>|vmsish/C<vmsish exit>>. As with the CRTL's C<exit()> function, C<exit 0> is also mapped to an exit status of C<SS$_NORMAL> (C<1>); this mapping cannot be overridden. Any other argument to C<exit> is used directly as Perl's exit status. On VMS, unless the future POSIX_EXIT mode is enabled, the exit code should always be a valid VMS exit code and not a generic number. When the POSIX_EXIT mode is enabled, a generic number will be encoded in a method compatible with the C library _POSIX_EXIT macro so that it can be decoded by other programs, particularly ones written in C, like the GNV package. (Solaris) C<exit> resets file pointers, which is a problem when called from a child process (created by L<C<fork>|perlfunc/fork>) in L<C<BEGIN>|perlmod/BEGIN, UNITCHECK, CHECK, INIT and END>. A workaround is to use L<C<POSIX::_exit>|POSIX/C<_exit>>. exit unless $Config{archname} =~ /\bsolaris\b/; require POSIX; POSIX::_exit(0); =item fcntl (Win32) Not implemented. (VMS) Some functions available based on the version of VMS. =item flock (VMS, S<RISC OS>, VOS) Not implemented. =item fork (AmigaOS, S<RISC OS>, VMS) Not implemented. (Win32) Emulated using multiple interpreters. See L<perlfork>. (SunOS, Solaris, HP-UX) Does not automatically flush output handles on some platforms. =item getlogin (S<RISC OS>) Not implemented. =item getpgrp (Win32, VMS, S<RISC OS>) Not implemented. =item getppid (Win32, S<RISC OS>) Not implemented. =item getpriority (Win32, VMS, S<RISC OS>, VOS) Not implemented. =item getpwnam (Win32) Not implemented. (S<RISC OS>) Not useful. =item getgrnam (Win32, VMS, S<RISC OS>) Not implemented. =item getnetbyname (Android, Win32, S<Plan 9>) Not implemented. =item getpwuid (Win32) Not implemented. (S<RISC OS>) Not useful. =item getgrgid (Win32, VMS, S<RISC OS>) Not implemented. =item getnetbyaddr (Android, Win32, S<Plan 9>) Not implemented. =item getprotobynumber (Android) Not implemented. =item getpwent (Android, Win32) Not implemented. =item getgrent (Android, Win32, VMS) Not implemented. =item gethostbyname (S<Irix 5>) C<gethostbyname('localhost')> does not work everywhere: you may have to use C<gethostbyname('127.0.0.1')>. =item gethostent (Win32) Not implemented. =item getnetent (Android, Win32, S<Plan 9>) Not implemented. =item getprotoent (Android, Win32, S<Plan 9>) Not implemented. =item getservent (Win32, S<Plan 9>) Not implemented. =item seekdir (Android) Not implemented. =item sethostent (Android, Win32, S<Plan 9>, S<RISC OS>) Not implemented. =item setnetent (Win32, S<Plan 9>, S<RISC OS>) Not implemented. =item setprotoent (Android, Win32, S<Plan 9>, S<RISC OS>) Not implemented. =item setservent (S<Plan 9>, Win32, S<RISC OS>) Not implemented. =item endpwent (Win32) Not implemented. (Android) Either not implemented or a no-op. =item endgrent (Android, S<RISC OS>, VMS, Win32) Not implemented. =item endhostent (Android, Win32) Not implemented. =item endnetent (Android, Win32, S<Plan 9>) Not implemented. =item endprotoent (Android, Win32, S<Plan 9>) Not implemented. =item endservent (S<Plan 9>, Win32) Not implemented. =item getsockopt (S<Plan 9>) Not implemented. =item glob This operator is implemented via the L<C<File::Glob>|File::Glob> extension on most platforms. See L<File::Glob> for portability information. =item gmtime In theory, C<gmtime> is reliable from -2**63 to 2**63-1. However, because work-arounds in the implementation use floating point numbers, it will become inaccurate as the time gets larger. This is a bug and will be fixed in the future. (VOS) Time values are 32-bit quantities. =item ioctl (VMS) Not implemented. (Win32) Available only for socket handles, and it does what the C<ioctlsocket()> call in the Winsock API does. (S<RISC OS>) Available only for socket handles. =item kill (S<RISC OS>) Not implemented, hence not useful for taint checking. (Win32) C<kill> doesn't send a signal to the identified process like it does on Unix platforms. Instead C<kill($sig, $pid)> terminates the process identified by C<$pid>, and makes it exit immediately with exit status C<$sig>. As in Unix, if C<$sig> is 0 and the specified process exists, it returns true without actually terminating it. (Win32) C<kill(-9, $pid)> will terminate the process specified by C<$pid> and recursively all child processes owned by it. This is different from the Unix semantics, where the signal will be delivered to all processes in the same process group as the process specified by C<$pid>. (VMS) A pid of -1 indicating all processes on the system is not currently supported. =item link (S<RISC OS>, VOS) Not implemented. (AmigaOS) Link count not updated because hard links are not quite that hard (They are sort of half-way between hard and soft links). (Win32) Hard links are implemented on Win32 under NTFS only. They are natively supported on Windows 2000 and later. On Windows NT they are implemented using the Windows POSIX subsystem support and the Perl process will need Administrator or Backup Operator privileges to create hard links. (VMS) Available on 64 bit OpenVMS 8.2 and later. =item localtime C<localtime> has the same range as L</gmtime>, but because time zone rules change, its accuracy for historical and future times may degrade but usually by no more than an hour. =item lstat (S<RISC OS>) Not implemented. (Win32) Return values (especially for device and inode) may be bogus. =item msgctl =item msgget =item msgsnd =item msgrcv (Android, Win32, VMS, S<Plan 9>, S<RISC OS>, VOS) Not implemented. =item open (S<RISC OS>) Open modes C<|-> and C<-|> are unsupported. (SunOS, Solaris, HP-UX) Opening a process does not automatically flush output handles on some platforms. (Win32) Both of modes C<|-> and C<-|> are supported, but the list form is emulated since the Win32 API CreateProcess() accepts a simple string rather than an array of arguments. This may have security implications for your code. =item readlink (Win32, VMS, S<RISC OS>) Not implemented. =item rename (Win32) Can't move directories between directories on different logical volumes. =item rewinddir (Win32) Will not cause L<C<readdir>|perlfunc/readdir DIRHANDLE> to re-read the directory stream. The entries already read before the C<rewinddir> call will just be returned again from a cache buffer. =item select (Win32, VMS) Only implemented on sockets. (S<RISC OS>) Only reliable on sockets. Note that the L<C<select FILEHANDLE>|perlfunc/select FILEHANDLE> form is generally portable. =item semctl =item semget =item semop (Android, Win32, VMS, S<RISC OS>) Not implemented. =item setgrent (Android, VMS, Win32, S<RISC OS>) Not implemented. =item setpgrp (Win32, VMS, S<RISC OS>, VOS) Not implemented. =item setpriority (Win32, VMS, S<RISC OS>, VOS) Not implemented. =item setpwent (Android, Win32, S<RISC OS>) Not implemented. =item setsockopt (S<Plan 9>) Not implemented. =item shmctl =item shmget =item shmread =item shmwrite (Android, Win32, VMS, S<RISC OS>) Not implemented. =item sleep (Win32) Emulated using synchronization functions such that it can be interrupted by L<C<alarm>|perlfunc/alarm SECONDS>, and limited to a maximum of 4294967 seconds, approximately 49 days. =item socketpair (S<RISC OS>) Not implemented. (VMS) Available on 64 bit OpenVMS 8.2 and later. =item stat Platforms that do not have C<rdev>, C<blksize>, or C<blocks> will return these as C<''>, so numeric comparison or manipulation of these fields may cause 'not numeric' warnings. (S<Mac OS X>) C<ctime> not supported on UFS. (Win32) C<ctime> is creation time instead of inode change time. (Win32) C<dev> and C<ino> are not meaningful. (VMS) C<dev> and C<ino> are not necessarily reliable. (S<RISC OS>) C<mtime>, C<atime> and C<ctime> all return the last modification time. C<dev> and C<ino> are not necessarily reliable. (OS/2) C<dev>, C<rdev>, C<blksize>, and C<blocks> are not available. C<ino> is not meaningful and will differ between stat calls on the same file. (Cygwin) Some versions of cygwin when doing a C<stat("foo")> and not finding it may then attempt to C<stat("foo.exe")>. (Win32) C<stat> needs to open the file to determine the link count and update attributes that may have been changed through hard links. Setting L<C<${^WIN32_SLOPPY_STAT}>|perlvar/${^WIN32_SLOPPY_STAT}> to a true value speeds up C<stat> by not performing this operation. =item symlink (Win32, S<RISC OS>) Not implemented. (VMS) Implemented on 64 bit VMS 8.3. VMS requires the symbolic link to be in Unix syntax if it is intended to resolve to a valid path. =item syscall (Win32, VMS, S<RISC OS>, VOS) Not implemented. =item sysopen (S<Mac OS>, OS/390) The traditional C<0>, C<1>, and C<2> MODEs are implemented with different numeric values on some systems. The flags exported by L<C<Fcntl>|Fcntl> (C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR>) should work everywhere though. =item system (Win32) As an optimization, may not call the command shell specified in C<$ENV{PERL5SHELL}>. C<system(1, @args)> spawns an external process and immediately returns its process designator, without waiting for it to terminate. Return value may be used subsequently in L<C<wait>|perlfunc/wait> or L<C<waitpid>|perlfunc/waitpid PID,FLAGS>. Failure to C<spawn()> a subprocess is indicated by setting L<C<$?>|perlvar/$?> to C<<< 255 << 8 >>>. L<C<$?>|perlvar/$?> is set in a way compatible with Unix (i.e. the exit status of the subprocess is obtained by C<<< $? >> 8 >>>, as described in the documentation). Note that the list form of system() is emulated since the Win32 API CreateProcess() accepts a simple string rather than an array of command-line arguments. This may have security implications for your code. (S<RISC OS>) There is no shell to process metacharacters, and the native standard is to pass a command line terminated by "\n" "\r" or "\0" to the spawned program. Redirection such as C<< > foo >> is performed (if at all) by the run time library of the spawned program. C<system LIST> will call the Unix emulation library's L<C<exec>|perlfunc/exec LIST> emulation, which attempts to provide emulation of the stdin, stdout, stderr in force in the parent, provided the child program uses a compatible version of the emulation library. C<system SCALAR> will call the native command line directly and no such emulation of a child Unix program will occur. Mileage B<will> vary. (Win32) C<system LIST> without the use of indirect object syntax (C<system PROGRAM LIST>) may fall back to trying the shell if the first C<spawn()> fails. (SunOS, Solaris, HP-UX) Does not automatically flush output handles on some platforms. (VMS) As with Win32, C<system(1, @args)> spawns an external process and immediately returns its process designator without waiting for the process to terminate. In this case the return value may be used subsequently in L<C<wait>|perlfunc/wait> or L<C<waitpid>|perlfunc/waitpid PID,FLAGS>. Otherwise the return value is POSIX-like (shifted up by 8 bits), which only allows room for a made-up value derived from the severity bits of the native 32-bit condition code (unless overridden by L<C<use vmsish 'status'>|vmsish/C<vmsish status>>). If the native condition code is one that has a POSIX value encoded, the POSIX value will be decoded to extract the expected exit value. For more details see L<perlvms/$?>. =item telldir (Android) Not implemented. =item times (Win32) "Cumulative" times will be bogus. On anything other than Windows NT or Windows 2000, "system" time will be bogus, and "user" time is actually the time returned by the L<C<clock()>|clock(3)> function in the C runtime library. (S<RISC OS>) Not useful. =item truncate (Older versions of VMS) Not implemented. (VOS) Truncation to same-or-shorter lengths only. (Win32) If a FILEHANDLE is supplied, it must be writable and opened in append mode (i.e., use C<<< open(my $fh, '>>', 'filename') >>> or C<sysopen(my $fh, ..., O_APPEND|O_RDWR)>. If a filename is supplied, it should not be held open elsewhere. =item umask Returns C<undef> where unavailable. (AmigaOS) C<umask> works but the correct permissions are set only when the file is finally closed. =item utime (VMS, S<RISC OS>) Only the modification time is updated. (Win32) May not behave as expected. Behavior depends on the C runtime library's implementation of L<C<utime()>|utime(2)>, and the filesystem being used. The FAT filesystem typically does not support an "access time" field, and it may limit timestamps to a granularity of two seconds. =item wait =item waitpid (Win32) Can only be applied to process handles returned for processes spawned using C<system(1, ...)> or pseudo processes created with L<C<fork>|perlfunc/fork>. (S<RISC OS>) Not useful. =back =head1 Supported Platforms The following platforms are known to build Perl 5.12 (as of April 2010, its release date) from the standard source code distribution available at L<http://www.cpan.org/src> =over =item Linux (x86, ARM, IA64) =item HP-UX =item AIX =item Win32 =over =item Windows 2000 =item Windows XP =item Windows Server 2003 =item Windows Vista =item Windows Server 2008 =item Windows 7 =back =item Cygwin Some tests are known to fail: =over =item * F<ext/XS-APItest/t/call_checker.t> - see L<https://github.com/Perl/perl5/issues/10750> =item * F<dist/I18N-Collate/t/I18N-Collate.t> =item * F<ext/Win32CORE/t/win32core.t> - may fail on recent cygwin installs. =back =item Solaris (x86, SPARC) =item OpenVMS =over =item Alpha (7.2 and later) =item I64 (8.2 and later) =back =item Symbian =item NetBSD =item FreeBSD =item Debian GNU/kFreeBSD =item Haiku =item Irix (6.5. What else?) =item OpenBSD =item Dragonfly BSD =item Midnight BSD =item QNX Neutrino RTOS (6.5.0) =item MirOS BSD =item Stratus OpenVOS (17.0 or later) Caveats: =over =item time_t issues that may or may not be fixed =back =item Symbian (Series 60 v3, 3.2 and 5 - what else?) =item Stratus VOS / OpenVOS =item AIX =item Android =item FreeMINT Perl now builds with FreeMiNT/Atari. It fails a few tests, that needs some investigation. The FreeMiNT port uses GNU dld for loadable module capabilities. So ensure you have that library installed when building perl. =back =head1 EOL Platforms =head2 (Perl 5.20) The following platforms were supported by a previous version of Perl but have been officially removed from Perl's source code as of 5.20: =over =item AT&T 3b1 =back =head2 (Perl 5.14) The following platforms were supported up to 5.10. They may still have worked in 5.12, but supporting code has been removed for 5.14: =over =item Windows 95 =item Windows 98 =item Windows ME =item Windows NT4 =back =head2 (Perl 5.12) The following platforms were supported by a previous version of Perl but have been officially removed from Perl's source code as of 5.12: =over =item Atari MiNT =item Apollo Domain/OS =item Apple Mac OS 8/9 =item Tenon Machten =back =head1 Supported Platforms (Perl 5.8) As of July 2002 (the Perl release 5.8.0), the following platforms were able to build Perl from the standard source code distribution available at L<http://www.cpan.org/src/> AIX BeOS BSD/OS (BSDi) Cygwin DG/UX DOS DJGPP 1) DYNIX/ptx EPOC R5 FreeBSD HI-UXMPP (Hitachi) (5.8.0 worked but we didn't know it) HP-UX IRIX Linux Mac OS Classic Mac OS X (Darwin) MPE/iX NetBSD NetWare NonStop-UX ReliantUNIX (formerly SINIX) OpenBSD OpenVMS (formerly VMS) Open UNIX (Unixware) (since Perl 5.8.1/5.9.0) OS/2 OS/400 (using the PASE) (since Perl 5.8.1/5.9.0) POSIX-BC (formerly BS2000) QNX Solaris SunOS 4 SUPER-UX (NEC) Tru64 UNIX (formerly DEC OSF/1, Digital UNIX) UNICOS UNICOS/mk UTS VOS / OpenVOS Win95/98/ME/2K/XP 2) WinCE z/OS (formerly OS/390) VM/ESA 1) in DOS mode either the DOS or OS/2 ports can be used 2) compilers: Borland, MinGW (GCC), VC6 The following platforms worked with the previous releases (5.6 and 5.7), but we did not manage either to fix or to test these in time for the 5.8.0 release. There is a very good chance that many of these will work fine with the 5.8.0. BSD/OS DomainOS Hurd LynxOS MachTen PowerMAX SCO SV SVR4 Unixware Windows 3.1 Known to be broken for 5.8.0 (but 5.6.1 and 5.7.2 can be used): AmigaOS 3 The following platforms have been known to build Perl from source in the past (5.005_03 and earlier), but we haven't been able to verify their status for the current release, either because the hardware/software platforms are rare or because we don't have an active champion on these platforms--or both. They used to work, though, so go ahead and try compiling them, and let L<https://github.com/Perl/perl5/issues> know of any trouble. 3b1 A/UX ConvexOS CX/UX DC/OSx DDE SMES DOS EMX Dynix EP/IX ESIX FPS GENIX Greenhills ISC MachTen 68k MPC NEWS-OS NextSTEP OpenSTEP Opus Plan 9 RISC/os SCO ODT/OSR Stellar SVR2 TI1500 TitanOS Ultrix Unisys Dynix The following platforms have their own source code distributions and binaries available via L<http://www.cpan.org/ports/> Perl release OS/400 (ILE) 5.005_02 Tandem Guardian 5.004 The following platforms have only binaries available via L<http://www.cpan.org/ports/index.html> : Perl release Acorn RISCOS 5.005_02 AOS 5.002 LynxOS 5.004_02 Although we do suggest that you always build your own Perl from the source code, both for maximal configurability and for security, in case you are in a hurry you can check L<http://www.cpan.org/ports/index.html> for binary distributions. =head1 SEE ALSO L<perlaix>, L<perlamiga>, L<perlbs2000>, L<perlcygwin>, L<perldos>, L<perlebcdic>, L<perlfreebsd>, L<perlhurd>, L<perlhpux>, L<perlirix>, L<perlmacos>, L<perlmacosx>, L<perlnetware>, L<perlos2>, L<perlos390>, L<perlos400>, L<perlplan9>, L<perlqnx>, L<perlsolaris>, L<perltru64>, L<perlunicode>, L<perlvms>, L<perlvos>, L<perlwin32>, and L<Win32>. =head1 AUTHORS / CONTRIBUTORS Abigail <abigail@abigail.be>, Charles Bailey <bailey@newman.upenn.edu>, Graham Barr <gbarr@pobox.com>, Tom Christiansen <tchrist@perl.com>, Nicholas Clark <nick@ccl4.org>, Thomas Dorner <Thomas.Dorner@start.de>, Andy Dougherty <doughera@lafayette.edu>, Dominic Dunlop <domo@computer.org>, Neale Ferguson <neale@vma.tabnsw.com.au>, David J. Fiander <davidf@mks.com>, Paul Green <Paul.Green@stratus.com>, M.J.T. Guy <mjtg@cam.ac.uk>, Jarkko Hietaniemi <jhi@iki.fi>, Luther Huffman <lutherh@stratcom.com>, Nick Ing-Simmons <nick@ing-simmons.net>, Andreas J. KE<ouml>nig <a.koenig@mind.de>, Markus Laker <mlaker@contax.co.uk>, Andrew M. Langmead <aml@world.std.com>, Lukas Mai <l.mai@web.de>, Larry Moore <ljmoore@freespace.net>, Paul Moore <Paul.Moore@uk.origin-it.com>, Chris Nandor <pudge@pobox.com>, Matthias Neeracher <neeracher@mac.com>, Philip Newton <pne@cpan.org>, Gary Ng <71564.1743@CompuServe.COM>, Tom Phoenix <rootbeer@teleport.com>, AndrE<eacute> Pirard <A.Pirard@ulg.ac.be>, Peter Prymmer <pvhp@forte.com>, Hugo van der Sanden <hv@crypt0.demon.co.uk>, Gurusamy Sarathy <gsar@activestate.com>, Paul J. Schinder <schinder@pobox.com>, Michael G Schwern <schwern@pobox.com>, Dan Sugalski <dan@sidhe.org>, Nathan Torkington <gnat@frii.com>, John Malmberg <wb8tyw@qsl.net> PK �=�[抰 perlsynology.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. But if you have been into Perl you probably already know this. =head1 NAME perlsynology - Perl 5 on Synology DSM systems =head1 DESCRIPTION Synology manufactures a vast number of Network Attached Storage (NAS) devices that are very popular in large organisations as well as small businesses and homes. The NAS systems are equipped with Synology Disk Storage Manager (DSM), which is a trimmed-down Linux system enhanced with several tools for managing the NAS. There are several flavours of hardware: Marvell Armada (ARMv5tel, ARMv7l), Intel Atom (i686, x86_64), Freescale QorIQ (PPC), and more. For a full list see the L<Synology FAQ|https://forum.synology.com/wiki/index.php/What_kind_of_CPU_does_my_NAS_have>. Since it is based on Linux, the NAS can run many popular Linux software packages, including Perl. In fact, Synology provides a ready-to-install package for Perl, depending on the version of DSM the installed perl ranges from 5.8.6 on DSM-4.3 to 5.24.0 on DSM-6.1. There is an active user community that provides many software packages for the Synology DSM systems; at the time of writing this document they provide Perl version 5.24.1. This document describes various features of Synology DSM operating system that will affect how Perl 5 (hereafter just Perl) is configured, compiled and/or runs. It has been compiled and verified by Johan Vromans for the Synology DS413 (QorIQ), with feedback from H.Merijn Brand (DS213, ARMv5tel and RS815, Intel Atom x64). =head2 Setting up the build environment =head3 DSM 5 As DSM is a trimmed-down Linux system, it lacks many of the tools and libraries commonly found on Linux. The basic tools like sh, cp, rm, etc. are implemented using L<BusyBox|https://en.wikipedia.org/wiki/BusyBox>. =over 4 =item * Using your favourite browser open the DSM management page and start the Package Center. =item * If you want to smoke test Perl, install C<Perl>. =item * In Settings, add the following Package Sources: https://www.cphub.net http://packages.quadrat4.de =item * Still in Settings, in Channel Update, select Beta Channel. =item * Press Refresh. In the left panel the item "Community" will appear. Click it. Select "Bootstrap Installer Beta" and install it. =item * Likewise, install "iPKGui Beta". The application window should now show an icon for iPKGui. =item * Start iPKGui. Install the packages C<make>, C<gcc> and C<coreutils>. If you want to smoke test Perl, install C<patch>. =back The next step is to add some symlinks to system libraries. For example, the development software expect a library C<libm.so> that normally is a symlink to C<libm.so.6>. Synology only provides the latter and not the symlink. Here the actual architecture of the Synology system matters. You have to find out where the gcc libraries have been installed. Look in /opt for a directory similar to arm-none-linux-gnueab or powerpc-linux-gnuspe. In the instructions below I'll use powerpc-linux-gnuspe as an example. =over 4 =item * On the DSM management page start the Control Panel. =item * Click Terminal, and enable SSH service. =item * Close Terminal and the Control Panel. =item * Open a shell on the Synology using ssh and become root. =item * Execute the following commands: cd /lib ln -s libm.so.6 libm.so ln -s libcrypt.so.1 libcrypt.so ln -s libdl.so.2 libdl.so cd /opt/powerpc-linux-gnuspe/lib (or /opt/arm-none-linux-gnueabi/lib) ln -s /lib/libdl.so.2 libdl.so =back B<WARNING:> When you perform a system software upgrade, these links will disappear and need to be re-established. =head3 DSM 6 Using iPkg has been deprecated on DSM 6, but an alternative is available for DSM 6: entware/opkg. For instructions on how to use that, please read L<Install Entware-ng on Synology NAS|https://github.com/Entware-ng/Entware-ng/wiki/Install-on-Synology-NAS> That sadly does not (yet) work on QorIQ. At the moment of writing, the supported architectures are armv5, armv7, mipsel, wl500g, x86_32, and x86_64. Check L<here|https://pkg.entware.net/binaries/> for supported platforms. Entware-ng comes with a precompiled 5.24.1 (June 2017) that allowes building shared XS code. Note that this installation does B<not> use a site_perl folder. The available C<cpan> works. If all required development packages are installed too, also for XS. =head2 Compiling Perl 5 When the build environment has been set up, building and testing Perl is straightforward. The only thing you need to do is download the sources as usual, and add a file Policy.sh as follows: # Administrivia. perladmin="your.email@goes.here" # Install Perl in a tree in /opt/perl instead of /opt/bin. prefix=/opt/perl # Select the compiler. Note that there is no 'cc' alias or link. cc=gcc # Build flags. ccflags="-DDEBUGGING" # Library and include paths. libpth="/lib" locincpth="/opt/include" loclibpth="/lib" You may want to create the destination directory and give it the right permissions before installing, thus eliminating the need to build Perl as a super user. In the directory where you unpacked the sources, issue the familiar commands: ./Configure -des make make test make install =head2 Known problems =head3 Configure No known problems yet =head3 Build =over 4 =item Error message "No error definitions found". This error is generated when it is not possible to find the local definitions for error codes, due to the uncommon structure of the Synology file system. This error was fixed in the Perl development git for version 5.19, commit 7a8f1212e5482613c8a5b0402528e3105b26ff24. =back =head3 Failing tests =over 4 =item F<ext/DynaLoader/t/DynaLoader.t> One subtest fails due to the uncommon structure of the Synology file system. The file F</lib/glibc.so> is missing. B<WARNING:> Do not symlink F</lib/glibc.so.6> to F</lib/glibc.so> or some system components will start to fail. =back =head2 Smoke testing Perl 5 If building completes successfully, you can set up smoke testing as described in the Test::Smoke documentation. For smoke testing you need a running Perl. You can either install the Synology supplied package for Perl 5.8.6, or build and install your own, much more recent version. Note that I could not run successful smokes when initiated by the Synology Task Scheduler. I resorted to initiating the smokes via a cron job run on another system, using ssh: ssh nas1 wrk/Test-Smoke/smoke/smokecurrent.sh =head3 Local patches When local patches are applied with smoke testing, the test driver will automatically request regeneration of certain tables after the patches are applied. The Synology supplied Perl 5.8.6 (at least on the DS413) B<is NOT capable> of generating these tables. It will generate opcodes with bogus values, causing the build to fail. You can prevent regeneration by adding the setting 'flags' => 0, to the smoke config, or by adding another patch that inserts exit 0 if $] == 5.008006; in the beginning of the C<regen.pl> program. =head2 Adding libraries The above procedure describes a basic environment and hence results in a basic Perl. If you want to add additional libraries to Perl, you may need some extra settings. For example, the basic Perl does not have any of the DB libraries (db, dbm, ndbm, gdsm). You can add these using iPKGui, however, you need to set environment variable LD_LIBRARY_PATH to the appropriate value: LD_LIBRARY_PATH=/lib:/opt/lib export LD_LIBRARY_PATH This setting needs to be in effect while Perl is built, but also when the programs are run. =head1 REVISION June 2017, for Synology DSM 5.1.5022 and DSM 6.1-15101-4. =head1 AUTHOR Johan Vromans <jvromans@squirrel.nl> H. Merijn Brand <h.m.brand@xs4all.nl> =cut PK �=�[)��XS XS perlapio.podnu �[��� =head1 NAME perlapio - perl's IO abstraction interface. =head1 SYNOPSIS #define PERLIO_NOT_STDIO 0 /* For co-existence with stdio only */ #include <perlio.h> /* Usually via #include <perl.h> */ PerlIO *PerlIO_stdin(void); PerlIO *PerlIO_stdout(void); PerlIO *PerlIO_stderr(void); PerlIO *PerlIO_open(const char *path,const char *mode); PerlIO *PerlIO_fdopen(int fd, const char *mode); PerlIO *PerlIO_reopen(const char *path, /* deprecated */ const char *mode, PerlIO *old); int PerlIO_close(PerlIO *f); int PerlIO_stdoutf(const char *fmt,...) int PerlIO_puts(PerlIO *f,const char *string); int PerlIO_putc(PerlIO *f,int ch); SSize_t PerlIO_write(PerlIO *f,const void *buf,size_t numbytes); int PerlIO_printf(PerlIO *f, const char *fmt,...); int PerlIO_vprintf(PerlIO *f, const char *fmt, va_list args); int PerlIO_flush(PerlIO *f); int PerlIO_eof(PerlIO *f); int PerlIO_error(PerlIO *f); void PerlIO_clearerr(PerlIO *f); int PerlIO_getc(PerlIO *d); int PerlIO_ungetc(PerlIO *f,int ch); SSize_t PerlIO_read(PerlIO *f, void *buf, size_t numbytes); int PerlIO_fileno(PerlIO *f); void PerlIO_setlinebuf(PerlIO *f); Off_t PerlIO_tell(PerlIO *f); int PerlIO_seek(PerlIO *f, Off_t offset, int whence); void PerlIO_rewind(PerlIO *f); int PerlIO_getpos(PerlIO *f, SV *save); /* prototype changed */ int PerlIO_setpos(PerlIO *f, SV *saved); /* prototype changed */ int PerlIO_fast_gets(PerlIO *f); int PerlIO_has_cntptr(PerlIO *f); SSize_t PerlIO_get_cnt(PerlIO *f); char *PerlIO_get_ptr(PerlIO *f); void PerlIO_set_ptrcnt(PerlIO *f, char *ptr, SSize_t count); int PerlIO_canset_cnt(PerlIO *f); /* deprecated */ void PerlIO_set_cnt(PerlIO *f, int count); /* deprecated */ int PerlIO_has_base(PerlIO *f); char *PerlIO_get_base(PerlIO *f); SSize_t PerlIO_get_bufsiz(PerlIO *f); PerlIO *PerlIO_importFILE(FILE *stdio, const char *mode); FILE *PerlIO_exportFILE(PerlIO *f, const char *mode); FILE *PerlIO_findFILE(PerlIO *f); void PerlIO_releaseFILE(PerlIO *f,FILE *stdio); int PerlIO_apply_layers(PerlIO *f, const char *mode, const char *layers); int PerlIO_binmode(PerlIO *f, int ptype, int imode, const char *layers); void PerlIO_debug(const char *fmt,...); =for apidoc Amh|int |PerlIO_apply_layers|PerlIO *f|const char *mode|const char *layers =for apidoc Amh|int |PerlIO_binmode|PerlIO *f|int ptype|int imode|const char *layers =for apidoc ATmh|int |PerlIO_canset_cnt|PerlIO *f =for apidoc Amh|void |PerlIO_debug|const char *fmt|... =for apidoc ATmh|FILE *|PerlIO_exportFILE|PerlIO *f|const char *mode =for apidoc ATmh|int |PerlIO_fast_gets|PerlIO *f =for apidoc ATmh|PerlIO*|PerlIO_fdopen|int fd|const char *mode =for apidoc ATmh|FILE *|PerlIO_findFILE|PerlIO *f =for apidoc ATmh|int |PerlIO_getc|PerlIO *d =for apidoc ATmh|int |PerlIO_getpos|PerlIO *f|SV *save =for apidoc ATmh|int |PerlIO_has_base|PerlIO *f =for apidoc ATmh|int |PerlIO_has_cntptr|PerlIO *f =for apidoc ATmh|PerlIO*|PerlIO_importFILE|FILE *stdio|const char *mode =for apidoc ATmh|PerlIO*|PerlIO_open|const char *path|const char *mode =for apidoc Amh|int |PerlIO_printf|PerlIO *f|const char *fmt|... =for apidoc ATmh|int |PerlIO_putc|PerlIO *f|int ch =for apidoc ATmh|int |PerlIO_puts|PerlIO *f|const char *string =for apidoc ATmh|void |PerlIO_releaseFILE|PerlIO *f|FILE *stdio =for apidoc Amh|PerlIO *|PerlIO_reopen|const char *path|const char *mode|PerlIO *old =for apidoc ATmh|void |PerlIO_rewind|PerlIO *f =for apidoc ATmh|int |PerlIO_setpos|PerlIO *f|SV *saved =for apidoc Amh|int |PerlIO_stdoutf|const char *fmt|... =for apidoc ATmh|int |PerlIO_ungetc|PerlIO *f|int ch =for apidoc ATmh|int |PerlIO_vprintf|PerlIO *f|const char *fmt|va_list args =for apidoc PerlIO_stdin =for apidoc PerlIO_stdout =for apidoc PerlIO_stderr =for apidoc PerlIO_close =for apidoc PerlIO_write =for apidoc PerlIO_flush =for apidoc PerlIO_eof =for apidoc PerlIO_error =for apidoc PerlIO_clearerr =for apidoc PerlIO_read =for apidoc PerlIO_fileno =for apidoc PerlIO_setlinebuf =for apidoc PerlIO_tell =for apidoc PerlIO_seek =for apidoc PerlIO_get_cnt =for apidoc PerlIO_get_ptr =for apidoc PerlIO_set_ptrcnt =for apidoc PerlIO_set_cnt =for apidoc PerlIO_get_base =for apidoc PerlIO_get_bufsiz =head1 DESCRIPTION Perl's source code, and extensions that want maximum portability, should use the above functions instead of those defined in ANSI C's I<stdio.h>. The perl headers (in particular "perlio.h") will C<#define> them to the I/O mechanism selected at Configure time. The functions are modeled on those in I<stdio.h>, but parameter order has been "tidied up a little". C<PerlIO *> takes the place of FILE *. Like FILE * it should be treated as opaque (it is probably safe to assume it is a pointer to something). There are currently two implementations: =over 4 =item 1. USE_STDIO All above are #define'd to stdio functions or are trivial wrapper functions which call stdio. In this case I<only> PerlIO * is a FILE *. This has been the default implementation since the abstraction was introduced in perl5.003_02. =item 2. USE_PERLIO Introduced just after perl5.7.0, this is a re-implementation of the above abstraction which allows perl more control over how IO is done as it decouples IO from the way the operating system and C library choose to do things. For USE_PERLIO PerlIO * has an extra layer of indirection - it is a pointer-to-a-pointer. This allows the PerlIO * to remain with a known value while swapping the implementation around underneath I<at run time>. In this case all the above are true (but very simple) functions which call the underlying implementation. This is the only implementation for which C<PerlIO_apply_layers()> does anything "interesting". The USE_PERLIO implementation is described in L<perliol>. =back Because "perlio.h" is a thin layer (for efficiency) the semantics of these functions are somewhat dependent on the underlying implementation. Where these variations are understood they are noted below. Unless otherwise noted, functions return 0 on success, or a negative value (usually C<EOF> which is usually -1) and set C<errno> on error. =over 4 =item B<PerlIO_stdin()>, B<PerlIO_stdout()>, B<PerlIO_stderr()> Use these rather than C<stdin>, C<stdout>, C<stderr>. They are written to look like "function calls" rather than variables because this makes it easier to I<make them> function calls if platform cannot export data to loaded modules, or if (say) different "threads" might have different values. =item B<PerlIO_open(path, mode)>, B<PerlIO_fdopen(fd,mode)> These correspond to fopen()/fdopen() and the arguments are the same. Return C<NULL> and set C<errno> if there is an error. There may be an implementation limit on the number of open handles, which may be lower than the limit on the number of open files - C<errno> may not be set when C<NULL> is returned if this limit is exceeded. =item B<PerlIO_reopen(path,mode,f)> While this currently exists in both implementations, perl itself does not use it. I<As perl does not use it, it is not well tested.> Perl prefers to C<dup> the new low-level descriptor to the descriptor used by the existing PerlIO. This may become the behaviour of this function in the future. =item B<PerlIO_printf(f,fmt,...)>, B<PerlIO_vprintf(f,fmt,a)> These are fprintf()/vfprintf() equivalents. =item B<PerlIO_stdoutf(fmt,...)> This is printf() equivalent. printf is #defined to this function, so it is (currently) legal to use C<printf(fmt,...)> in perl sources. =item B<PerlIO_read(f,buf,count)>, B<PerlIO_write(f,buf,count)> These correspond functionally to fread() and fwrite() but the arguments and return values are different. The PerlIO_read() and PerlIO_write() signatures have been modeled on the more sane low level read() and write() functions instead: The "file" argument is passed first, there is only one "count", and the return value can distinguish between error and C<EOF>. Returns a byte count if successful (which may be zero or positive), returns negative value and sets C<errno> on error. Depending on implementation C<errno> may be C<EINTR> if operation was interrupted by a signal. =item B<PerlIO_close(f)> Depending on implementation C<errno> may be C<EINTR> if operation was interrupted by a signal. =item B<PerlIO_puts(f,s)>, B<PerlIO_putc(f,c)> These correspond to fputs() and fputc(). Note that arguments have been revised to have "file" first. =item B<PerlIO_ungetc(f,c)> This corresponds to ungetc(). Note that arguments have been revised to have "file" first. Arranges that next read operation will return the byte B<c>. Despite the implied "character" in the name only values in the range 0..0xFF are defined. Returns the byte B<c> on success or -1 (C<EOF>) on error. The number of bytes that can be "pushed back" may vary, only 1 character is certain, and then only if it is the last character that was read from the handle. =item B<PerlIO_getc(f)> This corresponds to getc(). Despite the c in the name only byte range 0..0xFF is supported. Returns the character read or -1 (C<EOF>) on error. =item B<PerlIO_eof(f)> This corresponds to feof(). Returns a true/false indication of whether the handle is at end of file. For terminal devices this may or may not be "sticky" depending on the implementation. The flag is cleared by PerlIO_seek(), or PerlIO_rewind(). =item B<PerlIO_error(f)> This corresponds to ferror(). Returns a true/false indication of whether there has been an IO error on the handle. =item B<PerlIO_fileno(f)> This corresponds to fileno(), note that on some platforms, the meaning of "fileno" may not match Unix. Returns -1 if the handle has no open descriptor associated with it. =item B<PerlIO_clearerr(f)> This corresponds to clearerr(), i.e., clears 'error' and (usually) 'eof' flags for the "stream". Does not return a value. =item B<PerlIO_flush(f)> This corresponds to fflush(). Sends any buffered write data to the underlying file. If called with C<NULL> this may flush all open streams (or core dump with some USE_STDIO implementations). Calling on a handle open for read only, or on which last operation was a read of some kind may lead to undefined behaviour on some USE_STDIO implementations. The USE_PERLIO (layers) implementation tries to behave better: it flushes all open streams when passed C<NULL>, and attempts to retain data on read streams either in the buffer or by seeking the handle to the current logical position. =item B<PerlIO_seek(f,offset,whence)> This corresponds to fseek(). Sends buffered write data to the underlying file, or discards any buffered read data, then positions the file descriptor as specified by B<offset> and B<whence> (sic). This is the correct thing to do when switching between read and write on the same handle (see issues with PerlIO_flush() above). Offset is of type C<Off_t> which is a perl Configure value which may not be same as stdio's C<off_t>. =item B<PerlIO_tell(f)> This corresponds to ftell(). Returns the current file position, or (Off_t) -1 on error. May just return value system "knows" without making a system call or checking the underlying file descriptor (so use on shared file descriptors is not safe without a PerlIO_seek()). Return value is of type C<Off_t> which is a perl Configure value which may not be same as stdio's C<off_t>. =item B<PerlIO_getpos(f,p)>, B<PerlIO_setpos(f,p)> These correspond (loosely) to fgetpos() and fsetpos(). Rather than stdio's Fpos_t they expect a "Perl Scalar Value" to be passed. What is stored there should be considered opaque. The layout of the data may vary from handle to handle. When not using stdio or if platform does not have the stdio calls then they are implemented in terms of PerlIO_tell() and PerlIO_seek(). =item B<PerlIO_rewind(f)> This corresponds to rewind(). It is usually defined as being PerlIO_seek(f,(Off_t)0L, SEEK_SET); PerlIO_clearerr(f); =item B<PerlIO_tmpfile()> This corresponds to tmpfile(), i.e., returns an anonymous PerlIO or NULL on error. The system will attempt to automatically delete the file when closed. On Unix the file is usually C<unlink>-ed just after it is created so it does not matter how it gets closed. On other systems the file may only be deleted if closed via PerlIO_close() and/or the program exits via C<exit>. Depending on the implementation there may be "race conditions" which allow other processes access to the file, though in general it will be safer in this regard than ad. hoc. schemes. =item B<PerlIO_setlinebuf(f)> This corresponds to setlinebuf(). Does not return a value. What constitutes a "line" is implementation dependent but usually means that writing "\n" flushes the buffer. What happens with things like "this\nthat" is uncertain. (Perl core uses it I<only> when "dumping"; it has nothing to do with $| auto-flush.) =back =head2 Co-existence with stdio There is outline support for co-existence of PerlIO with stdio. Obviously if PerlIO is implemented in terms of stdio there is no problem. However in other cases then mechanisms must exist to create a FILE * which can be passed to library code which is going to use stdio calls. The first step is to add this line: #define PERLIO_NOT_STDIO 0 I<before> including any perl header files. (This will probably become the default at some point). That prevents "perlio.h" from attempting to #define stdio functions onto PerlIO functions. XS code is probably better using "typemap" if it expects FILE * arguments. The standard typemap will be adjusted to comprehend any changes in this area. =over 4 =item B<PerlIO_importFILE(f,mode)> Used to get a PerlIO * from a FILE *. The mode argument should be a string as would be passed to fopen/PerlIO_open. If it is NULL then - for legacy support - the code will (depending upon the platform and the implementation) either attempt to empirically determine the mode in which I<f> is open, or use "r+" to indicate a read/write stream. Once called the FILE * should I<ONLY> be closed by calling C<PerlIO_close()> on the returned PerlIO *. The PerlIO is set to textmode. Use PerlIO_binmode if this is not the desired mode. This is B<not> the reverse of PerlIO_exportFILE(). =item B<PerlIO_exportFILE(f,mode)> Given a PerlIO * create a 'native' FILE * suitable for passing to code expecting to be compiled and linked with ANSI C I<stdio.h>. The mode argument should be a string as would be passed to fopen/PerlIO_open. If it is NULL then - for legacy support - the FILE * is opened in same mode as the PerlIO *. The fact that such a FILE * has been 'exported' is recorded, (normally by pushing a new :stdio "layer" onto the PerlIO *), which may affect future PerlIO operations on the original PerlIO *. You should not call C<fclose()> on the file unless you call C<PerlIO_releaseFILE()> to disassociate it from the PerlIO *. (Do not use PerlIO_importFILE() for doing the disassociation.) Calling this function repeatedly will create a FILE * on each call (and will push an :stdio layer each time as well). =item B<PerlIO_releaseFILE(p,f)> Calling PerlIO_releaseFILE informs PerlIO that all use of FILE * is complete. It is removed from the list of 'exported' FILE *s, and the associated PerlIO * should revert to its original behaviour. Use this to disassociate a file from a PerlIO * that was associated using PerlIO_exportFILE(). =item B<PerlIO_findFILE(f)> Returns a native FILE * used by a stdio layer. If there is none, it will create one with PerlIO_exportFILE. In either case the FILE * should be considered as belonging to PerlIO subsystem and should only be closed by calling C<PerlIO_close()>. =back =head2 "Fast gets" Functions In addition to standard-like API defined so far above there is an "implementation" interface which allows perl to get at internals of PerlIO. The following calls correspond to the various FILE_xxx macros determined by Configure - or their equivalent in other implementations. This section is really of interest to only those concerned with detailed perl-core behaviour, implementing a PerlIO mapping or writing code which can make use of the "read ahead" that has been done by the IO system in the same way perl does. Note that any code that uses these interfaces must be prepared to do things the traditional way if a handle does not support them. =over 4 =item B<PerlIO_fast_gets(f)> Returns true if implementation has all the interfaces required to allow perl's C<sv_gets> to "bypass" normal IO mechanism. This can vary from handle to handle. PerlIO_fast_gets(f) = PerlIO_has_cntptr(f) && \ PerlIO_canset_cnt(f) && \ 'Can set pointer into buffer' =item B<PerlIO_has_cntptr(f)> Implementation can return pointer to current position in the "buffer" and a count of bytes available in the buffer. Do not use this - use PerlIO_fast_gets. =item B<PerlIO_get_cnt(f)> Return count of readable bytes in the buffer. Zero or negative return means no more bytes available. =item B<PerlIO_get_ptr(f)> Return pointer to next readable byte in buffer, accessing via the pointer (dereferencing) is only safe if PerlIO_get_cnt() has returned a positive value. Only positive offsets up to value returned by PerlIO_get_cnt() are allowed. =item B<PerlIO_set_ptrcnt(f,p,c)> Set pointer into buffer, and a count of bytes still in the buffer. Should be used only to set pointer to within range implied by previous calls to C<PerlIO_get_ptr> and C<PerlIO_get_cnt>. The two values I<must> be consistent with each other (implementation may only use one or the other or may require both). =item B<PerlIO_canset_cnt(f)> Implementation can adjust its idea of number of bytes in the buffer. Do not use this - use PerlIO_fast_gets. =item B<PerlIO_set_cnt(f,c)> Obscure - set count of bytes in the buffer. Deprecated. Only usable if PerlIO_canset_cnt() returns true. Currently used in only doio.c to force count less than -1 to -1. Perhaps should be PerlIO_set_empty or similar. This call may actually do nothing if "count" is deduced from pointer and a "limit". Do not use this - use PerlIO_set_ptrcnt(). =item B<PerlIO_has_base(f)> Returns true if implementation has a buffer, and can return pointer to whole buffer and its size. Used by perl for B<-T> / B<-B> tests. Other uses would be very obscure... =item B<PerlIO_get_base(f)> Return I<start> of buffer. Access only positive offsets in the buffer up to the value returned by PerlIO_get_bufsiz(). =item B<PerlIO_get_bufsiz(f)> Return the I<total number of bytes> in the buffer, this is neither the number that can be read, nor the amount of memory allocated to the buffer. Rather it is what the operating system and/or implementation happened to C<read()> (or whatever) last time IO was requested. =back =head2 Other Functions =over 4 =item PerlIO_apply_layers(f,mode,layers) The new interface to the USE_PERLIO implementation. The layers ":crlf" and ":raw" are only ones allowed for other implementations and those are silently ignored. (As of perl5.8 ":raw" is deprecated.) Use PerlIO_binmode() below for the portable case. =item PerlIO_binmode(f,ptype,imode,layers) The hook used by perl's C<binmode> operator. B<ptype> is perl's character for the kind of IO: =over 8 =item 'E<lt>' read =item 'E<gt>' write =item '+' read/write =back B<imode> is C<O_BINARY> or C<O_TEXT>. B<layers> is a string of layers to apply, only ":crlf" makes sense in the non USE_PERLIO case. (As of perl5.8 ":raw" is deprecated in favour of passing NULL.) Portable cases are: PerlIO_binmode(f,ptype,O_BINARY,NULL); and PerlIO_binmode(f,ptype,O_TEXT,":crlf"); On Unix these calls probably have no effect whatsoever. Elsewhere they alter "\n" to CR,LF translation and possibly cause a special text "end of file" indicator to be written or honoured on read. The effect of making the call after doing any IO to the handle depends on the implementation. (It may be ignored, affect any data which is already buffered as well, or only apply to subsequent data.) =item PerlIO_debug(fmt,...) PerlIO_debug is a printf()-like function which can be used for debugging. No return value. Its main use is inside PerlIO where using real printf, warn() etc. would recursively call PerlIO and be a problem. PerlIO_debug writes to the file named by $ENV{'PERLIO_DEBUG'} or defaults to stderr if the environment variable is not defined. Typical use might be Bourne shells (sh, ksh, bash, zsh, ash, ...): PERLIO_DEBUG=/tmp/perliodebug.log ./perl -Di somescript some args Csh/Tcsh: setenv PERLIO_DEBUG /tmp/perliodebug.log ./perl -Di somescript some args If you have the "env" utility: env PERLIO_DEBUG=/tmp/perliodebug.log ./perl -Di somescript args Win32: set PERLIO_DEBUG=perliodebug.log perl -Di somescript some args On a Perl built without C<-DDEBUGGING>, or when the C<-Di> command-line switch is not specified, or under taint, PerlIO_debug() is a no-op. =back PK �=�[��'W W perlfreebsd.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specifically designed to be readable as is. =head1 NAME perlfreebsd - Perl version 5 on FreeBSD systems =head1 DESCRIPTION This document describes various features of FreeBSD that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. =head2 FreeBSD core dumps from readdir_r with ithreads When perl is configured to use ithreads, it will use re-entrant library calls in preference to non-re-entrant versions. There is a bug in FreeBSD's C<readdir_r> function in versions 4.5 and earlier that can cause a SEGV when reading large directories. A patch for FreeBSD libc is available (see L<http://www.freebsd.org/cgi/query-pr.cgi?pr=misc/30631> ) which has been integrated into FreeBSD 4.6. =head2 C<$^X> doesn't always contain a full path in FreeBSD perl sets C<$^X> where possible to a full path by asking the operating system. On FreeBSD the full path of the perl interpreter is found by using C<sysctl> with C<KERN_PROC_PATHNAME> if that is supported, else by reading the symlink F</proc/curproc/file>. FreeBSD 7 and earlier has a bug where either approach sometimes returns an incorrect value (see L<http://www.freebsd.org/cgi/query-pr.cgi?pr=35703> ). In these cases perl will fall back to the old behaviour of using C's C<argv[0]> value for C<$^X>. =head1 AUTHOR Nicholas Clark <nick@ccl4.org>, collating wisdom supplied by Slaven Rezic and Tim Bunce. Please report any errors, updates, or suggestions to L<https://github.com/Perl/perl5/issues>. PK �=�[�"�c:d :d perlunicook.podnu �[��� =encoding utf8 =head1 NAME perlunicook - cookbookish examples of handling Unicode in Perl =head1 DESCRIPTION This manpage contains short recipes demonstrating how to handle common Unicode operations in Perl, plus one complete program at the end. Any undeclared variables in individual recipes are assumed to have a previous appropriate value in them. =head1 EXAMPLES =head2 ℞ 0: Standard preamble Unless otherwise notes, all examples below require this standard preamble to work correctly, with the C<#!> adjusted to work on your system: #!/usr/bin/env perl use utf8; # so literals and identifiers can be in UTF-8 use v5.12; # or later to get "unicode_strings" feature use strict; # quote strings, declare variables use warnings; # on by default use warnings qw(FATAL utf8); # fatalize encoding glitches use open qw(:std :encoding(UTF-8)); # undeclared streams in UTF-8 use charnames qw(:full :short); # unneeded in v5.16 This I<does> make even Unix programmers C<binmode> your binary streams, or open them with C<:raw>, but that's the only way to get at them portably anyway. B<WARNING>: C<use autodie> (pre 2.26) and C<use open> do not get along with each other. =head2 ℞ 1: Generic Unicode-savvy filter Always decompose on the way in, then recompose on the way out. use Unicode::Normalize; while (<>) { $_ = NFD($_); # decompose + reorder canonically ... } continue { print NFC($_); # recompose (where possible) + reorder canonically } =head2 ℞ 2: Fine-tuning Unicode warnings As of v5.14, Perl distinguishes three subclasses of UTF‑8 warnings. use v5.14; # subwarnings unavailable any earlier no warnings "nonchar"; # the 66 forbidden non-characters no warnings "surrogate"; # UTF-16/CESU-8 nonsense no warnings "non_unicode"; # for codepoints over 0x10_FFFF =head2 ℞ 3: Declare source in utf8 for identifiers and literals Without the all-critical C<use utf8> declaration, putting UTF‑8 in your literals and identifiers won’t work right. If you used the standard preamble just given above, this already happened. If you did, you can do things like this: use utf8; my $measure = "Ångström"; my @μsoft = qw( cp852 cp1251 cp1252 ); my @ὑπέρμεγας = qw( ὑπέρ μεγας ); my @鯉 = qw( koi8-f koi8-u koi8-r ); my $motto = "👪 💗 🐪"; # FAMILY, GROWING HEART, DROMEDARY CAMEL If you forget C<use utf8>, high bytes will be misunderstood as separate characters, and nothing will work right. =head2 ℞ 4: Characters and their numbers The C<ord> and C<chr> functions work transparently on all codepoints, not just on ASCII alone — nor in fact, not even just on Unicode alone. # ASCII characters ord("A") chr(65) # characters from the Basic Multilingual Plane ord("Σ") chr(0x3A3) # beyond the BMP ord("𝑛") # MATHEMATICAL ITALIC SMALL N chr(0x1D45B) # beyond Unicode! (up to MAXINT) ord("\x{20_0000}") chr(0x20_0000) =head2 ℞ 5: Unicode literals by character number In an interpolated literal, whether a double-quoted string or a regex, you may specify a character by its number using the C<\x{I<HHHHHH>}> escape. String: "\x{3a3}" Regex: /\x{3a3}/ String: "\x{1d45b}" Regex: /\x{1d45b}/ # even non-BMP ranges in regex work fine /[\x{1D434}-\x{1D467}]/ =head2 ℞ 6: Get character name by number use charnames (); my $name = charnames::viacode(0x03A3); =head2 ℞ 7: Get character number by name use charnames (); my $number = charnames::vianame("GREEK CAPITAL LETTER SIGMA"); =head2 ℞ 8: Unicode named characters Use the C<< \N{I<charname>} >> notation to get the character by that name for use in interpolated literals (double-quoted strings and regexes). In v5.16, there is an implicit use charnames qw(:full :short); But prior to v5.16, you must be explicit about which set of charnames you want. The C<:full> names are the official Unicode character name, alias, or sequence, which all share a namespace. use charnames qw(:full :short latin greek); "\N{MATHEMATICAL ITALIC SMALL N}" # :full "\N{GREEK CAPITAL LETTER SIGMA}" # :full Anything else is a Perl-specific convenience abbreviation. Specify one or more scripts by names if you want short names that are script-specific. "\N{Greek:Sigma}" # :short "\N{ae}" # latin "\N{epsilon}" # greek The v5.16 release also supports a C<:loose> import for loose matching of character names, which works just like loose matching of property names: that is, it disregards case, whitespace, and underscores: "\N{euro sign}" # :loose (from v5.16) Starting in v5.32, you can also use qr/\p{name=euro sign}/ to get official Unicode named characters in regular expressions. Loose matching is always done for these. =head2 ℞ 9: Unicode named sequences These look just like character names but return multiple codepoints. Notice the C<%vx> vector-print functionality in C<printf>. use charnames qw(:full); my $seq = "\N{LATIN CAPITAL LETTER A WITH MACRON AND GRAVE}"; printf "U+%v04X\n", $seq; U+0100.0300 =head2 ℞ 10: Custom named characters Use C<:alias> to give your own lexically scoped nicknames to existing characters, or even to give unnamed private-use characters useful names. use charnames ":full", ":alias" => { ecute => "LATIN SMALL LETTER E WITH ACUTE", "APPLE LOGO" => 0xF8FF, # private use character }; "\N{ecute}" "\N{APPLE LOGO}" =head2 ℞ 11: Names of CJK codepoints Sinograms like “東京” come back with character names of C<CJK UNIFIED IDEOGRAPH-6771> and C<CJK UNIFIED IDEOGRAPH-4EAC>, because their “names” vary. The CPAN C<Unicode::Unihan> module has a large database for decoding these (and a whole lot more), provided you know how to understand its output. # cpan -i Unicode::Unihan use Unicode::Unihan; my $str = "東京"; my $unhan = Unicode::Unihan->new; for my $lang (qw(Mandarin Cantonese Korean JapaneseOn JapaneseKun)) { printf "CJK $str in %-12s is ", $lang; say $unhan->$lang($str); } prints: CJK 東京 in Mandarin is DONG1JING1 CJK 東京 in Cantonese is dung1ging1 CJK 東京 in Korean is TONGKYENG CJK 東京 in JapaneseOn is TOUKYOU KEI KIN CJK 東京 in JapaneseKun is HIGASHI AZUMAMIYAKO If you have a specific romanization scheme in mind, use the specific module: # cpan -i Lingua::JA::Romanize::Japanese use Lingua::JA::Romanize::Japanese; my $k2r = Lingua::JA::Romanize::Japanese->new; my $str = "東京"; say "Japanese for $str is ", $k2r->chars($str); prints Japanese for 東京 is toukyou =head2 ℞ 12: Explicit encode/decode On rare occasion, such as a database read, you may be given encoded text you need to decode. use Encode qw(encode decode); my $chars = decode("shiftjis", $bytes, 1); # OR my $bytes = encode("MIME-Header-ISO_2022_JP", $chars, 1); For streams all in the same encoding, don't use encode/decode; instead set the file encoding when you open the file or immediately after with C<binmode> as described later below. =head2 ℞ 13: Decode program arguments as utf8 $ perl -CA ... or $ export PERL_UNICODE=A or use Encode qw(decode); @ARGV = map { decode('UTF-8', $_, 1) } @ARGV; =head2 ℞ 14: Decode program arguments as locale encoding # cpan -i Encode::Locale use Encode qw(locale); use Encode::Locale; # use "locale" as an arg to encode/decode @ARGV = map { decode(locale => $_, 1) } @ARGV; =head2 ℞ 15: Declare STD{IN,OUT,ERR} to be utf8 Use a command-line option, an environment variable, or else call C<binmode> explicitly: $ perl -CS ... or $ export PERL_UNICODE=S or use open qw(:std :encoding(UTF-8)); or binmode(STDIN, ":encoding(UTF-8)"); binmode(STDOUT, ":utf8"); binmode(STDERR, ":utf8"); =head2 ℞ 16: Declare STD{IN,OUT,ERR} to be in locale encoding # cpan -i Encode::Locale use Encode; use Encode::Locale; # or as a stream for binmode or open binmode STDIN, ":encoding(console_in)" if -t STDIN; binmode STDOUT, ":encoding(console_out)" if -t STDOUT; binmode STDERR, ":encoding(console_out)" if -t STDERR; =head2 ℞ 17: Make file I/O default to utf8 Files opened without an encoding argument will be in UTF-8: $ perl -CD ... or $ export PERL_UNICODE=D or use open qw(:encoding(UTF-8)); =head2 ℞ 18: Make all I/O and args default to utf8 $ perl -CSDA ... or $ export PERL_UNICODE=SDA or use open qw(:std :encoding(UTF-8)); use Encode qw(decode); @ARGV = map { decode('UTF-8', $_, 1) } @ARGV; =head2 ℞ 19: Open file with specific encoding Specify stream encoding. This is the normal way to deal with encoded text, not by calling low-level functions. # input file open(my $in_file, "< :encoding(UTF-16)", "wintext"); OR open(my $in_file, "<", "wintext"); binmode($in_file, ":encoding(UTF-16)"); THEN my $line = <$in_file>; # output file open($out_file, "> :encoding(cp1252)", "wintext"); OR open(my $out_file, ">", "wintext"); binmode($out_file, ":encoding(cp1252)"); THEN print $out_file "some text\n"; More layers than just the encoding can be specified here. For example, the incantation C<":raw :encoding(UTF-16LE) :crlf"> includes implicit CRLF handling. =head2 ℞ 20: Unicode casing Unicode casing is very different from ASCII casing. uc("henry ⅷ") # "HENRY Ⅷ" uc("tschüß") # "TSCHÜSS" notice ß => SS # both are true: "tschüß" =~ /TSCHÜSS/i # notice ß => SS "Σίσυφος" =~ /ΣΊΣΥΦΟΣ/i # notice Σ,σ,ς sameness =head2 ℞ 21: Unicode case-insensitive comparisons Also available in the CPAN L<Unicode::CaseFold> module, the new C<fc> “foldcase” function from v5.16 grants access to the same Unicode casefolding as the C</i> pattern modifier has always used: use feature "fc"; # fc() function is from v5.16 # sort case-insensitively my @sorted = sort { fc($a) cmp fc($b) } @list; # both are true: fc("tschüß") eq fc("TSCHÜSS") fc("Σίσυφος") eq fc("ΣΊΣΥΦΟΣ") =head2 ℞ 22: Match Unicode linebreak sequence in regex A Unicode linebreak matches the two-character CRLF grapheme or any of seven vertical whitespace characters. Good for dealing with textfiles coming from different operating systems. \R s/\R/\n/g; # normalize all linebreaks to \n =head2 ℞ 23: Get character category Find the general category of a numeric codepoint. use Unicode::UCD qw(charinfo); my $cat = charinfo(0x3A3)->{category}; # "Lu" =head2 ℞ 24: Disabling Unicode-awareness in builtin charclasses Disable C<\w>, C<\b>, C<\s>, C<\d>, and the POSIX classes from working correctly on Unicode either in this scope, or in just one regex. use v5.14; use re "/a"; # OR my($num) = $str =~ /(\d+)/a; Or use specific un-Unicode properties, like C<\p{ahex}> and C<\p{POSIX_Digit>}. Properties still work normally no matter what charset modifiers (C</d /u /l /a /aa>) should be effect. =head2 ℞ 25: Match Unicode properties in regex with \p, \P These all match a single codepoint with the given property. Use C<\P> in place of C<\p> to match one codepoint lacking that property. \pL, \pN, \pS, \pP, \pM, \pZ, \pC \p{Sk}, \p{Ps}, \p{Lt} \p{alpha}, \p{upper}, \p{lower} \p{Latin}, \p{Greek} \p{script_extensions=Latin}, \p{scx=Greek} \p{East_Asian_Width=Wide}, \p{EA=W} \p{Line_Break=Hyphen}, \p{LB=HY} \p{Numeric_Value=4}, \p{NV=4} =head2 ℞ 26: Custom character properties Define at compile-time your own custom character properties for use in regexes. # using private-use characters sub In_Tengwar { "E000\tE07F\n" } if (/\p{In_Tengwar}/) { ... } # blending existing properties sub Is_GraecoRoman_Title {<<'END_OF_SET'} +utf8::IsLatin +utf8::IsGreek &utf8::IsTitle END_OF_SET if (/\p{Is_GraecoRoman_Title}/ { ... } =head2 ℞ 27: Unicode normalization Typically render into NFD on input and NFC on output. Using NFKC or NFKD functions improves recall on searches, assuming you've already done to the same text to be searched. Note that this is about much more than just pre- combined compatibility glyphs; it also reorders marks according to their canonical combining classes and weeds out singletons. use Unicode::Normalize; my $nfd = NFD($orig); my $nfc = NFC($orig); my $nfkd = NFKD($orig); my $nfkc = NFKC($orig); =head2 ℞ 28: Convert non-ASCII Unicode numerics Unless you’ve used C</a> or C</aa>, C<\d> matches more than ASCII digits only, but Perl’s implicit string-to-number conversion does not current recognize these. Here’s how to convert such strings manually. use v5.14; # needed for num() function use Unicode::UCD qw(num); my $str = "got Ⅻ and ४५६७ and ⅞ and here"; my @nums = (); while ($str =~ /(\d+|\N)/g) { # not just ASCII! push @nums, num($1); } say "@nums"; # 12 4567 0.875 use charnames qw(:full); my $nv = num("\N{RUMI DIGIT ONE}\N{RUMI DIGIT TWO}"); =head2 ℞ 29: Match Unicode grapheme cluster in regex Programmer-visible “characters” are codepoints matched by C</./s>, but user-visible “characters” are graphemes matched by C</\X/>. # Find vowel *plus* any combining diacritics,underlining,etc. my $nfd = NFD($orig); $nfd =~ / (?=[aeiou]) \X /xi =head2 ℞ 30: Extract by grapheme instead of by codepoint (regex) # match and grab five first graphemes my($first_five) = $str =~ /^ ( \X{5} ) /x; =head2 ℞ 31: Extract by grapheme instead of by codepoint (substr) # cpan -i Unicode::GCString use Unicode::GCString; my $gcs = Unicode::GCString->new($str); my $first_five = $gcs->substr(0, 5); =head2 ℞ 32: Reverse string by grapheme Reversing by codepoint messes up diacritics, mistakenly converting C<crème brûlée> into C<éel̂urb em̀erc> instead of into C<eélûrb emèrc>; so reverse by grapheme instead. Both these approaches work right no matter what normalization the string is in: $str = join("", reverse $str =~ /\X/g); # OR: cpan -i Unicode::GCString use Unicode::GCString; $str = reverse Unicode::GCString->new($str); =head2 ℞ 33: String length in graphemes The string C<brûlée> has six graphemes but up to eight codepoints. This counts by grapheme, not by codepoint: my $str = "brûlée"; my $count = 0; while ($str =~ /\X/g) { $count++ } # OR: cpan -i Unicode::GCString use Unicode::GCString; my $gcs = Unicode::GCString->new($str); my $count = $gcs->length; =head2 ℞ 34: Unicode column-width for printing Perl’s C<printf>, C<sprintf>, and C<format> think all codepoints take up 1 print column, but many take 0 or 2. Here to show that normalization makes no difference, we print out both forms: use Unicode::GCString; use Unicode::Normalize; my @words = qw/crème brûlée/; @words = map { NFC($_), NFD($_) } @words; for my $str (@words) { my $gcs = Unicode::GCString->new($str); my $cols = $gcs->columns; my $pad = " " x (10 - $cols); say str, $pad, " |"; } generates this to show that it pads correctly no matter the normalization: crème | crème | brûlée | brûlée | =head2 ℞ 35: Unicode collation Text sorted by numeric codepoint follows no reasonable alphabetic order; use the UCA for sorting text. use Unicode::Collate; my $col = Unicode::Collate->new(); my @list = $col->sort(@old_list); See the I<ucsort> program from the L<Unicode::Tussle> CPAN module for a convenient command-line interface to this module. =head2 ℞ 36: Case- I<and> accent-insensitive Unicode sort Specify a collation strength of level 1 to ignore case and diacritics, only looking at the basic character. use Unicode::Collate; my $col = Unicode::Collate->new(level => 1); my @list = $col->sort(@old_list); =head2 ℞ 37: Unicode locale collation Some locales have special sorting rules. # either use v5.12, OR: cpan -i Unicode::Collate::Locale use Unicode::Collate::Locale; my $col = Unicode::Collate::Locale->new(locale => "de__phonebook"); my @list = $col->sort(@old_list); The I<ucsort> program mentioned above accepts a C<--locale> parameter. =head2 ℞ 38: Making C<cmp> work on text instead of codepoints Instead of this: @srecs = sort { $b->{AGE} <=> $a->{AGE} || $a->{NAME} cmp $b->{NAME} } @recs; Use this: my $coll = Unicode::Collate->new(); for my $rec (@recs) { $rec->{NAME_key} = $coll->getSortKey( $rec->{NAME} ); } @srecs = sort { $b->{AGE} <=> $a->{AGE} || $a->{NAME_key} cmp $b->{NAME_key} } @recs; =head2 ℞ 39: Case- I<and> accent-insensitive comparisons Use a collator object to compare Unicode text by character instead of by codepoint. use Unicode::Collate; my $es = Unicode::Collate->new( level => 1, normalization => undef ); # now both are true: $es->eq("García", "GARCIA" ); $es->eq("Márquez", "MARQUEZ"); =head2 ℞ 40: Case- I<and> accent-insensitive locale comparisons Same, but in a specific locale. my $de = Unicode::Collate::Locale->new( locale => "de__phonebook", ); # now this is true: $de->eq("tschüß", "TSCHUESS"); # notice ü => UE, ß => SS =head2 ℞ 41: Unicode linebreaking Break up text into lines according to Unicode rules. # cpan -i Unicode::LineBreak use Unicode::LineBreak; use charnames qw(:full); my $para = "This is a super\N{HYPHEN}long string. " x 20; my $fmt = Unicode::LineBreak->new; print $fmt->break($para), "\n"; =head2 ℞ 42: Unicode text in DBM hashes, the tedious way Using a regular Perl string as a key or value for a DBM hash will trigger a wide character exception if any codepoints won’t fit into a byte. Here’s how to manually manage the translation: use DB_File; use Encode qw(encode decode); tie %dbhash, "DB_File", "pathname"; # STORE # assume $uni_key and $uni_value are abstract Unicode strings my $enc_key = encode("UTF-8", $uni_key, 1); my $enc_value = encode("UTF-8", $uni_value, 1); $dbhash{$enc_key} = $enc_value; # FETCH # assume $uni_key holds a normal Perl string (abstract Unicode) my $enc_key = encode("UTF-8", $uni_key, 1); my $enc_value = $dbhash{$enc_key}; my $uni_value = decode("UTF-8", $enc_value, 1); =head2 ℞ 43: Unicode text in DBM hashes, the easy way Here’s how to implicitly manage the translation; all encoding and decoding is done automatically, just as with streams that have a particular encoding attached to them: use DB_File; use DBM_Filter; my $dbobj = tie %dbhash, "DB_File", "pathname"; $dbobj->Filter_Value("utf8"); # this is the magic bit # STORE # assume $uni_key and $uni_value are abstract Unicode strings $dbhash{$uni_key} = $uni_value; # FETCH # $uni_key holds a normal Perl string (abstract Unicode) my $uni_value = $dbhash{$uni_key}; =head2 ℞ 44: PROGRAM: Demo of Unicode collation and printing Here’s a full program showing how to make use of locale-sensitive sorting, Unicode casing, and managing print widths when some of the characters take up zero or two columns, not just one column each time. When run, the following program produces this nicely aligned output: Crème Brûlée....... €2.00 Éclair............. €1.60 Fideuà............. €4.20 Hamburger.......... €6.00 Jamón Serrano...... €4.45 Linguiça........... €7.00 Pâté............... €4.15 Pears.............. €2.00 Pêches............. €2.25 Smørbrød........... €5.75 Spätzle............ €5.50 Xoriço............. €3.00 Γύρος.............. €6.50 막걸리............. €4.00 おもち............. €2.65 お好み焼き......... €8.00 シュークリーム..... €1.85 寿司............... €9.99 包子............... €7.50 Here's that program; tested on v5.14. #!/usr/bin/env perl # umenu - demo sorting and printing of Unicode food # # (obligatory and increasingly long preamble) # use utf8; use v5.14; # for locale sorting use strict; use warnings; use warnings qw(FATAL utf8); # fatalize encoding faults use open qw(:std :encoding(UTF-8)); # undeclared streams in UTF-8 use charnames qw(:full :short); # unneeded in v5.16 # std modules use Unicode::Normalize; # std perl distro as of v5.8 use List::Util qw(max); # std perl distro as of v5.10 use Unicode::Collate::Locale; # std perl distro as of v5.14 # cpan modules use Unicode::GCString; # from CPAN # forward defs sub pad($$$); sub colwidth(_); sub entitle(_); my %price = ( "γύρος" => 6.50, # gyros "pears" => 2.00, # like um, pears "linguiça" => 7.00, # spicy sausage, Portuguese "xoriço" => 3.00, # chorizo sausage, Catalan "hamburger" => 6.00, # burgermeister meisterburger "éclair" => 1.60, # dessert, French "smørbrød" => 5.75, # sandwiches, Norwegian "spätzle" => 5.50, # Bayerisch noodles, little sparrows "包子" => 7.50, # bao1 zi5, steamed pork buns, Mandarin "jamón serrano" => 4.45, # country ham, Spanish "pêches" => 2.25, # peaches, French "シュークリーム" => 1.85, # cream-filled pastry like eclair "막걸리" => 4.00, # makgeolli, Korean rice wine "寿司" => 9.99, # sushi, Japanese "おもち" => 2.65, # omochi, rice cakes, Japanese "crème brûlée" => 2.00, # crema catalana "fideuà" => 4.20, # more noodles, Valencian # (Catalan=fideuada) "pâté" => 4.15, # gooseliver paste, French "お好み焼き" => 8.00, # okonomiyaki, Japanese ); my $width = 5 + max map { colwidth } keys %price; # So the Asian stuff comes out in an order that someone # who reads those scripts won't freak out over; the # CJK stuff will be in JIS X 0208 order that way. my $coll = Unicode::Collate::Locale->new(locale => "ja"); for my $item ($coll->sort(keys %price)) { print pad(entitle($item), $width, "."); printf " €%.2f\n", $price{$item}; } sub pad($$$) { my($str, $width, $padchar) = @_; return $str . ($padchar x ($width - colwidth($str))); } sub colwidth(_) { my($str) = @_; return Unicode::GCString->new($str)->columns; } sub entitle(_) { my($str) = @_; $str =~ s{ (?=\pL)(\S) (\S*) } { ucfirst($1) . lc($2) }xge; return $str; } =head1 SEE ALSO See these manpages, some of which are CPAN modules: L<perlunicode>, L<perluniprops>, L<perlre>, L<perlrecharclass>, L<perluniintro>, L<perlunitut>, L<perlunifaq>, L<PerlIO>, L<DB_File>, L<DBM_Filter>, L<DBM_Filter::utf8>, L<Encode>, L<Encode::Locale>, L<Unicode::UCD>, L<Unicode::Normalize>, L<Unicode::GCString>, L<Unicode::LineBreak>, L<Unicode::Collate>, L<Unicode::Collate::Locale>, L<Unicode::Unihan>, L<Unicode::CaseFold>, L<Unicode::Tussle>, L<Lingua::JA::Romanize::Japanese>, L<Lingua::ZH::Romanize::Pinyin>, L<Lingua::KO::Romanize::Hangul>. The L<Unicode::Tussle> CPAN module includes many programs to help with working with Unicode, including these programs to fully or partly replace standard utilities: I<tcgrep> instead of I<egrep>, I<uniquote> instead of I<cat -v> or I<hexdump>, I<uniwc> instead of I<wc>, I<unilook> instead of I<look>, I<unifmt> instead of I<fmt>, and I<ucsort> instead of I<sort>. For exploring Unicode character names and character properties, see its I<uniprops>, I<unichars>, and I<uninames> programs. It also supplies these programs, all of which are general filters that do Unicode-y things: I<unititle> and I<unicaps>; I<uniwide> and I<uninarrow>; I<unisupers> and I<unisubs>; I<nfd>, I<nfc>, I<nfkd>, and I<nfkc>; and I<uc>, I<lc>, and I<tc>. Finally, see the published Unicode Standard (page numbers are from version 6.0.0), including these specific annexes and technical reports: =over =item §3.13 Default Case Algorithms, page 113; §4.2 Case, pages 120–122; Case Mappings, page 166–172, especially Caseless Matching starting on page 170. =item UAX #44: Unicode Character Database =item UTS #18: Unicode Regular Expressions =item UAX #15: Unicode Normalization Forms =item UTS #10: Unicode Collation Algorithm =item UAX #29: Unicode Text Segmentation =item UAX #14: Unicode Line Breaking Algorithm =item UAX #11: East Asian Width =back =head1 AUTHOR Tom Christiansen E<lt>tchrist@perl.comE<gt> wrote this, with occasional kibbitzing from Larry Wall and Jeffrey Friedl in the background. =head1 COPYRIGHT AND LICENCE Copyright © 2012 Tom Christiansen. This program is free software; you may redistribute it and/or modify it under the same terms as Perl itself. Most of these examples taken from the current edition of the “Camel Book”; that is, from the 4ᵗʰ Edition of I<Programming Perl>, Copyright © 2012 Tom Christiansen <et al.>, 2012-02-13 by O’Reilly Media. The code itself is freely redistributable, and you are encouraged to transplant, fold, spindle, and mutilate any of the examples in this manpage however you please for inclusion into your own programs without any encumbrance whatsoever. Acknowledgement via code comment is polite but not required. =head1 REVISION HISTORY v1.0.0 – first public release, 2012-02-27 PK �=�[�:}�O �O perlaix.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlaix - Perl version 5 on IBM AIX (UNIX) systems =head1 DESCRIPTION This document describes various features of IBM's UNIX operating system AIX that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. =head2 Compiling Perl 5 on AIX For information on compilers on older versions of AIX, see L</Compiling Perl 5 on older AIX versions up to 4.3.3>. When compiling Perl, you must use an ANSI C compiler. AIX does not ship an ANSI compliant C compiler with AIX by default, but binary builds of gcc for AIX are widely available. A version of gcc is also included in the AIX Toolbox which is shipped with AIX. =head2 Supported Compilers Currently all versions of IBM's "xlc", "xlc_r", "cc", "cc_r" or "vac" ANSI/C compiler will work for building Perl if that compiler works on your system. If you plan to link Perl to any module that requires thread-support, like DBD::Oracle, it is better to use the _r version of the compiler. This will not build a threaded Perl, but a thread-enabled Perl. See also L</Threaded Perl> later on. As of writing (2010-09) only the I<IBM XL C for AIX> or I<IBM XL C/C++ for AIX> compiler is supported by IBM on AIX 5L/6.1/7.1. The following compiler versions are currently supported by IBM: IBM XL C and IBM XL C/C++ V8, V9, V10, V11 The XL C for AIX is integrated in the XL C/C++ for AIX compiler and therefore also supported. If you choose XL C/C++ V9 you need APAR IZ35785 installed otherwise the integrated SDBM_File do not compile correctly due to an optimization bug. You can circumvent this problem by adding -qipa to the optimization flags (-Doptimize='-O -qipa'). The PTF for APAR IZ35785 which solves this problem is available from IBM (April 2009 PTF for XL C/C++ Enterprise Edition for AIX, V9.0). If you choose XL C/C++ V11 you need the April 2010 PTF (or newer) installed otherwise you will not get a working Perl version. Perl can be compiled with either IBM's ANSI C compiler or with gcc. The former is recommended, as not only it can compile Perl with no difficulty, but also can take advantage of features listed later that require the use of IBM compiler-specific command-line flags. If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details. Please report any hoops you had to jump through to the development team. =head2 Incompatibility with AIX Toolbox lib gdbm If the AIX Toolbox version of lib gdbm < 1.8.3-5 is installed on your system then Perl will not work. This library contains the header files /opt/freeware/include/gdbm/dbm.h|ndbm.h which conflict with the AIX system versions. The lib gdbm will be automatically removed from the wanted libraries if the presence of one of these two header files is detected. If you want to build Perl with GDBM support then please install at least gdbm-devel-1.8.3-5 (or higher). =head2 Perl 5 was successfully compiled and tested on: Perl | AIX Level | Compiler Level | w th | w/o th -------+---------------------+-------------------------+------+------- 5.12.2 |5.1 TL9 32 bit | XL C/C++ V7 | OK | OK 5.12.2 |5.1 TL9 64 bit | XL C/C++ V7 | OK | OK 5.12.2 |5.2 TL10 SP8 32 bit | XL C/C++ V8 | OK | OK 5.12.2 |5.2 TL10 SP8 32 bit | gcc 3.2.2 | OK | OK 5.12.2 |5.2 TL10 SP8 64 bit | XL C/C++ V8 | OK | OK 5.12.2 |5.3 TL8 SP8 32 bit | XL C/C++ V9 + IZ35785 | OK | OK 5.12.2 |5.3 TL8 SP8 32 bit | gcc 4.2.4 | OK | OK 5.12.2 |5.3 TL8 SP8 64 bit | XL C/C++ V9 + IZ35785 | OK | OK 5.12.2 |5.3 TL10 SP3 32 bit | XL C/C++ V11 + Apr 2010 | OK | OK 5.12.2 |5.3 TL10 SP3 64 bit | XL C/C++ V11 + Apr 2010 | OK | OK 5.12.2 |6.1 TL1 SP7 32 bit | XL C/C++ V10 | OK | OK 5.12.2 |6.1 TL1 SP7 64 bit | XL C/C++ V10 | OK | OK 5.13 |7.1 TL0 SP1 32 bit | XL C/C++ V11 + Jul 2010 | OK | OK 5.13 |7.1 TL0 SP1 64 bit | XL C/C++ V11 + Jul 2010 | OK | OK w th = with thread support w/o th = without thread support OK = tested Successfully tested means that all "make test" runs finish with a result of 100% OK. All tests were conducted with -Duseshrplib set. All tests were conducted on the oldest supported AIX technology level with the latest support package applied. If the tested AIX version is out of support (AIX 4.3.3, 5.1, 5.2) then the last available support level was used. =head2 Building Dynamic Extensions on AIX Starting from Perl 5.7.2 (and consequently 5.8.x / 5.10.x / 5.12.x) and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface in the so called runtime linking mode instead of the emulated interface that was used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and earlier. This change does break backward compatibility with compiled modules from earlier Perl releases. The change was made to make Perl more compliant with other applications like Apache/mod_perl which are using the AIX native interface. This change also enables the use of C++ code with static constructors and destructors in Perl extensions, which was not possible using the emulated interface. It is highly recommended to use the new interface. =head2 Using Large Files with Perl Should yield no problems. =head2 Threaded Perl Should yield no problems with AIX 5.1 / 5.2 / 5.3 / 6.1 / 7.1. IBM uses the AIX system Perl (V5.6.0 on AIX 5.1 and V5.8.2 on AIX 5.2 / 5.3 and 6.1; V5.8.8 on AIX 5.3 TL11 and AIX 6.1 TL4; V5.10.1 on AIX 7.1) for some AIX system scripts. If you switch the links in /usr/bin from the AIX system Perl (/usr/opt/perl5) to the newly build Perl then you get the same features as with the IBM AIX system Perl if the threaded options are used. The threaded Perl build works also on AIX 5.1 but the IBM Perl build (Perl v5.6.0) is not threaded on AIX 5.1. Perl 5.12 an newer is not compatible with the IBM fileset perl.libext. =head2 64-bit Perl If your AIX system is installed with 64-bit support, you can expect 64-bit configurations to work. If you want to use 64-bit Perl on AIX 6.1 you need an APAR for a libc.a bug which affects (n)dbm_XXX functions. The APAR number for this problem is IZ39077. If you need more memory (larger data segment) for your Perl programs you can set: /etc/security/limits default: (or your user) data = -1 (default is 262144 * 512 byte) With the default setting the size is limited to 128MB. The -1 removes this limit. If the "make test" fails please change your /etc/security/limits as stated above. =head2 Long doubles IBM calls its implementation of long doubles 128-bit, but it is not the IEEE 128-bit ("quadruple precision") which would give 116 bit of mantissa (nor it is implemented in hardware), instead it's a special software implementation called "double-double", which gives 106 bits of mantissa. There seem to be various problems in this long double implementation. If Configure detects this brokenness, it will disable the long double support. This can be overridden with explicit C<-Duselongdouble> (or C<-Dusemorebits>, which enables both long doubles and 64 bit integers). If you decide to enable long doubles, for most of the broken things Perl has implemented workarounds, but the handling of the special values infinity and NaN remains badly broken: for example infinity plus zero results in NaN. =head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/32-bit) With the following options you get a threaded Perl version which passes all make tests in threaded 32-bit mode, which is the default configuration for the Perl builds that AIX ships with. rm config.sh ./Configure \ -d \ -Dcc=cc_r \ -Duseshrplib \ -Dusethreads \ -Dprefix=/usr/opt/perl5_32 The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation. =head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (32-bit) With the following options you get a Perl version which passes all make tests in 32-bit mode. rm config.sh ./Configure \ -d \ -Dcc=cc_r \ -Duseshrplib \ -Dprefix=/usr/opt/perl5_32 The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation. =head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/64-bit) With the following options you get a threaded Perl version which passes all make tests in 64-bit mode. export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell) rm config.sh ./Configure \ -d \ -Dcc=cc_r \ -Duseshrplib \ -Dusethreads \ -Duse64bitall \ -Dprefix=/usr/opt/perl5_64 =head2 Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (64-bit) With the following options you get a Perl version which passes all make tests in 64-bit mode. export OBJECT_MODE=64 / setenv OBJECT_MODE 64 (depending on your shell) rm config.sh ./Configure \ -d \ -Dcc=cc_r \ -Duseshrplib \ -Duse64bitall \ -Dprefix=/usr/opt/perl5_64 The -Dprefix option will install Perl in a directory parallel to the IBM AIX system Perl installation. If you choose gcc to compile 64-bit Perl then you need to add the following option: -Dcc='gcc -maix64' =head2 Compiling Perl 5 on AIX 7.1.0 A regression in AIX 7 causes a failure in make test in Time::Piece during daylight savings time. APAR IV16514 provides the fix for this. A quick test to see if it's required, assuming it is currently daylight savings in Eastern Time, would be to run C< TZ=EST5 date +%Z >. This will come back with C<EST> normally, but nothing if you have the problem. =head2 Compiling Perl 5 on older AIX versions up to 4.3.3 Due to the fact that AIX 4.3.3 reached end-of-service in December 31, 2003 this information is provided as is. The Perl versions prior to Perl 5.8.9 could be compiled on AIX up to 4.3.3 with the following settings (your mileage may vary): When compiling Perl, you must use an ANSI C compiler. AIX does not ship an ANSI compliant C-compiler with AIX by default, but binary builds of gcc for AIX are widely available. At the moment of writing, AIX supports two different native C compilers, for which you have to pay: B<xlC> and B<vac>. If you decide to use either of these two (which is quite a lot easier than using gcc), be sure to upgrade to the latest available patch level. Currently: xlC.C 3.1.4.10 or 3.6.6.0 or 4.0.2.2 or 5.0.2.9 or 6.0.0.3 vac.C 4.4.0.3 or 5.0.2.6 or 6.0.0.1 note that xlC has the OS version in the name as of version 4.0.2.0, so you will find xlC.C for AIX-5.0 as package xlC.aix50.rte 5.0.2.0 or 6.0.0.3 subversions are not the same "latest" on all OS versions. For example, the latest xlC-5 on aix41 is 5.0.2.9, while on aix43, it is 5.0.2.7. Perl can be compiled with either IBM's ANSI C compiler or with gcc. The former is recommended, as not only can it compile Perl with no difficulty, but also can take advantage of features listed later that require the use of IBM compiler-specific command-line flags. The IBM's compiler patch levels 5.0.0.0 and 5.0.1.0 have compiler optimization bugs that affect compiling perl.c and regcomp.c, respectively. If Perl's configuration detects those compiler patch levels, optimization is turned off for the said source code files. Upgrading to at least 5.0.2.0 is recommended. If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details. Please report any hoops you had to jump through to the development team. =head2 OS level Before installing the patches to the IBM C-compiler you need to know the level of patching for the Operating System. IBM's command 'oslevel' will show the base, but is not always complete (in this example oslevel shows 4.3.NULL, whereas the system might run most of 4.3.THREE): # oslevel 4.3.0.0 # lslpp -l | grep 'bos.rte ' bos.rte 4.3.3.75 COMMITTED Base Operating System Runtime bos.rte 4.3.2.0 COMMITTED Base Operating System Runtime # The same might happen to AIX 5.1 or other OS levels. As a side note, Perl cannot be built without bos.adt.syscalls and bos.adt.libm installed # lslpp -l | egrep "syscalls|libm" bos.adt.libm 5.1.0.25 COMMITTED Base Application Development bos.adt.syscalls 5.1.0.36 COMMITTED System Calls Application # =head2 Building Dynamic Extensions on AIX E<lt> 5L AIX supports dynamically loadable objects as well as shared libraries. Shared libraries by convention end with the suffix .a, which is a bit misleading, as an archive can contain static as well as dynamic members. For Perl dynamically loaded objects we use the .so suffix also used on many other platforms. Note that starting from Perl 5.7.2 (and consequently 5.8.0) and AIX 4.3 or newer Perl uses the AIX native dynamic loading interface in the so called runtime linking mode instead of the emulated interface that was used in Perl releases 5.6.1 and earlier or, for AIX releases 4.2 and earlier. This change does break backward compatibility with compiled modules from earlier Perl releases. The change was made to make Perl more compliant with other applications like Apache/mod_perl which are using the AIX native interface. This change also enables the use of C++ code with static constructors and destructors in Perl extensions, which was not possible using the emulated interface. =head2 The IBM ANSI C Compiler All defaults for Configure can be used. If you've chosen to use vac 4, be sure to run 4.4.0.3. Older versions will turn up nasty later on. For vac 5 be sure to run at least 5.0.1.0, but vac 5.0.2.6 or up is highly recommended. Note that since IBM has removed vac 5.0.2.1 through 5.0.2.5 from the software depot, these versions should be considered obsolete. Here's a brief lead of how to upgrade the compiler to the latest level. Of course this is subject to changes. You can only upgrade versions from ftp-available updates if the first three digit groups are the same (in where you can skip intermediate unlike the patches in the developer snapshots of Perl), or to one version up where the "base" is available. In other words, the AIX compiler patches are cumulative. vac.C.4.4.0.1 => vac.C.4.4.0.3 is OK (vac.C.4.4.0.2 not needed) xlC.C.3.1.3.3 => xlC.C.3.1.4.10 is NOT OK (xlC.C.3.1.4.0 is not available) # ftp ftp.software.ibm.com Connected to service.boulder.ibm.com. : welcome message ... Name (ftp.software.ibm.com:merijn): anonymous 331 Guest login ok, send your complete e-mail address as password. Password: ... accepted login stuff ftp> cd /aix/fixes/v4/ ftp> dir other other.ll output to local-file: other.ll? y 200 PORT command successful. 150 Opening ASCII mode data connection for /bin/ls. 226 Transfer complete. ftp> dir xlc xlc.ll output to local-file: xlc.ll? y 200 PORT command successful. 150 Opening ASCII mode data connection for /bin/ls. 226 Transfer complete. ftp> bye ... goodbye messages # ls -l *.ll -rw-rw-rw- 1 merijn system 1169432 Nov 2 17:29 other.ll -rw-rw-rw- 1 merijn system 29170 Nov 2 17:29 xlc.ll On AIX 4.2 using xlC, we continue: # lslpp -l | fgrep 'xlC.C ' xlC.C 3.1.4.9 COMMITTED C for AIX Compiler xlC.C 3.1.4.0 COMMITTED C for AIX Compiler # grep 'xlC.C.3.1.4.*.bff' xlc.ll -rw-r--r-- 1 45776101 1 6286336 Jul 22 1996 xlC.C.3.1.4.1.bff -rw-rw-r-- 1 45776101 1 6173696 Aug 24 1998 xlC.C.3.1.4.10.bff -rw-r--r-- 1 45776101 1 6319104 Aug 14 1996 xlC.C.3.1.4.2.bff -rw-r--r-- 1 45776101 1 6316032 Oct 21 1996 xlC.C.3.1.4.3.bff -rw-r--r-- 1 45776101 1 6315008 Dec 20 1996 xlC.C.3.1.4.4.bff -rw-rw-r-- 1 45776101 1 6178816 Mar 28 1997 xlC.C.3.1.4.5.bff -rw-rw-r-- 1 45776101 1 6188032 May 22 1997 xlC.C.3.1.4.6.bff -rw-rw-r-- 1 45776101 1 6191104 Sep 5 1997 xlC.C.3.1.4.7.bff -rw-rw-r-- 1 45776101 1 6185984 Jan 13 1998 xlC.C.3.1.4.8.bff -rw-rw-r-- 1 45776101 1 6169600 May 27 1998 xlC.C.3.1.4.9.bff # wget ftp://ftp.software.ibm.com/aix/fixes/v4/xlc/xlC.C.3.1.4.10.bff # On AIX 4.3 using vac, we continue: # lslpp -l | grep 'vac.C ' vac.C 5.0.2.2 COMMITTED C for AIX Compiler vac.C 5.0.2.0 COMMITTED C for AIX Compiler # grep 'vac.C.5.0.2.*.bff' other.ll -rw-rw-r-- 1 45776101 1 13592576 Apr 16 2001 vac.C.5.0.2.0.bff -rw-rw-r-- 1 45776101 1 14133248 Apr 9 2002 vac.C.5.0.2.3.bff -rw-rw-r-- 1 45776101 1 14173184 May 20 2002 vac.C.5.0.2.4.bff -rw-rw-r-- 1 45776101 1 14192640 Nov 22 2002 vac.C.5.0.2.6.bff # wget ftp://ftp.software.ibm.com/aix/fixes/v4/other/vac.C.5.0.2.6.bff # Likewise on all other OS levels. Then execute the following command, and fill in its choices # smit install_update -> Install and Update from LATEST Available Software * INPUT device / directory for software [ vac.C.5.0.2.6.bff ] [ OK ] [ OK ] Follow the messages ... and you're done. If you like a more web-like approach, a good start point can be L<http://www14.software.ibm.com/webapp/download/downloadaz.jsp> and click "C for AIX", and follow the instructions. =head2 The usenm option If linking miniperl cc -o miniperl ... miniperlmain.o opmini.o perl.o ... -lm -lc ... causes error like this ld: 0711-317 ERROR: Undefined symbol: .aintl ld: 0711-317 ERROR: Undefined symbol: .copysignl ld: 0711-317 ERROR: Undefined symbol: .syscall ld: 0711-317 ERROR: Undefined symbol: .eaccess ld: 0711-317 ERROR: Undefined symbol: .setresuid ld: 0711-317 ERROR: Undefined symbol: .setresgid ld: 0711-317 ERROR: Undefined symbol: .setproctitle ld: 0711-345 Use the -bloadmap or -bnoquiet option to obtain more information. you could retry with make realclean rm config.sh ./Configure -Dusenm ... which makes Configure to use the C<nm> tool when scanning for library symbols, which usually is not done in AIX. Related to this, you probably should not use the C<-r> option of Configure in AIX, because that affects of how the C<nm> tool is used. =head2 Using GNU's gcc for building Perl Using gcc-3.x (tested with 3.0.4, 3.1, and 3.2) now works out of the box, as do recent gcc-2.9 builds available directly from IBM as part of their Linux compatibility packages, available here: http://www.ibm.com/servers/aix/products/aixos/linux/ =head2 Using Large Files with Perl E<lt> 5L Should yield no problems. =head2 Threaded Perl E<lt> 5L Threads seem to work OK, though at the moment not all tests pass when threads are used in combination with 64-bit configurations. You may get a warning when doing a threaded build: "pp_sys.c", line 4640.39: 1506-280 (W) Function argument assignment between types "unsigned char*" and "const void*" is not allowed. The exact line number may vary, but if the warning (W) comes from a line line this hent = PerlSock_gethostbyaddr(addr, (Netdb_hlen_t) addrlen, addrtype); in the "pp_ghostent" function, you may ignore it safely. The warning is caused by the reentrant variant of gethostbyaddr() having a slightly different prototype than its non-reentrant variant, but the difference is not really significant here. =head2 64-bit Perl E<lt> 5L If your AIX is installed with 64-bit support, you can expect 64-bit configurations to work. In combination with threads some tests might still fail. =head2 AIX 4.2 and extensions using C++ with statics In AIX 4.2 Perl extensions that use C++ functions that use statics may have problems in that the statics are not getting initialized. In newer AIX releases this has been solved by linking Perl with the libC_r library, but unfortunately in AIX 4.2 the said library has an obscure bug where the various functions related to time (such as time() and gettimeofday()) return broken values, and therefore in AIX 4.2 Perl is not linked against the libC_r. =head1 AUTHORS Rainer Tammer <tammer@tammer.net> =cut PK �=�[e��� �� perluniintro.podnu �[��� =head1 NAME perluniintro - Perl Unicode introduction =head1 DESCRIPTION This document gives a general idea of Unicode and how to use Unicode in Perl. See L</Further Resources> for references to more in-depth treatments of Unicode. =head2 Unicode Unicode is a character set standard which plans to codify all of the writing systems of the world, plus many other symbols. Unicode and ISO/IEC 10646 are coordinated standards that unify almost all other modern character set standards, covering more than 80 writing systems and hundreds of languages, including all commercially-important modern languages. All characters in the largest Chinese, Japanese, and Korean dictionaries are also encoded. The standards will eventually cover almost all characters in more than 250 writing systems and thousands of languages. Unicode 1.0 was released in October 1991, and 6.0 in October 2010. A Unicode I<character> is an abstract entity. It is not bound to any particular integer width, especially not to the C language C<char>. Unicode is language-neutral and display-neutral: it does not encode the language of the text, and it does not generally define fonts or other graphical layout details. Unicode operates on characters and on text built from those characters. Unicode defines characters like C<LATIN CAPITAL LETTER A> or C<GREEK SMALL LETTER ALPHA> and unique numbers for the characters, in this case 0x0041 and 0x03B1, respectively. These unique numbers are called I<code points>. A code point is essentially the position of the character within the set of all possible Unicode characters, and thus in Perl, the term I<ordinal> is often used interchangeably with it. The Unicode standard prefers using hexadecimal notation for the code points. If numbers like C<0x0041> are unfamiliar to you, take a peek at a later section, L</"Hexadecimal Notation">. The Unicode standard uses the notation C<U+0041 LATIN CAPITAL LETTER A>, to give the hexadecimal code point and the normative name of the character. Unicode also defines various I<properties> for the characters, like "uppercase" or "lowercase", "decimal digit", or "punctuation"; these properties are independent of the names of the characters. Furthermore, various operations on the characters like uppercasing, lowercasing, and collating (sorting) are defined. A Unicode I<logical> "character" can actually consist of more than one internal I<actual> "character" or code point. For Western languages, this is adequately modelled by a I<base character> (like C<LATIN CAPITAL LETTER A>) followed by one or more I<modifiers> (like C<COMBINING ACUTE ACCENT>). This sequence of base character and modifiers is called a I<combining character sequence>. Some non-western languages require more complicated models, so Unicode created the I<grapheme cluster> concept, which was later further refined into the I<extended grapheme cluster>. For example, a Korean Hangul syllable is considered a single logical character, but most often consists of three actual Unicode characters: a leading consonant followed by an interior vowel followed by a trailing consonant. Whether to call these extended grapheme clusters "characters" depends on your point of view. If you are a programmer, you probably would tend towards seeing each element in the sequences as one unit, or "character". However from the user's point of view, the whole sequence could be seen as one "character" since that's probably what it looks like in the context of the user's language. In this document, we take the programmer's point of view: one "character" is one Unicode code point. For some combinations of base character and modifiers, there are I<precomposed> characters. There is a single character equivalent, for example, for the sequence C<LATIN CAPITAL LETTER A> followed by C<COMBINING ACUTE ACCENT>. It is called C<LATIN CAPITAL LETTER A WITH ACUTE>. These precomposed characters are, however, only available for some combinations, and are mainly meant to support round-trip conversions between Unicode and legacy standards (like ISO 8859). Using sequences, as Unicode does, allows for needing fewer basic building blocks (code points) to express many more potential grapheme clusters. To support conversion between equivalent forms, various I<normalization forms> are also defined. Thus, C<LATIN CAPITAL LETTER A WITH ACUTE> is in I<Normalization Form Composed>, (abbreviated NFC), and the sequence C<LATIN CAPITAL LETTER A> followed by C<COMBINING ACUTE ACCENT> represents the same character in I<Normalization Form Decomposed> (NFD). Because of backward compatibility with legacy encodings, the "a unique number for every character" idea breaks down a bit: instead, there is "at least one number for every character". The same character could be represented differently in several legacy encodings. The converse is not true: some code points do not have an assigned character. Firstly, there are unallocated code points within otherwise used blocks. Secondly, there are special Unicode control characters that do not represent true characters. When Unicode was first conceived, it was thought that all the world's characters could be represented using a 16-bit word; that is a maximum of C<0x10000> (or 65,536) characters would be needed, from C<0x0000> to C<0xFFFF>. This soon proved to be wrong, and since Unicode 2.0 (July 1996), Unicode has been defined all the way up to 21 bits (C<0x10FFFF>), and Unicode 3.1 (March 2001) defined the first characters above C<0xFFFF>. The first C<0x10000> characters are called the I<Plane 0>, or the I<Basic Multilingual Plane> (BMP). With Unicode 3.1, 17 (yes, seventeen) planes in all were defined--but they are nowhere near full of defined characters, yet. When a new language is being encoded, Unicode generally will choose a C<block> of consecutive unallocated code points for its characters. So far, the number of code points in these blocks has always been evenly divisible by 16. Extras in a block, not currently needed, are left unallocated, for future growth. But there have been occasions when a later release needed more code points than the available extras, and a new block had to allocated somewhere else, not contiguous to the initial one, to handle the overflow. Thus, it became apparent early on that "block" wasn't an adequate organizing principle, and so the C<Script> property was created. (Later an improved script property was added as well, the C<Script_Extensions> property.) Those code points that are in overflow blocks can still have the same script as the original ones. The script concept fits more closely with natural language: there is C<Latin> script, C<Greek> script, and so on; and there are several artificial scripts, like C<Common> for characters that are used in multiple scripts, such as mathematical symbols. Scripts usually span varied parts of several blocks. For more information about scripts, see L<perlunicode/Scripts>. The division into blocks exists, but it is almost completely accidental--an artifact of how the characters have been and still are allocated. (Note that this paragraph has oversimplified things for the sake of this being an introduction. Unicode doesn't really encode languages, but the writing systems for them--their scripts; and one script can be used by many languages. Unicode also encodes things that aren't really about languages, such as symbols like C<BAGGAGE CLAIM>.) The Unicode code points are just abstract numbers. To input and output these abstract numbers, the numbers must be I<encoded> or I<serialised> somehow. Unicode defines several I<character encoding forms>, of which I<UTF-8> is the most popular. UTF-8 is a variable length encoding that encodes Unicode characters as 1 to 4 bytes. Other encodings include UTF-16 and UTF-32 and their big- and little-endian variants (UTF-8 is byte-order independent). The ISO/IEC 10646 defines the UCS-2 and UCS-4 encoding forms. For more information about encodings--for instance, to learn what I<surrogates> and I<byte order marks> (BOMs) are--see L<perlunicode>. =head2 Perl's Unicode Support Starting from Perl v5.6.0, Perl has had the capacity to handle Unicode natively. Perl v5.8.0, however, is the first recommended release for serious Unicode work. The maintenance release 5.6.1 fixed many of the problems of the initial Unicode implementation, but for example regular expressions still do not work with Unicode in 5.6.1. Perl v5.14.0 is the first release where Unicode support is (almost) seamlessly integrable without some gotchas. (There are a few exceptions. Firstly, some differences in L<quotemeta|perlfunc/quotemeta> were fixed starting in Perl 5.16.0. Secondly, some differences in L<the range operator|perlop/Range Operators> were fixed starting in Perl 5.26.0. Thirdly, some differences in L<split|perlfunc/split> were fixed started in Perl 5.28.0.) To enable this seamless support, you should C<use feature 'unicode_strings'> (which is automatically selected if you C<use 5.012> or higher). See L<feature>. (5.14 also fixes a number of bugs and departures from the Unicode standard.) Before Perl v5.8.0, the use of C<use utf8> was used to declare that operations in the current block or file would be Unicode-aware. This model was found to be wrong, or at least clumsy: the "Unicodeness" is now carried with the data, instead of being attached to the operations. Starting with Perl v5.8.0, only one case remains where an explicit C<use utf8> is needed: if your Perl script itself is encoded in UTF-8, you can use UTF-8 in your identifier names, and in string and regular expression literals, by saying C<use utf8>. This is not the default because scripts with legacy 8-bit data in them would break. See L<utf8>. =head2 Perl's Unicode Model Perl supports both pre-5.6 strings of eight-bit native bytes, and strings of Unicode characters. The general principle is that Perl tries to keep its data as eight-bit bytes for as long as possible, but as soon as Unicodeness cannot be avoided, the data is transparently upgraded to Unicode. Prior to Perl v5.14.0, the upgrade was not completely transparent (see L<perlunicode/The "Unicode Bug">), and for backwards compatibility, full transparency is not gained unless C<use feature 'unicode_strings'> (see L<feature>) or C<use 5.012> (or higher) is selected. Internally, Perl currently uses either whatever the native eight-bit character set of the platform (for example Latin-1) is, defaulting to UTF-8, to encode Unicode strings. Specifically, if all code points in the string are C<0xFF> or less, Perl uses the native eight-bit character set. Otherwise, it uses UTF-8. A user of Perl does not normally need to know nor care how Perl happens to encode its internal strings, but it becomes relevant when outputting Unicode strings to a stream without a PerlIO layer (one with the "default" encoding). In such a case, the raw bytes used internally (the native character set or UTF-8, as appropriate for each string) will be used, and a "Wide character" warning will be issued if those strings contain a character beyond 0x00FF. For example, perl -e 'print "\x{DF}\n", "\x{0100}\x{DF}\n"' produces a fairly useless mixture of native bytes and UTF-8, as well as a warning: Wide character in print at ... To output UTF-8, use the C<:encoding> or C<:utf8> output layer. Prepending binmode(STDOUT, ":utf8"); to this sample program ensures that the output is completely UTF-8, and removes the program's warning. You can enable automatic UTF-8-ification of your standard file handles, default C<open()> layer, and C<@ARGV> by using either the C<-C> command line switch or the C<PERL_UNICODE> environment variable, see L<perlrun|perlrun/-C [numberE<sol>list]> for the documentation of the C<-C> switch. Note that this means that Perl expects other software to work the same way: if Perl has been led to believe that STDIN should be UTF-8, but then STDIN coming in from another command is not UTF-8, Perl will likely complain about the malformed UTF-8. All features that combine Unicode and I/O also require using the new PerlIO feature. Almost all Perl 5.8 platforms do use PerlIO, though: you can see whether yours is by running "perl -V" and looking for C<useperlio=define>. =head2 Unicode and EBCDIC Perl 5.8.0 added support for Unicode on EBCDIC platforms. This support was allowed to lapse in later releases, but was revived in 5.22. Unicode support is somewhat more complex to implement since additional conversions are needed. See L<perlebcdic> for more information. On EBCDIC platforms, the internal Unicode encoding form is UTF-EBCDIC instead of UTF-8. The difference is that as UTF-8 is "ASCII-safe" in that ASCII characters encode to UTF-8 as-is, while UTF-EBCDIC is "EBCDIC-safe", in that all the basic characters (which includes all those that have ASCII equivalents (like C<"A">, C<"0">, C<"%">, I<etc.>) are the same in both EBCDIC and UTF-EBCDIC. Often, documentation will use the term "UTF-8" to mean UTF-EBCDIC as well. This is the case in this document. =head2 Creating Unicode This section applies fully to Perls starting with v5.22. Various caveats for earlier releases are in the L</Earlier releases caveats> subsection below. To create Unicode characters in literals, use the C<\N{...}> notation in double-quoted strings: my $smiley_from_name = "\N{WHITE SMILING FACE}"; my $smiley_from_code_point = "\N{U+263a}"; Similarly, they can be used in regular expression literals $smiley =~ /\N{WHITE SMILING FACE}/; $smiley =~ /\N{U+263a}/; or, starting in v5.32: $smiley =~ /\p{Name=WHITE SMILING FACE}/; $smiley =~ /\p{Name=whitesmilingface}/; At run-time you can use: use charnames (); my $hebrew_alef_from_name = charnames::string_vianame("HEBREW LETTER ALEF"); my $hebrew_alef_from_code_point = charnames::string_vianame("U+05D0"); Naturally, C<ord()> will do the reverse: it turns a character into a code point. There are other runtime options as well. You can use C<pack()>: my $hebrew_alef_from_code_point = pack("U", 0x05d0); Or you can use C<chr()>, though it is less convenient in the general case: $hebrew_alef_from_code_point = chr(utf8::unicode_to_native(0x05d0)); utf8::upgrade($hebrew_alef_from_code_point); The C<utf8::unicode_to_native()> and C<utf8::upgrade()> aren't needed if the argument is above 0xFF, so the above could have been written as $hebrew_alef_from_code_point = chr(0x05d0); since 0x5d0 is above 255. C<\x{}> and C<\o{}> can also be used to specify code points at compile time in double-quotish strings, but, for backward compatibility with older Perls, the same rules apply as with C<chr()> for code points less than 256. C<utf8::unicode_to_native()> is used so that the Perl code is portable to EBCDIC platforms. You can omit it if you're I<really> sure no one will ever want to use your code on a non-ASCII platform. Starting in Perl v5.22, calls to it on ASCII platforms are optimized out, so there's no performance penalty at all in adding it. Or you can simply use the other constructs that don't require it. See L</"Further Resources"> for how to find all these names and numeric codes. =head3 Earlier releases caveats On EBCDIC platforms, prior to v5.22, using C<\N{U+...}> doesn't work properly. Prior to v5.16, using C<\N{...}> with a character name (as opposed to a C<U+...> code point) required a S<C<use charnames :full>>. Prior to v5.14, there were some bugs in C<\N{...}> with a character name (as opposed to a C<U+...> code point). C<charnames::string_vianame()> was introduced in v5.14. Prior to that, C<charnames::vianame()> should work, but only if the argument is of the form C<"U+...">. Your best bet there for runtime Unicode by character name is probably: use charnames (); my $hebrew_alef_from_name = pack("U", charnames::vianame("HEBREW LETTER ALEF")); =head2 Handling Unicode Handling Unicode is for the most part transparent: just use the strings as usual. Functions like C<index()>, C<length()>, and C<substr()> will work on the Unicode characters; regular expressions will work on the Unicode characters (see L<perlunicode> and L<perlretut>). Note that Perl considers grapheme clusters to be separate characters, so for example print length("\N{LATIN CAPITAL LETTER A}\N{COMBINING ACUTE ACCENT}"), "\n"; will print 2, not 1. The only exception is that regular expressions have C<\X> for matching an extended grapheme cluster. (Thus C<\X> in a regular expression would match the entire sequence of both the example characters.) Life is not quite so transparent, however, when working with legacy encodings, I/O, and certain special cases: =head2 Legacy Encodings When you combine legacy data and Unicode, the legacy data needs to be upgraded to Unicode. Normally the legacy data is assumed to be ISO 8859-1 (or EBCDIC, if applicable). The C<Encode> module knows about many encodings and has interfaces for doing conversions between those encodings: use Encode 'decode'; $data = decode("iso-8859-3", $data); # convert from legacy =head2 Unicode I/O Normally, writing out Unicode data print FH $some_string_with_unicode, "\n"; produces raw bytes that Perl happens to use to internally encode the Unicode string. Perl's internal encoding depends on the system as well as what characters happen to be in the string at the time. If any of the characters are at code points C<0x100> or above, you will get a warning. To ensure that the output is explicitly rendered in the encoding you desire--and to avoid the warning--open the stream with the desired encoding. Some examples: open FH, ">:utf8", "file"; open FH, ">:encoding(ucs2)", "file"; open FH, ">:encoding(UTF-8)", "file"; open FH, ">:encoding(shift_jis)", "file"; and on already open streams, use C<binmode()>: binmode(STDOUT, ":utf8"); binmode(STDOUT, ":encoding(ucs2)"); binmode(STDOUT, ":encoding(UTF-8)"); binmode(STDOUT, ":encoding(shift_jis)"); The matching of encoding names is loose: case does not matter, and many encodings have several aliases. Note that the C<:utf8> layer must always be specified exactly like that; it is I<not> subject to the loose matching of encoding names. Also note that currently C<:utf8> is unsafe for input, because it accepts the data without validating that it is indeed valid UTF-8; you should instead use C<:encoding(UTF-8)> (with or without a hyphen). See L<PerlIO> for the C<:utf8> layer, L<PerlIO::encoding> and L<Encode::PerlIO> for the C<:encoding()> layer, and L<Encode::Supported> for many encodings supported by the C<Encode> module. Reading in a file that you know happens to be encoded in one of the Unicode or legacy encodings does not magically turn the data into Unicode in Perl's eyes. To do that, specify the appropriate layer when opening files open(my $fh,'<:encoding(UTF-8)', 'anything'); my $line_of_unicode = <$fh>; open(my $fh,'<:encoding(Big5)', 'anything'); my $line_of_unicode = <$fh>; The I/O layers can also be specified more flexibly with the C<open> pragma. See L<open>, or look at the following example. use open ':encoding(UTF-8)'; # input/output default encoding will be # UTF-8 open X, ">file"; print X chr(0x100), "\n"; close X; open Y, "<file"; printf "%#x\n", ord(<Y>); # this should print 0x100 close Y; With the C<open> pragma you can use the C<:locale> layer BEGIN { $ENV{LC_ALL} = $ENV{LANG} = 'ru_RU.KOI8-R' } # the :locale will probe the locale environment variables like # LC_ALL use open OUT => ':locale'; # russki parusski open(O, ">koi8"); print O chr(0x430); # Unicode CYRILLIC SMALL LETTER A = KOI8-R 0xc1 close O; open(I, "<koi8"); printf "%#x\n", ord(<I>), "\n"; # this should print 0xc1 close I; These methods install a transparent filter on the I/O stream that converts data from the specified encoding when it is read in from the stream. The result is always Unicode. The L<open> pragma affects all the C<open()> calls after the pragma by setting default layers. If you want to affect only certain streams, use explicit layers directly in the C<open()> call. You can switch encodings on an already opened stream by using C<binmode()>; see L<perlfunc/binmode>. The C<:locale> does not currently work with C<open()> and C<binmode()>, only with the C<open> pragma. The C<:utf8> and C<:encoding(...)> methods do work with all of C<open()>, C<binmode()>, and the C<open> pragma. Similarly, you may use these I/O layers on output streams to automatically convert Unicode to the specified encoding when it is written to the stream. For example, the following snippet copies the contents of the file "text.jis" (encoded as ISO-2022-JP, aka JIS) to the file "text.utf8", encoded as UTF-8: open(my $nihongo, '<:encoding(iso-2022-jp)', 'text.jis'); open(my $unicode, '>:utf8', 'text.utf8'); while (<$nihongo>) { print $unicode $_ } The naming of encodings, both by the C<open()> and by the C<open> pragma allows for flexible names: C<koi8-r> and C<KOI8R> will both be understood. Common encodings recognized by ISO, MIME, IANA, and various other standardisation organisations are recognised; for a more detailed list see L<Encode::Supported>. C<read()> reads characters and returns the number of characters. C<seek()> and C<tell()> operate on byte counts, as does C<sysseek()>. C<sysread()> and C<syswrite()> should not be used on file handles with character encoding layers, they behave badly, and that behaviour has been deprecated since perl 5.24. Notice that because of the default behaviour of not doing any conversion upon input if there is no default layer, it is easy to mistakenly write code that keeps on expanding a file by repeatedly encoding the data: # BAD CODE WARNING open F, "file"; local $/; ## read in the whole file of 8-bit characters $t = <F>; close F; open F, ">:encoding(UTF-8)", "file"; print F $t; ## convert to UTF-8 on output close F; If you run this code twice, the contents of the F<file> will be twice UTF-8 encoded. A C<use open ':encoding(UTF-8)'> would have avoided the bug, or explicitly opening also the F<file> for input as UTF-8. B<NOTE>: the C<:utf8> and C<:encoding> features work only if your Perl has been built with L<PerlIO>, which is the default on most systems. =head2 Displaying Unicode As Text Sometimes you might want to display Perl scalars containing Unicode as simple ASCII (or EBCDIC) text. The following subroutine converts its argument so that Unicode characters with code points greater than 255 are displayed as C<\x{...}>, control characters (like C<\n>) are displayed as C<\x..>, and the rest of the characters as themselves: sub nice_string { join("", map { $_ > 255 # if wide character... ? sprintf("\\x{%04X}", $_) # \x{...} : chr($_) =~ /[[:cntrl:]]/ # else if control character... ? sprintf("\\x%02X", $_) # \x.. : quotemeta(chr($_)) # else quoted or as themselves } unpack("W*", $_[0])); # unpack Unicode characters } For example, nice_string("foo\x{100}bar\n") returns the string 'foo\x{0100}bar\x0A' which is ready to be printed. (C<\\x{}> is used here instead of C<\\N{}>, since it's most likely that you want to see what the native values are.) =head2 Special Cases =over 4 =item * Starting in Perl 5.28, it is illegal for bit operators, like C<~>, to operate on strings containing code points above 255. =item * The vec() function may produce surprising results if used on strings containing characters with ordinal values above 255. In such a case, the results are consistent with the internal encoding of the characters, but not with much else. So don't do that, and starting in Perl 5.28, a deprecation message is issued if you do so, becoming illegal in Perl 5.32. =item * Peeking At Perl's Internal Encoding Normal users of Perl should never care how Perl encodes any particular Unicode string (because the normal ways to get at the contents of a string with Unicode--via input and output--should always be via explicitly-defined I/O layers). But if you must, there are two ways of looking behind the scenes. One way of peeking inside the internal encoding of Unicode characters is to use C<unpack("C*", ...> to get the bytes of whatever the string encoding happens to be, or C<unpack("U0..", ...)> to get the bytes of the UTF-8 encoding: # this prints c4 80 for the UTF-8 bytes 0xc4 0x80 print join(" ", unpack("U0(H2)*", pack("U", 0x100))), "\n"; Yet another way would be to use the Devel::Peek module: perl -MDevel::Peek -e 'Dump(chr(0x100))' That shows the C<UTF8> flag in FLAGS and both the UTF-8 bytes and Unicode characters in C<PV>. See also later in this document the discussion about the C<utf8::is_utf8()> function. =back =head2 Advanced Topics =over 4 =item * String Equivalence The question of string equivalence turns somewhat complicated in Unicode: what do you mean by "equal"? (Is C<LATIN CAPITAL LETTER A WITH ACUTE> equal to C<LATIN CAPITAL LETTER A>?) The short answer is that by default Perl compares equivalence (C<eq>, C<ne>) based only on code points of the characters. In the above case, the answer is no (because 0x00C1 != 0x0041). But sometimes, any CAPITAL LETTER A's should be considered equal, or even A's of any case. The long answer is that you need to consider character normalization and casing issues: see L<Unicode::Normalize>, Unicode Technical Report #15, L<Unicode Normalization Forms|https://www.unicode.org/unicode/reports/tr15> and sections on case mapping in the L<Unicode Standard|https://www.unicode.org>. As of Perl 5.8.0, the "Full" case-folding of I<Case Mappings/SpecialCasing> is implemented, but bugs remain in C<qr//i> with them, mostly fixed by 5.14, and essentially entirely by 5.18. =item * String Collation People like to see their strings nicely sorted--or as Unicode parlance goes, collated. But again, what do you mean by collate? (Does C<LATIN CAPITAL LETTER A WITH ACUTE> come before or after C<LATIN CAPITAL LETTER A WITH GRAVE>?) The short answer is that by default, Perl compares strings (C<lt>, C<le>, C<cmp>, C<ge>, C<gt>) based only on the code points of the characters. In the above case, the answer is "after", since C<0x00C1> > C<0x00C0>. The long answer is that "it depends", and a good answer cannot be given without knowing (at the very least) the language context. See L<Unicode::Collate>, and I<Unicode Collation Algorithm> L<https://www.unicode.org/unicode/reports/tr10/> =back =head2 Miscellaneous =over 4 =item * Character Ranges and Classes Character ranges in regular expression bracketed character classes ( e.g., C</[a-z]/>) and in the C<tr///> (also known as C<y///>) operator are not magically Unicode-aware. What this means is that C<[A-Za-z]> will not magically start to mean "all alphabetic letters" (not that it does mean that even for 8-bit characters; for those, if you are using locales (L<perllocale>), use C</[[:alpha:]]/>; and if not, use the 8-bit-aware property C<\p{alpha}>). All the properties that begin with C<\p> (and its inverse C<\P>) are actually character classes that are Unicode-aware. There are dozens of them, see L<perluniprops>. Starting in v5.22, you can use Unicode code points as the end points of regular expression pattern character ranges, and the range will include all Unicode code points that lie between those end points, inclusive. qr/ [ \N{U+03} - \N{U+20} ] /xx includes the code points C<\N{U+03}>, C<\N{U+04}>, ..., C<\N{U+20}>. This also works for ranges in C<tr///> starting in Perl v5.24. =item * String-To-Number Conversions Unicode does define several other decimal--and numeric--characters besides the familiar 0 to 9, such as the Arabic and Indic digits. Perl does not support string-to-number conversion for digits other than ASCII C<0> to C<9> (and ASCII C<a> to C<f> for hexadecimal). To get safe conversions from any Unicode string, use L<Unicode::UCD/num()>. =back =head2 Questions With Answers =over 4 =item * Will My Old Scripts Break? Very probably not. Unless you are generating Unicode characters somehow, old behaviour should be preserved. About the only behaviour that has changed and which could start generating Unicode is the old behaviour of C<chr()> where supplying an argument more than 255 produced a character modulo 255. C<chr(300)>, for example, was equal to C<chr(45)> or "-" (in ASCII), now it is LATIN CAPITAL LETTER I WITH BREVE. =item * How Do I Make My Scripts Work With Unicode? Very little work should be needed since nothing changes until you generate Unicode data. The most important thing is getting input as Unicode; for that, see the earlier I/O discussion. To get full seamless Unicode support, add C<use feature 'unicode_strings'> (or C<use 5.012> or higher) to your script. =item * How Do I Know Whether My String Is In Unicode? You shouldn't have to care. But you may if your Perl is before 5.14.0 or you haven't specified C<use feature 'unicode_strings'> or C<use 5.012> (or higher) because otherwise the rules for the code points in the range 128 to 255 are different depending on whether the string they are contained within is in Unicode or not. (See L<perlunicode/When Unicode Does Not Happen>.) To determine if a string is in Unicode, use: print utf8::is_utf8($string) ? 1 : 0, "\n"; But note that this doesn't mean that any of the characters in the string are necessary UTF-8 encoded, or that any of the characters have code points greater than 0xFF (255) or even 0x80 (128), or that the string has any characters at all. All the C<is_utf8()> does is to return the value of the internal "utf8ness" flag attached to the C<$string>. If the flag is off, the bytes in the scalar are interpreted as a single byte encoding. If the flag is on, the bytes in the scalar are interpreted as the (variable-length, potentially multi-byte) UTF-8 encoded code points of the characters. Bytes added to a UTF-8 encoded string are automatically upgraded to UTF-8. If mixed non-UTF-8 and UTF-8 scalars are merged (double-quoted interpolation, explicit concatenation, or printf/sprintf parameter substitution), the result will be UTF-8 encoded as if copies of the byte strings were upgraded to UTF-8: for example, $a = "ab\x80c"; $b = "\x{100}"; print "$a = $b\n"; the output string will be UTF-8-encoded C<ab\x80c = \x{100}\n>, but C<$a> will stay byte-encoded. Sometimes you might really need to know the byte length of a string instead of the character length. For that use the C<bytes> pragma and the C<length()> function: my $unicode = chr(0x100); print length($unicode), "\n"; # will print 1 use bytes; print length($unicode), "\n"; # will print 2 # (the 0xC4 0x80 of the UTF-8) no bytes; =item * How Do I Find Out What Encoding a File Has? You might try L<Encode::Guess>, but it has a number of limitations. =item * How Do I Detect Data That's Not Valid In a Particular Encoding? Use the C<Encode> package to try converting it. For example, use Encode 'decode'; if (eval { decode('UTF-8', $string, Encode::FB_CROAK); 1 }) { # $string is valid UTF-8 } else { # $string is not valid UTF-8 } Or use C<unpack> to try decoding it: use warnings; @chars = unpack("C0U*", $string_of_bytes_that_I_think_is_utf8); If invalid, a C<Malformed UTF-8 character> warning is produced. The "C0" means "process the string character per character". Without that, the C<unpack("U*", ...)> would work in C<U0> mode (the default if the format string starts with C<U>) and it would return the bytes making up the UTF-8 encoding of the target string, something that will always work. =item * How Do I Convert Binary Data Into a Particular Encoding, Or Vice Versa? This probably isn't as useful as you might think. Normally, you shouldn't need to. In one sense, what you are asking doesn't make much sense: encodings are for characters, and binary data are not "characters", so converting "data" into some encoding isn't meaningful unless you know in what character set and encoding the binary data is in, in which case it's not just binary data, now is it? If you have a raw sequence of bytes that you know should be interpreted via a particular encoding, you can use C<Encode>: use Encode 'from_to'; from_to($data, "iso-8859-1", "UTF-8"); # from latin-1 to UTF-8 The call to C<from_to()> changes the bytes in C<$data>, but nothing material about the nature of the string has changed as far as Perl is concerned. Both before and after the call, the string C<$data> contains just a bunch of 8-bit bytes. As far as Perl is concerned, the encoding of the string remains as "system-native 8-bit bytes". You might relate this to a fictional 'Translate' module: use Translate; my $phrase = "Yes"; Translate::from_to($phrase, 'english', 'deutsch'); ## phrase now contains "Ja" The contents of the string changes, but not the nature of the string. Perl doesn't know any more after the call than before that the contents of the string indicates the affirmative. Back to converting data. If you have (or want) data in your system's native 8-bit encoding (e.g. Latin-1, EBCDIC, etc.), you can use pack/unpack to convert to/from Unicode. $native_string = pack("W*", unpack("U*", $Unicode_string)); $Unicode_string = pack("U*", unpack("W*", $native_string)); If you have a sequence of bytes you B<know> is valid UTF-8, but Perl doesn't know it yet, you can make Perl a believer, too: $Unicode = $bytes; utf8::decode($Unicode); or: $Unicode = pack("U0a*", $bytes); You can find the bytes that make up a UTF-8 sequence with @bytes = unpack("C*", $Unicode_string) and you can create well-formed Unicode with $Unicode_string = pack("U*", 0xff, ...) =item * How Do I Display Unicode? How Do I Input Unicode? See L<http://www.alanwood.net/unicode/> and L<http://www.cl.cam.ac.uk/~mgk25/unicode.html> =item * How Does Unicode Work With Traditional Locales? If your locale is a UTF-8 locale, starting in Perl v5.26, Perl works well for all categories; before this, starting with Perl v5.20, it works for all categories but C<LC_COLLATE>, which deals with sorting and the C<cmp> operator. But note that the standard C<L<Unicode::Collate>> and C<L<Unicode::Collate::Locale>> modules offer much more powerful solutions to collation issues, and work on earlier releases. For other locales, starting in Perl 5.16, you can specify use locale ':not_characters'; to get Perl to work well with them. The catch is that you have to translate from the locale character set to/from Unicode yourself. See L</Unicode IE<sol>O> above for how to use open ':locale'; to accomplish this, but full details are in L<perllocale/Unicode and UTF-8>, including gotchas that happen if you don't specify C<:not_characters>. =back =head2 Hexadecimal Notation The Unicode standard prefers using hexadecimal notation because that more clearly shows the division of Unicode into blocks of 256 characters. Hexadecimal is also simply shorter than decimal. You can use decimal notation, too, but learning to use hexadecimal just makes life easier with the Unicode standard. The C<U+HHHH> notation uses hexadecimal, for example. The C<0x> prefix means a hexadecimal number, the digits are 0-9 I<and> a-f (or A-F, case doesn't matter). Each hexadecimal digit represents four bits, or half a byte. C<print 0x..., "\n"> will show a hexadecimal number in decimal, and C<printf "%x\n", $decimal> will show a decimal number in hexadecimal. If you have just the "hex digits" of a hexadecimal number, you can use the C<hex()> function. print 0x0009, "\n"; # 9 print 0x000a, "\n"; # 10 print 0x000f, "\n"; # 15 print 0x0010, "\n"; # 16 print 0x0011, "\n"; # 17 print 0x0100, "\n"; # 256 print 0x0041, "\n"; # 65 printf "%x\n", 65; # 41 printf "%#x\n", 65; # 0x41 print hex("41"), "\n"; # 65 =head2 Further Resources =over 4 =item * Unicode Consortium L<https://www.unicode.org/> =item * Unicode FAQ L<https://www.unicode.org/unicode/faq/> =item * Unicode Glossary L<https://www.unicode.org/glossary/> =item * Unicode Recommended Reading List The Unicode Consortium has a list of articles and books, some of which give a much more in depth treatment of Unicode: L<http://unicode.org/resources/readinglist.html> =item * Unicode Useful Resources L<https://www.unicode.org/unicode/onlinedat/resources.html> =item * Unicode and Multilingual Support in HTML, Fonts, Web Browsers and Other Applications L<http://www.alanwood.net/unicode/> =item * UTF-8 and Unicode FAQ for Unix/Linux L<http://www.cl.cam.ac.uk/~mgk25/unicode.html> =item * Legacy Character Sets L<http://www.czyborra.com/> L<http://www.eki.ee/letter/> =item * You can explore various information from the Unicode data files using the C<Unicode::UCD> module. =back =head1 UNICODE IN OLDER PERLS If you cannot upgrade your Perl to 5.8.0 or later, you can still do some Unicode processing by using the modules C<Unicode::String>, C<Unicode::Map8>, and C<Unicode::Map>, available from CPAN. If you have the GNU recode installed, you can also use the Perl front-end C<Convert::Recode> for character conversions. The following are fast conversions from ISO 8859-1 (Latin-1) bytes to UTF-8 bytes and back, the code works even with older Perl 5 versions. # ISO 8859-1 to UTF-8 s/([\x80-\xFF])/chr(0xC0|ord($1)>>6).chr(0x80|ord($1)&0x3F)/eg; # UTF-8 to ISO 8859-1 s/([\xC2\xC3])([\x80-\xBF])/chr(ord($1)<<6&0xC0|ord($2)&0x3F)/eg; =head1 SEE ALSO L<perlunitut>, L<perlunicode>, L<Encode>, L<open>, L<utf8>, L<bytes>, L<perlretut>, L<perlrun>, L<Unicode::Collate>, L<Unicode::Normalize>, L<Unicode::UCD> =head1 ACKNOWLEDGMENTS Thanks to the kind readers of the perl5-porters@perl.org, perl-unicode@perl.org, linux-utf8@nl.linux.org, and unicore@unicode.org mailing lists for their valuable feedback. =head1 AUTHOR, COPYRIGHT, AND LICENSE Copyright 2001-2011 Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt>. Now maintained by Perl 5 Porters. This document may be distributed under the same terms as Perl itself. PK �=�['}��� �� perlrun.podnu �[��� =head1 NAME perlrun - how to execute the Perl interpreter =head1 SYNOPSIS B<perl> S<[ B<-sTtuUWX> ]> S<[ B<-hv> ] [ B<-V>[:I<configvar>] ]> S<[ B<-cw> ] [ B<-d>[B<t>][:I<debugger>] ] [ B<-D>[I<number/list>] ]> S<[ B<-pna> ] [ B<-F>I<pattern> ] [ B<-l>[I<octal>] ] [ B<-0>[I<octal/hexadecimal>] ]> S<[ B<-I>I<dir> ] [ B<-m>[B<->]I<module> ] [ B<-M>[B<->]I<'module...'> ] [ B<-f> ]> S<[ B<-C [I<number/list>] >]> S<[ B<-S> ]> S<[ B<-x>[I<dir>] ]> S<[ B<-i>[I<extension>] ]> S<[ [B<-e>|B<-E>] I<'command'> ] [ B<--> ] [ I<programfile> ] [ I<argument> ]...> =head1 DESCRIPTION The normal way to run a Perl program is by making it directly executable, or else by passing the name of the source file as an argument on the command line. (An interactive Perl environment is also possible--see L<perldebug> for details on how to do that.) Upon startup, Perl looks for your program in one of the following places: =over 4 =item 1. Specified line by line via L<-e|/-e commandline> or L<-E|/-E commandline> switches on the command line. =item 2. Contained in the file specified by the first filename on the command line. (Note that systems supporting the C<#!> notation invoke interpreters this way. See L</Location of Perl>.) =item 3. Passed in implicitly via standard input. This works only if there are no filename arguments--to pass arguments to a STDIN-read program you must explicitly specify a "-" for the program name. =back With methods 2 and 3, Perl starts parsing the input file from the beginning, unless you've specified a L</-x> switch, in which case it scans for the first line starting with C<#!> and containing the word "perl", and starts there instead. This is useful for running a program embedded in a larger message. (In this case you would indicate the end of the program using the C<__END__> token.) The C<#!> line is always examined for switches as the line is being parsed. Thus, if you're on a machine that allows only one argument with the C<#!> line, or worse, doesn't even recognize the C<#!> line, you still can get consistent switch behaviour regardless of how Perl was invoked, even if L</-x> was used to find the beginning of the program. Because historically some operating systems silently chopped off kernel interpretation of the C<#!> line after 32 characters, some switches may be passed in on the command line, and some may not; you could even get a "-" without its letter, if you're not careful. You probably want to make sure that all your switches fall either before or after that 32-character boundary. Most switches don't actually care if they're processed redundantly, but getting a "-" instead of a complete switch could cause Perl to try to execute standard input instead of your program. And a partial L<-I|/-Idirectory> switch could also cause odd results. Some switches do care if they are processed twice, for instance combinations of L<-l|/-l[octnum]> and L<-0|/-0[octalE<sol>hexadecimal]>. Either put all the switches after the 32-character boundary (if applicable), or replace the use of B<-0>I<digits> by C<BEGIN{ $/ = "\0digits"; }>. Parsing of the C<#!> switches starts wherever "perl" is mentioned in the line. The sequences "-*" and "- " are specifically ignored so that you could, if you were so inclined, say #!/bin/sh #! -*- perl -*- -p eval 'exec perl -x -wS $0 ${1+"$@"}' if 0; to let Perl see the L</-p> switch. A similar trick involves the I<env> program, if you have it. #!/usr/bin/env perl The examples above use a relative path to the perl interpreter, getting whatever version is first in the user's path. If you want a specific version of Perl, say, perl5.14.1, you should place that directly in the C<#!> line's path. If the C<#!> line does not contain the word "perl" nor the word "indir", the program named after the C<#!> is executed instead of the Perl interpreter. This is slightly bizarre, but it helps people on machines that don't do C<#!>, because they can tell a program that their SHELL is F</usr/bin/perl>, and Perl will then dispatch the program to the correct interpreter for them. After locating your program, Perl compiles the entire program to an internal form. If there are any compilation errors, execution of the program is not attempted. (This is unlike the typical shell script, which might run part-way through before finding a syntax error.) If the program is syntactically correct, it is executed. If the program runs off the end without hitting an exit() or die() operator, an implicit C<exit(0)> is provided to indicate successful completion. =head2 #! and quoting on non-Unix systems X<hashbang> X<#!> Unix's C<#!> technique can be simulated on other systems: =over 4 =item OS/2 Put extproc perl -S -your_switches as the first line in C<*.cmd> file (L</-S> due to a bug in cmd.exe's `extproc' handling). =item MS-DOS Create a batch file to run your program, and codify it in C<ALTERNATE_SHEBANG> (see the F<dosish.h> file in the source distribution for more information). =item Win95/NT The Win95/NT installation, when using the ActiveState installer for Perl, will modify the Registry to associate the F<.pl> extension with the perl interpreter. If you install Perl by other means (including building from the sources), you may have to modify the Registry yourself. Note that this means you can no longer tell the difference between an executable Perl program and a Perl library file. =item VMS Put $ perl -mysw 'f$env("procedure")' 'p1' 'p2' 'p3' 'p4' 'p5' 'p6' 'p7' 'p8' ! $ exit++ + ++$status != 0 and $exit = $status = undef; at the top of your program, where B<-mysw> are any command line switches you want to pass to Perl. You can now invoke the program directly, by saying C<perl program>, or as a DCL procedure, by saying C<@program> (or implicitly via F<DCL$PATH> by just using the name of the program). This incantation is a bit much to remember, but Perl will display it for you if you say C<perl "-V:startperl">. =back Command-interpreters on non-Unix systems have rather different ideas on quoting than Unix shells. You'll need to learn the special characters in your command-interpreter (C<*>, C<\> and C<"> are common) and how to protect whitespace and these characters to run one-liners (see L<-e|/-e commandline> below). On some systems, you may have to change single-quotes to double ones, which you must I<not> do on Unix or Plan 9 systems. You might also have to change a single % to a %%. For example: # Unix perl -e 'print "Hello world\n"' # MS-DOS, etc. perl -e "print \"Hello world\n\"" # VMS perl -e "print ""Hello world\n""" The problem is that none of this is reliable: it depends on the command and it is entirely possible neither works. If I<4DOS> were the command shell, this would probably work better: perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>"" B<CMD.EXE> in Windows NT slipped a lot of standard Unix functionality in when nobody was looking, but just try to find documentation for its quoting rules. There is no general solution to all of this. It's just a mess. =head2 Location of Perl X<perl, location of interpreter> It may seem obvious to say, but Perl is useful only when users can easily find it. When possible, it's good for both F</usr/bin/perl> and F</usr/local/bin/perl> to be symlinks to the actual binary. If that can't be done, system administrators are strongly encouraged to put (symlinks to) perl and its accompanying utilities into a directory typically found along a user's PATH, or in some other obvious and convenient place. In this documentation, C<#!/usr/bin/perl> on the first line of the program will stand in for whatever method works on your system. You are advised to use a specific path if you care about a specific version. #!/usr/local/bin/perl5.14 or if you just want to be running at least version, place a statement like this at the top of your program: use 5.014; =head2 Command Switches X<perl, command switches> X<command switches> As with all standard commands, a single-character switch may be clustered with the following switch, if any. #!/usr/bin/perl -spi.orig # same as -s -p -i.orig A C<--> signals the end of options and disables further option processing. Any arguments after the C<--> are treated as filenames and arguments. Switches include: =over 5 =item B<-0>[I<octal/hexadecimal>] X<-0> X<$/> specifies the input record separator (C<$/>) as an octal or hexadecimal number. If there are no digits, the null character is the separator. Other switches may precede or follow the digits. For example, if you have a version of I<find> which can print filenames terminated by the null character, you can say this: find . -name '*.orig' -print0 | perl -n0e unlink The special value 00 will cause Perl to slurp files in paragraph mode. Any value 0400 or above will cause Perl to slurp files whole, but by convention the value 0777 is the one normally used for this purpose. You can also specify the separator character using hexadecimal notation: B<-0xI<HHH...>>, where the C<I<H>> are valid hexadecimal digits. Unlike the octal form, this one may be used to specify any Unicode character, even those beyond 0xFF. So if you I<really> want a record separator of 0777, specify it as B<-0x1FF>. (This means that you cannot use the L</-x> option with a directory name that consists of hexadecimal digits, or else Perl will think you have specified a hex number to B<-0>.) =item B<-a> X<-a> X<autosplit> turns on autosplit mode when used with a L</-n> or L</-p>. An implicit split command to the @F array is done as the first thing inside the implicit while loop produced by the L</-n> or L</-p>. perl -ane 'print pop(@F), "\n";' is equivalent to while (<>) { @F = split(' '); print pop(@F), "\n"; } An alternate delimiter may be specified using L<-F|/-Fpattern>. B<-a> implicitly sets L</-n>. =item B<-C [I<number/list>]> X<-C> The B<-C> flag controls some of the Perl Unicode features. As of 5.8.1, the B<-C> can be followed either by a number or a list of option letters. The letters, their numeric values, and effects are as follows; listing the letters is equal to summing the numbers. I 1 STDIN is assumed to be in UTF-8 O 2 STDOUT will be in UTF-8 E 4 STDERR will be in UTF-8 S 7 I + O + E i 8 UTF-8 is the default PerlIO layer for input streams o 16 UTF-8 is the default PerlIO layer for output streams D 24 i + o A 32 the @ARGV elements are expected to be strings encoded in UTF-8 L 64 normally the "IOEioA" are unconditional, the L makes them conditional on the locale environment variables (the LC_ALL, LC_CTYPE, and LANG, in the order of decreasing precedence) -- if the variables indicate UTF-8, then the selected "IOEioA" are in effect a 256 Set ${^UTF8CACHE} to -1, to run the UTF-8 caching code in debugging mode. =for documenting_the_underdocumented perl.h gives W/128 as PERL_UNICODE_WIDESYSCALLS "/* for Sarathy */" =for todo perltodo mentions Unicode in %ENV and filenames. I guess that these will be options e and f (or F). For example, B<-COE> and B<-C6> will both turn on UTF-8-ness on both STDOUT and STDERR. Repeating letters is just redundant, not cumulative nor toggling. The C<io> options mean that any subsequent open() (or similar I/O operations) in main program scope will have the C<:utf8> PerlIO layer implicitly applied to them, in other words, UTF-8 is expected from any input stream, and UTF-8 is produced to any output stream. This is just the default set via L<C<${^OPEN}>|perlvar/${^OPEN}>, with explicit layers in open() and with binmode() one can manipulate streams as usual. This has no effect on code run in modules. B<-C> on its own (not followed by any number or option list), or the empty string C<""> for the L</PERL_UNICODE> environment variable, has the same effect as B<-CSDL>. In other words, the standard I/O handles and the default C<open()> layer are UTF-8-fied I<but> only if the locale environment variables indicate a UTF-8 locale. This behaviour follows the I<implicit> (and problematic) UTF-8 behaviour of Perl 5.8.0. (See L<perl581delta/UTF-8 no longer default under UTF-8 locales>.) You can use B<-C0> (or C<"0"> for C<PERL_UNICODE>) to explicitly disable all the above Unicode features. The read-only magic variable C<${^UNICODE}> reflects the numeric value of this setting. This variable is set during Perl startup and is thereafter read-only. If you want runtime effects, use the three-arg open() (see L<perlfunc/open>), the two-arg binmode() (see L<perlfunc/binmode>), and the C<open> pragma (see L<open>). (In Perls earlier than 5.8.1 the B<-C> switch was a Win32-only switch that enabled the use of Unicode-aware "wide system call" Win32 APIs. This feature was practically unused, however, and the command line switch was therefore "recycled".) B<Note:> Since perl 5.10.1, if the B<-C> option is used on the C<#!> line, it must be specified on the command line as well, since the standard streams are already set up at this point in the execution of the perl interpreter. You can also use binmode() to set the encoding of an I/O stream. =item B<-c> X<-c> causes Perl to check the syntax of the program and then exit without executing it. Actually, it I<will> execute any C<BEGIN>, C<UNITCHECK>, or C<CHECK> blocks and any C<use> statements: these are considered as occurring outside the execution of your program. C<INIT> and C<END> blocks, however, will be skipped. =item B<-d> X<-d> X<-dt> =item B<-dt> runs the program under the Perl debugger. See L<perldebug>. If B<t> is specified, it indicates to the debugger that threads will be used in the code being debugged. =item B<-d:>I<MOD[=bar,baz]> X<-d> X<-dt> =item B<-dt:>I<MOD[=bar,baz]> runs the program under the control of a debugging, profiling, or tracing module installed as C<Devel::I<MOD>>. E.g., B<-d:DProf> executes the program using the C<Devel::DProf> profiler. As with the L<-M|/-M[-]module> flag, options may be passed to the C<Devel::I<MOD>> package where they will be received and interpreted by the C<Devel::I<MOD>::import> routine. Again, like B<-M>, use -B<-d:-I<MOD>> to call C<Devel::I<MOD>::unimport> instead of import. The comma-separated list of options must follow a C<=> character. If B<t> is specified, it indicates to the debugger that threads will be used in the code being debugged. See L<perldebug>. =item B<-D>I<letters> X<-D> X<DEBUGGING> X<-DDEBUGGING> =item B<-D>I<number> sets debugging flags. This switch is enabled only if your perl binary has been built with debugging enabled: normal production perls won't have been. For example, to watch how perl executes your program, use B<-Dtls>. Another nice value is B<-Dx>, which lists your compiled syntax tree, and B<-Dr> displays compiled regular expressions; the format of the output is explained in L<perldebguts>. As an alternative, specify a number instead of list of letters (e.g., B<-D14> is equivalent to B<-Dtls>): 1 p Tokenizing and parsing (with v, displays parse stack) 2 s Stack snapshots (with v, displays all stacks) 4 l Context (loop) stack processing 8 t Trace execution 16 o Method and overloading resolution 32 c String/numeric conversions 64 P Print profiling info, source file input state 128 m Memory and SV allocation 256 f Format processing 512 r Regular expression parsing and execution 1024 x Syntax tree dump 2048 u Tainting checks 4096 U Unofficial, User hacking (reserved for private, unreleased use) 16384 X Scratchpad allocation 32768 D Cleaning up 65536 S Op slab allocation 131072 T Tokenizing 262144 R Include reference counts of dumped variables (eg when using -Ds) 524288 J show s,t,P-debug (don't Jump over) on opcodes within package DB 1048576 v Verbose: use in conjunction with other flags to increase the verbosity of the output. Is a no-op on many of the other flags 2097152 C Copy On Write 4194304 A Consistency checks on internal structures 8388608 q quiet - currently only suppresses the "EXECUTING" message 16777216 M trace smart match resolution 33554432 B dump suBroutine definitions, including special Blocks like BEGIN 67108864 L trace Locale-related info; what gets output is very subject to change 134217728 i trace PerlIO layer processing. Set PERLIO_DEBUG to the filename to trace to. 268435456 y trace y///, tr/// compilation and execution All these flags require B<-DDEBUGGING> when you compile the Perl executable (but see C<:opd> in L<Devel::Peek> or L<re/'debug' mode> which may change this). See the F<INSTALL> file in the Perl source distribution for how to do this. If you're just trying to get a print out of each line of Perl code as it executes, the way that C<sh -x> provides for shell scripts, you can't use Perl's B<-D> switch. Instead do this # If you have "env" utility env PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program # Bourne shell syntax $ PERLDB_OPTS="NonStop=1 AutoTrace=1 frame=2" perl -dS program # csh syntax % (setenv PERLDB_OPTS "NonStop=1 AutoTrace=1 frame=2"; perl -dS program) See L<perldebug> for details and variations. =item B<-e> I<commandline> X<-e> may be used to enter one line of program. If B<-e> is given, Perl will not look for a filename in the argument list. Multiple B<-e> commands may be given to build up a multi-line script. Make sure to use semicolons where you would in a normal program. =item B<-E> I<commandline> X<-E> behaves just like L<-e|/-e commandline>, except that it implicitly enables all optional features (in the main compilation unit). See L<feature>. =item B<-f> X<-f> X<sitecustomize> X<sitecustomize.pl> Disable executing F<$Config{sitelib}/sitecustomize.pl> at startup. Perl can be built so that it by default will try to execute F<$Config{sitelib}/sitecustomize.pl> at startup (in a BEGIN block). This is a hook that allows the sysadmin to customize how Perl behaves. It can for instance be used to add entries to the @INC array to make Perl find modules in non-standard locations. Perl actually inserts the following code: BEGIN { do { local $!; -f "$Config{sitelib}/sitecustomize.pl"; } && do "$Config{sitelib}/sitecustomize.pl"; } Since it is an actual C<do> (not a C<require>), F<sitecustomize.pl> doesn't need to return a true value. The code is run in package C<main>, in its own lexical scope. However, if the script dies, C<$@> will not be set. The value of C<$Config{sitelib}> is also determined in C code and not read from C<Config.pm>, which is not loaded. The code is executed I<very> early. For example, any changes made to C<@INC> will show up in the output of `perl -V`. Of course, C<END> blocks will be likewise executed very late. To determine at runtime if this capability has been compiled in your perl, you can check the value of C<$Config{usesitecustomize}>. =item B<-F>I<pattern> X<-F> specifies the pattern to split on for L</-a>. The pattern may be surrounded by C<//>, C<"">, or C<''>, otherwise it will be put in single quotes. You can't use literal whitespace or NUL characters in the pattern. B<-F> implicitly sets both L</-a> and L</-n>. =item B<-h> X<-h> prints a summary of the options. =item B<-i>[I<extension>] X<-i> X<in-place> specifies that files processed by the C<E<lt>E<gt>> construct are to be edited in-place. It does this by renaming the input file, opening the output file by the original name, and selecting that output file as the default for print() statements. The extension, if supplied, is used to modify the name of the old file to make a backup copy, following these rules: If no extension is supplied, and your system supports it, the original I<file> is kept open without a name while the output is redirected to a new file with the original I<filename>. When perl exits, cleanly or not, the original I<file> is unlinked. If the extension doesn't contain a C<*>, then it is appended to the end of the current filename as a suffix. If the extension does contain one or more C<*> characters, then each C<*> is replaced with the current filename. In Perl terms, you could think of this as: ($backup = $extension) =~ s/\*/$file_name/g; This allows you to add a prefix to the backup file, instead of (or in addition to) a suffix: $ perl -pi'orig_*' -e 's/bar/baz/' fileA # backup to # 'orig_fileA' Or even to place backup copies of the original files into another directory (provided the directory already exists): $ perl -pi'old/*.orig' -e 's/bar/baz/' fileA # backup to # 'old/fileA.orig' These sets of one-liners are equivalent: $ perl -pi -e 's/bar/baz/' fileA # overwrite current file $ perl -pi'*' -e 's/bar/baz/' fileA # overwrite current file $ perl -pi'.orig' -e 's/bar/baz/' fileA # backup to 'fileA.orig' $ perl -pi'*.orig' -e 's/bar/baz/' fileA # backup to 'fileA.orig' From the shell, saying $ perl -p -i.orig -e "s/foo/bar/; ... " is the same as using the program: #!/usr/bin/perl -pi.orig s/foo/bar/; which is equivalent to #!/usr/bin/perl $extension = '.orig'; LINE: while (<>) { if ($ARGV ne $oldargv) { if ($extension !~ /\*/) { $backup = $ARGV . $extension; } else { ($backup = $extension) =~ s/\*/$ARGV/g; } rename($ARGV, $backup); open(ARGVOUT, ">$ARGV"); select(ARGVOUT); $oldargv = $ARGV; } s/foo/bar/; } continue { print; # this prints to original filename } select(STDOUT); except that the B<-i> form doesn't need to compare $ARGV to $oldargv to know when the filename has changed. It does, however, use ARGVOUT for the selected filehandle. Note that STDOUT is restored as the default output filehandle after the loop. As shown above, Perl creates the backup file whether or not any output is actually changed. So this is just a fancy way to copy files: $ perl -p -i'/some/file/path/*' -e 1 file1 file2 file3... or $ perl -p -i'.orig' -e 1 file1 file2 file3... You can use C<eof> without parentheses to locate the end of each input file, in case you want to append to each file, or reset line numbering (see example in L<perlfunc/eof>). If, for a given file, Perl is unable to create the backup file as specified in the extension then it will skip that file and continue on with the next one (if it exists). For a discussion of issues surrounding file permissions and B<-i>, see L<perlfaq5/Why does Perl let me delete read-only files? Why does -i clobber protected files? Isn't this a bug in Perl?>. You cannot use B<-i> to create directories or to strip extensions from files. Perl does not expand C<~> in filenames, which is good, since some folks use it for their backup files: $ perl -pi~ -e 's/foo/bar/' file1 file2 file3... Note that because B<-i> renames or deletes the original file before creating a new file of the same name, Unix-style soft and hard links will not be preserved. Finally, the B<-i> switch does not impede execution when no files are given on the command line. In this case, no backup is made (the original file cannot, of course, be determined) and processing proceeds from STDIN to STDOUT as might be expected. =item B<-I>I<directory> X<-I> X<@INC> Directories specified by B<-I> are prepended to the search path for modules (C<@INC>). =item B<-l>[I<octnum>] X<-l> X<$/> X<$\> enables automatic line-ending processing. It has two separate effects. First, it automatically chomps C<$/> (the input record separator) when used with L</-n> or L</-p>. Second, it assigns C<$\> (the output record separator) to have the value of I<octnum> so that any print statements will have that separator added back on. If I<octnum> is omitted, sets C<$\> to the current value of C<$/>. For instance, to trim lines to 80 columns: perl -lpe 'substr($_, 80) = ""' Note that the assignment C<$\ = $/> is done when the switch is processed, so the input record separator can be different than the output record separator if the B<-l> switch is followed by a L<-0|/-0[octalE<sol>hexadecimal]> switch: gnufind / -print0 | perl -ln0e 'print "found $_" if -p' This sets C<$\> to newline and then sets C<$/> to the null character. =item B<-m>[B<->]I<module> X<-m> X<-M> =item B<-M>[B<->]I<module> =item B<-M>[B<->]I<'module ...'> =item B<-[mM]>[B<->]I<module=arg[,arg]...> B<-m>I<module> executes C<use> I<module> C<();> before executing your program. This loads the module, but does not call its C<import> method, so does not import subroutines and does not give effect to a pragma. B<-M>I<module> executes C<use> I<module> C<;> before executing your program. This loads the module and calls its C<import> method, causing the module to have its default effect, typically importing subroutines or giving effect to a pragma. You can use quotes to add extra code after the module name, e.g., C<'-MI<MODULE> qw(foo bar)'>. If the first character after the B<-M> or B<-m> is a dash (B<->) then the 'use' is replaced with 'no'. This makes no difference for B<-m>. A little builtin syntactic sugar means you can also say B<-mI<MODULE>=foo,bar> or B<-MI<MODULE>=foo,bar> as a shortcut for B<'-MI<MODULE> qw(foo bar)'>. This avoids the need to use quotes when importing symbols. The actual code generated by B<-MI<MODULE>=foo,bar> is C<use module split(/,/,q{foo,bar})>. Note that the C<=> form removes the distinction between B<-m> and B<-M>; that is, B<-mI<MODULE>=foo,bar> is the same as B<-MI<MODULE>=foo,bar>. A consequence of the C<split> formulation is that B<-MI<MODULE>=number> never does a version check, unless C<I<MODULE>::import()> itself is set up to do a version check, which could happen for example if I<MODULE> inherits from L<Exporter>. =item B<-n> X<-n> causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like I<sed -n> or I<awk>: LINE: while (<>) { ... # your program goes here } Note that the lines are not printed by default. See L</-p> to have lines printed. If a file named by an argument cannot be opened for some reason, Perl warns you about it and moves on to the next file. Also note that C<< <> >> passes command line arguments to L<perlfunc/open>, which doesn't necessarily interpret them as file names. See L<perlop> for possible security implications. Here is an efficient way to delete all files that haven't been modified for at least a week: find . -mtime +7 -print | perl -nle unlink This is faster than using the B<-exec> switch of I<find> because you don't have to start a process on every filename found (but it's not faster than using the B<-delete> switch available in newer versions of I<find>. It does suffer from the bug of mishandling newlines in pathnames, which you can fix if you follow the example under L<-0|/-0[octalE<sol>hexadecimal]>. C<BEGIN> and C<END> blocks may be used to capture control before or after the implicit program loop, just as in I<awk>. =item B<-p> X<-p> causes Perl to assume the following loop around your program, which makes it iterate over filename arguments somewhat like I<sed>: LINE: while (<>) { ... # your program goes here } continue { print or die "-p destination: $!\n"; } If a file named by an argument cannot be opened for some reason, Perl warns you about it, and moves on to the next file. Note that the lines are printed automatically. An error occurring during printing is treated as fatal. To suppress printing use the L</-n> switch. A B<-p> overrides a B<-n> switch. C<BEGIN> and C<END> blocks may be used to capture control before or after the implicit loop, just as in I<awk>. =item B<-s> X<-s> enables rudimentary switch parsing for switches on the command line after the program name but before any filename arguments (or before an argument of B<-->). Any switch found there is removed from @ARGV and sets the corresponding variable in the Perl program. The following program prints "1" if the program is invoked with a B<-xyz> switch, and "abc" if it is invoked with B<-xyz=abc>. #!/usr/bin/perl -s if ($xyz) { print "$xyz\n" } Do note that a switch like B<--help> creates the variable C<${-help}>, which is not compliant with C<use strict "refs">. Also, when using this option on a script with warnings enabled you may get a lot of spurious "used only once" warnings. =item B<-S> X<-S> makes Perl use the L</PATH> environment variable to search for the program unless the name of the program contains path separators. On some platforms, this also makes Perl append suffixes to the filename while searching for it. For example, on Win32 platforms, the ".bat" and ".cmd" suffixes are appended if a lookup for the original name fails, and if the name does not already end in one of those suffixes. If your Perl was compiled with C<DEBUGGING> turned on, using the L<-Dp|/-Dletters> switch to Perl shows how the search progresses. Typically this is used to emulate C<#!> startup on platforms that don't support C<#!>. It's also convenient when debugging a script that uses C<#!>, and is thus normally found by the shell's $PATH search mechanism. This example works on many platforms that have a shell compatible with Bourne shell: #!/usr/bin/perl eval 'exec /usr/bin/perl -wS $0 ${1+"$@"}' if $running_under_some_shell; The system ignores the first line and feeds the program to F</bin/sh>, which proceeds to try to execute the Perl program as a shell script. The shell executes the second line as a normal shell command, and thus starts up the Perl interpreter. On some systems $0 doesn't always contain the full pathname, so the L</-S> tells Perl to search for the program if necessary. After Perl locates the program, it parses the lines and ignores them because the variable $running_under_some_shell is never true. If the program will be interpreted by csh, you will need to replace C<${1+"$@"}> with C<$*>, even though that doesn't understand embedded spaces (and such) in the argument list. To start up I<sh> rather than I<csh>, some systems may have to replace the C<#!> line with a line containing just a colon, which will be politely ignored by Perl. Other systems can't control that, and need a totally devious construct that will work under any of I<csh>, I<sh>, or Perl, such as the following: eval '(exit $?0)' && eval 'exec perl -wS $0 ${1+"$@"}' & eval 'exec /usr/bin/perl -wS $0 $argv:q' if $running_under_some_shell; If the filename supplied contains directory separators (and so is an absolute or relative pathname), and if that file is not found, platforms that append file extensions will do so and try to look for the file with those extensions added, one by one. On DOS-like platforms, if the program does not contain directory separators, it will first be searched for in the current directory before being searched for on the PATH. On Unix platforms, the program will be searched for strictly on the PATH. =item B<-t> X<-t> Like L</-T>, but taint checks will issue warnings rather than fatal errors. These warnings can now be controlled normally with C<no warnings qw(taint)>. B<Note: This is not a substitute for C<-T>!> This is meant to be used I<only> as a temporary development aid while securing legacy code: for real production code and for new secure code written from scratch, always use the real L</-T>. =item B<-T> X<-T> turns on "taint" so you can test them. Ordinarily these checks are done only when running setuid or setgid. It's a good idea to turn them on explicitly for programs that run on behalf of someone else whom you might not necessarily trust, such as CGI programs or any internet servers you might write in Perl. See L<perlsec> for details. For security reasons, this option must be seen by Perl quite early; usually this means it must appear early on the command line or in the C<#!> line for systems which support that construct. =item B<-u> X<-u> This switch causes Perl to dump core after compiling your program. You can then in theory take this core dump and turn it into an executable file by using the I<undump> program (not supplied). This speeds startup at the expense of some disk space (which you can minimize by stripping the executable). (Still, a "hello world" executable comes out to about 200K on my machine.) If you want to execute a portion of your program before dumping, use the C<CORE::dump()> function instead. Note: availability of I<undump> is platform specific and may not be available for a specific port of Perl. =item B<-U> X<-U> allows Perl to do unsafe operations. Currently the only "unsafe" operations are attempting to unlink directories while running as superuser and running setuid programs with fatal taint checks turned into warnings. Note that warnings must be enabled along with this option to actually I<generate> the taint-check warnings. =item B<-v> X<-v> prints the version and patchlevel of your perl executable. =item B<-V> X<-V> prints summary of the major perl configuration values and the current values of @INC. =item B<-V:>I<configvar> Prints to STDOUT the value of the named configuration variable(s), with multiples when your C<I<configvar>> argument looks like a regex (has non-letters). For example: $ perl -V:libc libc='/lib/libc-2.2.4.so'; $ perl -V:lib. libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc'; libc='/lib/libc-2.2.4.so'; $ perl -V:lib.* libpth='/usr/local/lib /lib /usr/lib'; libs='-lnsl -lgdbm -ldb -ldl -lm -lcrypt -lutil -lc'; lib_ext='.a'; libc='/lib/libc-2.2.4.so'; libperl='libperl.a'; .... Additionally, extra colons can be used to control formatting. A trailing colon suppresses the linefeed and terminator ";", allowing you to embed queries into shell commands. (mnemonic: PATH separator ":".) $ echo "compression-vars: " `perl -V:z.*: ` " are here !" compression-vars: zcat='' zip='zip' are here ! A leading colon removes the "name=" part of the response, this allows you to map to the name you need. (mnemonic: empty label) $ echo "goodvfork="`./perl -Ilib -V::usevfork` goodvfork=false; Leading and trailing colons can be used together if you need positional parameter values without the names. Note that in the case below, the C<PERL_API> params are returned in alphabetical order. $ echo building_on `perl -V::osname: -V::PERL_API_.*:` now building_on 'linux' '5' '1' '9' now =item B<-w> X<-w> prints warnings about dubious constructs, such as variable names mentioned only once and scalar variables used before being set; redefined subroutines; references to undefined filehandles; filehandles opened read-only that you are attempting to write on; values used as a number that don't I<look> like numbers; using an array as though it were a scalar; if your subroutines recurse more than 100 deep; and innumerable other things. This switch really just enables the global C<$^W> variable; normally, the lexically scoped C<use warnings> pragma is preferred. You can disable or promote into fatal errors specific warnings using C<__WARN__> hooks, as described in L<perlvar> and L<perlfunc/warn>. See also L<perldiag> and L<perltrap>. A fine-grained warning facility is also available if you want to manipulate entire classes of warnings; see L<warnings>. =item B<-W> X<-W> Enables all warnings regardless of C<no warnings> or C<$^W>. See L<warnings>. =item B<-X> X<-X> Disables all warnings regardless of C<use warnings> or C<$^W>. See L<warnings>. Forbidden in L</C<PERL5OPT>>. =item B<-x> X<-x> =item B<-x>I<directory> tells Perl that the program is embedded in a larger chunk of unrelated text, such as in a mail message. Leading garbage will be discarded until the first line that starts with C<#!> and contains the string "perl". Any meaningful switches on that line will be applied. All references to line numbers by the program (warnings, errors, ...) will treat the C<#!> line as the first line. Thus a warning on the 2nd line of the program, which is on the 100th line in the file will be reported as line 2, not as line 100. This can be overridden by using the C<#line> directive. (See L<perlsyn/"Plain Old Comments (Not!)">) If a directory name is specified, Perl will switch to that directory before running the program. The B<-x> switch controls only the disposal of leading garbage. The program must be terminated with C<__END__> if there is trailing garbage to be ignored; the program can process any or all of the trailing garbage via the C<DATA> filehandle if desired. The directory, if specified, must appear immediately following the B<-x> with no intervening whitespace. =back =head1 ENVIRONMENT X<perl, environment variables> =over 12 =item HOME X<HOME> Used if C<chdir> has no argument. =item LOGDIR X<LOGDIR> Used if C<chdir> has no argument and L</HOME> is not set. =item PATH X<PATH> Used in executing subprocesses, and in finding the program if L</-S> is used. =item PERL5LIB X<PERL5LIB> A list of directories in which to look for Perl library files before looking in the standard library. Any architecture-specific and version-specific directories, such as F<version/archname/>, F<version/>, or F<archname/> under the specified locations are automatically included if they exist, with this lookup done at interpreter startup time. In addition, any directories matching the entries in C<$Config{inc_version_list}> are added. (These typically would be for older compatible perl versions installed in the same directory tree.) If PERL5LIB is not defined, L</PERLLIB> is used. Directories are separated (like in PATH) by a colon on Unixish platforms and by a semicolon on Windows (the proper path separator being given by the command C<perl -V:I<path_sep>>). When running taint checks, either because the program was running setuid or setgid, or the L</-T> or L</-t> switch was specified, neither PERL5LIB nor L</PERLLIB> is consulted. The program should instead say: use lib "/my/directory"; =item PERL5OPT X<PERL5OPT> Command-line options (switches). Switches in this variable are treated as if they were on every Perl command line. Only the B<-[CDIMTUWdmtw]> switches are allowed. When running taint checks (either because the program was running setuid or setgid, or because the L</-T> or L</-t> switch was used), this variable is ignored. If PERL5OPT begins with B<-T>, tainting will be enabled and subsequent options ignored. If PERL5OPT begins with B<-t>, tainting will be enabled, a writable dot removed from @INC, and subsequent options honored. =item PERLIO X<PERLIO> A space (or colon) separated list of PerlIO layers. If perl is built to use PerlIO system for IO (the default) these layers affect Perl's IO. It is conventional to start layer names with a colon (for example, C<:perlio>) to emphasize their similarity to variable "attributes". But the code that parses layer specification strings, which is also used to decode the PERLIO environment variable, treats the colon as a separator. An unset or empty PERLIO is equivalent to the default set of layers for your platform; for example, C<:unix:perlio> on Unix-like systems and C<:unix:crlf> on Windows and other DOS-like systems. The list becomes the default for I<all> Perl's IO. Consequently only built-in layers can appear in this list, as external layers (such as C<:encoding()>) need IO in order to load them! See L<"open pragma"|open> for how to add external encodings as defaults. Layers it makes sense to include in the PERLIO environment variable are briefly summarized below. For more details see L<PerlIO>. =over 8 =item :crlf X<:crlf> A layer which does CRLF to C<"\n"> translation distinguishing "text" and "binary" files in the manner of MS-DOS and similar operating systems, and also provides buffering similar to C<:perlio> on these architectures. =item :perlio X<:perlio> This is a re-implementation of stdio-like buffering written as a PerlIO layer. As such it will call whatever layer is below it for its operations, typically C<:unix>. =item :stdio X<:stdio> This layer provides a PerlIO interface by wrapping system's ANSI C "stdio" library calls. The layer provides both buffering and IO. Note that the C<:stdio> layer does I<not> do CRLF translation even if that is the platform's normal behaviour. You will need a C<:crlf> layer above it to do that. =item :unix X<:unix> Low-level layer that calls C<read>, C<write>, C<lseek>, etc. =item :win32 X<:win32> On Win32 platforms this I<experimental> layer uses native "handle" IO rather than a Unix-like numeric file descriptor layer. Known to be buggy in this release (5.30). =back The default set of layers should give acceptable results on all platforms. For Unix platforms that will be the equivalent of ":unix:perlio" or ":stdio". Configure is set up to prefer the ":stdio" implementation if the system's library provides for fast access to the buffer (not common on modern architectures); otherwise, it uses the ":unix:perlio" implementation. On Win32 the default in this release (5.30) is ":unix:crlf". Win32's ":stdio" has a number of bugs/mis-features for Perl IO which are somewhat depending on the version and vendor of the C compiler. Using our own C<:crlf> layer as the buffer avoids those issues and makes things more uniform. This release (5.30) uses C<:unix> as the bottom layer on Win32, and so still uses the C compiler's numeric file descriptor routines. There is an experimental native C<:win32> layer, which is expected to be enhanced and may eventually become the default under Win32. The PERLIO environment variable is completely ignored when Perl is run in taint mode. =item PERLIO_DEBUG X<PERLIO_DEBUG> If set to the name of a file or device when Perl is run with the L<-Di|/-Dletters> command-line switch, the logging of certain operations of the PerlIO subsystem will be redirected to the specified file rather than going to stderr, which is the default. The file is opened in append mode. Typical uses are in Unix: % env PERLIO_DEBUG=/tmp/perlio.log perl -Di script ... and under Win32, the approximately equivalent: > set PERLIO_DEBUG=CON perl -Di script ... This functionality is disabled for setuid scripts, for scripts run with L</-T>, and for scripts run on a Perl built without C<-DDEBUGGING> support. =item PERLLIB X<PERLLIB> A list of directories in which to look for Perl library files before looking in the standard library. If L</PERL5LIB> is defined, PERLLIB is not used. The PERLLIB environment variable is completely ignored when Perl is run in taint mode. =item PERL5DB X<PERL5DB> The command used to load the debugger code. The default is: BEGIN { require "perl5db.pl" } The PERL5DB environment variable is only used when Perl is started with a bare L</-d> switch. =item PERL5DB_THREADED X<PERL5DB_THREADED> If set to a true value, indicates to the debugger that the code being debugged uses threads. =item PERL5SHELL (specific to the Win32 port) X<PERL5SHELL> On Win32 ports only, may be set to an alternative shell that Perl must use internally for executing "backtick" commands or system(). Default is C<cmd.exe /x/d/c> on WindowsNT and C<command.com /c> on Windows95. The value is considered space-separated. Precede any character that needs to be protected, like a space or backslash, with another backslash. Note that Perl doesn't use COMSPEC for this purpose because COMSPEC has a high degree of variability among users, leading to portability concerns. Besides, Perl can use a shell that may not be fit for interactive use, and setting COMSPEC to such a shell may interfere with the proper functioning of other programs (which usually look in COMSPEC to find a shell fit for interactive use). Before Perl 5.10.0 and 5.8.8, PERL5SHELL was not taint checked when running external commands. It is recommended that you explicitly set (or delete) C<$ENV{PERL5SHELL}> when running in taint mode under Windows. =item PERL_ALLOW_NON_IFS_LSP (specific to the Win32 port) X<PERL_ALLOW_NON_IFS_LSP> Set to 1 to allow the use of non-IFS compatible LSPs (Layered Service Providers). Perl normally searches for an IFS-compatible LSP because this is required for its emulation of Windows sockets as real filehandles. However, this may cause problems if you have a firewall such as I<McAfee Guardian>, which requires that all applications use its LSP but which is not IFS-compatible, because clearly Perl will normally avoid using such an LSP. Setting this environment variable to 1 means that Perl will simply use the first suitable LSP enumerated in the catalog, which keeps I<McAfee Guardian> happy--and in that particular case Perl still works too because I<McAfee Guardian>'s LSP actually plays other games which allow applications requiring IFS compatibility to work. =item PERL_DEBUG_MSTATS X<PERL_DEBUG_MSTATS> Relevant only if Perl is compiled with the C<malloc> included with the Perl distribution; that is, if C<perl -V:d_mymalloc> is "define". If set, this dumps out memory statistics after execution. If set to an integer greater than one, also dumps out memory statistics after compilation. =item PERL_DESTRUCT_LEVEL X<PERL_DESTRUCT_LEVEL> Controls the behaviour of global destruction of objects and other references. See L<perlhacktips/PERL_DESTRUCT_LEVEL> for more information. =item PERL_DL_NONLAZY X<PERL_DL_NONLAZY> Set to C<"1"> to have Perl resolve I<all> undefined symbols when it loads a dynamic library. The default behaviour is to resolve symbols when they are used. Setting this variable is useful during testing of extensions, as it ensures that you get an error on misspelled function names even if the test suite doesn't call them. =item PERL_ENCODING X<PERL_ENCODING> If using the C<use encoding> pragma without an explicit encoding name, the PERL_ENCODING environment variable is consulted for an encoding name. =item PERL_HASH_SEED X<PERL_HASH_SEED> (Since Perl 5.8.1, new semantics in Perl 5.18.0) Used to override the randomization of Perl's internal hash function. The value is expressed in hexadecimal, and may include a leading 0x. Truncated patterns are treated as though they are suffixed with sufficient 0's as required. If the option is provided, and C<PERL_PERTURB_KEYS> is NOT set, then a value of '0' implies C<PERL_PERTURB_KEYS=0> and any other value implies C<PERL_PERTURB_KEYS=2>. B<PLEASE NOTE: The hash seed is sensitive information>. Hashes are randomized to protect against local and remote attacks against Perl code. By manually setting a seed, this protection may be partially or completely lost. See L<perlsec/"Algorithmic Complexity Attacks">, L</PERL_PERTURB_KEYS>, and L</PERL_HASH_SEED_DEBUG> for more information. =item PERL_PERTURB_KEYS X<PERL_PERTURB_KEYS> (Since Perl 5.18.0) Set to C<"0"> or C<"NO"> then traversing keys will be repeatable from run to run for the same C<PERL_HASH_SEED>. Insertion into a hash will not change the order, except to provide for more space in the hash. When combined with setting PERL_HASH_SEED this mode is as close to pre 5.18 behavior as you can get. When set to C<"1"> or C<"RANDOM"> then traversing keys will be randomized. Every time a hash is inserted into the key order will change in a random fashion. The order may not be repeatable in a following program run even if the PERL_HASH_SEED has been specified. This is the default mode for perl. When set to C<"2"> or C<"DETERMINISTIC"> then inserting keys into a hash will cause the key order to change, but in a way that is repeatable from program run to program run. B<NOTE:> Use of this option is considered insecure, and is intended only for debugging non-deterministic behavior in Perl's hash function. Do not use it in production. See L<perlsec/"Algorithmic Complexity Attacks"> and L</PERL_HASH_SEED> and L</PERL_HASH_SEED_DEBUG> for more information. You can get and set the key traversal mask for a specific hash by using the C<hash_traversal_mask()> function from L<Hash::Util>. =item PERL_HASH_SEED_DEBUG X<PERL_HASH_SEED_DEBUG> (Since Perl 5.8.1.) Set to C<"1"> to display (to STDERR) information about the hash function, seed, and what type of key traversal randomization is in effect at the beginning of execution. This, combined with L</PERL_HASH_SEED> and L</PERL_PERTURB_KEYS> is intended to aid in debugging nondeterministic behaviour caused by hash randomization. B<Note> that any information about the hash function, especially the hash seed is B<sensitive information>: by knowing it, one can craft a denial-of-service attack against Perl code, even remotely; see L<perlsec/"Algorithmic Complexity Attacks"> for more information. B<Do not disclose the hash seed> to people who don't need to know it. See also L<C<hash_seed()>|Hash::Util/hash_seed> and L<C<hash_traversal_mask()>|Hash::Util/hash_traversal_mask>. An example output might be: HASH_FUNCTION = ONE_AT_A_TIME_HARD HASH_SEED = 0x652e9b9349a7a032 PERTURB_KEYS = 1 (RANDOM) =item PERL_MEM_LOG X<PERL_MEM_LOG> If your Perl was configured with B<-Accflags=-DPERL_MEM_LOG>, setting the environment variable C<PERL_MEM_LOG> enables logging debug messages. The value has the form C<< <I<number>>[m][s][t] >>, where C<I<number>> is the file descriptor number you want to write to (2 is default), and the combination of letters specifies that you want information about (m)emory and/or (s)v, optionally with (t)imestamps. For example, C<PERL_MEM_LOG=1mst> logs all information to stdout. You can write to other opened file descriptors in a variety of ways: $ 3>foo3 PERL_MEM_LOG=3m perl ... =item PERL_ROOT (specific to the VMS port) X<PERL_ROOT> A translation-concealed rooted logical name that contains Perl and the logical device for the @INC path on VMS only. Other logical names that affect Perl on VMS include PERLSHR, PERL_ENV_TABLES, and SYS$TIMEZONE_DIFFERENTIAL, but are optional and discussed further in L<perlvms> and in F<README.vms> in the Perl source distribution. =item PERL_SIGNALS X<PERL_SIGNALS> Available in Perls 5.8.1 and later. If set to C<"unsafe">, the pre-Perl-5.8.0 signal behaviour (which is immediate but unsafe) is restored. If set to C<safe>, then safe (but deferred) signals are used. See L<perlipc/"Deferred Signals (Safe Signals)">. =item PERL_UNICODE X<PERL_UNICODE> Equivalent to the L<-C|/-C [numberE<sol>list]> command-line switch. Note that this is not a boolean variable. Setting this to C<"1"> is not the right way to "enable Unicode" (whatever that would mean). You can use C<"0"> to "disable Unicode", though (or alternatively unset PERL_UNICODE in your shell before starting Perl). See the description of the L<-C|/-C [numberE<sol>list]> switch for more information. =item PERL_USE_UNSAFE_INC X<PERL_USE_UNSAFE_INC> If perl has been configured to not have the current directory in L<C<@INC>|perlvar/@INC> by default, this variable can be set to C<"1"> to reinstate it. It's primarily intended for use while building and testing modules that have not been updated to deal with "." not being in C<@INC> and should not be set in the environment for day-to-day use. =item SYS$LOGIN (specific to the VMS port) X<SYS$LOGIN> Used if chdir has no argument and L</HOME> and L</LOGDIR> are not set. =item PERL_INTERNAL_RAND_SEED X<PERL_INTERNAL_RAND_SEED> Set to a non-negative integer to seed the random number generator used internally by perl for a variety of purposes. Ignored if perl is run setuid or setgid. Used only for some limited startup randomization (hash keys) if C<-T> or C<-t> perl is started with tainting enabled. Perl may be built to ignore this variable. =back Perl also has environment variables that control how Perl handles data specific to particular natural languages; see L<perllocale>. Perl and its various modules and components, including its test frameworks, may sometimes make use of certain other environment variables. Some of these are specific to a particular platform. Please consult the appropriate module documentation and any documentation for your platform (like L<perlsolaris>, L<perllinux>, L<perlmacosx>, L<perlwin32>, etc) for variables peculiar to those specific situations. Perl makes all environment variables available to the program being executed, and passes these along to any child processes it starts. However, programs running setuid would do well to execute the following lines before doing anything else, just to keep people honest: $ENV{PATH} = "/bin:/usr/bin"; # or whatever you need $ENV{SHELL} = "/bin/sh" if exists $ENV{SHELL}; delete @ENV{qw(IFS CDPATH ENV BASH_ENV)}; =head1 ORDER OF APPLICATION Some options, in particular C<-I>, C<-M>, C<PERL5LIB> and C<PERL5OPT> can interact, and the order in which they are applied is important. Note that this section does not document what I<actually> happens inside the perl interpreter, it documents what I<effectively> happens. =over =item -I The effect of multiple C<-I> options is to C<unshift> them onto C<@INC> from right to left. So for example: perl -I 1 -I 2 -I 3 will first prepend C<3> onto the front of C<@INC>, then prepend C<2>, and then prepend C<1>. The result is that C<@INC> begins with: qw(1 2 3) =item -M Multiple C<-M> options are processed from left to right. So this: perl -Mlib=1 -Mlib=2 -Mlib=3 will first use the L<lib> pragma to prepend C<1> to C<@INC>, then it will prepend C<2>, then it will prepend C<3>, resulting in an C<@INC> that begins with: qw(3 2 1) =item the PERL5LIB environment variable This contains a list of directories, separated by colons. The entire list is prepended to C<@INC> in one go. This: PERL5LIB=1:2:3 perl will result in an C<@INC> that begins with: qw(1 2 3) =item combinations of -I, -M and PERL5LIB C<PERL5LIB> is applied first, then all the C<-I> arguments, then all the C<-M> arguments. This: PERL5LIB=e1:e2 perl -I i1 -Mlib=m1 -I i2 -Mlib=m2 will result in an C<@INC> that begins with: qw(m2 m1 i1 i2 e1 e2) =item the PERL5OPT environment variable This contains a space separated list of switches. We only consider the effects of C<-M> and C<-I> in this section. After normal processing of C<-I> switches from the command line, all the C<-I> switches in C<PERL5OPT> are extracted. They are processed from left to right instead of from right to left. Also note that while whitespace is allowed between a C<-I> and its directory on the command line, it is not allowed in C<PERL5OPT>. After normal processing of C<-M> switches from the command line, all the C<-M> switches in C<PERL5OPT> are extracted. They are processed from left to right, I<i.e.> the same as those on the command line. An example may make this clearer: export PERL5OPT="-Mlib=optm1 -Iopti1 -Mlib=optm2 -Iopti2" export PERL5LIB=e1:e2 perl -I i1 -Mlib=m1 -I i2 -Mlib=m2 will result in an C<@INC> that begins with: qw( optm2 optm1 m2 m1 opti2 opti1 i1 i2 e1 e2 ) =item Other complications There are some complications that are ignored in the examples above: =over =item arch and version subdirs All of C<-I>, C<PERL5LIB> and C<use lib> will also prepend arch and version subdirs if they are present =item sitecustomize.pl =back =back PK �=�[�q�� � perlre.podnu �[��� =head1 NAME X<regular expression> X<regex> X<regexp> perlre - Perl regular expressions =head1 DESCRIPTION This page describes the syntax of regular expressions in Perl. If you haven't used regular expressions before, a tutorial introduction is available in L<perlretut>. If you know just a little about them, a quick-start introduction is available in L<perlrequick>. Except for L</The Basics> section, this page assumes you are familiar with regular expression basics, like what is a "pattern", what does it look like, and how it is basically used. For a reference on how they are used, plus various examples of the same, see discussions of C<m//>, C<s///>, C<qr//> and C<"??"> in L<perlop/"Regexp Quote-Like Operators">. New in v5.22, L<C<use re 'strict'>|re/'strict' mode> applies stricter rules than otherwise when compiling regular expression patterns. It can find things that, while legal, may not be what you intended. =head2 The Basics X<regular expression, version 8> X<regex, version 8> X<regexp, version 8> Regular expressions are strings with the very particular syntax and meaning described in this document and auxiliary documents referred to by this one. The strings are called "patterns". Patterns are used to determine if some other string, called the "target", has (or doesn't have) the characteristics specified by the pattern. We call this "matching" the target string against the pattern. Usually the match is done by having the target be the first operand, and the pattern be the second operand, of one of the two binary operators C<=~> and C<!~>, listed in L<perlop/Binding Operators>; and the pattern will have been converted from an ordinary string by one of the operators in L<perlop/"Regexp Quote-Like Operators">, like so: $foo =~ m/abc/ This evaluates to true if and only if the string in the variable C<$foo> contains somewhere in it, the sequence of characters "a", "b", then "c". (The C<=~ m>, or match operator, is described in L<perlop/m/PATTERN/msixpodualngc>.) Patterns that aren't already stored in some variable must be delimitted, at both ends, by delimitter characters. These are often, as in the example above, forward slashes, and the typical way a pattern is written in documentation is with those slashes. In most cases, the delimitter is the same character, fore and aft, but there are a few cases where a character looks like it has a mirror-image mate, where the opening version is the beginning delimiter, and the closing one is the ending delimiter, like $foo =~ m<abc> Most times, the pattern is evaluated in double-quotish context, but it is possible to choose delimiters to force single-quotish, like $foo =~ m'abc' If the pattern contains its delimiter within it, that delimiter must be escaped. Prefixing it with a backslash (I<e.g.>, C<"/foo\/bar/">) serves this purpose. Any single character in a pattern matches that same character in the target string, unless the character is a I<metacharacter> with a special meaning described in this document. A sequence of non-metacharacters matches the same sequence in the target string, as we saw above with C<m/abc/>. Only a few characters (all of them being ASCII punctuation characters) are metacharacters. The most commonly used one is a dot C<".">, which normally matches almost any character (including a dot itself). You can cause characters that normally function as metacharacters to be interpreted literally by prefixing them with a C<"\">, just like the pattern's delimiter must be escaped if it also occurs within the pattern. Thus, C<"\."> matches just a literal dot, C<"."> instead of its normal meaning. This means that the backslash is also a metacharacter, so C<"\\"> matches a single C<"\">. And a sequence that contains an escaped metacharacter matches the same sequence (but without the escape) in the target string. So, the pattern C</blur\\fl/> would match any target string that contains the sequence C<"blur\fl">. The metacharacter C<"|"> is used to match one thing or another. Thus $foo =~ m/this|that/ is TRUE if and only if C<$foo> contains either the sequence C<"this"> or the sequence C<"that">. Like all metacharacters, prefixing the C<"|"> with a backslash makes it match the plain punctuation character; in its case, the VERTICAL LINE. $foo =~ m/this\|that/ is TRUE if and only if C<$foo> contains the sequence C<"this|that">. You aren't limited to just a single C<"|">. $foo =~ m/fee|fie|foe|fum/ is TRUE if and only if C<$foo> contains any of those 4 sequences from the children's story "Jack and the Beanstalk". As you can see, the C<"|"> binds less tightly than a sequence of ordinary characters. We can override this by using the grouping metacharacters, the parentheses C<"("> and C<")">. $foo =~ m/th(is|at) thing/ is TRUE if and only if C<$foo> contains either the sequence S<C<"this thing">> or the sequence S<C<"that thing">>. The portions of the string that match the portions of the pattern enclosed in parentheses are normally made available separately for use later in the pattern, substitution, or program. This is called "capturing", and it can get complicated. See L</Capture groups>. The first alternative includes everything from the last pattern delimiter (C<"(">, C<"(?:"> (described later), I<etc>. or the beginning of the pattern) up to the first C<"|">, and the last alternative contains everything from the last C<"|"> to the next closing pattern delimiter. That's why it's common practice to include alternatives in parentheses: to minimize confusion about where they start and end. Alternatives are tried from left to right, so the first alternative found for which the entire expression matches, is the one that is chosen. This means that alternatives are not necessarily greedy. For example: when matching C<foo|foot> against C<"barefoot">, only the C<"foo"> part will match, as that is the first alternative tried, and it successfully matches the target string. (This might not seem important, but it is important when you are capturing matched text using parentheses.) Besides taking away the special meaning of a metacharacter, a prefixed backslash changes some letter and digit characters away from matching just themselves to instead have special meaning. These are called "escape sequences", and all such are described in L<perlrebackslash>. A backslash sequence (of a letter or digit) that doesn't currently have special meaning to Perl will raise a warning if warnings are enabled, as those are reserved for potential future use. One such sequence is C<\b>, which matches a boundary of some sort. C<\b{wb}> and a few others give specialized types of boundaries. (They are all described in detail starting at L<perlrebackslash/\b{}, \b, \B{}, \B>.) Note that these don't match characters, but the zero-width spaces between characters. They are an example of a L<zero-width assertion|/Assertions>. Consider again, $foo =~ m/fee|fie|foe|fum/ It evaluates to TRUE if, besides those 4 words, any of the sequences "feed", "field", "Defoe", "fume", and many others are in C<$foo>. By judicious use of C<\b> (or better (because it is designed to handle natural language) C<\b{wb}>), we can make sure that only the Giant's words are matched: $foo =~ m/\b(fee|fie|foe|fum)\b/ $foo =~ m/\b{wb}(fee|fie|foe|fum)\b{wb}/ The final example shows that the characters C<"{"> and C<"}"> are metacharacters. Another use for escape sequences is to specify characters that cannot (or which you prefer not to) be written literally. These are described in detail in L<perlrebackslash/Character Escapes>, but the next three paragraphs briefly describe some of them. Various control characters can be written in C language style: C<"\n"> matches a newline, C<"\t"> a tab, C<"\r"> a carriage return, C<"\f"> a form feed, I<etc>. More generally, C<\I<nnn>>, where I<nnn> is a string of three octal digits, matches the character whose native code point is I<nnn>. You can easily run into trouble if you don't have exactly three digits. So always use three, or since Perl 5.14, you can use C<\o{...}> to specify any number of octal digits. Similarly, C<\xI<nn>>, where I<nn> are hexadecimal digits, matches the character whose native ordinal is I<nn>. Again, not using exactly two digits is a recipe for disaster, but you can use C<\x{...}> to specify any number of hex digits. Besides being a metacharacter, the C<"."> is an example of a "character class", something that can match any single character of a given set of them. In its case, the set is just about all possible characters. Perl predefines several character classes besides the C<".">; there is a separate reference page about just these, L<perlrecharclass>. You can define your own custom character classes, by putting into your pattern in the appropriate place(s), a list of all the characters you want in the set. You do this by enclosing the list within C<[]> bracket characters. These are called "bracketed character classes" when we are being precise, but often the word "bracketed" is dropped. (Dropping it usually doesn't cause confusion.) This means that the C<"["> character is another metacharacter. It doesn't match anything just by itself; it is used only to tell Perl that what follows it is a bracketed character class. If you want to match a literal left square bracket, you must escape it, like C<"\[">. The matching C<"]"> is also a metacharacter; again it doesn't match anything by itself, but just marks the end of your custom class to Perl. It is an example of a "sometimes metacharacter". It isn't a metacharacter if there is no corresponding C<"[">, and matches its literal self: print "]" =~ /]/; # prints 1 The list of characters within the character class gives the set of characters matched by the class. C<"[abc]"> matches a single "a" or "b" or "c". But if the first character after the C<"["> is C<"^">, the class instead matches any character not in the list. Within a list, the C<"-"> character specifies a range of characters, so that C<a-z> represents all characters between "a" and "z", inclusive. If you want either C<"-"> or C<"]"> itself to be a member of a class, put it at the start of the list (possibly after a C<"^">), or escape it with a backslash. C<"-"> is also taken literally when it is at the end of the list, just before the closing C<"]">. (The following all specify the same class of three characters: C<[-az]>, C<[az-]>, and C<[a\-z]>. All are different from C<[a-z]>, which specifies a class containing twenty-six characters, even on EBCDIC-based character sets.) There is lots more to bracketed character classes; full details are in L<perlrecharclass/Bracketed Character Classes>. =head3 Metacharacters X<metacharacter> X<\> X<^> X<.> X<$> X<|> X<(> X<()> X<[> X<[]> L</The Basics> introduced some of the metacharacters. This section gives them all. Most of them have the same meaning as in the I<egrep> command. Only the C<"\"> is always a metacharacter. The others are metacharacters just sometimes. The following tables lists all of them, summarizes their use, and gives the contexts where they are metacharacters. Outside those contexts or if prefixed by a C<"\">, they match their corresponding punctuation character. In some cases, their meaning varies depending on various pattern modifiers that alter the default behaviors. See L</Modifiers>. PURPOSE WHERE \ Escape the next character Always, except when escaped by another \ ^ Match the beginning of the string Not in [] (or line, if /m is used) ^ Complement the [] class At the beginning of [] . Match any single character except newline Not in [] (under /s, includes newline) $ Match the end of the string Not in [], but can (or before newline at the end of the mean interpolate a string; or before any newline if /m is scalar used) | Alternation Not in [] () Grouping Not in [] [ Start Bracketed Character class Not in [] ] End Bracketed Character class Only in [], and not first * Matches the preceding element 0 or more Not in [] times + Matches the preceding element 1 or more Not in [] times ? Matches the preceding element 0 or 1 Not in [] times { Starts a sequence that gives number(s) Not in [] of times the preceding element can be matched { when following certain escape sequences starts a modifier to the meaning of the sequence } End sequence started by { - Indicates a range Only in [] interior # Beginning of comment, extends to line end Only with /x modifier Notice that most of the metacharacters lose their special meaning when they occur in a bracketed character class, except C<"^"> has a different meaning when it is at the beginning of such a class. And C<"-"> and C<"]"> are metacharacters only at restricted positions within bracketed character classes; while C<"}"> is a metacharacter only when closing a special construct started by C<"{">. In double-quotish context, as is usually the case, you need to be careful about C<"$"> and the non-metacharacter C<"@">. Those could interpolate variables, which may or may not be what you intended. These rules were designed for compactness of expression, rather than legibility and maintainability. The L</E<sol>x and E<sol>xx> pattern modifiers allow you to insert white space to improve readability. And use of S<C<L<re 'strict'|re/'strict' mode>>> adds extra checking to catch some typos that might silently compile into something unintended. By default, the C<"^"> character is guaranteed to match only the beginning of the string, the C<"$"> character only the end (or before the newline at the end), and Perl does certain optimizations with the assumption that the string contains only one line. Embedded newlines will not be matched by C<"^"> or C<"$">. You may, however, wish to treat a string as a multi-line buffer, such that the C<"^"> will match after any newline within the string (except if the newline is the last character in the string), and C<"$"> will match before any newline. At the cost of a little more overhead, you can do this by using the L</C<E<sol>m>> modifier on the pattern match operator. (Older programs did this by setting C<$*>, but this option was removed in perl 5.10.) X<^> X<$> X</m> To simplify multi-line substitutions, the C<"."> character never matches a newline unless you use the L<C<E<sol>s>|/s> modifier, which in effect tells Perl to pretend the string is a single line--even if it isn't. X<.> X</s> =head2 Modifiers =head3 Overview The default behavior for matching can be changed, using various modifiers. Modifiers that relate to the interpretation of the pattern are listed just below. Modifiers that alter the way a pattern is used by Perl are detailed in L<perlop/"Regexp Quote-Like Operators"> and L<perlop/"Gory details of parsing quoted constructs">. =over 4 =item B<C<m>> X</m> X<regex, multiline> X<regexp, multiline> X<regular expression, multiline> Treat the string being matched against as multiple lines. That is, change C<"^"> and C<"$"> from matching the start of the string's first line and the end of its last line to matching the start and end of each line within the string. =item B<C<s>> X</s> X<regex, single-line> X<regexp, single-line> X<regular expression, single-line> Treat the string as single line. That is, change C<"."> to match any character whatsoever, even a newline, which normally it would not match. Used together, as C</ms>, they let the C<"."> match any character whatsoever, while still allowing C<"^"> and C<"$"> to match, respectively, just after and just before newlines within the string. =item B<C<i>> X</i> X<regex, case-insensitive> X<regexp, case-insensitive> X<regular expression, case-insensitive> Do case-insensitive pattern matching. For example, "A" will match "a" under C</i>. If locale matching rules are in effect, the case map is taken from the current locale for code points less than 255, and from Unicode rules for larger code points. However, matches that would cross the Unicode rules/non-Unicode rules boundary (ords 255/256) will not succeed, unless the locale is a UTF-8 one. See L<perllocale>. There are a number of Unicode characters that match a sequence of multiple characters under C</i>. For example, C<LATIN SMALL LIGATURE FI> should match the sequence C<fi>. Perl is not currently able to do this when the multiple characters are in the pattern and are split between groupings, or when one or more are quantified. Thus "\N{LATIN SMALL LIGATURE FI}" =~ /fi/i; # Matches "\N{LATIN SMALL LIGATURE FI}" =~ /[fi][fi]/i; # Doesn't match! "\N{LATIN SMALL LIGATURE FI}" =~ /fi*/i; # Doesn't match! # The below doesn't match, and it isn't clear what $1 and $2 would # be even if it did!! "\N{LATIN SMALL LIGATURE FI}" =~ /(f)(i)/i; # Doesn't match! Perl doesn't match multiple characters in a bracketed character class unless the character that maps to them is explicitly mentioned, and it doesn't match them at all if the character class is inverted, which otherwise could be highly confusing. See L<perlrecharclass/Bracketed Character Classes>, and L<perlrecharclass/Negation>. =item B<C<x>> and B<C<xx>> X</x> Extend your pattern's legibility by permitting whitespace and comments. Details in L</E<sol>x and E<sol>xx> =item B<C<p>> X</p> X<regex, preserve> X<regexp, preserve> Preserve the string matched such that C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> are available for use after matching. In Perl 5.20 and higher this is ignored. Due to a new copy-on-write mechanism, C<${^PREMATCH}>, C<${^MATCH}>, and C<${^POSTMATCH}> will be available after the match regardless of the modifier. =item B<C<a>>, B<C<d>>, B<C<l>>, and B<C<u>> X</a> X</d> X</l> X</u> These modifiers, all new in 5.14, affect which character-set rules (Unicode, I<etc>.) are used, as described below in L</Character set modifiers>. =item B<C<n>> X</n> X<regex, non-capture> X<regexp, non-capture> X<regular expression, non-capture> Prevent the grouping metacharacters C<()> from capturing. This modifier, new in 5.22, will stop C<$1>, C<$2>, I<etc>... from being filled in. "hello" =~ /(hi|hello)/; # $1 is "hello" "hello" =~ /(hi|hello)/n; # $1 is undef This is equivalent to putting C<?:> at the beginning of every capturing group: "hello" =~ /(?:hi|hello)/; # $1 is undef C</n> can be negated on a per-group basis. Alternatively, named captures may still be used. "hello" =~ /(?-n:(hi|hello))/n; # $1 is "hello" "hello" =~ /(?<greet>hi|hello)/n; # $1 is "hello", $+{greet} is # "hello" =item Other Modifiers There are a number of flags that can be found at the end of regular expression constructs that are I<not> generic regular expression flags, but apply to the operation being performed, like matching or substitution (C<m//> or C<s///> respectively). Flags described further in L<perlretut/"Using regular expressions in Perl"> are: c - keep the current position during repeated matching g - globally match the pattern repeatedly in the string Substitution-specific modifiers described in L<perlop/"s/PATTERN/REPLACEMENT/msixpodualngcer"> are: e - evaluate the right-hand side as an expression ee - evaluate the right side as a string then eval the result o - pretend to optimize your code, but actually introduce bugs r - perform non-destructive substitution and return the new value =back Regular expression modifiers are usually written in documentation as I<e.g.>, "the C</x> modifier", even though the delimiter in question might not really be a slash. The modifiers C</imnsxadlup> may also be embedded within the regular expression itself using the C<(?...)> construct, see L</Extended Patterns> below. =head3 Details on some modifiers Some of the modifiers require more explanation than given in the L</Overview> above. =head4 C</x> and C</xx> A single C</x> tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a bracketed character class. You can use this to break up your regular expression into more readable parts. Also, the C<"#"> character is treated as a metacharacter introducing a comment that runs up to the pattern's closing delimiter, or to the end of the current line if the pattern extends onto the next line. Hence, this is very much like an ordinary Perl code comment. (You can include the closing delimiter within the comment only if you precede it with a backslash, so be careful!) Use of C</x> means that if you want real whitespace or C<"#"> characters in the pattern (outside a bracketed character class, which is unaffected by C</x>), then you'll either have to escape them (using backslashes or C<\Q...\E>) or encode them using octal, hex, or C<\N{}> or C<\p{name=...}> escapes. It is ineffective to try to continue a comment onto the next line by escaping the C<\n> with a backslash or C<\Q>. You can use L</(?#text)> to create a comment that ends earlier than the end of the current line, but C<text> also can't contain the closing delimiter unless escaped with a backslash. A common pitfall is to forget that C<"#"> characters begin a comment under C</x> and are not matched literally. Just keep that in mind when trying to puzzle out why a particular C</x> pattern isn't working as expected. Starting in Perl v5.26, if the modifier has a second C<"x"> within it, it does everything that a single C</x> does, but additionally non-backslashed SPACE and TAB characters within bracketed character classes are also generally ignored, and hence can be added to make the classes more readable. / [d-e g-i 3-7]/xx /[ ! @ " # $ % ^ & * () = ? <> ' ]/xx may be easier to grasp than the squashed equivalents /[d-eg-i3-7]/ /[!@"#$%^&*()=?<>']/ Taken together, these features go a long way towards making Perl's regular expressions more readable. Here's an example: # Delete (most) C comments. $program =~ s { /\* # Match the opening delimiter. .*? # Match a minimal number of characters. \*/ # Match the closing delimiter. } []gsx; Note that anything inside a C<\Q...\E> stays unaffected by C</x>. And note that C</x> doesn't affect space interpretation within a single multi-character construct. For example in C<\x{...}>, regardless of the C</x> modifier, there can be no spaces. Same for a L<quantifier|/Quantifiers> such as C<{3}> or C<{5,}>. Similarly, C<(?:...)> can't have a space between the C<"(">, C<"?">, and C<":">. Within any delimiters for such a construct, allowed spaces are not affected by C</x>, and depend on the construct. For example, C<\x{...}> can't have spaces because hexadecimal numbers don't have spaces in them. But, Unicode properties can have spaces, so in C<\p{...}> there can be spaces that follow the Unicode rules, for which see L<perluniprops/Properties accessible through \p{} and \P{}>. X</x> The set of characters that are deemed whitespace are those that Unicode calls "Pattern White Space", namely: U+0009 CHARACTER TABULATION U+000A LINE FEED U+000B LINE TABULATION U+000C FORM FEED U+000D CARRIAGE RETURN U+0020 SPACE U+0085 NEXT LINE U+200E LEFT-TO-RIGHT MARK U+200F RIGHT-TO-LEFT MARK U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR =head4 Character set modifiers C</d>, C</u>, C</a>, and C</l>, available starting in 5.14, are called the character set modifiers; they affect the character set rules used for the regular expression. The C</d>, C</u>, and C</l> modifiers are not likely to be of much use to you, and so you need not worry about them very much. They exist for Perl's internal use, so that complex regular expression data structures can be automatically serialized and later exactly reconstituted, including all their nuances. But, since Perl can't keep a secret, and there may be rare instances where they are useful, they are documented here. The C</a> modifier, on the other hand, may be useful. Its purpose is to allow code that is to work mostly on ASCII data to not have to concern itself with Unicode. Briefly, C</l> sets the character set to that of whatever B<L>ocale is in effect at the time of the execution of the pattern match. C</u> sets the character set to B<U>nicode. C</a> also sets the character set to Unicode, BUT adds several restrictions for B<A>SCII-safe matching. C</d> is the old, problematic, pre-5.14 B<D>efault character set behavior. Its only use is to force that old behavior. At any given time, exactly one of these modifiers is in effect. Their existence allows Perl to keep the originally compiled behavior of a regular expression, regardless of what rules are in effect when it is actually executed. And if it is interpolated into a larger regex, the original's rules continue to apply to it, and don't affect the other parts. The C</l> and C</u> modifiers are automatically selected for regular expressions compiled within the scope of various pragmas, and we recommend that in general, you use those pragmas instead of specifying these modifiers explicitly. For one thing, the modifiers affect only pattern matching, and do not extend to even any replacement done, whereas using the pragmas gives consistent results for all appropriate operations within their scopes. For example, s/foo/\Ubar/il will match "foo" using the locale's rules for case-insensitive matching, but the C</l> does not affect how the C<\U> operates. Most likely you want both of them to use locale rules. To do this, instead compile the regular expression within the scope of C<use locale>. This both implicitly adds the C</l>, and applies locale rules to the C<\U>. The lesson is to C<use locale>, and not C</l> explicitly. Similarly, it would be better to use C<use feature 'unicode_strings'> instead of, s/foo/\Lbar/iu to get Unicode rules, as the C<\L> in the former (but not necessarily the latter) would also use Unicode rules. More detail on each of the modifiers follows. Most likely you don't need to know this detail for C</l>, C</u>, and C</d>, and can skip ahead to L<E<sol>a|/E<sol>a (and E<sol>aa)>. =head4 /l means to use the current locale's rules (see L<perllocale>) when pattern matching. For example, C<\w> will match the "word" characters of that locale, and C<"/i"> case-insensitive matching will match according to the locale's case folding rules. The locale used will be the one in effect at the time of execution of the pattern match. This may not be the same as the compilation-time locale, and can differ from one match to another if there is an intervening call of the L<setlocale() function|perllocale/The setlocale function>. Prior to v5.20, Perl did not support multi-byte locales. Starting then, UTF-8 locales are supported. No other multi byte locales are ever likely to be supported. However, in all locales, one can have code points above 255 and these will always be treated as Unicode no matter what locale is in effect. Under Unicode rules, there are a few case-insensitive matches that cross the 255/256 boundary. Except for UTF-8 locales in Perls v5.20 and later, these are disallowed under C</l>. For example, 0xFF (on ASCII platforms) does not caselessly match the character at 0x178, C<LATIN CAPITAL LETTER Y WITH DIAERESIS>, because 0xFF may not be C<LATIN SMALL LETTER Y WITH DIAERESIS> in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is. In a UTF-8 locale in v5.20 and later, the only visible difference between locale and non-locale in regular expressions should be tainting (see L<perlsec>). This modifier may be specified to be the default by C<use locale>, but see L</Which character set modifier is in effect?>. X</l> =head4 /u means to use Unicode rules when pattern matching. On ASCII platforms, this means that the code points between 128 and 255 take on their Latin-1 (ISO-8859-1) meanings (which are the same as Unicode's). (Otherwise Perl considers their meanings to be undefined.) Thus, under this modifier, the ASCII platform effectively becomes a Unicode platform; and hence, for example, C<\w> will match any of the more than 100_000 word characters in Unicode. Unlike most locales, which are specific to a language and country pair, Unicode classifies all the characters that are letters I<somewhere> in the world as C<\w>. For example, your locale might not think that C<LATIN SMALL LETTER ETH> is a letter (unless you happen to speak Icelandic), but Unicode does. Similarly, all the characters that are decimal digits somewhere in the world will match C<\d>; this is hundreds, not 10, possible matches. And some of those digits look like some of the 10 ASCII digits, but mean a different number, so a human could easily think a number is a different quantity than it really is. For example, C<BENGALI DIGIT FOUR> (U+09EA) looks very much like an C<ASCII DIGIT EIGHT> (U+0038), and C<LEPCHA DIGIT SIX> (U+1C46) looks very much like an C<ASCII DIGIT FIVE> (U+0035). And, C<\d+>, may match strings of digits that are a mixture from different writing systems, creating a security issue. A fraudulent website, for example, could display the price of something using U+1C46, and it would appear to the user that something cost 500 units, but it really costs 600. A browser that enforced script runs (L</Script Runs>) would prevent that fraudulent display. L<Unicode::UCD/num()> can also be used to sort this out. Or the C</a> modifier can be used to force C<\d> to match just the ASCII 0 through 9. Also, under this modifier, case-insensitive matching works on the full set of Unicode characters. The C<KELVIN SIGN>, for example matches the letters "k" and "K"; and C<LATIN SMALL LIGATURE FF> matches the sequence "ff", which, if you're not prepared, might make it look like a hexadecimal constant, presenting another potential security issue. See L<https://unicode.org/reports/tr36> for a detailed discussion of Unicode security issues. This modifier may be specified to be the default by C<use feature 'unicode_strings>, C<use locale ':not_characters'>, or C<L<use 5.012|perlfunc/use VERSION>> (or higher), but see L</Which character set modifier is in effect?>. X</u> =head4 /d This modifier means to use the "Default" native rules of the platform except when there is cause to use Unicode rules instead, as follows: =over 4 =item 1 the target string is encoded in UTF-8; or =item 2 the pattern is encoded in UTF-8; or =item 3 the pattern explicitly mentions a code point that is above 255 (say by C<\x{100}>); or =item 4 the pattern uses a Unicode name (C<\N{...}>); or =item 5 the pattern uses a Unicode property (C<\p{...}> or C<\P{...}>); or =item 6 the pattern uses a Unicode break (C<\b{...}> or C<\B{...}>); or =item 7 the pattern uses L</C<(?[ ])>> =item 8 the pattern uses L<C<(*script_run: ...)>|/Script Runs> =back Another mnemonic for this modifier is "Depends", as the rules actually used depend on various things, and as a result you can get unexpected results. See L<perlunicode/The "Unicode Bug">. The Unicode Bug has become rather infamous, leading to yet another (without swearing) name for this modifier, "Dodgy". Unless the pattern or string are encoded in UTF-8, only ASCII characters can match positively. Here are some examples of how that works on an ASCII platform: $str = "\xDF"; # $str is not in UTF-8 format. $str =~ /^\w/; # No match, as $str isn't in UTF-8 format. $str .= "\x{0e0b}"; # Now $str is in UTF-8 format. $str =~ /^\w/; # Match! $str is now in UTF-8 format. chop $str; $str =~ /^\w/; # Still a match! $str remains in UTF-8 format. This modifier is automatically selected by default when none of the others are, so yet another name for it is "Default". Because of the unexpected behaviors associated with this modifier, you probably should only explicitly use it to maintain weird backward compatibilities. =head4 /a (and /aa) This modifier stands for ASCII-restrict (or ASCII-safe). This modifier may be doubled-up to increase its effect. When it appears singly, it causes the sequences C<\d>, C<\s>, C<\w>, and the Posix character classes to match only in the ASCII range. They thus revert to their pre-5.6, pre-Unicode meanings. Under C</a>, C<\d> always means precisely the digits C<"0"> to C<"9">; C<\s> means the five characters C<[ \f\n\r\t]>, and starting in Perl v5.18, the vertical tab; C<\w> means the 63 characters C<[A-Za-z0-9_]>; and likewise, all the Posix classes such as C<[[:print:]]> match only the appropriate ASCII-range characters. This modifier is useful for people who only incidentally use Unicode, and who do not wish to be burdened with its complexities and security concerns. With C</a>, one can write C<\d> with confidence that it will only match ASCII characters, and should the need arise to match beyond ASCII, you can instead use C<\p{Digit}> (or C<\p{Word}> for C<\w>). There are similar C<\p{...}> constructs that can match beyond ASCII both white space (see L<perlrecharclass/Whitespace>), and Posix classes (see L<perlrecharclass/POSIX Character Classes>). Thus, this modifier doesn't mean you can't use Unicode, it means that to get Unicode matching you must explicitly use a construct (C<\p{}>, C<\P{}>) that signals Unicode. As you would expect, this modifier causes, for example, C<\D> to mean the same thing as C<[^0-9]>; in fact, all non-ASCII characters match C<\D>, C<\S>, and C<\W>. C<\b> still means to match at the boundary between C<\w> and C<\W>, using the C</a> definitions of them (similarly for C<\B>). Otherwise, C</a> behaves like the C</u> modifier, in that case-insensitive matching uses Unicode rules; for example, "k" will match the Unicode C<\N{KELVIN SIGN}> under C</i> matching, and code points in the Latin1 range, above ASCII will have Unicode rules when it comes to case-insensitive matching. To forbid ASCII/non-ASCII matches (like "k" with C<\N{KELVIN SIGN}>), specify the C<"a"> twice, for example C</aai> or C</aia>. (The first occurrence of C<"a"> restricts the C<\d>, I<etc>., and the second occurrence adds the C</i> restrictions.) But, note that code points outside the ASCII range will use Unicode rules for C</i> matching, so the modifier doesn't really restrict things to just ASCII; it just forbids the intermixing of ASCII and non-ASCII. To summarize, this modifier provides protection for applications that don't wish to be exposed to all of Unicode. Specifying it twice gives added protection. This modifier may be specified to be the default by C<use re '/a'> or C<use re '/aa'>. If you do so, you may actually have occasion to use the C</u> modifier explicitly if there are a few regular expressions where you do want full Unicode rules (but even here, it's best if everything were under feature C<"unicode_strings">, along with the C<use re '/aa'>). Also see L</Which character set modifier is in effect?>. X</a> X</aa> =head4 Which character set modifier is in effect? Which of these modifiers is in effect at any given point in a regular expression depends on a fairly complex set of interactions. These have been designed so that in general you don't have to worry about it, but this section gives the gory details. As explained below in L</Extended Patterns> it is possible to explicitly specify modifiers that apply only to portions of a regular expression. The innermost always has priority over any outer ones, and one applying to the whole expression has priority over any of the default settings that are described in the remainder of this section. The C<L<use re 'E<sol>foo'|re/"'/flags' mode">> pragma can be used to set default modifiers (including these) for regular expressions compiled within its scope. This pragma has precedence over the other pragmas listed below that also change the defaults. Otherwise, C<L<use locale|perllocale>> sets the default modifier to C</l>; and C<L<use feature 'unicode_strings|feature>>, or C<L<use 5.012|perlfunc/use VERSION>> (or higher) set the default to C</u> when not in the same scope as either C<L<use locale|perllocale>> or C<L<use bytes|bytes>>. (C<L<use locale ':not_characters'|perllocale/Unicode and UTF-8>> also sets the default to C</u>, overriding any plain C<use locale>.) Unlike the mechanisms mentioned above, these affect operations besides regular expressions pattern matching, and so give more consistent results with other operators, including using C<\U>, C<\l>, I<etc>. in substitution replacements. If none of the above apply, for backwards compatibility reasons, the C</d> modifier is the one in effect by default. As this can lead to unexpected results, it is best to specify which other rule set should be used. =head4 Character set modifier behavior prior to Perl 5.14 Prior to 5.14, there were no explicit modifiers, but C</l> was implied for regexes compiled within the scope of C<use locale>, and C</d> was implied otherwise. However, interpolating a regex into a larger regex would ignore the original compilation in favor of whatever was in effect at the time of the second compilation. There were a number of inconsistencies (bugs) with the C</d> modifier, where Unicode rules would be used when inappropriate, and vice versa. C<\p{}> did not imply Unicode rules, and neither did all occurrences of C<\N{}>, until 5.12. =head2 Regular Expressions =head3 Quantifiers Quantifiers are used when a particular portion of a pattern needs to match a certain number (or numbers) of times. If there isn't a quantifier the number of times to match is exactly one. The following standard quantifiers are recognized: X<metacharacter> X<quantifier> X<*> X<+> X<?> X<{n}> X<{n,}> X<{n,m}> * Match 0 or more times + Match 1 or more times ? Match 1 or 0 times {n} Match exactly n times {n,} Match at least n times {n,m} Match at least n but not more than m times (If a non-escaped curly bracket occurs in a context other than one of the quantifiers listed above, where it does not form part of a backslashed sequence like C<\x{...}>, it is either a fatal syntax error, or treated as a regular character, generally with a deprecation warning raised. To escape it, you can precede it with a backslash (C<"\{">) or enclose it within square brackets (C<"[{]">). This change will allow for future syntax extensions (like making the lower bound of a quantifier optional), and better error checking of quantifiers). The C<"*"> quantifier is equivalent to C<{0,}>, the C<"+"> quantifier to C<{1,}>, and the C<"?"> quantifier to C<{0,1}>. I<n> and I<m> are limited to non-negative integral values less than a preset limit defined when perl is built. This is usually 32766 on the most common platforms. The actual limit can be seen in the error message generated by code such as this: $_ **= $_ , / {$_} / for 2 .. 42; By default, a quantified subpattern is "greedy", that is, it will match as many times as possible (given a particular starting location) while still allowing the rest of the pattern to match. If you want it to match the minimum number of times possible, follow the quantifier with a C<"?">. Note that the meanings don't change, just the "greediness": X<metacharacter> X<greedy> X<greediness> X<?> X<*?> X<+?> X<??> X<{n}?> X<{n,}?> X<{n,m}?> *? Match 0 or more times, not greedily +? Match 1 or more times, not greedily ?? Match 0 or 1 time, not greedily {n}? Match exactly n times, not greedily (redundant) {n,}? Match at least n times, not greedily {n,m}? Match at least n but not more than m times, not greedily Normally when a quantified subpattern does not allow the rest of the overall pattern to match, Perl will backtrack. However, this behaviour is sometimes undesirable. Thus Perl provides the "possessive" quantifier form as well. *+ Match 0 or more times and give nothing back ++ Match 1 or more times and give nothing back ?+ Match 0 or 1 time and give nothing back {n}+ Match exactly n times and give nothing back (redundant) {n,}+ Match at least n times and give nothing back {n,m}+ Match at least n but not more than m times and give nothing back For instance, 'aaaa' =~ /a++a/ will never match, as the C<a++> will gobble up all the C<"a">'s in the string and won't leave any for the remaining part of the pattern. This feature can be extremely useful to give perl hints about where it shouldn't backtrack. For instance, the typical "match a double-quoted string" problem can be most efficiently performed when written as: /"(?:[^"\\]++|\\.)*+"/ as we know that if the final quote does not match, backtracking will not help. See the independent subexpression L</C<< (?>I<pattern>) >>> for more details; possessive quantifiers are just syntactic sugar for that construct. For instance the above example could also be written as follows: /"(?>(?:(?>[^"\\]+)|\\.)*)"/ Note that the possessive quantifier modifier can not be combined with the non-greedy modifier. This is because it would make no sense. Consider the follow equivalency table: Illegal Legal ------------ ------ X??+ X{0} X+?+ X{1} X{min,max}?+ X{min} =head3 Escape sequences Because patterns are processed as double-quoted strings, the following also work: \t tab (HT, TAB) \n newline (LF, NL) \r return (CR) \f form feed (FF) \a alarm (bell) (BEL) \e escape (think troff) (ESC) \cK control char (example: VT) \x{}, \x00 character whose ordinal is the given hexadecimal number \N{name} named Unicode character or character sequence \N{U+263D} Unicode character (example: FIRST QUARTER MOON) \o{}, \000 character whose ordinal is the given octal number \l lowercase next char (think vi) \u uppercase next char (think vi) \L lowercase until \E (think vi) \U uppercase until \E (think vi) \Q quote (disable) pattern metacharacters until \E \E end either case modification or quoted section, think vi Details are in L<perlop/Quote and Quote-like Operators>. =head3 Character Classes and other Special Escapes In addition, Perl defines the following: X<\g> X<\k> X<\K> X<backreference> Sequence Note Description [...] [1] Match a character according to the rules of the bracketed character class defined by the "...". Example: [a-z] matches "a" or "b" or "c" ... or "z" [[:...:]] [2] Match a character according to the rules of the POSIX character class "..." within the outer bracketed character class. Example: [[:upper:]] matches any uppercase character. (?[...]) [8] Extended bracketed character class \w [3] Match a "word" character (alphanumeric plus "_", plus other connector punctuation chars plus Unicode marks) \W [3] Match a non-"word" character \s [3] Match a whitespace character \S [3] Match a non-whitespace character \d [3] Match a decimal digit character \D [3] Match a non-digit character \pP [3] Match P, named property. Use \p{Prop} for longer names \PP [3] Match non-P \X [4] Match Unicode "eXtended grapheme cluster" \1 [5] Backreference to a specific capture group or buffer. '1' may actually be any positive integer. \g1 [5] Backreference to a specific or previous group, \g{-1} [5] The number may be negative indicating a relative previous group and may optionally be wrapped in curly brackets for safer parsing. \g{name} [5] Named backreference \k<name> [5] Named backreference \K [6] Keep the stuff left of the \K, don't include it in $& \N [7] Any character but \n. Not affected by /s modifier \v [3] Vertical whitespace \V [3] Not vertical whitespace \h [3] Horizontal whitespace \H [3] Not horizontal whitespace \R [4] Linebreak =over 4 =item [1] See L<perlrecharclass/Bracketed Character Classes> for details. =item [2] See L<perlrecharclass/POSIX Character Classes> for details. =item [3] See L<perlunicode/Unicode Character Properties> for details =item [4] See L<perlrebackslash/Misc> for details. =item [5] See L</Capture groups> below for details. =item [6] See L</Extended Patterns> below for details. =item [7] Note that C<\N> has two meanings. When of the form C<\N{I<NAME>}>, it matches the character or character sequence whose name is I<NAME>; and similarly when of the form C<\N{U+I<hex>}>, it matches the character whose Unicode code point is I<hex>. Otherwise it matches any character but C<\n>. =item [8] See L<perlrecharclass/Extended Bracketed Character Classes> for details. =back =head3 Assertions Besides L<C<"^"> and C<"$">|/Metacharacters>, Perl defines the following zero-width assertions: X<zero-width assertion> X<assertion> X<regex, zero-width assertion> X<regexp, zero-width assertion> X<regular expression, zero-width assertion> X<\b> X<\B> X<\A> X<\Z> X<\z> X<\G> \b{} Match at Unicode boundary of specified type \B{} Match where corresponding \b{} doesn't match \b Match a \w\W or \W\w boundary \B Match except at a \w\W or \W\w boundary \A Match only at beginning of string \Z Match only at end of string, or before newline at the end \z Match only at end of string \G Match only at pos() (e.g. at the end-of-match position of prior m//g) A Unicode boundary (C<\b{}>), available starting in v5.22, is a spot between two characters, or before the first character in the string, or after the final character in the string where certain criteria defined by Unicode are met. See L<perlrebackslash/\b{}, \b, \B{}, \B> for details. A word boundary (C<\b>) is a spot between two characters that has a C<\w> on one side of it and a C<\W> on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a C<\W>. (Within character classes C<\b> represents backspace rather than a word boundary, just as it normally does in any double-quoted string.) The C<\A> and C<\Z> are just like C<"^"> and C<"$">, except that they won't match multiple times when the C</m> modifier is used, while C<"^"> and C<"$"> will match at every internal line boundary. To match the actual end of the string and not ignore an optional trailing newline, use C<\z>. X<\b> X<\A> X<\Z> X<\z> X</m> The C<\G> assertion can be used to chain global matches (using C<m//g>), as described in L<perlop/"Regexp Quote-Like Operators">. It is also useful when writing C<lex>-like scanners, when you have several patterns that you want to match against consequent substrings of your string; see the previous reference. The actual location where C<\G> will match can also be influenced by using C<pos()> as an lvalue: see L<perlfunc/pos>. Note that the rule for zero-length matches (see L</"Repeated Patterns Matching a Zero-length Substring">) is modified somewhat, in that contents to the left of C<\G> are not counted when determining the length of the match. Thus the following will not match forever: X<\G> my $string = 'ABC'; pos($string) = 1; while ($string =~ /(.\G)/g) { print $1; } It will print 'A' and then terminate, as it considers the match to be zero-width, and thus will not match at the same position twice in a row. It is worth noting that C<\G> improperly used can result in an infinite loop. Take care when using patterns that include C<\G> in an alternation. Note also that C<s///> will refuse to overwrite part of a substitution that has already been replaced; so for example this will stop after the first iteration, rather than iterating its way backwards through the string: $_ = "123456789"; pos = 6; s/.(?=.\G)/X/g; print; # prints 1234X6789, not XXXXX6789 =head3 Capture groups The grouping construct C<( ... )> creates capture groups (also referred to as capture buffers). To refer to the current contents of a group later on, within the same pattern, use C<\g1> (or C<\g{1}>) for the first, C<\g2> (or C<\g{2}>) for the second, and so on. This is called a I<backreference>. X<regex, capture buffer> X<regexp, capture buffer> X<regex, capture group> X<regexp, capture group> X<regular expression, capture buffer> X<backreference> X<regular expression, capture group> X<backreference> X<\g{1}> X<\g{-1}> X<\g{name}> X<relative backreference> X<named backreference> X<named capture buffer> X<regular expression, named capture buffer> X<named capture group> X<regular expression, named capture group> X<%+> X<$+{name}> X<< \k<name> >> There is no limit to the number of captured substrings that you may use. Groups are numbered with the leftmost open parenthesis being number 1, I<etc>. If a group did not match, the associated backreference won't match either. (This can happen if the group is optional, or in a different branch of an alternation.) You can omit the C<"g">, and write C<"\1">, I<etc>, but there are some issues with this form, described below. You can also refer to capture groups relatively, by using a negative number, so that C<\g-1> and C<\g{-1}> both refer to the immediately preceding capture group, and C<\g-2> and C<\g{-2}> both refer to the group before it. For example: / (Y) # group 1 ( # group 2 (X) # group 3 \g{-1} # backref to group 3 \g{-3} # backref to group 1 ) /x would match the same as C</(Y) ( (X) \g3 \g1 )/x>. This allows you to interpolate regexes into larger regexes and not have to worry about the capture groups being renumbered. You can dispense with numbers altogether and create named capture groups. The notation is C<(?E<lt>I<name>E<gt>...)> to declare and C<\g{I<name>}> to reference. (To be compatible with .Net regular expressions, C<\g{I<name>}> may also be written as C<\k{I<name>}>, C<\kE<lt>I<name>E<gt>> or C<\k'I<name>'>.) I<name> must not begin with a number, nor contain hyphens. When different groups within the same pattern have the same name, any reference to that name assumes the leftmost defined group. Named groups count in absolute and relative numbering, and so can also be referred to by those numbers. (It's possible to do things with named capture groups that would otherwise require C<(??{})>.) Capture group contents are dynamically scoped and available to you outside the pattern until the end of the enclosing block or until the next successful match, whichever comes first. (See L<perlsyn/"Compound Statements">.) You can refer to them by absolute number (using C<"$1"> instead of C<"\g1">, I<etc>); or by name via the C<%+> hash, using C<"$+{I<name>}">. Braces are required in referring to named capture groups, but are optional for absolute or relative numbered ones. Braces are safer when creating a regex by concatenating smaller strings. For example if you have C<qr/$a$b/>, and C<$a> contained C<"\g1">, and C<$b> contained C<"37">, you would get C</\g137/> which is probably not what you intended. The C<\g> and C<\k> notations were introduced in Perl 5.10.0. Prior to that there were no named nor relative numbered capture groups. Absolute numbered groups were referred to using C<\1>, C<\2>, I<etc>., and this notation is still accepted (and likely always will be). But it leads to some ambiguities if there are more than 9 capture groups, as C<\10> could mean either the tenth capture group, or the character whose ordinal in octal is 010 (a backspace in ASCII). Perl resolves this ambiguity by interpreting C<\10> as a backreference only if at least 10 left parentheses have opened before it. Likewise C<\11> is a backreference only if at least 11 left parentheses have opened before it. And so on. C<\1> through C<\9> are always interpreted as backreferences. There are several examples below that illustrate these perils. You can avoid the ambiguity by always using C<\g{}> or C<\g> if you mean capturing groups; and for octal constants always using C<\o{}>, or for C<\077> and below, using 3 digits padded with leading zeros, since a leading zero implies an octal constant. The C<\I<digit>> notation also works in certain circumstances outside the pattern. See L</Warning on \1 Instead of $1> below for details. Examples: s/^([^ ]*) *([^ ]*)/$2 $1/; # swap first two words /(.)\g1/ # find first doubled char and print "'$1' is the first doubled character\n"; /(?<char>.)\k<char>/ # ... a different way and print "'$+{char}' is the first doubled character\n"; /(?'char'.)\g1/ # ... mix and match and print "'$1' is the first doubled character\n"; if (/Time: (..):(..):(..)/) { # parse out values $hours = $1; $minutes = $2; $seconds = $3; } /(.)(.)(.)(.)(.)(.)(.)(.)(.)\g10/ # \g10 is a backreference /(.)(.)(.)(.)(.)(.)(.)(.)(.)\10/ # \10 is octal /((.)(.)(.)(.)(.)(.)(.)(.)(.))\10/ # \10 is a backreference /((.)(.)(.)(.)(.)(.)(.)(.)(.))\010/ # \010 is octal $a = '(.)\1'; # Creates problems when concatenated. $b = '(.)\g{1}'; # Avoids the problems. "aa" =~ /${a}/; # True "aa" =~ /${b}/; # True "aa0" =~ /${a}0/; # False! "aa0" =~ /${b}0/; # True "aa\x08" =~ /${a}0/; # True! "aa\x08" =~ /${b}0/; # False Several special variables also refer back to portions of the previous match. C<$+> returns whatever the last bracket match matched. C<$&> returns the entire matched string. (At one point C<$0> did also, but now it returns the name of the program.) C<$`> returns everything before the matched string. C<$'> returns everything after the matched string. And C<$^N> contains whatever was matched by the most-recently closed group (submatch). C<$^N> can be used in extended patterns (see below), for example to assign a submatch to a variable. X<$+> X<$^N> X<$&> X<$`> X<$'> These special variables, like the C<%+> hash and the numbered match variables (C<$1>, C<$2>, C<$3>, I<etc>.) are dynamically scoped until the end of the enclosing block or until the next successful match, whichever comes first. (See L<perlsyn/"Compound Statements">.) X<$+> X<$^N> X<$&> X<$`> X<$'> X<$1> X<$2> X<$3> X<$4> X<$5> X<$6> X<$7> X<$8> X<$9> B<NOTE>: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match. B<WARNING>: If your code is to run on Perl 5.16 or earlier, beware that once Perl sees that you need one of C<$&>, C<$`>, or C<$'> anywhere in the program, it has to provide them for every pattern match. This may substantially slow your program. Perl uses the same mechanism to produce C<$1>, C<$2>, I<etc>, so you also pay a price for each pattern that contains capturing parentheses. (To avoid this cost while retaining the grouping behaviour, use the extended regular expression C<(?: ... )> instead.) But if you never use C<$&>, C<$`> or C<$'>, then patterns I<without> capturing parentheses will not be penalized. So avoid C<$&>, C<$'>, and C<$`> if you can, but if you can't (and some algorithms really appreciate them), once you've used them once, use them at will, because you've already paid the price. X<$&> X<$`> X<$'> Perl 5.16 introduced a slightly more efficient mechanism that notes separately whether each of C<$`>, C<$&>, and C<$'> have been seen, and thus may only need to copy part of the string. Perl 5.20 introduced a much more efficient copy-on-write mechanism which eliminates any slowdown. As another workaround for this problem, Perl 5.10.0 introduced C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}>, which are equivalent to C<$`>, C<$&> and C<$'>, B<except> that they are only guaranteed to be defined after a successful match that was executed with the C</p> (preserve) modifier. The use of these variables incurs no global performance penalty, unlike their punctuation character equivalents, however at the trade-off that you have to tell perl when you want to use them. As of Perl 5.20, these three variables are equivalent to C<$`>, C<$&> and C<$'>, and C</p> is ignored. X</p> X<p modifier> =head2 Quoting metacharacters Backslashed metacharacters in Perl are alphanumeric, such as C<\b>, C<\w>, C<\n>. Unlike some other regular expression languages, there are no backslashed symbols that aren't alphanumeric. So anything that looks like C<\\>, C<\(>, C<\)>, C<\[>, C<\]>, C<\{>, or C<\}> is always interpreted as a literal character, not a metacharacter. This was once used in a common idiom to disable or quote the special meanings of regular expression metacharacters in a string that you want to use for a pattern. Simply quote all non-"word" characters: $pattern =~ s/(\W)/\\$1/g; (If C<use locale> is set, then this depends on the current locale.) Today it is more common to use the C<L<quotemeta()|perlfunc/quotemeta>> function or the C<\Q> metaquoting escape sequence to disable all metacharacters' special meanings like this: /$unquoted\Q$quoted\E$unquoted/ Beware that if you put literal backslashes (those not inside interpolated variables) between C<\Q> and C<\E>, double-quotish backslash interpolation may lead to confusing results. If you I<need> to use literal backslashes within C<\Q...\E>, consult L<perlop/"Gory details of parsing quoted constructs">. C<quotemeta()> and C<\Q> are fully described in L<perlfunc/quotemeta>. =head2 Extended Patterns Perl also defines a consistent extension syntax for features not found in standard tools like B<awk> and B<lex>. The syntax for most of these is a pair of parentheses with a question mark as the first thing within the parentheses. The character after the question mark indicates the extension. A question mark was chosen for this and for the minimal-matching construct because 1) question marks are rare in older regular expressions, and 2) whenever you see one, you should stop and "question" exactly what is going on. That's psychology.... =over 4 =item C<(?#I<text>)> X<(?#)> A comment. The I<text> is ignored. Note that Perl closes the comment as soon as it sees a C<")">, so there is no way to put a literal C<")"> in the comment. The pattern's closing delimiter must be escaped by a backslash if it appears in the comment. See L</E<sol>x> for another way to have comments in patterns. Note that a comment can go just about anywhere, except in the middle of an escape sequence. Examples: qr/foo(?#comment)bar/' # Matches 'foobar' # The pattern below matches 'abcd', 'abccd', or 'abcccd' qr/abc(?#comment between literal and its quantifier){1,3}d/ # The pattern below generates a syntax error, because the '\p' must # be followed immediately by a '{'. qr/\p(?#comment between \p and its property name){Any}/ # The pattern below generates a syntax error, because the initial # '\(' is a literal opening parenthesis, and so there is nothing # for the closing ')' to match qr/\(?#the backslash means this isn't a comment)p{Any}/ # Comments can be used to fold long patterns into multiple lines qr/First part of a long regex(?# )remaining part/ =item C<(?adlupimnsx-imnsx)> =item C<(?^alupimnsx)> X<(?)> X<(?^)> Zero or more embedded pattern-match modifiers, to be turned on (or turned off if preceded by C<"-">) for the remainder of the pattern or the remainder of the enclosing pattern group (if any). This is particularly useful for dynamically-generated patterns, such as those read in from a configuration file, taken from an argument, or specified in a table somewhere. Consider the case where some patterns want to be case-sensitive and some do not: The case-insensitive ones merely need to include C<(?i)> at the front of the pattern. For example: $pattern = "foobar"; if ( /$pattern/i ) { } # more flexible: $pattern = "(?i)foobar"; if ( /$pattern/ ) { } These modifiers are restored at the end of the enclosing group. For example, ( (?i) blah ) \s+ \g1 will match C<blah> in any case, some spaces, and an exact (I<including the case>!) repetition of the previous word, assuming the C</x> modifier, and no C</i> modifier outside this group. These modifiers do not carry over into named subpatterns called in the enclosing group. In other words, a pattern such as C<((?i)(?&I<NAME>))> does not change the case-sensitivity of the I<NAME> pattern. A modifier is overridden by later occurrences of this construct in the same scope containing the same modifier, so that /((?im)foo(?-m)bar)/ matches all of C<foobar> case insensitively, but uses C</m> rules for only the C<foo> portion. The C<"a"> flag overrides C<aa> as well; likewise C<aa> overrides C<"a">. The same goes for C<"x"> and C<xx>. Hence, in /(?-x)foo/xx both C</x> and C</xx> are turned off during matching C<foo>. And in /(?x)foo/x C</x> but NOT C</xx> is turned on for matching C<foo>. (One might mistakenly think that since the inner C<(?x)> is already in the scope of C</x>, that the result would effectively be the sum of them, yielding C</xx>. It doesn't work that way.) Similarly, doing something like C<(?xx-x)foo> turns off all C<"x"> behavior for matching C<foo>, it is not that you subtract 1 C<"x"> from 2 to get 1 C<"x"> remaining. Any of these modifiers can be set to apply globally to all regular expressions compiled within the scope of a C<use re>. See L<re/"'/flags' mode">. Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately after the C<"?"> is a shorthand equivalent to C<d-imnsx>. Flags (except C<"d">) may follow the caret to override it. But a minus sign is not legal with it. Note that the C<"a">, C<"d">, C<"l">, C<"p">, and C<"u"> modifiers are special in that they can only be enabled, not disabled, and the C<"a">, C<"d">, C<"l">, and C<"u"> modifiers are mutually exclusive: specifying one de-specifies the others, and a maximum of one (or two C<"a">'s) may appear in the construct. Thus, for example, C<(?-p)> will warn when compiled under C<use warnings>; C<(?-d:...)> and C<(?dl:...)> are fatal errors. Note also that the C<"p"> modifier is special in that its presence anywhere in a pattern has a global effect. Having zero modifiers makes this a no-op (so why did you specify it, unless it's generated code), and starting in v5.30, warns under L<C<use re 'strict'>|re/'strict' mode>. =item C<(?:I<pattern>)> X<(?:)> =item C<(?adluimnsx-imnsx:I<pattern>)> =item C<(?^aluimnsx:I<pattern>)> X<(?^:)> This is for clustering, not capturing; it groups subexpressions like C<"()">, but doesn't make backreferences as C<"()"> does. So @fields = split(/\b(?:a|b|c)\b/) matches the same field delimiters as @fields = split(/\b(a|b|c)\b/) but doesn't spit out the delimiters themselves as extra fields (even though that's the behaviour of L<perlfunc/split> when its pattern contains capturing groups). It's also cheaper not to capture characters if you don't need to. Any letters between C<"?"> and C<":"> act as flags modifiers as with C<(?adluimnsx-imnsx)>. For example, /(?s-i:more.*than).*million/i is equivalent to the more verbose /(?:(?s-i)more.*than).*million/i Note that any C<()> constructs enclosed within this one will still capture unless the C</n> modifier is in effect. Like the L</(?adlupimnsx-imnsx)> construct, C<aa> and C<"a"> override each other, as do C<xx> and C<"x">. They are not additive. So, doing something like C<(?xx-x:foo)> turns off all C<"x"> behavior for matching C<foo>. Starting in Perl 5.14, a C<"^"> (caret or circumflex accent) immediately after the C<"?"> is a shorthand equivalent to C<d-imnsx>. Any positive flags (except C<"d">) may follow the caret, so (?^x:foo) is equivalent to (?x-imns:foo) The caret tells Perl that this cluster doesn't inherit the flags of any surrounding pattern, but uses the system defaults (C<d-imnsx>), modified by any flags specified. The caret allows for simpler stringification of compiled regular expressions. These look like (?^:pattern) with any non-default flags appearing between the caret and the colon. A test that looks at such stringification thus doesn't need to have the system default flags hard-coded in it, just the caret. If new flags are added to Perl, the meaning of the caret's expansion will change to include the default for those flags, so the test will still work, unchanged. Specifying a negative flag after the caret is an error, as the flag is redundant. Mnemonic for C<(?^...)>: A fresh beginning since the usual use of a caret is to match at the beginning. =item C<(?|I<pattern>)> X<(?|)> X<Branch reset> This is the "branch reset" pattern, which has the special property that the capture groups are numbered from the same starting point in each alternation branch. It is available starting from perl 5.10.0. Capture groups are numbered from left to right, but inside this construct the numbering is restarted for each branch. The numbering within each branch will be as normal, and any groups following this construct will be numbered as though the construct contained only one branch, that being the one with the most capture groups in it. This construct is useful when you want to capture one of a number of alternative matches. Consider the following pattern. The numbers underneath show in which group the captured content will be stored. # before ---------------branch-reset----------- after / ( a ) (?| x ( y ) z | (p (q) r) | (t) u (v) ) ( z ) /x # 1 2 2 3 2 3 4 Be careful when using the branch reset pattern in combination with named captures. Named captures are implemented as being aliases to numbered groups holding the captures, and that interferes with the implementation of the branch reset pattern. If you are using named captures in a branch reset pattern, it's best to use the same names, in the same order, in each of the alternations: /(?| (?<a> x ) (?<b> y ) | (?<a> z ) (?<b> w )) /x Not doing so may lead to surprises: "12" =~ /(?| (?<a> \d+ ) | (?<b> \D+))/x; say $+{a}; # Prints '12' say $+{b}; # *Also* prints '12'. The problem here is that both the group named C<< a >> and the group named C<< b >> are aliases for the group belonging to C<< $1 >>. =item Lookaround Assertions X<look-around assertion> X<lookaround assertion> X<look-around> X<lookaround> Lookaround assertions are zero-width patterns which match a specific pattern without including it in C<$&>. Positive assertions match when their subpattern matches, negative assertions match when their subpattern fails. Lookbehind matches text up to the current match position, lookahead matches text following the current match position. =over 4 =item C<(?=I<pattern>)> =item C<(*pla:I<pattern>)> =item C<(*positive_lookahead:I<pattern>)> X<(?=)> X<(*pla> X<(*positive_lookahead> X<look-ahead, positive> X<lookahead, positive> A zero-width positive lookahead assertion. For example, C</\w+(?=\t)/> matches a word followed by a tab, without including the tab in C<$&>. =item C<(?!I<pattern>)> =item C<(*nla:I<pattern>)> =item C<(*negative_lookahead:I<pattern>)> X<(?!)> X<(*nla> X<(*negative_lookahead> X<look-ahead, negative> X<lookahead, negative> A zero-width negative lookahead assertion. For example C</foo(?!bar)/> matches any occurrence of "foo" that isn't followed by "bar". Note however that lookahead and lookbehind are NOT the same thing. You cannot use this for lookbehind. If you are looking for a "bar" that isn't preceded by a "foo", C</(?!foo)bar/> will not do what you want. That's because the C<(?!foo)> is just saying that the next thing cannot be "foo"--and it's not, it's a "bar", so "foobar" will match. Use lookbehind instead (see below). =item C<(?<=I<pattern>)> =item C<\K> =item C<(*plb:I<pattern>)> =item C<(*positive_lookbehind:I<pattern>)> X<(?<=)> X<(*plb> X<(*positive_lookbehind> X<look-behind, positive> X<lookbehind, positive> X<\K> A zero-width positive lookbehind assertion. For example, C</(?<=\t)\w+/> matches a word that follows a tab, without including the tab in C<$&>. Prior to Perl 5.30, it worked only for fixed-width lookbehind, but starting in that release, it can handle variable lengths from 1 to 255 characters as an experimental feature. The feature is enabled automatically if you use a variable length lookbehind assertion, but will raise a warning at pattern compilation time, unless turned off, in the C<experimental::vlb> category. This is to warn you that the exact behavior is subject to change should feedback from actual use in the field indicate to do so; or even complete removal if the problems found are not practically surmountable. You can achieve close to pre-5.30 behavior by fatalizing warnings in this category. There is a special form of this construct, called C<\K> (available since Perl 5.10.0), which causes the regex engine to "keep" everything it had matched prior to the C<\K> and not include it in C<$&>. This effectively provides non-experimental variable-length lookbehind of any length. And, there is a technique that can be used to handle variable length lookbehinds on earlier releases, and longer than 255 characters. It is described in L<http://www.drregex.com/2019/02/variable-length-lookbehinds-actually.html>. Note that under C</i>, a few single characters match two or three other characters. This makes them variable length, and the 255 length applies to the maximum number of characters in the match. For example C<qr/\N{LATIN SMALL LETTER SHARP S}/i> matches the sequence C<"ss">. Your lookbehind assertion could contain 127 Sharp S characters under C</i>, but adding a 128th would generate a compilation error, as that could match 256 C<"s"> characters in a row. The use of C<\K> inside of another lookaround assertion is allowed, but the behaviour is currently not well defined. For various reasons C<\K> may be significantly more efficient than the equivalent C<< (?<=...) >> construct, and it is especially useful in situations where you want to efficiently remove something following something else in a string. For instance s/(foo)bar/$1/g; can be rewritten as the much more efficient s/foo\Kbar//g; Use of the non-greedy modifier C<"?"> may not give you the expected results if it is within a capturing group within the construct. =item C<(?<!I<pattern>)> =item C<(*nlb:I<pattern>)> =item C<(*negative_lookbehind:I<pattern>)> X<(?<!)> X<(*nlb> X<(*negative_lookbehind> X<look-behind, negative> X<lookbehind, negative> A zero-width negative lookbehind assertion. For example C</(?<!bar)foo/> matches any occurrence of "foo" that does not follow "bar". Prior to Perl 5.30, it worked only for fixed-width lookbehind, but starting in that release, it can handle variable lengths from 1 to 255 characters as an experimental feature. The feature is enabled automatically if you use a variable length lookbehind assertion, but will raise a warning at pattern compilation time, unless turned off, in the C<experimental::vlb> category. This is to warn you that the exact behavior is subject to change should feedback from actual use in the field indicate to do so; or even complete removal if the problems found are not practically surmountable. You can achieve close to pre-5.30 behavior by fatalizing warnings in this category. There is a technique that can be used to handle variable length lookbehinds on earlier releases, and longer than 255 characters. It is described in L<http://www.drregex.com/2019/02/variable-length-lookbehinds-actually.html>. Note that under C</i>, a few single characters match two or three other characters. This makes them variable length, and the 255 length applies to the maximum number of characters in the match. For example C<qr/\N{LATIN SMALL LETTER SHARP S}/i> matches the sequence C<"ss">. Your lookbehind assertion could contain 127 Sharp S characters under C</i>, but adding a 128th would generate a compilation error, as that could match 256 C<"s"> characters in a row. Use of the non-greedy modifier C<"?"> may not give you the expected results if it is within a capturing group within the construct. =back =item C<< (?<I<NAME>>I<pattern>) >> =item C<(?'I<NAME>'I<pattern>)> X<< (?<NAME>) >> X<(?'NAME')> X<named capture> X<capture> A named capture group. Identical in every respect to normal capturing parentheses C<()> but for the additional fact that the group can be referred to by name in various regular expression constructs (like C<\g{I<NAME>}>) and can be accessed by name after a successful match via C<%+> or C<%->. See L<perlvar> for more details on the C<%+> and C<%-> hashes. If multiple distinct capture groups have the same name, then C<$+{I<NAME>}> will refer to the leftmost defined group in the match. The forms C<(?'I<NAME>'I<pattern>)> and C<< (?<I<NAME>>I<pattern>) >> are equivalent. B<NOTE:> While the notation of this construct is the same as the similar function in .NET regexes, the behavior is not. In Perl the groups are numbered sequentially regardless of being named or not. Thus in the pattern /(x)(?<foo>y)(z)/ C<$+{foo}> will be the same as C<$2>, and C<$3> will contain 'z' instead of the opposite which is what a .NET regex hacker might expect. Currently I<NAME> is restricted to simple identifiers only. In other words, it must match C</^[_A-Za-z][_A-Za-z0-9]*\z/> or its Unicode extension (see L<utf8>), though it isn't extended by the locale (see L<perllocale>). B<NOTE:> In order to make things easier for programmers with experience with the Python or PCRE regex engines, the pattern C<< (?PE<lt>I<NAME>E<gt>I<pattern>) >> may be used instead of C<< (?<I<NAME>>I<pattern>) >>; however this form does not support the use of single quotes as a delimiter for the name. =item C<< \k<I<NAME>> >> =item C<< \k'I<NAME>' >> Named backreference. Similar to numeric backreferences, except that the group is designated by name and not number. If multiple groups have the same name then it refers to the leftmost defined group in the current match. It is an error to refer to a name not defined by a C<< (?<I<NAME>>) >> earlier in the pattern. Both forms are equivalent. B<NOTE:> In order to make things easier for programmers with experience with the Python or PCRE regex engines, the pattern C<< (?P=I<NAME>) >> may be used instead of C<< \k<I<NAME>> >>. =item C<(?{ I<code> })> X<(?{})> X<regex, code in> X<regexp, code in> X<regular expression, code in> B<WARNING>: Using this feature safely requires that you understand its limitations. Code executed that has side effects may not perform identically from version to version due to the effect of future optimisations in the regex engine. For more information on this, see L</Embedded Code Execution Frequency>. This zero-width assertion executes any embedded Perl code. It always succeeds, and its return value is set as C<$^R>. In literal patterns, the code is parsed at the same time as the surrounding code. While within the pattern, control is passed temporarily back to the perl parser, until the logically-balancing closing brace is encountered. This is similar to the way that an array index expression in a literal string is handled, for example "abc$array[ 1 + f('[') + g()]def" In particular, braces do not need to be balanced: s/abc(?{ f('{'); })/def/ Even in a pattern that is interpolated and compiled at run-time, literal code blocks will be compiled once, at perl compile time; the following prints "ABCD": print "D"; my $qr = qr/(?{ BEGIN { print "A" } })/; my $foo = "foo"; /$foo$qr(?{ BEGIN { print "B" } })/; BEGIN { print "C" } In patterns where the text of the code is derived from run-time information rather than appearing literally in a source code /pattern/, the code is compiled at the same time that the pattern is compiled, and for reasons of security, C<use re 'eval'> must be in scope. This is to stop user-supplied patterns containing code snippets from being executable. In situations where you need to enable this with C<use re 'eval'>, you should also have taint checking enabled. Better yet, use the carefully constrained evaluation within a Safe compartment. See L<perlsec> for details about both these mechanisms. From the viewpoint of parsing, lexical variable scope and closures, /AAA(?{ BBB })CCC/ behaves approximately like /AAA/ && do { BBB } && /CCC/ Similarly, qr/AAA(?{ BBB })CCC/ behaves approximately like sub { /AAA/ && do { BBB } && /CCC/ } In particular: { my $i = 1; $r = qr/(?{ print $i })/ } my $i = 2; /$r/; # prints "1" Inside a C<(?{...})> block, C<$_> refers to the string the regular expression is matching against. You can also use C<pos()> to know what is the current position of matching within this string. The code block introduces a new scope from the perspective of lexical variable declarations, but B<not> from the perspective of C<local> and similar localizing behaviours. So later code blocks within the same pattern will still see the values which were localized in earlier blocks. These accumulated localizations are undone either at the end of a successful match, or if the assertion is backtracked (compare L</"Backtracking">). For example, $_ = 'a' x 8; m< (?{ $cnt = 0 }) # Initialize $cnt. ( a (?{ local $cnt = $cnt + 1; # Update $cnt, # backtracking-safe. }) )* aaaa (?{ $res = $cnt }) # On success copy to # non-localized location. >x; will initially increment C<$cnt> up to 8; then during backtracking, its value will be unwound back to 4, which is the value assigned to C<$res>. At the end of the regex execution, C<$cnt> will be wound back to its initial value of 0. This assertion may be used as the condition in a (?(condition)yes-pattern|no-pattern) switch. If I<not> used in this way, the result of evaluation of I<code> is put into the special variable C<$^R>. This happens immediately, so C<$^R> can be used from other C<(?{ I<code> })> assertions inside the same regular expression. The assignment to C<$^R> above is properly localized, so the old value of C<$^R> is restored if the assertion is backtracked; compare L</"Backtracking">. Note that the special variable C<$^N> is particularly useful with code blocks to capture the results of submatches in variables without having to keep track of the number of nested parentheses. For example: $_ = "The brown fox jumps over the lazy dog"; /the (\S+)(?{ $color = $^N }) (\S+)(?{ $animal = $^N })/i; print "color = $color, animal = $animal\n"; =item C<(??{ I<code> })> X<(??{})> X<regex, postponed> X<regexp, postponed> X<regular expression, postponed> B<WARNING>: Using this feature safely requires that you understand its limitations. Code executed that has side effects may not perform identically from version to version due to the effect of future optimisations in the regex engine. For more information on this, see L</Embedded Code Execution Frequency>. This is a "postponed" regular subexpression. It behaves in I<exactly> the same way as a C<(?{ I<code> })> code block as described above, except that its return value, rather than being assigned to C<$^R>, is treated as a pattern, compiled if it's a string (or used as-is if its a qr// object), then matched as if it were inserted instead of this construct. During the matching of this sub-pattern, it has its own set of captures which are valid during the sub-match, but are discarded once control returns to the main pattern. For example, the following matches, with the inner pattern capturing "B" and matching "BB", while the outer pattern captures "A"; my $inner = '(.)\1'; "ABBA" =~ /^(.)(??{ $inner })\1/; print $1; # prints "A"; Note that this means that there is no way for the inner pattern to refer to a capture group defined outside. (The code block itself can use C<$1>, I<etc>., to refer to the enclosing pattern's capture groups.) Thus, although ('a' x 100)=~/(??{'(.)' x 100})/ I<will> match, it will I<not> set C<$1> on exit. The following pattern matches a parenthesized group: $re = qr{ \( (?: (?> [^()]+ ) # Non-parens without backtracking | (??{ $re }) # Group with matching parens )* \) }x; See also L<C<(?I<PARNO>)>|/(?I<PARNO>) (?-I<PARNO>) (?+I<PARNO>) (?R) (?0)> for a different, more efficient way to accomplish the same task. Executing a postponed regular expression too many times without consuming any input string will also result in a fatal error. The depth at which that happens is compiled into perl, so it can be changed with a custom build. =item C<(?I<PARNO>)> C<(?-I<PARNO>)> C<(?+I<PARNO>)> C<(?R)> C<(?0)> X<(?PARNO)> X<(?1)> X<(?R)> X<(?0)> X<(?-1)> X<(?+1)> X<(?-PARNO)> X<(?+PARNO)> X<regex, recursive> X<regexp, recursive> X<regular expression, recursive> X<regex, relative recursion> X<GOSUB> X<GOSTART> Recursive subpattern. Treat the contents of a given capture buffer in the current pattern as an independent subpattern and attempt to match it at the current position in the string. Information about capture state from the caller for things like backreferences is available to the subpattern, but capture buffers set by the subpattern are not visible to the caller. Similar to C<(??{ I<code> })> except that it does not involve executing any code or potentially compiling a returned pattern string; instead it treats the part of the current pattern contained within a specified capture group as an independent pattern that must match at the current position. Also different is the treatment of capture buffers, unlike C<(??{ I<code> })> recursive patterns have access to their caller's match state, so one can use backreferences safely. I<PARNO> is a sequence of digits (not starting with 0) whose value reflects the paren-number of the capture group to recurse to. C<(?R)> recurses to the beginning of the whole pattern. C<(?0)> is an alternate syntax for C<(?R)>. If I<PARNO> is preceded by a plus or minus sign then it is assumed to be relative, with negative numbers indicating preceding capture groups and positive ones following. Thus C<(?-1)> refers to the most recently declared group, and C<(?+1)> indicates the next group to be declared. Note that the counting for relative recursion differs from that of relative backreferences, in that with recursion unclosed groups B<are> included. The following pattern matches a function C<foo()> which may contain balanced parentheses as the argument. $re = qr{ ( # paren group 1 (full function) foo ( # paren group 2 (parens) \( ( # paren group 3 (contents of parens) (?: (?> [^()]+ ) # Non-parens without backtracking | (?2) # Recurse to start of paren group 2 )* ) \) ) ) }x; If the pattern was used as follows 'foo(bar(baz)+baz(bop))'=~/$re/ and print "\$1 = $1\n", "\$2 = $2\n", "\$3 = $3\n"; the output produced should be the following: $1 = foo(bar(baz)+baz(bop)) $2 = (bar(baz)+baz(bop)) $3 = bar(baz)+baz(bop) If there is no corresponding capture group defined, then it is a fatal error. Recursing deeply without consuming any input string will also result in a fatal error. The depth at which that happens is compiled into perl, so it can be changed with a custom build. The following shows how using negative indexing can make it easier to embed recursive patterns inside of a C<qr//> construct for later use: my $parens = qr/(\((?:[^()]++|(?-1))*+\))/; if (/foo $parens \s+ \+ \s+ bar $parens/x) { # do something here... } B<Note> that this pattern does not behave the same way as the equivalent PCRE or Python construct of the same form. In Perl you can backtrack into a recursed group, in PCRE and Python the recursed into group is treated as atomic. Also, modifiers are resolved at compile time, so constructs like C<(?i:(?1))> or C<(?:(?i)(?1))> do not affect how the sub-pattern will be processed. =item C<(?&I<NAME>)> X<(?&NAME)> Recurse to a named subpattern. Identical to C<(?I<PARNO>)> except that the parenthesis to recurse to is determined by name. If multiple parentheses have the same name, then it recurses to the leftmost. It is an error to refer to a name that is not declared somewhere in the pattern. B<NOTE:> In order to make things easier for programmers with experience with the Python or PCRE regex engines the pattern C<< (?P>I<NAME>) >> may be used instead of C<< (?&I<NAME>) >>. =item C<(?(I<condition>)I<yes-pattern>|I<no-pattern>)> X<(?()> =item C<(?(I<condition>)I<yes-pattern>)> Conditional expression. Matches I<yes-pattern> if I<condition> yields a true value, matches I<no-pattern> otherwise. A missing pattern always matches. C<(I<condition>)> should be one of: =over 4 =item an integer in parentheses (which is valid if the corresponding pair of parentheses matched); =item a lookahead/lookbehind/evaluate zero-width assertion; =item a name in angle brackets or single quotes (which is valid if a group with the given name matched); =item the special symbol C<(R)> (true when evaluated inside of recursion or eval). Additionally the C<"R"> may be followed by a number, (which will be true when evaluated when recursing inside of the appropriate group), or by C<&I<NAME>>, in which case it will be true only when evaluated during recursion in the named group. =back Here's a summary of the possible predicates: =over 4 =item C<(1)> C<(2)> ... Checks if the numbered capturing group has matched something. Full syntax: C<< (?(1)then|else) >> =item C<(E<lt>I<NAME>E<gt>)> C<('I<NAME>')> Checks if a group with the given name has matched something. Full syntax: C<< (?(<name>)then|else) >> =item C<(?=...)> C<(?!...)> C<(?<=...)> C<(?<!...)> Checks whether the pattern matches (or does not match, for the C<"!"> variants). Full syntax: C<< (?(?=I<lookahead>)I<then>|I<else>) >> =item C<(?{ I<CODE> })> Treats the return value of the code block as the condition. Full syntax: C<< (?(?{ I<code> })I<then>|I<else>) >> =item C<(R)> Checks if the expression has been evaluated inside of recursion. Full syntax: C<< (?(R)I<then>|I<else>) >> =item C<(R1)> C<(R2)> ... Checks if the expression has been evaluated while executing directly inside of the n-th capture group. This check is the regex equivalent of if ((caller(0))[3] eq 'subname') { ... } In other words, it does not check the full recursion stack. Full syntax: C<< (?(R1)I<then>|I<else>) >> =item C<(R&I<NAME>)> Similar to C<(R1)>, this predicate checks to see if we're executing directly inside of the leftmost group with a given name (this is the same logic used by C<(?&I<NAME>)> to disambiguate). It does not check the full stack, but only the name of the innermost active recursion. Full syntax: C<< (?(R&I<name>)I<then>|I<else>) >> =item C<(DEFINE)> In this case, the yes-pattern is never directly executed, and no no-pattern is allowed. Similar in spirit to C<(?{0})> but more efficient. See below for details. Full syntax: C<< (?(DEFINE)I<definitions>...) >> =back For example: m{ ( \( )? [^()]+ (?(1) \) ) }x matches a chunk of non-parentheses, possibly included in parentheses themselves. A special form is the C<(DEFINE)> predicate, which never executes its yes-pattern directly, and does not allow a no-pattern. This allows one to define subpatterns which will be executed only by the recursion mechanism. This way, you can define a set of regular expression rules that can be bundled into any pattern you choose. It is recommended that for this usage you put the DEFINE block at the end of the pattern, and that you name any subpatterns defined within it. Also, it's worth noting that patterns defined this way probably will not be as efficient, as the optimizer is not very clever about handling them. An example of how this might be used is as follows: /(?<NAME>(?&NAME_PAT))(?<ADDR>(?&ADDRESS_PAT)) (?(DEFINE) (?<NAME_PAT>....) (?<ADDRESS_PAT>....) )/x Note that capture groups matched inside of recursion are not accessible after the recursion returns, so the extra layer of capturing groups is necessary. Thus C<$+{NAME_PAT}> would not be defined even though C<$+{NAME}> would be. Finally, keep in mind that subpatterns created inside a DEFINE block count towards the absolute and relative number of captures, so this: my @captures = "a" =~ /(.) # First capture (?(DEFINE) (?<EXAMPLE> 1 ) # Second capture )/x; say scalar @captures; Will output 2, not 1. This is particularly important if you intend to compile the definitions with the C<qr//> operator, and later interpolate them in another pattern. =item C<< (?>I<pattern>) >> =item C<< (*atomic:I<pattern>) >> X<(?E<gt>pattern)> X<(*atomic> X<backtrack> X<backtracking> X<atomic> X<possessive> An "independent" subexpression, one which matches the substring that a standalone I<pattern> would match if anchored at the given position, and it matches I<nothing other than this substring>. This construct is useful for optimizations of what would otherwise be "eternal" matches, because it will not backtrack (see L</"Backtracking">). It may also be useful in places where the "grab all you can, and do not give anything back" semantic is desirable. For example: C<< ^(?>a*)ab >> will never match, since C<< (?>a*) >> (anchored at the beginning of string, as above) will match I<all> characters C<"a"> at the beginning of string, leaving no C<"a"> for C<ab> to match. In contrast, C<a*ab> will match the same as C<a+b>, since the match of the subgroup C<a*> is influenced by the following group C<ab> (see L</"Backtracking">). In particular, C<a*> inside C<a*ab> will match fewer characters than a standalone C<a*>, since this makes the tail match. C<< (?>I<pattern>) >> does not disable backtracking altogether once it has matched. It is still possible to backtrack past the construct, but not into it. So C<< ((?>a*)|(?>b*))ar >> will still match "bar". An effect similar to C<< (?>I<pattern>) >> may be achieved by writing C<(?=(I<pattern>))\g{-1}>. This matches the same substring as a standalone C<a+>, and the following C<\g{-1}> eats the matched string; it therefore makes a zero-length assertion into an analogue of C<< (?>...) >>. (The difference between these two constructs is that the second one uses a capturing group, thus shifting ordinals of backreferences in the rest of a regular expression.) Consider this pattern: m{ \( ( [^()]+ # x+ | \( [^()]* \) )+ \) }x That will efficiently match a nonempty group with matching parentheses two levels deep or less. However, if there is no such group, it will take virtually forever on a long string. That's because there are so many different ways to split a long string into several substrings. This is what C<(.+)+> is doing, and C<(.+)+> is similar to a subpattern of the above pattern. Consider how the pattern above detects no-match on C<((()aaaaaaaaaaaaaaaaaa> in several seconds, but that each extra letter doubles this time. This exponential performance will make it appear that your program has hung. However, a tiny change to this pattern m{ \( ( (?> [^()]+ ) # change x+ above to (?> x+ ) | \( [^()]* \) )+ \) }x which uses C<< (?>...) >> matches exactly when the one above does (verifying this yourself would be a productive exercise), but finishes in a fourth the time when used on a similar string with 1000000 C<"a">s. Be aware, however, that, when this construct is followed by a quantifier, it currently triggers a warning message under the C<use warnings> pragma or B<-w> switch saying it C<"matches null string many times in regex">. On simple groups, such as the pattern C<< (?> [^()]+ ) >>, a comparable effect may be achieved by negative lookahead, as in C<[^()]+ (?! [^()] )>. This was only 4 times slower on a string with 1000000 C<"a">s. The "grab all you can, and do not give anything back" semantic is desirable in many situations where on the first sight a simple C<()*> looks like the correct solution. Suppose we parse text with comments being delimited by C<"#"> followed by some optional (horizontal) whitespace. Contrary to its appearance, C<#[ \t]*> I<is not> the correct subexpression to match the comment delimiter, because it may "give up" some whitespace if the remainder of the pattern can be made to match that way. The correct answer is either one of these: (?>#[ \t]*) #[ \t]*(?![ \t]) For example, to grab non-empty comments into C<$1>, one should use either one of these: / (?> \# [ \t]* ) ( .+ ) /x; / \# [ \t]* ( [^ \t] .* ) /x; Which one you pick depends on which of these expressions better reflects the above specification of comments. In some literature this construct is called "atomic matching" or "possessive matching". Possessive quantifiers are equivalent to putting the item they are applied to inside of one of these constructs. The following equivalences apply: Quantifier Form Bracketing Form --------------- --------------- PAT*+ (?>PAT*) PAT++ (?>PAT+) PAT?+ (?>PAT?) PAT{min,max}+ (?>PAT{min,max}) Nested C<(?E<gt>...)> constructs are not no-ops, even if at first glance they might seem to be. This is because the nested C<(?E<gt>...)> can restrict internal backtracking that otherwise might occur. For example, "abc" =~ /(?>a[bc]*c)/ matches, but "abc" =~ /(?>a(?>[bc]*)c)/ does not. =item C<(?[ ])> See L<perlrecharclass/Extended Bracketed Character Classes>. Note that this feature is currently L<experimental|perlpolicy/experimental>; using it yields a warning in the C<experimental::regex_sets> category. =back =head2 Backtracking X<backtrack> X<backtracking> NOTE: This section presents an abstract approximation of regular expression behavior. For a more rigorous (and complicated) view of the rules involved in selecting a match among possible alternatives, see L</Combining RE Pieces>. A fundamental feature of regular expression matching involves the notion called I<backtracking>, which is currently used (when needed) by all regular non-possessive expression quantifiers, namely C<"*">, C<*?>, C<"+">, C<+?>, C<{n,m}>, and C<{n,m}?>. Backtracking is often optimized internally, but the general principle outlined here is valid. For a regular expression to match, the I<entire> regular expression must match, not just part of it. So if the beginning of a pattern containing a quantifier succeeds in a way that causes later parts in the pattern to fail, the matching engine backs up and recalculates the beginning part--that's why it's called backtracking. Here is an example of backtracking: Let's say you want to find the word following "foo" in the string "Food is on the foo table.": $_ = "Food is on the foo table."; if ( /\b(foo)\s+(\w+)/i ) { print "$2 follows $1.\n"; } When the match runs, the first part of the regular expression (C<\b(foo)>) finds a possible match right at the beginning of the string, and loads up C<$1> with "Foo". However, as soon as the matching engine sees that there's no whitespace following the "Foo" that it had saved in C<$1>, it realizes its mistake and starts over again one character after where it had the tentative match. This time it goes all the way until the next occurrence of "foo". The complete regular expression matches this time, and you get the expected output of "table follows foo." Sometimes minimal matching can help a lot. Imagine you'd like to match everything between "foo" and "bar". Initially, you write something like this: $_ = "The food is under the bar in the barn."; if ( /foo(.*)bar/ ) { print "got <$1>\n"; } Which perhaps unexpectedly yields: got <d is under the bar in the > That's because C<.*> was greedy, so you get everything between the I<first> "foo" and the I<last> "bar". Here it's more effective to use minimal matching to make sure you get the text between a "foo" and the first "bar" thereafter. if ( /foo(.*?)bar/ ) { print "got <$1>\n" } got <d is under the > Here's another example. Let's say you'd like to match a number at the end of a string, and you also want to keep the preceding part of the match. So you write this: $_ = "I have 2 numbers: 53147"; if ( /(.*)(\d*)/ ) { # Wrong! print "Beginning is <$1>, number is <$2>.\n"; } That won't work at all, because C<.*> was greedy and gobbled up the whole string. As C<\d*> can match on an empty string the complete regular expression matched successfully. Beginning is <I have 2 numbers: 53147>, number is <>. Here are some variants, most of which don't work: $_ = "I have 2 numbers: 53147"; @pats = qw{ (.*)(\d*) (.*)(\d+) (.*?)(\d*) (.*?)(\d+) (.*)(\d+)$ (.*?)(\d+)$ (.*)\b(\d+)$ (.*\D)(\d+)$ }; for $pat (@pats) { printf "%-12s ", $pat; if ( /$pat/ ) { print "<$1> <$2>\n"; } else { print "FAIL\n"; } } That will print out: (.*)(\d*) <I have 2 numbers: 53147> <> (.*)(\d+) <I have 2 numbers: 5314> <7> (.*?)(\d*) <> <> (.*?)(\d+) <I have > <2> (.*)(\d+)$ <I have 2 numbers: 5314> <7> (.*?)(\d+)$ <I have 2 numbers: > <53147> (.*)\b(\d+)$ <I have 2 numbers: > <53147> (.*\D)(\d+)$ <I have 2 numbers: > <53147> As you see, this can be a bit tricky. It's important to realize that a regular expression is merely a set of assertions that gives a definition of success. There may be 0, 1, or several different ways that the definition might succeed against a particular string. And if there are multiple ways it might succeed, you need to understand backtracking to know which variety of success you will achieve. When using lookahead assertions and negations, this can all get even trickier. Imagine you'd like to find a sequence of non-digits not followed by "123". You might try to write that as $_ = "ABC123"; if ( /^\D*(?!123)/ ) { # Wrong! print "Yup, no 123 in $_\n"; } But that isn't going to match; at least, not the way you're hoping. It claims that there is no 123 in the string. Here's a clearer picture of why that pattern matches, contrary to popular expectations: $x = 'ABC123'; $y = 'ABC445'; print "1: got $1\n" if $x =~ /^(ABC)(?!123)/; print "2: got $1\n" if $y =~ /^(ABC)(?!123)/; print "3: got $1\n" if $x =~ /^(\D*)(?!123)/; print "4: got $1\n" if $y =~ /^(\D*)(?!123)/; This prints 2: got ABC 3: got AB 4: got ABC You might have expected test 3 to fail because it seems to a more general purpose version of test 1. The important difference between them is that test 3 contains a quantifier (C<\D*>) and so can use backtracking, whereas test 1 will not. What's happening is that you've asked "Is it true that at the start of C<$x>, following 0 or more non-digits, you have something that's not 123?" If the pattern matcher had let C<\D*> expand to "ABC", this would have caused the whole pattern to fail. The search engine will initially match C<\D*> with "ABC". Then it will try to match C<(?!123)> with "123", which fails. But because a quantifier (C<\D*>) has been used in the regular expression, the search engine can backtrack and retry the match differently in the hope of matching the complete regular expression. The pattern really, I<really> wants to succeed, so it uses the standard pattern back-off-and-retry and lets C<\D*> expand to just "AB" this time. Now there's indeed something following "AB" that is not "123". It's "C123", which suffices. We can deal with this by using both an assertion and a negation. We'll say that the first part in C<$1> must be followed both by a digit and by something that's not "123". Remember that the lookaheads are zero-width expressions--they only look, but don't consume any of the string in their match. So rewriting this way produces what you'd expect; that is, case 5 will fail, but case 6 succeeds: print "5: got $1\n" if $x =~ /^(\D*)(?=\d)(?!123)/; print "6: got $1\n" if $y =~ /^(\D*)(?=\d)(?!123)/; 6: got ABC In other words, the two zero-width assertions next to each other work as though they're ANDed together, just as you'd use any built-in assertions: C</^$/> matches only if you're at the beginning of the line AND the end of the line simultaneously. The deeper underlying truth is that juxtaposition in regular expressions always means AND, except when you write an explicit OR using the vertical bar. C</ab/> means match "a" AND (then) match "b", although the attempted matches are made at different positions because "a" is not a zero-width assertion, but a one-width assertion. B<WARNING>: Particularly complicated regular expressions can take exponential time to solve because of the immense number of possible ways they can use backtracking to try for a match. For example, without internal optimizations done by the regular expression engine, this will take a painfully long time to run: 'aaaaaaaaaaaa' =~ /((a{0,5}){0,5})*[c]/ And if you used C<"*">'s in the internal groups instead of limiting them to 0 through 5 matches, then it would take forever--or until you ran out of stack space. Moreover, these internal optimizations are not always applicable. For example, if you put C<{0,5}> instead of C<"*"> on the external group, no current optimization is applicable, and the match takes a long time to finish. A powerful tool for optimizing such beasts is what is known as an "independent group", which does not backtrack (see L</C<< (?>pattern) >>>). Note also that zero-length lookahead/lookbehind assertions will not backtrack to make the tail match, since they are in "logical" context: only whether they match is considered relevant. For an example where side-effects of lookahead I<might> have influenced the following match, see L</C<< (?>pattern) >>>. =head2 Script Runs X<(*script_run:...)> X<(sr:...)> X<(*atomic_script_run:...)> X<(asr:...)> A script run is basically a sequence of characters, all from the same Unicode script (see L<perlunicode/Scripts>), such as Latin or Greek. In most places a single word would never be written in multiple scripts, unless it is a spoofing attack. An infamous example, is paypal.com Those letters could all be Latin (as in the example just above), or they could be all Cyrillic (except for the dot), or they could be a mixture of the two. In the case of an internet address the C<.com> would be in Latin, And any Cyrillic ones would cause it to be a mixture, not a script run. Someone clicking on such a link would not be directed to the real Paypal website, but an attacker would craft a look-alike one to attempt to gather sensitive information from the person. Starting in Perl 5.28, it is now easy to detect strings that aren't script runs. Simply enclose just about any pattern like either of these: (*script_run:pattern) (*sr:pattern) What happens is that after I<pattern> succeeds in matching, it is subjected to the additional criterion that every character in it must be from the same script (see exceptions below). If this isn't true, backtracking occurs until something all in the same script is found that matches, or all possibilities are exhausted. This can cause a lot of backtracking, but generally, only malicious input will result in this, though the slow down could cause a denial of service attack. If your needs permit, it is best to make the pattern atomic to cut down on the amount of backtracking. This is so likely to be what you want, that instead of writing this: (*script_run:(?>pattern)) you can write either of these: (*atomic_script_run:pattern) (*asr:pattern) (See L</C<(?E<gt>I<pattern>)>>.) In Taiwan, Japan, and Korea, it is common for text to have a mixture of characters from their native scripts and base Chinese. Perl follows Unicode's UTS 39 (L<https://unicode.org/reports/tr39/>) Unicode Security Mechanisms in allowing such mixtures. For example, the Japanese scripts Katakana and Hiragana are commonly mixed together in practice, along with some Chinese characters, and hence are treated as being in a single script run by Perl. The rules used for matching decimal digits are slightly stricter. Many scripts have their own sets of digits equivalent to the Western C<0> through C<9> ones. A few, such as Arabic, have more than one set. For a string to be considered a script run, all digits in it must come from the same set of ten, as determined by the first digit encountered. As an example, qr/(*script_run: \d+ \b )/x guarantees that the digits matched will all be from the same set of 10. You won't get a look-alike digit from a different script that has a different value than what it appears to be. Unicode has three pseudo scripts that are handled specially. "Unknown" is applied to code points whose meaning has yet to be determined. Perl currently will match as a script run, any single character string consisting of one of these code points. But any string longer than one code point containing one of these will not be considered a script run. "Inherited" is applied to characters that modify another, such as an accent of some type. These are considered to be in the script of the master character, and so never cause a script run to not match. The other one is "Common". This consists of mostly punctuation, emoji, and characters used in mathematics and music, the ASCII digits C<0> through C<9>, and full-width forms of these digits. These characters can appear intermixed in text in many of the world's scripts. These also don't cause a script run to not match. But like other scripts, all digits in a run must come from the same set of 10. This construct is non-capturing. You can add parentheses to I<pattern> to capture, if desired. You will have to do this if you plan to use L</(*ACCEPT) (*ACCEPT:arg)> and not have it bypass the script run checking. The C<Script_Extensions> property as modified by UTS 39 (L<https://unicode.org/reports/tr39/>) is used as the basis for this feature. To summarize, =over 4 =item * All length 0 or length 1 sequences are script runs. =item * A longer sequence is a script run if and only if B<all> of the following conditions are met: Z<> =over =item 1 No code point in the sequence has the C<Script_Extension> property of C<Unknown>. This currently means that all code points in the sequence have been assigned by Unicode to be characters that aren't private use nor surrogate code points. =item 2 All characters in the sequence come from the Common script and/or the Inherited script and/or a single other script. The script of a character is determined by the C<Script_Extensions> property as modified by UTS 39 (L<https://unicode.org/reports/tr39/>), as described above. =item 3 All decimal digits in the sequence come from the same block of 10 consecutive digits. =back =back =head2 Special Backtracking Control Verbs These special patterns are generally of the form C<(*I<VERB>:I<arg>)>. Unless otherwise stated the I<arg> argument is optional; in some cases, it is mandatory. Any pattern containing a special backtracking verb that allows an argument has the special behaviour that when executed it sets the current package's C<$REGERROR> and C<$REGMARK> variables. When doing so the following rules apply: On failure, the C<$REGERROR> variable will be set to the I<arg> value of the verb pattern, if the verb was involved in the failure of the match. If the I<arg> part of the pattern was omitted, then C<$REGERROR> will be set to the name of the last C<(*MARK:I<NAME>)> pattern executed, or to TRUE if there was none. Also, the C<$REGMARK> variable will be set to FALSE. On a successful match, the C<$REGERROR> variable will be set to FALSE, and the C<$REGMARK> variable will be set to the name of the last C<(*MARK:I<NAME>)> pattern executed. See the explanation for the C<(*MARK:I<NAME>)> verb below for more details. B<NOTE:> C<$REGERROR> and C<$REGMARK> are not magic variables like C<$1> and most other regex-related variables. They are not local to a scope, nor readonly, but instead are volatile package variables similar to C<$AUTOLOAD>. They are set in the package containing the code that I<executed> the regex (rather than the one that compiled it, where those differ). If necessary, you can use C<local> to localize changes to these variables to a specific scope before executing a regex. If a pattern does not contain a special backtracking verb that allows an argument, then C<$REGERROR> and C<$REGMARK> are not touched at all. =over 3 =item Verbs =over 4 =item C<(*PRUNE)> C<(*PRUNE:I<NAME>)> X<(*PRUNE)> X<(*PRUNE:NAME)> This zero-width pattern prunes the backtracking tree at the current point when backtracked into on failure. Consider the pattern C</I<A> (*PRUNE) I<B>/>, where I<A> and I<B> are complex patterns. Until the C<(*PRUNE)> verb is reached, I<A> may backtrack as necessary to match. Once it is reached, matching continues in I<B>, which may also backtrack as necessary; however, should B not match, then no further backtracking will take place, and the pattern will fail outright at the current starting position. The following example counts all the possible matching strings in a pattern (without actually matching any of them). 'aaab' =~ /a+b?(?{print "$&\n"; $count++})(*FAIL)/; print "Count=$count\n"; which produces: aaab aaa aa a aab aa a ab a Count=9 If we add a C<(*PRUNE)> before the count like the following 'aaab' =~ /a+b?(*PRUNE)(?{print "$&\n"; $count++})(*FAIL)/; print "Count=$count\n"; we prevent backtracking and find the count of the longest matching string at each matching starting point like so: aaab aab ab Count=3 Any number of C<(*PRUNE)> assertions may be used in a pattern. See also C<<< L<< /(?>I<pattern>) >> >>> and possessive quantifiers for other ways to control backtracking. In some cases, the use of C<(*PRUNE)> can be replaced with a C<< (?>pattern) >> with no functional difference; however, C<(*PRUNE)> can be used to handle cases that cannot be expressed using a C<< (?>pattern) >> alone. =item C<(*SKIP)> C<(*SKIP:I<NAME>)> X<(*SKIP)> This zero-width pattern is similar to C<(*PRUNE)>, except that on failure it also signifies that whatever text that was matched leading up to the C<(*SKIP)> pattern being executed cannot be part of I<any> match of this pattern. This effectively means that the regex engine "skips" forward to this position on failure and tries to match again, (assuming that there is sufficient room to match). The name of the C<(*SKIP:I<NAME>)> pattern has special significance. If a C<(*MARK:I<NAME>)> was encountered while matching, then it is that position which is used as the "skip point". If no C<(*MARK)> of that name was encountered, then the C<(*SKIP)> operator has no effect. When used without a name the "skip point" is where the match point was when executing the C<(*SKIP)> pattern. Compare the following to the examples in C<(*PRUNE)>; note the string is twice as long: 'aaabaaab' =~ /a+b?(*SKIP)(?{print "$&\n"; $count++})(*FAIL)/; print "Count=$count\n"; outputs aaab aaab Count=2 Once the 'aaab' at the start of the string has matched, and the C<(*SKIP)> executed, the next starting point will be where the cursor was when the C<(*SKIP)> was executed. =item C<(*MARK:I<NAME>)> C<(*:I<NAME>)> X<(*MARK)> X<(*MARK:NAME)> X<(*:NAME)> This zero-width pattern can be used to mark the point reached in a string when a certain part of the pattern has been successfully matched. This mark may be given a name. A later C<(*SKIP)> pattern will then skip forward to that point if backtracked into on failure. Any number of C<(*MARK)> patterns are allowed, and the I<NAME> portion may be duplicated. In addition to interacting with the C<(*SKIP)> pattern, C<(*MARK:I<NAME>)> can be used to "label" a pattern branch, so that after matching, the program can determine which branches of the pattern were involved in the match. When a match is successful, the C<$REGMARK> variable will be set to the name of the most recently executed C<(*MARK:I<NAME>)> that was involved in the match. This can be used to determine which branch of a pattern was matched without using a separate capture group for each branch, which in turn can result in a performance improvement, as perl cannot optimize C</(?:(x)|(y)|(z))/> as efficiently as something like C</(?:x(*MARK:x)|y(*MARK:y)|z(*MARK:z))/>. When a match has failed, and unless another verb has been involved in failing the match and has provided its own name to use, the C<$REGERROR> variable will be set to the name of the most recently executed C<(*MARK:I<NAME>)>. See L</(*SKIP)> for more details. As a shortcut C<(*MARK:I<NAME>)> can be written C<(*:I<NAME>)>. =item C<(*THEN)> C<(*THEN:I<NAME>)> This is similar to the "cut group" operator C<::> from Raku. Like C<(*PRUNE)>, this verb always matches, and when backtracked into on failure, it causes the regex engine to try the next alternation in the innermost enclosing group (capturing or otherwise) that has alternations. The two branches of a C<(?(I<condition>)I<yes-pattern>|I<no-pattern>)> do not count as an alternation, as far as C<(*THEN)> is concerned. Its name comes from the observation that this operation combined with the alternation operator (C<"|">) can be used to create what is essentially a pattern-based if/then/else block: ( COND (*THEN) FOO | COND2 (*THEN) BAR | COND3 (*THEN) BAZ ) Note that if this operator is used and NOT inside of an alternation then it acts exactly like the C<(*PRUNE)> operator. / A (*PRUNE) B / is the same as / A (*THEN) B / but / ( A (*THEN) B | C ) / is not the same as / ( A (*PRUNE) B | C ) / as after matching the I<A> but failing on the I<B> the C<(*THEN)> verb will backtrack and try I<C>; but the C<(*PRUNE)> verb will simply fail. =item C<(*COMMIT)> C<(*COMMIT:I<arg>)> X<(*COMMIT)> This is the Raku "commit pattern" C<< <commit> >> or C<:::>. It's a zero-width pattern similar to C<(*SKIP)>, except that when backtracked into on failure it causes the match to fail outright. No further attempts to find a valid match by advancing the start pointer will occur again. For example, 'aaabaaab' =~ /a+b?(*COMMIT)(?{print "$&\n"; $count++})(*FAIL)/; print "Count=$count\n"; outputs aaab Count=1 In other words, once the C<(*COMMIT)> has been entered, and if the pattern does not match, the regex engine will not try any further matching on the rest of the string. =item C<(*FAIL)> C<(*F)> C<(*FAIL:I<arg>)> X<(*FAIL)> X<(*F)> This pattern matches nothing and always fails. It can be used to force the engine to backtrack. It is equivalent to C<(?!)>, but easier to read. In fact, C<(?!)> gets optimised into C<(*FAIL)> internally. You can provide an argument so that if the match fails because of this C<FAIL> directive the argument can be obtained from C<$REGERROR>. It is probably useful only when combined with C<(?{})> or C<(??{})>. =item C<(*ACCEPT)> C<(*ACCEPT:I<arg>)> X<(*ACCEPT)> This pattern matches nothing and causes the end of successful matching at the point at which the C<(*ACCEPT)> pattern was encountered, regardless of whether there is actually more to match in the string. When inside of a nested pattern, such as recursion, or in a subpattern dynamically generated via C<(??{})>, only the innermost pattern is ended immediately. If the C<(*ACCEPT)> is inside of capturing groups then the groups are marked as ended at the point at which the C<(*ACCEPT)> was encountered. For instance: 'AB' =~ /(A (A|B(*ACCEPT)|C) D)(E)/x; will match, and C<$1> will be C<AB> and C<$2> will be C<"B">, C<$3> will not be set. If another branch in the inner parentheses was matched, such as in the string 'ACDE', then the C<"D"> and C<"E"> would have to be matched as well. You can provide an argument, which will be available in the var C<$REGMARK> after the match completes. =back =back =head2 Warning on C<\1> Instead of C<$1> Some people get too used to writing things like: $pattern =~ s/(\W)/\\\1/g; This is grandfathered (for \1 to \9) for the RHS of a substitute to avoid shocking the B<sed> addicts, but it's a dirty habit to get into. That's because in PerlThink, the righthand side of an C<s///> is a double-quoted string. C<\1> in the usual double-quoted string means a control-A. The customary Unix meaning of C<\1> is kludged in for C<s///>. However, if you get into the habit of doing that, you get yourself into trouble if you then add an C</e> modifier. s/(\d+)/ \1 + 1 /eg; # causes warning under -w Or if you try to do s/(\d+)/\1000/; You can't disambiguate that by saying C<\{1}000>, whereas you can fix it with C<${1}000>. The operation of interpolation should not be confused with the operation of matching a backreference. Certainly they mean two different things on the I<left> side of the C<s///>. =head2 Repeated Patterns Matching a Zero-length Substring B<WARNING>: Difficult material (and prose) ahead. This section needs a rewrite. Regular expressions provide a terse and powerful programming language. As with most other power tools, power comes together with the ability to wreak havoc. A common abuse of this power stems from the ability to make infinite loops using regular expressions, with something as innocuous as: 'foo' =~ m{ ( o? )* }x; The C<o?> matches at the beginning of "C<foo>", and since the position in the string is not moved by the match, C<o?> would match again and again because of the C<"*"> quantifier. Another common way to create a similar cycle is with the looping modifier C</g>: @matches = ( 'foo' =~ m{ o? }xg ); or print "match: <$&>\n" while 'foo' =~ m{ o? }xg; or the loop implied by C<split()>. However, long experience has shown that many programming tasks may be significantly simplified by using repeated subexpressions that may match zero-length substrings. Here's a simple example being: @chars = split //, $string; # // is not magic in split ($whitewashed = $string) =~ s/()/ /g; # parens avoid magic s// / Thus Perl allows such constructs, by I<forcefully breaking the infinite loop>. The rules for this are different for lower-level loops given by the greedy quantifiers C<*+{}>, and for higher-level ones like the C</g> modifier or C<split()> operator. The lower-level loops are I<interrupted> (that is, the loop is broken) when Perl detects that a repeated expression matched a zero-length substring. Thus m{ (?: NON_ZERO_LENGTH | ZERO_LENGTH )* }x; is made equivalent to m{ (?: NON_ZERO_LENGTH )* (?: ZERO_LENGTH )? }x; For example, this program #!perl -l "aaaaab" =~ / (?: a # non-zero | # or (?{print "hello"}) # print hello whenever this # branch is tried (?=(b)) # zero-width assertion )* # any number of times /x; print $&; print $1; prints hello aaaaa b Notice that "hello" is only printed once, as when Perl sees that the sixth iteration of the outermost C<(?:)*> matches a zero-length string, it stops the C<"*">. The higher-level loops preserve an additional state between iterations: whether the last match was zero-length. To break the loop, the following match after a zero-length match is prohibited to have a length of zero. This prohibition interacts with backtracking (see L</"Backtracking">), and so the I<second best> match is chosen if the I<best> match is of zero length. For example: $_ = 'bar'; s/\w??/<$&>/g; results in C<< <><b><><a><><r><> >>. At each position of the string the best match given by non-greedy C<??> is the zero-length match, and the I<second best> match is what is matched by C<\w>. Thus zero-length matches alternate with one-character-long matches. Similarly, for repeated C<m/()/g> the second-best match is the match at the position one notch further in the string. The additional state of being I<matched with zero-length> is associated with the matched string, and is reset by each assignment to C<pos()>. Zero-length matches at the end of the previous match are ignored during C<split>. =head2 Combining RE Pieces Each of the elementary pieces of regular expressions which were described before (such as C<ab> or C<\Z>) could match at most one substring at the given position of the input string. However, in a typical regular expression these elementary pieces are combined into more complicated patterns using combining operators C<ST>, C<S|T>, C<S*> I<etc>. (in these examples C<"S"> and C<"T"> are regular subexpressions). Such combinations can include alternatives, leading to a problem of choice: if we match a regular expression C<a|ab> against C<"abc">, will it match substring C<"a"> or C<"ab">? One way to describe which substring is actually matched is the concept of backtracking (see L</"Backtracking">). However, this description is too low-level and makes you think in terms of a particular implementation. Another description starts with notions of "better"/"worse". All the substrings which may be matched by the given regular expression can be sorted from the "best" match to the "worst" match, and it is the "best" match which is chosen. This substitutes the question of "what is chosen?" by the question of "which matches are better, and which are worse?". Again, for elementary pieces there is no such question, since at most one match at a given position is possible. This section describes the notion of better/worse for combining operators. In the description below C<"S"> and C<"T"> are regular subexpressions. =over 4 =item C<ST> Consider two possible matches, C<AB> and C<A'B'>, C<"A"> and C<A'> are substrings which can be matched by C<"S">, C<"B"> and C<B'> are substrings which can be matched by C<"T">. If C<"A"> is a better match for C<"S"> than C<A'>, C<AB> is a better match than C<A'B'>. If C<"A"> and C<A'> coincide: C<AB> is a better match than C<AB'> if C<"B"> is a better match for C<"T"> than C<B'>. =item C<S|T> When C<"S"> can match, it is a better match than when only C<"T"> can match. Ordering of two matches for C<"S"> is the same as for C<"S">. Similar for two matches for C<"T">. =item C<S{REPEAT_COUNT}> Matches as C<SSS...S> (repeated as many times as necessary). =item C<S{min,max}> Matches as C<S{max}|S{max-1}|...|S{min+1}|S{min}>. =item C<S{min,max}?> Matches as C<S{min}|S{min+1}|...|S{max-1}|S{max}>. =item C<S?>, C<S*>, C<S+> Same as C<S{0,1}>, C<S{0,BIG_NUMBER}>, C<S{1,BIG_NUMBER}> respectively. =item C<S??>, C<S*?>, C<S+?> Same as C<S{0,1}?>, C<S{0,BIG_NUMBER}?>, C<S{1,BIG_NUMBER}?> respectively. =item C<< (?>S) >> Matches the best match for C<"S"> and only that. =item C<(?=S)>, C<(?<=S)> Only the best match for C<"S"> is considered. (This is important only if C<"S"> has capturing parentheses, and backreferences are used somewhere else in the whole regular expression.) =item C<(?!S)>, C<(?<!S)> For this grouping operator there is no need to describe the ordering, since only whether or not C<"S"> can match is important. =item C<(??{ I<EXPR> })>, C<(?I<PARNO>)> The ordering is the same as for the regular expression which is the result of I<EXPR>, or the pattern contained by capture group I<PARNO>. =item C<(?(I<condition>)I<yes-pattern>|I<no-pattern>)> Recall that which of I<yes-pattern> or I<no-pattern> actually matches is already determined. The ordering of the matches is the same as for the chosen subexpression. =back The above recipes describe the ordering of matches I<at a given position>. One more rule is needed to understand how a match is determined for the whole regular expression: a match at an earlier position is always better than a match at a later position. =head2 Creating Custom RE Engines As of Perl 5.10.0, one can create custom regular expression engines. This is not for the faint of heart, as they have to plug in at the C level. See L<perlreapi> for more details. As an alternative, overloaded constants (see L<overload>) provide a simple way to extend the functionality of the RE engine, by substituting one pattern for another. Suppose that we want to enable a new RE escape-sequence C<\Y|> which matches at a boundary between whitespace characters and non-whitespace characters. Note that C<(?=\S)(?<!\S)|(?!\S)(?<=\S)> matches exactly at these positions, so we want to have each C<\Y|> in the place of the more complicated version. We can create a module C<customre> to do this: package customre; use overload; sub import { shift; die "No argument to customre::import allowed" if @_; overload::constant 'qr' => \&convert; } sub invalid { die "/$_[0]/: invalid escape '\\$_[1]'"} # We must also take care of not escaping the legitimate \\Y| # sequence, hence the presence of '\\' in the conversion rules. my %rules = ( '\\' => '\\\\', 'Y|' => qr/(?=\S)(?<!\S)|(?!\S)(?<=\S)/ ); sub convert { my $re = shift; $re =~ s{ \\ ( \\ | Y . ) } { $rules{$1} or invalid($re,$1) }sgex; return $re; } Now C<use customre> enables the new escape in constant regular expressions, I<i.e.>, those without any runtime variable interpolations. As documented in L<overload>, this conversion will work only over literal parts of regular expressions. For C<\Y|$re\Y|> the variable part of this regular expression needs to be converted explicitly (but only if the special meaning of C<\Y|> should be enabled inside C<$re>): use customre; $re = <>; chomp $re; $re = customre::convert $re; /\Y|$re\Y|/; =head2 Embedded Code Execution Frequency The exact rules for how often C<(??{})> and C<(?{})> are executed in a pattern are unspecified. In the case of a successful match you can assume that they DWIM and will be executed in left to right order the appropriate number of times in the accepting path of the pattern as would any other meta-pattern. How non-accepting pathways and match failures affect the number of times a pattern is executed is specifically unspecified and may vary depending on what optimizations can be applied to the pattern and is likely to change from version to version. For instance in "aaabcdeeeee"=~/a(?{print "a"})b(?{print "b"})cde/; the exact number of times "a" or "b" are printed out is unspecified for failure, but you may assume they will be printed at least once during a successful match, additionally you may assume that if "b" is printed, it will be preceded by at least one "a". In the case of branching constructs like the following: /a(b|(?{ print "a" }))c(?{ print "c" })/; you can assume that the input "ac" will output "ac", and that "abc" will output only "c". When embedded code is quantified, successful matches will call the code once for each matched iteration of the quantifier. For example: "good" =~ /g(?:o(?{print "o"}))*d/; will output "o" twice. =head2 PCRE/Python Support As of Perl 5.10.0, Perl supports several Python/PCRE-specific extensions to the regex syntax. While Perl programmers are encouraged to use the Perl-specific syntax, the following are also accepted: =over 4 =item C<< (?PE<lt>I<NAME>E<gt>I<pattern>) >> Define a named capture group. Equivalent to C<< (?<I<NAME>>I<pattern>) >>. =item C<< (?P=I<NAME>) >> Backreference to a named capture group. Equivalent to C<< \g{I<NAME>} >>. =item C<< (?P>I<NAME>) >> Subroutine call to a named capture group. Equivalent to C<< (?&I<NAME>) >>. =back =head1 BUGS There are a number of issues with regard to case-insensitive matching in Unicode rules. See C<"i"> under L</Modifiers> above. This document varies from difficult to understand to completely and utterly opaque. The wandering prose riddled with jargon is hard to fathom in several places. This document needs a rewrite that separates the tutorial content from the reference content. =head1 SEE ALSO The syntax of patterns used in Perl pattern matching evolved from those supplied in the Bell Labs Research Unix 8th Edition (Version 8) regex routines. (The code is actually derived (distantly) from Henry Spencer's freely redistributable reimplementation of those V8 routines.) L<perlrequick>. L<perlretut>. L<perlop/"Regexp Quote-Like Operators">. L<perlop/"Gory details of parsing quoted constructs">. L<perlfaq6>. L<perlfunc/pos>. L<perllocale>. L<perlebcdic>. I<Mastering Regular Expressions> by Jeffrey Friedl, published by O'Reilly and Associates. PK �=�[C��� � perl5283delta.podnu �[��� =encoding utf8 =head1 NAME perl5283delta - what is new for perl v5.28.3 =head1 DESCRIPTION This document describes differences between the 5.28.2 release and the 5.28.3 release. If you are upgrading from an earlier release such as 5.28.1, first read L<perl5282delta>, which describes differences between 5.28.1 and 5.28.2. =head1 Security =head2 [CVE-2020-10543] Buffer overflow caused by a crafted regular expression A signed C<size_t> integer overflow in the storage space calculations for nested regular expression quantifiers could cause a heap buffer overflow in Perl's regular expression compiler that overwrites memory allocated after the regular expression storage space with attacker supplied data. The target system needs a sufficient amount of memory to allocate partial expansions of the nested quantifiers prior to the overflow occurring. This requirement is unlikely to be met on 64-bit systems. Discovered by: ManhND of The Tarantula Team, VinCSS (a member of Vingroup). =head2 [CVE-2020-10878] Integer overflow via malformed bytecode produced by a crafted regular expression Integer overflows in the calculation of offsets between instructions for the regular expression engine could cause corruption of the intermediate language state of a compiled regular expression. An attacker could abuse this behaviour to insert instructions into the compiled form of a Perl regular expression. Discovered by: Hugo van der Sanden and Slaven Rezic. =head2 [CVE-2020-12723] Buffer overflow caused by a crafted regular expression Recursive calls to C<S_study_chunk()> by Perl's regular expression compiler to optimize the intermediate language representation of a regular expression could cause corruption of the intermediate language state of a compiled regular expression. Discovered by: Sergey Aleynikov. =head2 Additional Note An application written in Perl would only be vulnerable to any of the above flaws if it evaluates regular expressions supplied by the attacker. Evaluating regular expressions in this fashion is known to be dangerous since the regular expression engine does not protect against denial of service attacks in this usage scenario. =head1 Incompatible Changes There are no changes intentionally incompatible with Perl 5.28.2. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Module::CoreList> has been upgraded from version 5.20190419 to 5.20200601_28. =back =head1 Testing Tests were added and changed to reflect the other additions and changes in this release. =head1 Acknowledgements Perl 5.28.3 represents approximately 13 months of development since Perl 5.28.2 and contains approximately 3,100 lines of changes across 48 files from 16 authors. Excluding auto-generated files, documentation and release tools, there were approximately 1,700 lines of changes to 9 .pm, .t, .c and .h files. Perl continues to flourish into its fourth decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.28.3: Chris 'BinGOs' Williams, Dan Book, Hugo van der Sanden, James E Keenan, John Lightsey, Karen Etheridge, Karl Williamson, Matthew Horsfall, Max Maischein, Nicolas R., Renee Baecker, Sawyer X, Steve Hay, Tom Hukins, Tony Cook, Zak B. Elep. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the perl bug database at L<https://github.com/Perl/perl5/issues>. There may also be information at L<https://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please open an issue at L<https://github.com/Perl/perl5/issues>. Be sure to trim your bug down to a tiny but sufficient test case. If the bug you are reporting has security implications which make it inappropriate to send to a public issue tracker, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 Give Thanks If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the C<perlthanks> program: perlthanks This will send an email to the Perl 5 Porters list with your show of thanks. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�WL,! ,! perltru64.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perltru64 - Perl version 5 on Tru64 (formerly known as Digital UNIX formerly known as DEC OSF/1) systems =head1 DESCRIPTION This document describes various features of HP's (formerly Compaq's, formerly Digital's) Unix operating system (Tru64) that will affect how Perl version 5 (hereafter just Perl) is configured, compiled and/or runs. =head2 Compiling Perl 5 on Tru64 The recommended compiler to use in Tru64 is the native C compiler. The native compiler produces much faster code (the speed difference is noticeable: several dozen percentages) and also more correct code: if you are considering using the GNU C compiler you should use at the very least the release of 2.95.3 since all older gcc releases are known to produce broken code when compiling Perl. One manifestation of this brokenness is the lib/sdbm test dumping core; another is many of the op/regexp and op/pat, or ext/Storable tests dumping core (the exact pattern of failures depending on the GCC release and optimization flags). Both the native cc and gcc seem to consume lots of memory when building Perl. toke.c is a known trouble spot when optimizing: 256 megabytes of data section seems to be enough. Another known trouble spot is the mktables script which builds the Unicode support tables. The default setting of the process data section in Tru64 should be one gigabyte, but some sites/setups might have lowered that. The configuration process of Perl checks for too low process limits, and lowers the optimization for the toke.c if necessary, and also gives advice on how to raise the process limits (for example: C<ulimit -d 262144>) Also, Configure might abort with Build a threading Perl? [n] Configure[2437]: Syntax error at line 1 : 'config.sh' is not expected. This indicates that Configure is being run with a broken Korn shell (even though you think you are using a Bourne shell by using "sh Configure" or "./Configure"). The Korn shell bug has been reported to Compaq as of February 1999 but in the meanwhile, the reason ksh is being used is that you have the environment variable BIN_SH set to 'xpg4'. This causes /bin/sh to delegate its duties to /bin/posix/sh (a ksh). Unset the environment variable and rerun Configure. =head2 Using Large Files with Perl on Tru64 In Tru64 Perl is automatically able to use large files, that is, files larger than 2 gigabytes, there is no need to use the Configure -Duselargefiles option as described in INSTALL (though using the option is harmless). =head2 Threaded Perl on Tru64 If you want to use threads, you should primarily use the Perl 5.8.0 threads model by running Configure with -Duseithreads. Perl threading is going to work only in Tru64 4.0 and newer releases, older operating releases like 3.2 aren't probably going to work properly with threads. In Tru64 V5 (at least V5.1A, V5.1B) you cannot build threaded Perl with gcc because the system header <pthread.h> explicitly checks for supported C compilers, gcc (at least 3.2.2) not being one of them. But the system C compiler should work just fine. =head2 Long Doubles on Tru64 You cannot Configure Perl to use long doubles unless you have at least Tru64 V5.0, the long double support simply wasn't functional enough before that. Perl's Configure will override attempts to use the long doubles (you can notice this by Configure finding out that the modfl() function does not work as it should). At the time of this writing (June 2002), there is a known bug in the Tru64 libc printing of long doubles when not using "e" notation. The values are correct and usable, but you only get a limited number of digits displayed unless you force the issue by using C<printf "%.33e",$num> or the like. For Tru64 versions V5.0A through V5.1A, a patch is expected sometime after perl 5.8.0 is released. If your libc has not yet been patched, you'll get a warning from Configure when selecting long doubles. =head2 DB_File tests failing on Tru64 The DB_File tests (db-btree.t, db-hash.t, db-recno.t) may fail you have installed a newer version of Berkeley DB into the system and the -I and -L compiler and linker flags introduce version conflicts with the DB 1.85 headers and libraries that came with the Tru64. For example, mixing a DB v2 library with the DB v1 headers is a bad idea. Watch out for Configure options -Dlocincpth and -Dloclibpth, and check your /usr/local/include and /usr/local/lib since they are included by default. The second option is to explicitly instruct Configure to detect the newer Berkeley DB installation, by supplying the right directories with C<-Dlocincpth=/some/include> and C<-Dloclibpth=/some/lib> B<and> before running "make test" setting your LD_LIBRARY_PATH to F</some/lib>. The third option is to work around the problem by disabling the DB_File completely when build Perl by specifying -Ui_db to Configure, and then using the BerkeleyDB module from CPAN instead of DB_File. The BerkeleyDB works with Berkeley DB versions 2.* or greater. The Berkeley DB 4.1.25 has been tested with Tru64 V5.1A and found to work. The latest Berkeley DB can be found from L<http://www.sleepycat.com>. =head2 64-bit Perl on Tru64 In Tru64 Perl's integers are automatically 64-bit wide, there is no need to use the Configure -Duse64bitint option as described in INSTALL. Similarly, there is no need for -Duse64bitall since pointers are automatically 64-bit wide. =head2 Warnings about floating-point overflow when compiling Perl on Tru64 When compiling Perl in Tru64 you may (depending on the compiler release) see two warnings like this cc: Warning: numeric.c, line 104: In this statement, floating-point overflow occurs in evaluating the expression "1.8e308". (floatoverfl) return HUGE_VAL; -----------^ and when compiling the POSIX extension cc: Warning: const-c.inc, line 2007: In this statement, floating-point overflow occurs in evaluating the expression "1.8e308". (floatoverfl) return HUGE_VAL; -------------------^ The exact line numbers may vary between Perl releases. The warnings are benign and can be ignored: in later C compiler releases the warnings should be gone. When the file F<pp_sys.c> is being compiled you may (depending on the operating system release) see an additional compiler flag being used: C<-DNO_EFF_ONLY_OK>. This is normal and refers to a feature that is relevant only if you use the C<filetest> pragma. In older releases of the operating system the feature was broken and the NO_EFF_ONLY_OK instructs Perl not to use the feature. =head1 Testing Perl on Tru64 During "make test" the C<comp>/C<cpp> will be skipped because on Tru64 it cannot be tested before Perl has been installed. The test refers to the use of the C<-P> option of Perl. =head1 ext/ODBM_File/odbm Test Failing With Static Builds The ext/ODBM_File/odbm is known to fail with static builds (Configure -Uusedl) due to a known bug in Tru64's static libdbm library. The good news is that you very probably don't need to ever use the ODBM_File extension since more advanced NDBM_File works fine, not to mention the even more advanced DB_File. =head1 Perl Fails Because Of Unresolved Symbol sockatmark If you get an error like Can't load '.../OSF1/lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so' for module IO: Unresolved symbol in .../lib/perl5/5.8.0/alpha-dec_osf/auto/IO/IO.so: sockatmark at .../lib/perl5/5.8.0/alpha-dec_osf/XSLoader.pm line 75. you need to either recompile your Perl in Tru64 4.0D or upgrade your Tru64 4.0D to at least 4.0F: the sockatmark() system call was added in Tru64 4.0F, and the IO extension refers that symbol. =head1 read_cur_obj_info: bad file magic number You may be mixing the Tru64 cc/ar/ld with the GNU gcc/ar/ld. That may work, but sometimes it doesn't (your gcc or GNU utils may have been compiled for an incompatible OS release). Try 'which ld' and 'which ld' (or try 'ar --version' and 'ld --version', which work only for the GNU tools, and will announce themselves to be such), and adjust your PATH so that you are consistently using either the native tools or the GNU tools. After fixing your PATH, you should do 'make distclean' and start all the way from running the Configure since you may have quite a confused situation. =head1 AUTHOR Jarkko Hietaniemi <jhi@iki.fi> =cut PK �=�[��l�8 �8 perlgov.podnu �[��� =encoding utf-8 =head1 NAME perlgov - Perl Rules of Governance =head1 PREAMBLE We are forming a system of governance for development of the Perl programming language. The scope of governance includes the language definition, its implementation, its test suite, its documentation, and the policies and procedures by which it is developed and maintained. The system of governance includes definitions of the groups that will make decisions, the rules by which these groups are formed and changed, and the enumerated powers and constraints on the activities of these governing groups. In forming a system of governance, we seek to achieve the following goals: =over =item * We want a system that is functional. That means the governing groups may decide to undertake large changes, or they may decide to act conservatively, but they will act with intent and clear communication rather than fail to reach decisions when needed. =item * We want a system that is trusted. That means that a reasonable contributor to Perl might disagree with decisions made by the governing groups, but will accept that they were made in good faith in consultation with relevant communities outside the governing groups. =item * We want a system that is sustainable. That means it has provisions to self-modify, including ways of adding new members to the governing groups, ways to survive members becoming inactive, and ways of amending the rules of governance themselves if needed. =item * We want a system that is transparent. That means that it will prefer policies that manage ordinary matters in public, and it will prefer secrecy in a limited number of situations. =item * We want a system that is respectful. That means that it will establish standards of civil discourse that allow for healthy disagreement but avoid rancor and hostility in the community for which it is responsible. =back =head1 Mandate Perl language governance shall work to: =over =item * Maintain the quality, stability, and continuity of the Perl language and interpreter =item * Guide the evolution of the Perl language and interpreter =item * Establish and oversee the policies, procedures, systems, and mechanisms that enable a community of contributors to the Perl language and interpreter =item * Encourage discussion and consensus among contributors as preferential to formal decision making by governance groups =item * Facilitate communication between contributors and external stakeholders in the broader Perl ecosystem =back =head1 Definitions This document describes three roles involved in governance: =over =item "Core Team" =item "Steering Council" =item "Vote Administrator" =back A section on each follows. =head2 The Core Team The Core Team are a group of trusted volunteers involved in the ongoing development of the Perl language and interpreter. They are not required to be language developers or committers. References to specific votes are explained in the "Rules for Voting" section. =head3 Powers In addition to their contributions to the Perl language, the Core Team sets the rules of Perl governance, decides who participates in what role in governance, and delegates substantial decision making power to the Steering Council. Specifically: =over =item * They elect the Steering Council and have the power to remove Steering Council members. =item * In concert with the Steering Council, they manage Core Team membership. =item * In concert with the Steering Council, they have the power to modify the Perl Rules of Governance. =back The Core Team do not have any authority over parts of the Perl ecosystem unrelated to developing and releasing the language itself. These include, but are not limited to: =over =item * The Perl Foundation =item * CPAN administration and CPAN authors =item * perl.org, metacpan.org, and other community-maintained websites and services =item * Perl conferences and events, except those organized directly by the Core Team =item * Perl-related intellectual property legally owned by third-parties, except as allowed by applicable licenses or agreements =back =head3 Membership The initial Core Team members will be specified when this document is first ratified. Any Core Team member may nominate someone to be added to the Core Team by sending the nomination to the Steering Council. The Steering Council must approve or reject the nomination. If approved, the Steering Council will organize a Membership Change Vote to ratify the addition. Core Team members should demonstrate: =over =item * A solid track record of being constructive and helpful =item * Significant contributions to the project's goals, in any form =item * Willingness to dedicate some time to improving Perl =back Contributions are not limited to code. Here is an incomplete list of areas where contributions may be considered for joining the Core Team: =over =item * Working on community management and outreach =item * Providing support on mailing lists, IRC, or other forums =item * Triaging tickets =item * Writing patches (code, docs, or tests) =item * Reviewing patches (code, docs, or tests) =item * Participating in design discussions =item * Providing expertise in a particular domain (security, i18n, etc.) =item * Managing Perl infrastructure (websites, CI, documentation, etc.) =item * Maintaining significant projects in the Perl ecosystem =item * Creating visual designs =back Core Team membership acknowledges sustained and valuable efforts that align well with the philosophy and the goals of the Perl project. Core Team members are expected to act as role models for the community and custodians of the project, on behalf of the community and all those who rely on Perl. =head3 Term Core Team members serve until they are removed. =head3 Removal Core Team Members may resign their position at any time. In exceptional circumstances, it may be necessary to remove someone from the Core Team against their will, such as for flagrant or repeated violations of a Code of Conduct. Any Core Team member may send a recall request to the Steering Council naming the individual to be removed. The Steering Council must approve or reject the recall request. If approved, the Steering Council will organize a Membership Change vote to ratify the removal. If the removed member is also on the Steering Council, then they are removed from the Steering Council as well. =head3 Inactivity Core Team members who have stopped contributing are encouraged to declare themselves "inactive". Inactive members do not nominate or vote. Inactive members may declare themselves active at any time, except when a vote has been proposed and is not concluded. Eligibility to nominate or vote will be determined by the Vote Administrator. To record and honor their contributions, inactive Core Team members will continue to be listed alongside active members. =head3 No Confidence in the Steering Council The Core Team may remove either a single Steering Council member or the entire Steering Council via a No Confidence Vote. A No Confidence Vote is triggered when a Core Team member calls for one publicly on an appropriate project communication channel, and another Core Team member seconds the proposal. If a No Confidence Vote removes all Steering Council members, the Vote Administrator of the No Confidence Vote will then administer an election to select a new Steering Council. =head3 Amending Perl Rules of Governance Any Core Team member may propose amending the Perl Rules of Governance by sending a proposal to the Steering Council. The Steering Council must decide to approve or reject the proposal. If approved, the Steering Council will administer an Amendment Vote. =head3 Rules for Voting Membership Change, Amendment, and No Confidence Votes require 2/3 of participating votes from Core Team members to pass. A Vote Administrator must be selected following the rules in the "Vote Administrator" section. The vote occurs in two steps: =over =item 1 The Vote Administrator describes the proposal being voted upon. The Core Team then may discuss the matter in advance of voting. =item 2 Active Core Team members vote in favor or against the proposal. Voting is performed anonymously. =back For a Membership Change Vote, each phase will last one week. For Amendment and No Confidence Votes, each phase will last two weeks. =head2 The Steering Council The Steering Council is a 3-person committee, elected by the Core Team. Candidates are not required to be members of the Core Team. Non-member candidates are added to the Core Team if elected as if by a Membership Change Vote. References to specific elections are explained in the "Rules for Elections" section. =head3 Powers The Steering Council has broad authority to make decisions about the development of the Perl language, the interpreter, and all other components, systems and processes that result in new releases of the language interpreter. For example, it can: =over =item * Manage the schedule and process for shipping new releases =item * Establish procedures for proposing, discussing and deciding upon changes to the language =item * Delegate power to individuals on or outside the Steering Council =back Decisions of the Steering Council will be made by majority vote of non-vacant seats on the council. The Steering Council should look for ways to use these powers as little as possible. Instead of voting, it's better to seek consensus. Instead of ruling on individual cases, it's better to define standards and processes that apply to all cases. As with the Core Team, the Steering Council does not have any authority over parts of the Perl ecosystem unrelated to developing and releasing the language itself. The Steering Council does not have the power to modify the Perl Rules of Governance, except as provided in the section "Amending Perl Rules of Governance". =head3 Term A new Steering Council will be chosen by a Term Election within two weeks after each stable feature release (that is, change to C<PERL_REVISION> or C<PERL_VERSION>) or after two years, whichever comes first. The council members will serve until the completion of the next Term Election unless they are removed. =head3 Removal Steering Council members may resign their position at any time. Whenever there are vacancies on the Steering Council, the council will organize a Special Election within one week after the vacancy occurs. If the entire Steering Council is ever vacant, a Term Election will be held instead. If a Steering Council member is deceased, or drops out of touch and cannot be contacted for a month or longer, then the rest of the council may vote to declare their seat vacant. If an absent member returns after such a declaration is made, they are not reinstated automatically, but may run in the Special Election to fill the vacancy. Otherwise, Steering Council members may only be removed before the end of their term through a No Confidence Vote by the Core Team. =head3 Rules for Elections Term and Special Election are ranked-choice votes to construct an ordered list of candidates to fill vacancies in the Steering Council. A Vote Administrator must be selected following the rules in the "Vote Administrator" section. Both Term and Special Elections occur in two stages: =over =item 1 Candidates advertise their interest in serving. Candidates must be nominated by an active Core Team member. Self-nominations are allowed. Nominated candidates may share a statement about their candidacy with the Core Team. =item 2 Active Core Team Members vote by ranking all candidates. Voting is performed anonymously. After voting is complete, candidates are ranked using the Condorcet Internet Voting Service's proportional representation mode. If a tie occurs, it may be resolved by mutual agreement among the tied candidates, or else the tie will be resolved through random selection by the Vote Administrator. =back Anyone voted off the Core Team is not eligible to be a candidate for Steering Council unless re-instated to the Core Team. For a Term Election, each phase will last two weeks. At the end of the second phase, the top three ranked candidates are elected as the new Steering Council. For a Special Election, each phase will last one week. At the end of the second phase, vacancies are filled from the ordered list of candidates until no vacancies remain. The election of the first Steering Council will be a Term Election. Ricardo Signes will be the Vote Administrator for the initial Term Election unless he is a candidate, in which case he will select a non-candidate administrator to replace him. =head2 The Vote Administrator Every election or vote requires a Vote Administrator who manages communication, collection of secret ballots, and all other necessary activities to complete the voting process. Unless otherwise specified, the Steering Council selects the Vote Administrator. A Vote Administrator must not be a member of the Steering Council nor a candidate or subject of the vote. A Vote Administrator may be a member of the Core Team and, if so, may cast a vote while also serving as administrator. If the Vote Administrator becomes a candidate during an election vote, they will appoint a non-candidate replacement. If the entire Steering Council is vacant or is the subject of a No Confidence Vote, then the Core Team will select a Vote Administrator by consensus. If consensus cannot be reached within one week, the President of The Perl Foundation will select a Vote Administrator. =head1 Core Team Members The current members of the Perl Core Team are: =over =item * Abhijit Menon-Sen (inactive) =item * Andy Dougherty =item * Chad Granum =item * Chris 'BinGOs' Williams =item * Craig Berry =item * Dagfinn Ilmari Mannsåker =item * Dave Mitchell =item * David Golden =item * H. Merijn Brand =item * Hugo van der Sanden =item * James E Keenan =item * Jan Dubois (inactive) =item * Jesse Vincent (inactive) =item * Karen Etheridge =item * Karl Williamson =item * Leon Timmermans =item * Matthew Horsfall =item * Max Maischein =item * Nicholas Clark =item * Nicolas R. =item * Paul "LeoNerd" Evans =item * Philippe "BooK" Bruhat =item * Ricardo Signes =item * Sawyer X =item * Steve Hay =item * Stuart Mackintosh =item * Todd Rinaldo =item * Tony Cook =back PK �=�[��%+ + perlnewmod.podnu �[��� =head1 NAME perlnewmod - preparing a new module for distribution =head1 DESCRIPTION This document gives you some suggestions about how to go about writing Perl modules, preparing them for distribution, and making them available via CPAN. One of the things that makes Perl really powerful is the fact that Perl hackers tend to want to share the solutions to problems they've faced, so you and I don't have to battle with the same problem again. The main way they do this is by abstracting the solution into a Perl module. If you don't know what one of these is, the rest of this document isn't going to be much use to you. You're also missing out on an awful lot of useful code; consider having a look at L<perlmod>, L<perlmodlib> and L<perlmodinstall> before coming back here. When you've found that there isn't a module available for what you're trying to do, and you've had to write the code yourself, consider packaging up the solution into a module and uploading it to CPAN so that others can benefit. You should also take a look at L<perlmodstyle> for best practices in making a module. =head2 Warning We're going to primarily concentrate on Perl-only modules here, rather than XS modules. XS modules serve a rather different purpose, and you should consider different things before distributing them - the popularity of the library you are gluing, the portability to other operating systems, and so on. However, the notes on preparing the Perl side of the module and packaging and distributing it will apply equally well to an XS module as a pure-Perl one. =head2 What should I make into a module? You should make a module out of any code that you think is going to be useful to others. Anything that's likely to fill a hole in the communal library and which someone else can slot directly into their program. Any part of your code which you can isolate and extract and plug into something else is a likely candidate. Let's take an example. Suppose you're reading in data from a local format into a hash-of-hashes in Perl, turning that into a tree, walking the tree and then piping each node to an Acme Transmogrifier Server. Now, quite a few people have the Acme Transmogrifier, and you've had to write something to talk the protocol from scratch - you'd almost certainly want to make that into a module. The level at which you pitch it is up to you: you might want protocol-level modules analogous to L<Net::SMTP|Net::SMTP> which then talk to higher level modules analogous to L<Mail::Send|Mail::Send>. The choice is yours, but you do want to get a module out for that server protocol. Nobody else on the planet is going to talk your local data format, so we can ignore that. But what about the thing in the middle? Building tree structures from Perl variables and then traversing them is a nice, general problem, and if nobody's already written a module that does that, you might want to modularise that code too. So hopefully you've now got a few ideas about what's good to modularise. Let's now see how it's done. =head2 Step-by-step: Preparing the ground Before we even start scraping out the code, there are a few things we'll want to do in advance. =over 3 =item Look around Dig into a bunch of modules to see how they're written. I'd suggest starting with L<Text::Tabs|Text::Tabs>, since it's in the standard library and is nice and simple, and then looking at something a little more complex like L<File::Copy|File::Copy>. For object oriented code, L<WWW::Mechanize> or the C<Email::*> modules provide some good examples. These should give you an overall feel for how modules are laid out and written. =item Check it's new There are a lot of modules on CPAN, and it's easy to miss one that's similar to what you're planning on contributing. Have a good plough through L<http://metacpan.org> and make sure you're not the one reinventing the wheel! =item Discuss the need You might love it. You might feel that everyone else needs it. But there might not actually be any real demand for it out there. If you're unsure about the demand your module will have, consider asking the C<module-authors@perl.org> mailing list (send an email to C<module-authors-subscribe@perl.org> to subscribe; see L<https://lists.perl.org/list/module-authors.html> for more information and a link to the archives). =item Choose a name Perl modules included on CPAN have a naming hierarchy you should try to fit in with. See L<perlmodlib> for more details on how this works, and browse around CPAN and the modules list to get a feel of it. At the very least, remember this: modules should be title capitalised, (This::Thing) fit in with a category, and explain their purpose succinctly. =item Check again While you're doing that, make really sure you haven't missed a module similar to the one you're about to write. When you've got your name sorted out and you're sure that your module is wanted and not currently available, it's time to start coding. =back =head2 Step-by-step: Making the module =over 3 =item Start with F<module-starter> or F<h2xs> The F<module-starter> utility is distributed as part of the L<Module::Starter|Module::Starter> CPAN package. It creates a directory with stubs of all the necessary files to start a new module, according to recent "best practice" for module development, and is invoked from the command line, thus: module-starter --module=Foo::Bar \ --author="Your Name" --email=yourname@cpan.org If you do not wish to install the L<Module::Starter|Module::Starter> package from CPAN, F<h2xs> is an older tool, originally intended for the development of XS modules, which comes packaged with the Perl distribution. A typical invocation of L<h2xs|h2xs> for a pure Perl module is: h2xs -AX --skip-exporter --use-new-tests -n Foo::Bar The C<-A> omits the Autoloader code, C<-X> omits XS elements, C<--skip-exporter> omits the Exporter code, C<--use-new-tests> sets up a modern testing environment, and C<-n> specifies the name of the module. =item Use L<strict|strict> and L<warnings|warnings> A module's code has to be warning and strict-clean, since you can't guarantee the conditions that it'll be used under. Besides, you wouldn't want to distribute code that wasn't warning or strict-clean anyway, right? =item Use L<Carp|Carp> The L<Carp|Carp> module allows you to present your error messages from the caller's perspective; this gives you a way to signal a problem with the caller and not your module. For instance, if you say this: warn "No hostname given"; the user will see something like this: No hostname given at /usr/local/lib/perl5/site_perl/5.6.0/Net/Acme.pm line 123. which looks like your module is doing something wrong. Instead, you want to put the blame on the user, and say this: No hostname given at bad_code, line 10. You do this by using L<Carp|Carp> and replacing your C<warn>s with C<carp>s. If you need to C<die>, say C<croak> instead. However, keep C<warn> and C<die> in place for your sanity checks - where it really is your module at fault. =item Use L<Exporter|Exporter> - wisely! L<Exporter|Exporter> gives you a standard way of exporting symbols and subroutines from your module into the caller's namespace. For instance, saying C<use Net::Acme qw(&frob)> would import the C<frob> subroutine. The package variable C<@EXPORT> will determine which symbols will get exported when the caller simply says C<use Net::Acme> - you will hardly ever want to put anything in there. C<@EXPORT_OK>, on the other hand, specifies which symbols you're willing to export. If you do want to export a bunch of symbols, use the C<%EXPORT_TAGS> and define a standard export set - look at L<Exporter> for more details. =item Use L<plain old documentation|perlpod> The work isn't over until the paperwork is done, and you're going to need to put in some time writing some documentation for your module. C<module-starter> or C<h2xs> will provide a stub for you to fill in; if you're not sure about the format, look at L<perlpod> for an introduction. Provide a good synopsis of how your module is used in code, a description, and then notes on the syntax and function of the individual subroutines or methods. Use Perl comments for developer notes and POD for end-user notes. =item Write tests You're encouraged to create self-tests for your module to ensure it's working as intended on the myriad platforms Perl supports; if you upload your module to CPAN, a host of testers will build your module and send you the results of the tests. Again, C<module-starter> and C<h2xs> provide a test framework which you can extend - you should do something more than just checking your module will compile. L<Test::Simple|Test::Simple> and L<Test::More|Test::More> are good places to start when writing a test suite. =item Write the F<README> If you're uploading to CPAN, the automated gremlins will extract the README file and place that in your CPAN directory. It'll also appear in the main F<by-module> and F<by-category> directories if you make it onto the modules list. It's a good idea to put here what the module actually does in detail. =item Write F<Changes> Add any user-visible changes since the last release to your F<Changes> file. =back =head2 Step-by-step: Distributing your module =over 3 =item Get a CPAN user ID Every developer publishing modules on CPAN needs a CPAN ID. Visit C<L<http://pause.perl.org/>>, select "Request PAUSE Account", and wait for your request to be approved by the PAUSE administrators. =item C<perl Makefile.PL; make test; make distcheck; make dist> Once again, C<module-starter> or C<h2xs> has done all the work for you. They produce the standard C<Makefile.PL> you see when you download and install modules, and this produces a Makefile with a C<dist> target. Once you've ensured that your module passes its own tests - always a good thing to make sure - you can C<make distcheck> to make sure everything looks OK, followed by C<make dist>, and the Makefile will hopefully produce you a nice tarball of your module, ready for upload. =item Upload the tarball The email you got when you received your CPAN ID will tell you how to log in to PAUSE, the Perl Authors Upload SErver. From the menus there, you can upload your module to CPAN. Alternatively you can use the F<cpan-upload> script, part of the L<CPAN::Uploader> distribution on CPAN. =item Fix bugs! Once you start accumulating users, they'll send you bug reports. If you're lucky, they'll even send you patches. Welcome to the joys of maintaining a software project... =back =head1 AUTHOR Simon Cozens, C<simon@cpan.org> Updated by Kirrily "Skud" Robert, C<skud@cpan.org> =head1 SEE ALSO L<perlmod>, L<perlmodlib>, L<perlmodinstall>, L<h2xs>, L<strict>, L<Carp>, L<Exporter>, L<perlpod>, L<Test::Simple>, L<Test::More> L<ExtUtils::MakeMaker>, L<Module::Build>, L<Module::Starter> L<http://www.cpan.org/>, Ken Williams' tutorial on building your own module at L<http://mathforum.org/~ken/perl_modules.html> PK �=�[w��j�b �b perl588delta.podnu �[��� =encoding utf8 =head1 NAME perl588delta - what is new for perl v5.8.8 =head1 DESCRIPTION This document describes differences between the 5.8.7 release and the 5.8.8 release. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.8.7. If any exist, they are bugs and reports are welcome. =head1 Core Enhancements =over =item * C<chdir>, C<chmod> and C<chown> can now work on filehandles as well as filenames, if the system supports respectively C<fchdir>, C<fchmod> and C<fchown>, thanks to a patch provided by Gisle Aas. =back =head1 Modules and Pragmata =over =item * C<Attribute::Handlers> upgraded to version 0.78_02 =over =item * Documentation typo fix =back =item * C<attrs> upgraded to version 1.02 =over =item * Internal cleanup only =back =item * C<autouse> upgraded to version 1.05 =over =item * Simplified implementation =back =item * C<B> upgraded to version 1.09_01 =over =item * The inheritance hierarchy of the C<B::> modules has been corrected; C<B::NV> now inherits from C<B::SV> (instead of C<B::IV>). =back =item * C<blib> upgraded to version 1.03 =over =item * Documentation typo fix =back =item * C<ByteLoader> upgraded to version 0.06 =over =item * Internal cleanup =back =item * C<CGI> upgraded to version 3.15 =over =item * Extraneous "?" from C<self_url()> removed =item * C<scrolling_list()> select attribute fixed =item * C<virtual_port> now works properly with the https protocol =item * C<upload_hook()> and C<append()> now works in function-oriented mode =item * C<POST_MAX> doesn't cause the client to hang any more =item * Automatic tab indexes are now disabled and new C<-tabindex> pragma has been added to turn automatic indexes back on =item * C<end_form()> doesn't emit empty (and non-validating) C<< <div> >> =item * C<CGI::Carp> works better in certain mod_perl configurations =item * Setting C<$CGI::TMPDIRECTORY> is now effective =item * Enhanced documentation =back =item * C<charnames> upgraded to version 1.05 =over =item * C<viacode()> now accept hex strings and has been optimized. =back =item * C<CPAN> upgraded to version 1.76_02 =over =item * 1 minor bug fix for Win32 =back =item * C<Cwd> upgraded to version 3.12 =over =item * C<canonpath()> on Win32 now collapses F<foo\..> sections correctly. =item * Improved behaviour on Symbian OS. =item * Enhanced documentation and typo fixes =item * Internal cleanup =back =item * C<Data::Dumper> upgraded to version 2.121_08 =over =item * A problem where C<Data::Dumper> would sometimes update the iterator state of hashes has been fixed =item * Numeric labels now work =item * Internal cleanup =back =item * C<DB> upgraded to version 1.01 =over =item * A problem where the state of the regexp engine would sometimes get clobbered when running under the debugger has been fixed. =back =item * C<DB_File> upgraded to version 1.814 =over =item * Adds support for Berkeley DB 4.4. =back =item * C<Devel::DProf> upgraded to version 20050603.00 =over =item * Internal cleanup =back =item * C<Devel::Peek> upgraded to version 1.03 =over =item * Internal cleanup =back =item * C<Devel::PPPort> upgraded to version 3.06_01 =over =item * C<--compat-version> argument checking has been improved =item * Files passed on the command line are filtered by default =item * C<--nofilter> option to override the filtering has been added =item * Enhanced documentation =back =item * C<diagnostics> upgraded to version 1.15 =over =item * Documentation typo fix =back =item * C<Digest> upgraded to version 1.14 =over =item * The constructor now knows which module implements SHA-224 =item * Documentation tweaks and typo fixes =back =item * C<Digest::MD5> upgraded to version 2.36 =over =item * C<XSLoader> is now used for faster loading =item * Enhanced documentation including MD5 weaknesses discovered lately =back =item * C<Dumpvalue> upgraded to version 1.12 =over =item * Documentation fix =back =item * C<DynaLoader> upgraded but unfortunately we're not able to increment its version number :-( =over =item * Implements C<dl_unload_file> on Win32 =item * Internal cleanup =item * C<XSLoader> 0.06 incorporated; small optimisation for calling C<bootstrap_inherit()> and documentation enhancements. =back =item * C<Encode> upgraded to version 2.12 =over =item * A coderef is now acceptable for C<CHECK>! =item * 3 new characters added to the ISO-8859-7 encoding =item * New encoding C<MIME-Header-ISO_2022_JP> added =item * Problem with partial characters and C<< encoding(utf-8-strict) >> fixed. =item * Documentation enhancements and typo fixes =back =item * C<English> upgraded to version 1.02 =over =item * the C<< $COMPILING >> variable has been added =back =item * C<ExtUtils::Constant> upgraded to version 0.17 =over =item * Improved compatibility with older versions of perl =back =item * C<ExtUtils::MakeMaker> upgraded to version 6.30 (was 6.17) =over =item * Too much to list here; see L<http://search.cpan.org/dist/ExtUtils-MakeMaker/Changes> =back =item * C<File::Basename> upgraded to version 2.74, with changes contributed by Michael Schwern. =over =item * Documentation clarified and errors corrected. =item * C<basename> now strips trailing path separators before processing the name. =item * C<basename> now returns C</> for parameter C</>, to make C<basename> consistent with the shell utility of the same name. =item * The suffix is no longer stripped if it is identical to the remaining characters in the name, again for consistency with the shell utility. =item * Some internal code cleanup. =back =item * C<File::Copy> upgraded to version 2.09 =over =item * Copying a file onto itself used to fail. =item * Moving a file between file systems now preserves the access and modification time stamps =back =item * C<File::Find> upgraded to version 1.10 =over =item * Win32 portability fixes =item * Enhanced documentation =back =item * C<File::Glob> upgraded to version 1.05 =over =item * Internal cleanup =back =item * C<File::Path> upgraded to version 1.08 =over =item * C<mkpath> now preserves C<errno> when C<mkdir> fails =back =item * C<File::Spec> upgraded to version 3.12 =over =item * C<< File::Spec->rootdir() >> now returns C<\> on Win32, instead of C</> =item * C<$^O> could sometimes become tainted. This has been fixed. =item * C<canonpath> on Win32 now collapses C<foo/..> (or C<foo\..>) sections correctly, rather than doing the "misguided" work it was previously doing. Note that C<canonpath> on Unix still does B<not> collapse these sections, as doing so would be incorrect. =item * Some documentation improvements =item * Some internal code cleanup =back =item * C<FileCache> upgraded to version 1.06 =over =item * POD formatting errors in the documentation fixed =back =item * C<Filter::Simple> upgraded to version 0.82 =item * C<FindBin> upgraded to version 1.47 =over =item * Now works better with directories where access rights are more restrictive than usual. =back =item * C<GDBM_File> upgraded to version 1.08 =over =item * Internal cleanup =back =item * C<Getopt::Long> upgraded to version 2.35 =over =item * C<prefix_pattern> has now been complemented by a new configuration option C<long_prefix_pattern> that allows the user to specify what prefix patterns should have long option style semantics applied. =item * Options can now take multiple values at once (experimental) =item * Various bug fixes =back =item * C<if> upgraded to version 0.05 =over =item * Give more meaningful error messages from C<if> when invoked with a condition in list context. =item * Restore backwards compatibility with earlier versions of perl =back =item * C<IO> upgraded to version 1.22 =over =item * Enhanced documentation =item * Internal cleanup =back =item * C<IPC::Open2> upgraded to version 1.02 =over =item * Enhanced documentation =back =item * C<IPC::Open3> upgraded to version 1.02 =over =item * Enhanced documentation =back =item * C<List::Util> upgraded to version 1.18 (was 1.14) =over =item * Fix pure-perl version of C<refaddr> to avoid blessing an un-blessed reference =item * Use C<XSLoader> for faster loading =item * Fixed various memory leaks =item * Internal cleanup and portability fixes =back =item * C<Math::Complex> upgraded to version 1.35 =over =item * C<atan2(0, i)> now works, as do all the (computable) complex argument cases =item * Fixes for certain bugs in C<make> and C<emake> =item * Support returning the I<k>th root directly =item * Support C<[2,-3pi/8]> in C<emake> =item * Support C<inf> for C<make>/C<emake> =item * Document C<make>/C<emake> more visibly =back =item * C<Math::Trig> upgraded to version 1.03 =over =item * Add more great circle routines: C<great_circle_waypoint> and C<great_circle_destination> =back =item * C<MIME::Base64> upgraded to version 3.07 =over =item * Use C<XSLoader> for faster loading =item * Enhanced documentation =item * Internal cleanup =back =item * C<NDBM_File> upgraded to version 1.06 =over =item * Enhanced documentation =back =item * C<ODBM_File> upgraded to version 1.06 =over =item * Documentation typo fixed =item * Internal cleanup =back =item * C<Opcode> upgraded to version 1.06 =over =item * Enhanced documentation =item * Internal cleanup =back =item * C<open> upgraded to version 1.05 =over =item * Enhanced documentation =back =item * C<overload> upgraded to version 1.04 =over =item * Enhanced documentation =back =item * C<PerlIO> upgraded to version 1.04 =over =item * C<PerlIO::via> iterate over layers properly now =item * C<PerlIO::scalar> understands C<< $/ = "" >> now =item * C<encoding(utf-8-strict)> with partial characters now works =item * Enhanced documentation =item * Internal cleanup =back =item * C<Pod::Functions> upgraded to version 1.03 =over =item * Documentation typos fixed =back =item * C<Pod::Html> upgraded to version 1.0504 =over =item * HTML output will now correctly link to C<=item>s on the same page, and should be valid XHTML. =item * Variable names are recognized as intended =item * Documentation typos fixed =back =item * C<Pod::Parser> upgraded to version 1.32 =over =item * Allow files that start with C<=head> on the first line =item * Win32 portability fix =item * Exit status of C<pod2usage> fixed =item * New C<-noperldoc> switch for C<pod2usage> =item * Arbitrary URL schemes now allowed =item * Documentation typos fixed =back =item * C<POSIX> upgraded to version 1.09 =over =item * Documentation typos fixed =item * Internal cleanup =back =item * C<re> upgraded to version 0.05 =over =item * Documentation typo fixed =back =item * C<Safe> upgraded to version 2.12 =over =item * Minor documentation enhancement =back =item * C<SDBM_File> upgraded to version 1.05 =over =item * Documentation typo fixed =item * Internal cleanup =back =item * C<Socket> upgraded to version 1.78 =over =item * Internal cleanup =back =item * C<Storable> upgraded to version 2.15 =over =item * This includes the C<STORABLE_attach> hook functionality added by Adam Kennedy, and more frugal memory requirements when storing under C<ithreads>, by using the C<ithreads> cloning tracking code. =back =item * C<Switch> upgraded to version 2.10_01 =over =item * Documentation typos fixed =back =item * C<Sys::Syslog> upgraded to version 0.13 =over =item * Now provides numeric macros and meaningful C<Exporter> tags. =item * No longer uses C<Sys::Hostname> as it may provide useless values in unconfigured network environments, so instead uses C<INADDR_LOOPBACK> directly. =item * C<syslog()> now uses local timestamp. =item * C<setlogmask()> now behaves like its C counterpart. =item * C<setlogsock()> will now C<croak()> as documented. =item * Improved error and warnings messages. =item * Improved documentation. =back =item * C<Term::ANSIColor> upgraded to version 1.10 =over =item * Fixes a bug in C<colored> when C<$EACHLINE> is set that caused it to not color lines consisting solely of 0 (literal zero). =item * Improved tests. =back =item * C<Term::ReadLine> upgraded to version 1.02 =over =item * Documentation tweaks =back =item * C<Test::Harness> upgraded to version 2.56 (was 2.48) =over =item * The C<Test::Harness> timer is now off by default. =item * Now shows elapsed time in milliseconds. =item * Various bug fixes =back =item * C<Test::Simple> upgraded to version 0.62 (was 0.54) =over =item * C<is_deeply()> no longer fails to work for many cases =item * Various minor bug fixes =item * Documentation enhancements =back =item * C<Text::Tabs> upgraded to version 2005.0824 =over =item * Provides a faster implementation of C<expand> =back =item * C<Text::Wrap> upgraded to version 2005.082401 =over =item * Adds C<$Text::Wrap::separator2>, which allows you to preserve existing newlines but add line-breaks with some other string. =back =item * C<threads> upgraded to version 1.07 =over =item * C<threads> will now honour C<no warnings 'threads'> =item * A thread's interpreter is now freed after C<< $t->join() >> rather than after C<undef $t>, which should fix some C<ithreads> memory leaks. (Fixed by Dave Mitchell) =item * Some documentation typo fixes. =back =item * C<threads::shared> upgraded to version 0.94 =over =item * Documentation changes only =item * Note: An improved implementation of C<threads::shared> is available on CPAN - this will be merged into 5.8.9 if it proves stable. =back =item * C<Tie::Hash> upgraded to version 1.02 =over =item * Documentation typo fixed =back =item * C<Time::HiRes> upgraded to version 1.86 (was 1.66) =over =item * C<clock_nanosleep()> and C<clock()> functions added =item * Support for the POSIX C<clock_gettime()> and C<clock_getres()> has been added =item * Return C<undef> or an empty list if the C C<gettimeofday()> function fails =item * Improved C<nanosleep> detection =item * Internal cleanup =item * Enhanced documentation =back =item * C<Unicode::Collate> upgraded to version 0.52 =over =item * Now implements UCA Revision 14 (based on Unicode 4.1.0). =item * C<< Unicode::Collate->new >> method no longer overwrites user's C<$_> =item * Enhanced documentation =back =item * C<Unicode::UCD> upgraded to version 0.24 =over =item * Documentation typos fixed =back =item * C<User::grent> upgraded to version 1.01 =over =item * Documentation typo fixed =back =item * C<utf8> upgraded to version 1.06 =over =item * Documentation typos fixed =back =item * C<vmsish> upgraded to version 1.02 =over =item * Documentation typos fixed =back =item * C<warnings> upgraded to version 1.05 =over =item * Gentler messing with C<Carp::> internals =item * Internal cleanup =item * Documentation update =back =item * C<Win32> upgraded to version 0.2601 =for cynics And how many perl 5.8.x versions can I release ahead of Vista? =over =item * Provides Windows Vista support to C<Win32::GetOSName> =item * Documentation enhancements =back =item * C<XS::Typemap> upgraded to version 0.02 =over =item * Internal cleanup =back =back =head1 Utility Changes =head2 C<h2xs> enhancements C<h2xs> implements new option C<--use-xsloader> to force use of C<XSLoader> even in backwards compatible modules. The handling of authors' names that had apostrophes has been fixed. Any enums with negative values are now skipped. =head2 C<perlivp> enhancements C<perlivp> implements new option C<-a> and will not check for F<*.ph> files by default any more. Use the C<-a> option to run I<all> tests. =head1 New Documentation The L<perlglossary> manpage is a glossary of terms used in the Perl documentation, technical and otherwise, kindly provided by O'Reilly Media, inc. =head1 Performance Enhancements =over 4 =item * Weak reference creation is now I<O(1)> rather than I<O(n)>, courtesy of Nicholas Clark. Weak reference deletion remains I<O(n)>, but if deletion only happens at program exit, it may be skipped completely. =item * Salvador Fandiño provided improvements to reduce the memory usage of C<sort> and to speed up some cases. =item * Jarkko Hietaniemi and Andy Lester worked to mark as much data as possible in the C source files as C<static>, to increase the proportion of the executable file that the operating system can share between process, and thus reduce real memory usage on multi-user systems. =back =head1 Installation and Configuration Improvements Parallel makes should work properly now, although there may still be problems if C<make test> is instructed to run in parallel. Building with Borland's compilers on Win32 should work more smoothly. In particular Steve Hay has worked to side step many warnings emitted by their compilers and at least one C compiler internal error. C<Configure> will now detect C<clearenv> and C<unsetenv>, thanks to a patch from Alan Burlison. It will also probe for C<futimes> and whether C<sprintf> correctly returns the length of the formatted string, which will both be used in perl 5.8.9. There are improved hints for next-3.0, vmesa, IX, Darwin, Solaris, Linux, DEC/OSF, HP-UX and MPE/iX Perl extensions on Windows now can be statically built into the Perl DLL, thanks to a work by Vadim Konovalov. (This improvement was actually in 5.8.7, but was accidentally omitted from L<perl587delta>). =head1 Selected Bug Fixes =head2 no warnings 'category' works correctly with -w Previously when running with warnings enabled globally via C<-w>, selective disabling of specific warning categories would actually turn off all warnings. This is now fixed; now C<no warnings 'io';> will only turn off warnings in the C<io> class. Previously it would erroneously turn off all warnings. This bug fix may cause some programs to start correctly issuing warnings. =head2 Remove over-optimisation Perl 5.8.4 introduced a change so that assignments of C<undef> to a scalar, or of an empty list to an array or a hash, were optimised away. As this could cause problems when C<goto> jumps were involved, this change has been backed out. =head2 sprintf() fixes Using the sprintf() function with some formats could lead to a buffer overflow in some specific cases. This has been fixed, along with several other bugs, notably in bounds checking. In related fixes, it was possible for badly written code that did not follow the documentation of C<Sys::Syslog> to have formatting vulnerabilities. C<Sys::Syslog> has been changed to protect people from poor quality third party code. =head2 Debugger and Unicode slowdown It had been reported that running under perl's debugger when processing Unicode data could cause unexpectedly large slowdowns. The most likely cause of this was identified and fixed by Nicholas Clark. =head2 Smaller fixes =over 4 =item * C<FindBin> now works better with directories where access rights are more restrictive than usual. =item * Several memory leaks in ithreads were closed. An improved implementation of C<threads::shared> is available on CPAN - this will be merged into 5.8.9 if it proves stable. =item * Trailing spaces are now trimmed from C<$!> and C<$^E>. =item * Operations that require perl to read a process's list of groups, such as reads of C<$(> and C<$)>, now dynamically allocate memory rather than using a fixed sized array. The fixed size array could cause C stack exhaustion on systems configured to use large numbers of groups. =item * C<PerlIO::scalar> now works better with non-default C<$/> settings. =item * You can now use the C<x> operator to repeat a C<qw//> list. This used to raise a syntax error. =item * The debugger now traces correctly execution in eval("")uated code that contains #line directives. =item * The value of the C<open> pragma is no longer ignored for three-argument opens. =item * The optimisation of C<for (reverse @a)> introduced in perl 5.8.6 could misbehave when the array had undefined elements and was used in LVALUE context. Dave Mitchell provided a fix. =item * Some case insensitive matches between UTF-8 encoded data and 8 bit regexps, and vice versa, could give malformed character warnings. These have been fixed by Dave Mitchell and Yves Orton. =item * C<lcfirst> and C<ucfirst> could corrupt the string for certain cases where the length UTF-8 encoding of the string in lower case, upper case or title case differed. This was fixed by Nicholas Clark. =item * Perl will now use the C library calls C<unsetenv> and C<clearenv> if present to delete keys from C<%ENV> and delete C<%ENV> entirely, thanks to a patch from Alan Burlison. =back =head1 New or Changed Diagnostics =head2 Attempt to set length of freed array This is a new warning, produced in situations such as this: $r = do {my @a; \$#a}; $$r = 503; =head2 Non-string passed as bitmask This is a new warning, produced when number has been passed as an argument to select(), instead of a bitmask. # Wrong, will now warn $rin = fileno(STDIN); ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout); # Should be $rin = ''; vec($rin,fileno(STDIN),1) = 1; ($nfound,$timeleft) = select($rout=$rin, undef, undef, $timeout); =head2 Search pattern not terminated or ternary operator parsed as search pattern This syntax error indicates that the lexer couldn't find the final delimiter of a C<?PATTERN?> construct. Mentioning the ternary operator in this error message makes it easier to diagnose syntax errors. =head1 Changed Internals There has been a fair amount of refactoring of the C<C> source code, partly to make it tidier and more maintainable. The resulting object code and the C<perl> binary may well be smaller than 5.8.7, in particular due to a change contributed by Dave Mitchell which reworked the warnings code to be significantly smaller. Apart from being smaller and possibly faster, there should be no user-detectable changes. Andy Lester supplied many improvements to determine which function parameters and local variables could actually be declared C<const> to the C compiler. Steve Peters provided new C<*_set> macros and reworked the core to use these rather than assigning to macros in LVALUE context. Dave Mitchell improved the lexer debugging output under C<-DT> Nicholas Clark changed the string buffer allocation so that it is now rounded up to the next multiple of 4 (or 8 on platforms with 64 bit pointers). This should reduce the number of calls to C<realloc> without actually using any extra memory. The C<HV>'s array of C<HE*>s is now allocated at the correct (minimal) size, thanks to another change by Nicholas Clark. Compile with C<-DPERL_USE_LARGE_HV_ALLOC> to use the old, sloppier, default. For XS or embedding debugging purposes, if perl is compiled with C<-DDEBUG_LEAKING_SCALARS_FORK_DUMP> in addition to C<-DDEBUG_LEAKING_SCALARS> then a child process is C<fork>ed just before global destruction, which is used to display the values of any scalars found to have leaked at the end of global destruction. Without this, the scalars have already been freed sufficiently at the point of detection that it is impossible to produce any meaningful dump of their contents. This feature was implemented by the indefatigable Nicholas Clark, based on an idea by Mike Giroux. =head1 Platform Specific Problems The optimiser on HP-UX 11.23 (Itanium 2) is currently partly disabled (scaled down to +O1) when using HP C-ANSI-C; the cause of problems at higher optimisation levels is still unclear. There are a handful of remaining test failures on VMS, mostly due to test fixes and minor module tweaks with too many dependencies to integrate into this release from the development stream, where they have all been corrected. The following is a list of expected failures with the patch number of the fix where that is known: ext/Devel/PPPort/t/ppphtest.t #26913 ext/List/Util/t/p_tainted.t #26912 lib/ExtUtils/t/PL_FILES.t #26813 lib/ExtUtils/t/basic.t #26813 t/io/fs.t t/op/cmp.t =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/ =head1 SEE ALSO The F<Changes> file for exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�Uv܃% �% perl5122delta.podnu �[��� =encoding utf8 =head1 NAME perl5122delta - what is new for perl v5.12.2 =head1 DESCRIPTION This document describes differences between the 5.12.1 release and the 5.12.2 release. If you are upgrading from an earlier major version, such as 5.10.1, first read L<perl5120delta>, which describes differences between 5.10.1 and 5.12.0, as well as L<perl5121delta>, which describes earlier changes in the 5.12 stable release series. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.12.1. If any exist, they are bugs and reports are welcome. =head1 Core Enhancements Other than the bug fixes listed below, there should be no user-visible changes to the core language in this release. =head1 Modules and Pragmata =head2 New Modules and Pragmata This release does not introduce any new modules or pragmata. =head2 Pragmata Changes In the previous release, C<no I<VERSION>;> statements triggered a bug which could cause L<feature> bundles to be loaded and L<strict> mode to be enabled unintentionally. =head2 Updated Modules =over 4 =item C<Carp> Upgraded from version 1.16 to 1.17. L<Carp> now detects incomplete L<caller()|perlfunc/"caller EXPR"> overrides and avoids using bogus C<@DB::args>. To provide backtraces, Carp relies on particular behaviour of the caller built-in. Carp now detects if other code has overridden this with an incomplete implementation, and modifies its backtrace accordingly. Previously incomplete overrides would cause incorrect values in backtraces (best case), or obscure fatal errors (worst case) This fixes certain cases of C<Bizarre copy of ARRAY> caused by modules overriding C<caller()> incorrectly. =item C<CPANPLUS> A patch to F<cpanp-run-perl> has been backported from CPANPLUS C<0.9004>. This resolves L<RT #55964|http://rt.cpan.org/Public/Bug/Display.html?id=55964> and L<RT #57106|http://rt.cpan.org/Public/Bug/Display.html?id=57106>, both of which related to failures to install distributions that use C<Module::Install::DSL>. =item C<File::Glob> A regression which caused a failure to find C<CORE::GLOBAL::glob> after loading C<File::Glob> to crash has been fixed. Now, it correctly falls back to external globbing via C<pp_glob>. =item C<File::Copy> C<File::Copy::copy(FILE, DIR)> is now documented. =item C<File::Spec> Upgraded from version 3.31 to 3.31_01. Several portability fixes were made in C<File::Spec::VMS>: a colon is now recognized as a delimiter in native filespecs; caret-escaped delimiters are recognized for better handling of extended filespecs; C<catpath()> returns an empty directory rather than the current directory if the input directory name is empty; C<abs2rel()> properly handles Unix-style input. =back =head1 Utility Changes =over =item * F<perlbug> now always gives the reporter a chance to change the email address it guesses for them. =item * F<perlbug> should no longer warn about uninitialized values when using the C<-d> and C<-v> options. =back =head1 Changes to Existing Documentation =over =item * The existing policy on backward-compatibility and deprecation has been added to L<perlpolicy>, along with definitions of terms like I<deprecation>. =item * L<perlfunc/srand>'s usage has been clarified. =item * The entry for L<perlfunc/die> was reorganized to emphasize its role in the exception mechanism. =item * Perl's L<INSTALL> file has been clarified to explicitly state that Perl requires a C89 compliant ANSI C Compiler. =item * L<IO::Socket>'s C<getsockopt()> and C<setsockopt()> have been documented. =item * F<alarm()>'s inability to interrupt blocking IO on Windows has been documented. =item * L<Math::TrulyRandom> hasn't been updated since 1996 and has been removed as a recommended solution for random number generation. =item * L<perlrun> has been updated to clarify the behaviour of octal flags to F<perl>. =item * To ease user confusion, C<$#> and C<$*>, two special variables that were removed in earlier versions of Perl have been documented. =item * The version of L<perlfaq> shipped with the Perl core has been updated from the official FAQ version, which is now maintained in the C<briandfoy/perlfaq> branch of the Perl repository at L<git://perl5.git.perl.org/perl.git>. =back =head1 Installation and Configuration Improvements =head2 Configuration improvements =over =item * The C<d_u32align> configuration probe on ARM has been fixed. =back =head2 Compilation improvements =over =item * An "C<incompatible operand types>" error in ternary expressions when building with C<clang> has been fixed. =item * Perl now skips setuid C<File::Copy> tests on partitions it detects to be mounted as C<nosuid>. =back =head1 Selected Bug Fixes =over 4 =item * A possible segfault in the C<T_PRTOBJ> default typemap has been fixed. =item * A possible memory leak when using L<caller()|perlfunc/"caller EXPR"> to set C<@DB::args> has been fixed. =item * Several memory leaks when loading XS modules were fixed. =item * C<unpack()> now handles scalar context correctly for C<%32H> and C<%32u>, fixing a potential crash. C<split()> would crash because the third item on the stack wasn't the regular expression it expected. C<unpack("%2H", ...)> would return both the unpacked result and the checksum on the stack, as would C<unpack("%2u", ...)>. L<[perl #73814]|http://rt.perl.org/rt3/Ticket/Display.html?id=73814> =item * Perl now avoids using memory after calling C<free()> in F<pp_require> when there are CODEREFs in C<@INC>. =item * A bug that could cause "C<Unknown error>" messages when "C<call_sv(code, G_EVAL)>" is called from an XS destructor has been fixed. =item * The implementation of the C<open $fh, 'E<gt>' \$buffer> feature now supports get/set magic and thus tied buffers correctly. =item * The C<pp_getc>, C<pp_tell>, and C<pp_eof> opcodes now make room on the stack for their return values in cases where no argument was passed in. =item * When matching unicode strings under some conditions inappropriate backtracking would result in a C<Malformed UTF-8 character (fatal)> error. This should no longer occur. See L<[perl #75680]|http://rt.perl.org/rt3/Public/Bug/Display.html?id=75680> =back =head1 Platform Specific Notes =head2 AIX =over =item * F<README.aix> has been updated with information about the XL C/C++ V11 compiler suite. =back =head2 Windows =over =item * When building Perl with the mingw64 x64 cross-compiler C<incpath>, C<libpth>, C<ldflags>, C<lddlflags> and C<ldflags_nolargefiles> values in F<Config.pm> and F<Config_heavy.pl> were not previously being set correctly because, with that compiler, the include and lib directories are not immediately below C<$(CCHOME)>. =back =head2 VMS =over =item * F<git_version.h> is now installed on VMS. This was an oversight in v5.12.0 which caused some extensions to fail to build. =item * Several memory leaks in L<stat()|perlfunc/"stat FILEHANDLE"> have been fixed. =item * A memory leak in C<Perl_rename()> due to a double allocation has been fixed. =item * A memory leak in C<vms_fid_to_name()> (used by C<realpath()> and C<realname()>) has been fixed. =back =head1 Acknowledgements Perl 5.12.2 represents approximately three months of development since Perl 5.12.1 and contains approximately 2,000 lines of changes across 100 files from 36 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.2: Abigail, Ævar Arnfjörð Bjarmason, Ben Morrow, brian d foy, Brian Phillips, Chas. Owens, Chris 'BinGOs' Williams, Chris Williams, Craig A. Berry, Curtis Jewell, Dan Dascalescu, David Golden, David Mitchell, Father Chrysostomos, Florian Ragwitz, George Greer, H.Merijn Brand, Jan Dubois, Jesse Vincent, Jim Cromie, Karl Williamson, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯, Leon Brocard, Maik Hentsche, Matt S Trout, Nicholas Clark, Rafael Garcia-Suarez, Rainer Tammer, Ricardo Signes, Salvador Ortiz Garcia, Sisyphus, Slaven Rezic, Steffen Mueller, Tony Cook, Vincent Pit and Yves Orton. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[18�� perlandroid.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlandroid - Perl under Android =head1 SYNOPSIS The first portions of this document contains instructions to cross-compile Perl for Android 2.0 and later, using the binaries provided by Google. The latter portions describe how to build perl native using one of the toolchains available on the Play Store. =head1 DESCRIPTION This document describes how to set up your host environment when attempting to build Perl for Android. =head1 Cross-compilation These instructions assume an Unixish build environment on your host system; they've been tested on Linux and OS X, and may work on Cygwin and MSYS. While Google also provides an NDK for Windows, these steps won't work native there, although it may be possible to cross-compile through different means. If your host system's architecture is 32 bits, remember to change the C<x86_64>'s below to C<x86>'s. On a similar vein, the examples below use the 4.8 toolchain; if you want to use something older or newer (for example, the 4.4.3 toolchain included in the 8th revision of the NDK), just change those to the relevant version. =head2 Get the Android Native Development Kit (NDK) You can download the NDK from L<https://developer.android.com/tools/sdk/ndk/index.html>. You'll want the normal, non-legacy version. =head2 Determine the architecture you'll be cross-compiling for There's three possible options: arm-linux-androideabi for ARM, mipsel-linux-android for MIPS, and simply x86 for x86. As of 2014, most Android devices run on ARM, so that is generally a safe bet. With those two in hand, you should add $ANDROID_NDK/toolchains/$TARGETARCH-4.8/prebuilt/`uname | tr '[A-Z]' '[a-z]'`-x86_64/bin to your C<PATH>, where C<$ANDROID_NDK> is the location where you unpacked the NDK, and C<$TARGETARCH> is your target's architecture. =head2 Set up a standalone toolchain This creates a working sysroot that we can feed to Configure later. $ export ANDROID_TOOLCHAIN=/tmp/my-toolchain-$TARGETARCH $ export SYSROOT=$ANDROID_TOOLCHAIN/sysroot $ $ANDROID_NDK/build/tools/make-standalone-toolchain.sh \ --platform=android-9 \ --install-dir=$ANDROID_TOOLCHAIN \ --system=`uname | tr '[A-Z]' '[a-z]'`-x86_64 \ --toolchain=$TARGETARCH-4.8 =head2 adb or ssh? adb is the Android Debug Bridge. For our purposes, it's basically a way of establishing an ssh connection to an Android device without having to install anything on the device itself, as long as the device is either on the same local network as the host, or it is connected to the host through USB. Perl can be cross-compiled using either adb or a normal ssh connection; in general, if you can connect your device to the host using a USB port, or if you don't feel like installing an sshd app on your device, you may want to use adb, although you may be forced to switch to ssh if your device is not rooted and you're unlucky -- more on that later. Alternatively, if you're cross-compiling to an emulator, you'll have to use adb. =head3 adb To use adb, download the Android SDK from L<https://developer.android.com/sdk/index.html>. The "SDK Tools Only" version should suffice -- if you downloaded the ADT Bundle, you can find the sdk under F<$ADT_BUNDLE/sdk/>. Add F<$ANDROID_SDK/platform-tools> to your C<PATH>, which should give you access to adb. You'll now have to find your device's name using C<adb devices>, and later pass that to Configure through C<-Dtargethost=$DEVICE>. However, before calling Configure, you need to check if using adb is a viable choice in the first place. Because Android doesn't have a F</tmp>, nor does it allow executables in the sdcard, we need to find somewhere in the device for Configure to put some files in, as well as for the tests to run in. If your device is rooted, then you're good. Try running these: $ export TARGETDIR=/mnt/asec/perl $ adb -s $DEVICE shell "echo sh -c '\"mkdir $TARGETDIR\"' | su --" Which will create the directory we need, and you can move on to the next step. F</mnt/asec> is mounted as a tmpfs in Android, but it's only accessible to root. If your device is not rooted, you may still be in luck. Try running this: $ export TARGETDIR=/data/local/tmp/perl $ adb -s $DEVICE shell "mkdir $TARGETDIR" If the command works, you can move to the next step, but beware: B<You'll have to remove the directory from the device once you are done! Unlike F</mnt/asec>, F</data/local/tmp> may not get automatically garbage collected once you shut off the phone>. If neither of those work, then you can't use adb to cross-compile to your device. Either try rooting it, or go for the ssh route. =head3 ssh To use ssh, you'll need to install and run a sshd app and set it up properly. There are several paid and free apps that do this rather easily, so you should be able to spot one on the store. Remember that Perl requires a passwordless connection, so set up a public key. Note that several apps spew crap to stderr every time you connect, which can throw off Configure. You may need to monkeypatch the part of Configure that creates C<run-ssh> to have it discard stderr. Since you're using ssh, you'll have to pass some extra arguments to Configure: -Dtargetrun=ssh -Dtargethost=$TARGETHOST -Dtargetuser=$TARGETUSER -Dtargetport=$TARGETPORT =head2 Configure and beyond With all of the previous done, you're now ready to call Configure. If using adb, a "basic" Configure line will look like this: $ ./Configure -des -Dusedevel -Dusecrosscompile -Dtargetrun=adb \ -Dcc=$TARGETARCH-gcc \ -Dsysroot=$SYSROOT \ -Dtargetdir=$TARGETDIR \ -Dtargethost=$DEVICE If using ssh, it's not too different -- we just change targetrun to ssh, and pass in targetuser and targetport. It ends up looking like this: $ ./Configure -des -Dusedevel -Dusecrosscompile -Dtargetrun=ssh \ -Dcc=$TARGETARCH-gcc \ -Dsysroot=$SYSROOT \ -Dtargetdir=$TARGETDIR \ -Dtargethost="$TARGETHOST" \ -Dtargetuser=$TARGETUSER \ -Dtargetport=$TARGETPORT Now you're ready to run C<make> and C<make test>! As a final word of warning, if you're using adb, C<make test> may appear to hang; this is because it doesn't output anything until it finishes running all tests. You can check its progress by logging into the device, moving to F<$TARGETDIR>, and looking at the file F<output.stdout>. =head3 Notes =over =item * If you are targetting x86 Android, you will have to change C<$TARGETARCH-gcc> to C<i686-linux-android-gcc>. =item * On some older low-end devices -- think early 2.2 era -- some tests, particularly F<t/re/uniprops.t>, may crash the phone, causing it to turn itself off once, and then back on again. =back =head1 Native Builds While Google doesn't provide a native toolchain for Android, you can still get one from the Play Store. =head2 CCTools You may be able to get the CCTools app, which is free. Keep in mind that you want a full toolchain; some apps tend to default to installing only a barebones version without some important utilities, like ar or nm. Once you have the toolchain set up properly, the only remaining hurdle is actually locating where in the device it was installed in. For example, CCTools installs its toolchain in F</data/data/com.pdaxrom.cctools/root/cctools>. With the path in hand, compiling perl is little more than: export SYSROOT=<location of the native toolchain> export LD_LIBRARY_PATH="$SYSROOT/lib:`pwd`:`pwd`/lib:`pwd`/lib/auto:$LD_LIBRARY_PATH" sh Configure -des -Dsysroot=$SYSROOT -Alibpth="/system/lib /vendor/lib" =head2 Termux L<Termux|https://termux.com/> provides an Android terminal emulator and Linux environment. It comes with a cross-compiled perl already installed. Natively compiling perl 5.30 or later should be as straightforward as: sh Configure -des -Alibpth="/system/lib /vendor/lib" This certainly works on Android 8.1 (Oreo) at least... =head1 AUTHOR Brian Fraser <fraserbn@gmail.com> =cut PK �=�[�t��[� [� perldebguts.podnu �[��� =head1 NAME perldebguts - Guts of Perl debugging =head1 DESCRIPTION This is not L<perldebug>, which tells you how to use the debugger. This manpage describes low-level details concerning the debugger's internals, which range from difficult to impossible to understand for anyone who isn't incredibly intimate with Perl's guts. Caveat lector. =head1 Debugger Internals Perl has special debugging hooks at compile-time and run-time used to create debugging environments. These hooks are not to be confused with the I<perl -Dxxx> command described in L<perlrun|perlrun/-Dletters>, which is usable only if a special Perl is built per the instructions in the F<INSTALL> podpage in the Perl source tree. For example, whenever you call Perl's built-in C<caller> function from the package C<DB>, the arguments that the corresponding stack frame was called with are copied to the C<@DB::args> array. These mechanisms are enabled by calling Perl with the B<-d> switch. Specifically, the following additional features are enabled (cf. L<perlvar/$^P>): =over 4 =item * Perl inserts the contents of C<$ENV{PERL5DB}> (or C<BEGIN {require 'perl5db.pl'}> if not present) before the first line of your program. =item * Each array C<@{"_<$filename"}> holds the lines of $filename for a file compiled by Perl. The same is also true for C<eval>ed strings that contain subroutines, or which are currently being executed. The $filename for C<eval>ed strings looks like C<(eval 34)>. Values in this array are magical in numeric context: they compare equal to zero only if the line is not breakable. =item * Each hash C<%{"_<$filename"}> contains breakpoints and actions keyed by line number. Individual entries (as opposed to the whole hash) are settable. Perl only cares about Boolean true here, although the values used by F<perl5db.pl> have the form C<"$break_condition\0$action">. The same holds for evaluated strings that contain subroutines, or which are currently being executed. The $filename for C<eval>ed strings looks like C<(eval 34)>. =item * Each scalar C<${"_<$filename"}> contains C<"_<$filename">. This is also the case for evaluated strings that contain subroutines, or which are currently being executed. The $filename for C<eval>ed strings looks like C<(eval 34)>. =item * After each C<require>d file is compiled, but before it is executed, C<DB::postponed(*{"_<$filename"})> is called if the subroutine C<DB::postponed> exists. Here, the $filename is the expanded name of the C<require>d file, as found in the values of %INC. =item * After each subroutine C<subname> is compiled, the existence of C<$DB::postponed{subname}> is checked. If this key exists, C<DB::postponed(subname)> is called if the C<DB::postponed> subroutine also exists. =item * A hash C<%DB::sub> is maintained, whose keys are subroutine names and whose values have the form C<filename:startline-endline>. C<filename> has the form C<(eval 34)> for subroutines defined inside C<eval>s. =item * When the execution of your program reaches a point that can hold a breakpoint, the C<DB::DB()> subroutine is called if any of the variables C<$DB::trace>, C<$DB::single>, or C<$DB::signal> is true. These variables are not C<local>izable. This feature is disabled when executing inside C<DB::DB()>, including functions called from it unless C<< $^D & (1<<30) >> is true. =item * When execution of the program reaches a subroutine call, a call to C<&DB::sub>(I<args>) is made instead, with C<$DB::sub> set to identify the called subroutine. (This doesn't happen if the calling subroutine was compiled in the C<DB> package.) C<$DB::sub> normally holds the name of the called subroutine, if it has a name by which it can be looked up. Failing that, C<$DB::sub> will hold a reference to the called subroutine. Either way, the C<&DB::sub> subroutine can use C<$DB::sub> as a reference by which to call the called subroutine, which it will normally want to do. X<&DB::lsub>If the call is to an lvalue subroutine, and C<&DB::lsub> is defined C<&DB::lsub>(I<args>) is called instead, otherwise falling back to C<&DB::sub>(I<args>). =item * When execution of the program uses C<goto> to enter a non-XS subroutine and the 0x80 bit is set in C<$^P>, a call to C<&DB::goto> is made, with C<$DB::sub> set to identify the subroutine being entered. The call to C<&DB::goto> does not replace the C<goto>; the requested subroutine will still be entered once C<&DB::goto> has returned. C<$DB::sub> normally holds the name of the subroutine being entered, if it has one. Failing that, C<$DB::sub> will hold a reference to the subroutine being entered. Unlike when C<&DB::sub> is called, it is not guaranteed that C<$DB::sub> can be used as a reference to operate on the subroutine being entered. =back Note that if C<&DB::sub> needs external data for it to work, no subroutine call is possible without it. As an example, the standard debugger's C<&DB::sub> depends on the C<$DB::deep> variable (it defines how many levels of recursion deep into the debugger you can go before a mandatory break). If C<$DB::deep> is not defined, subroutine calls are not possible, even though C<&DB::sub> exists. =head2 Writing Your Own Debugger =head3 Environment Variables The C<PERL5DB> environment variable can be used to define a debugger. For example, the minimal "working" debugger (it actually doesn't do anything) consists of one line: sub DB::DB {} It can easily be defined like this: $ PERL5DB="sub DB::DB {}" perl -d your-script Another brief debugger, slightly more useful, can be created with only the line: sub DB::DB {print ++$i; scalar <STDIN>} This debugger prints a number which increments for each statement encountered and waits for you to hit a newline before continuing to the next statement. The following debugger is actually useful: { package DB; sub DB {} sub sub {print ++$i, " $sub\n"; &$sub} } It prints the sequence number of each subroutine call and the name of the called subroutine. Note that C<&DB::sub> is being compiled into the package C<DB> through the use of the C<package> directive. When it starts, the debugger reads your rc file (F<./.perldb> or F<~/.perldb> under Unix), which can set important options. (A subroutine (C<&afterinit>) can be defined here as well; it is executed after the debugger completes its own initialization.) After the rc file is read, the debugger reads the PERLDB_OPTS environment variable and uses it to set debugger options. The contents of this variable are treated as if they were the argument of an C<o ...> debugger command (q.v. in L<perldebug/"Configurable Options">). =head3 Debugger Internal Variables In addition to the file and subroutine-related variables mentioned above, the debugger also maintains various magical internal variables. =over 4 =item * C<@DB::dbline> is an alias for C<@{"::_<current_file"}>, which holds the lines of the currently-selected file (compiled by Perl), either explicitly chosen with the debugger's C<f> command, or implicitly by flow of execution. Values in this array are magical in numeric context: they compare equal to zero only if the line is not breakable. =item * C<%DB::dbline> is an alias for C<%{"::_<current_file"}>, which contains breakpoints and actions keyed by line number in the currently-selected file, either explicitly chosen with the debugger's C<f> command, or implicitly by flow of execution. As previously noted, individual entries (as opposed to the whole hash) are settable. Perl only cares about Boolean true here, although the values used by F<perl5db.pl> have the form C<"$break_condition\0$action">. =back =head3 Debugger Customization Functions Some functions are provided to simplify customization. =over 4 =item * See L<perldebug/"Configurable Options"> for a description of options parsed by C<DB::parse_options(string)>. =item * C<DB::dump_trace(skip[,count])> skips the specified number of frames and returns a list containing information about the calling frames (all of them, if C<count> is missing). Each entry is reference to a hash with keys C<context> (either C<.>, C<$>, or C<@>), C<sub> (subroutine name, or info about C<eval>), C<args> (C<undef> or a reference to an array), C<file>, and C<line>. =item * C<DB::print_trace(FH, skip[, count[, short]])> prints formatted info about caller frames. The last two functions may be convenient as arguments to C<< < >>, C<< << >> commands. =back Note that any variables and functions that are not documented in this manpages (or in L<perldebug>) are considered for internal use only, and as such are subject to change without notice. =head1 Frame Listing Output Examples The C<frame> option can be used to control the output of frame information. For example, contrast this expression trace: $ perl -de 42 Stack dump during die enabled outside of evals. Loading DB routines from perl5db.pl patch level 0.94 Emacs support available. Enter h or 'h h' for help. main::(-e:1): 0 DB<1> sub foo { 14 } DB<2> sub bar { 3 } DB<3> t print foo() * bar() main::((eval 172):3): print foo() + bar(); main::foo((eval 168):2): main::bar((eval 170):2): 42 with this one, once the C<o>ption C<frame=2> has been set: DB<4> o f=2 frame = '2' DB<5> t print foo() * bar() 3: foo() * bar() entering main::foo 2: sub foo { 14 }; exited main::foo entering main::bar 2: sub bar { 3 }; exited main::bar 42 By way of demonstration, we present below a laborious listing resulting from setting your C<PERLDB_OPTS> environment variable to the value C<f=n N>, and running I<perl -d -V> from the command line. Examples using various values of C<n> are shown to give you a feel for the difference between settings. Long though it may be, this is not a complete listing, but only excerpts. =over 4 =item 1 entering main::BEGIN entering Config::BEGIN Package lib/Exporter.pm. Package lib/Carp.pm. Package lib/Config.pm. entering Config::TIEHASH entering Exporter::import entering Exporter::export entering Config::myconfig entering Config::FETCH entering Config::FETCH entering Config::FETCH entering Config::FETCH =item 2 entering main::BEGIN entering Config::BEGIN Package lib/Exporter.pm. Package lib/Carp.pm. exited Config::BEGIN Package lib/Config.pm. entering Config::TIEHASH exited Config::TIEHASH entering Exporter::import entering Exporter::export exited Exporter::export exited Exporter::import exited main::BEGIN entering Config::myconfig entering Config::FETCH exited Config::FETCH entering Config::FETCH exited Config::FETCH entering Config::FETCH =item 3 in $=main::BEGIN() from /dev/null:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from li in @=Config::myconfig() from /dev/null:0 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'osname') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'osvers') from lib/Config.pm:574 =item 4 in $=main::BEGIN() from /dev/null:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. out $=Config::BEGIN() from lib/Config.pm:0 Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 out $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/ out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 out $=main::BEGIN() from /dev/null:0 in @=Config::myconfig() from /dev/null:0 in $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'package') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'baserev') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 out $=Config::FETCH(ref(Config), 'PERL_VERSION') from lib/Config.pm:574 in $=Config::FETCH(ref(Config), 'PERL_SUBVERSION') from lib/Config.pm:574 =item 5 in $=main::BEGIN() from /dev/null:0 in $=Config::BEGIN() from lib/Config.pm:2 Package lib/Exporter.pm. Package lib/Carp.pm. out $=Config::BEGIN() from lib/Config.pm:0 Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:644 out $=Config::TIEHASH('Config') from lib/Config.pm:644 in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/E out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 out $=main::BEGIN() from /dev/null:0 in @=Config::myconfig() from /dev/null:0 in $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 out $=Config::FETCH('Config=HASH(0x1aa444)', 'package') from lib/Config.pm:574 in $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 out $=Config::FETCH('Config=HASH(0x1aa444)', 'baserev') from lib/Config.pm:574 =item 6 in $=CODE(0x15eca4)() from /dev/null:0 in $=CODE(0x182528)() from lib/Config.pm:2 Package lib/Exporter.pm. out $=CODE(0x182528)() from lib/Config.pm:0 scalar context return from CODE(0x182528): undef Package lib/Config.pm. in $=Config::TIEHASH('Config') from lib/Config.pm:628 out $=Config::TIEHASH('Config') from lib/Config.pm:628 scalar context return from Config::TIEHASH: empty hash in $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 in $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 out $=Exporter::export('Config', 'main', 'myconfig', 'config_vars') from lib/Exporter.pm:171 scalar context return from Exporter::export: '' out $=Exporter::import('Config', 'myconfig', 'config_vars') from /dev/null:0 scalar context return from Exporter::import: '' =back In all cases shown above, the line indentation shows the call tree. If bit 2 of C<frame> is set, a line is printed on exit from a subroutine as well. If bit 4 is set, the arguments are printed along with the caller info. If bit 8 is set, the arguments are printed even if they are tied or references. If bit 16 is set, the return value is printed, too. When a package is compiled, a line like this Package lib/Carp.pm. is printed with proper indentation. =head1 Debugging Regular Expressions There are two ways to enable debugging output for regular expressions. If your perl is compiled with C<-DDEBUGGING>, you may use the B<-Dr> flag on the command line, and C<-Drv> for more verbose information. Otherwise, one can C<use re 'debug'>, which has effects at both compile time and run time. Since Perl 5.9.5, this pragma is lexically scoped. =head2 Compile-time Output The debugging output at compile time looks like this: Compiling REx '[bc]d(ef*g)+h[ij]k$' size 45 Got 364 bytes for offset annotations. first at 1 rarest char g at 0 rarest char d at 0 1: ANYOF[bc](12) 12: EXACT <d>(14) 14: CURLYX[0] {1,32767}(28) 16: OPEN1(18) 18: EXACT <e>(20) 20: STAR(23) 21: EXACT <f>(0) 23: EXACT <g>(25) 25: CLOSE1(27) 27: WHILEM[1/1](0) 28: NOTHING(29) 29: EXACT <h>(31) 31: ANYOF[ij](42) 42: EXACT <k>(44) 44: EOL(45) 45: END(0) anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating) stclass 'ANYOF[bc]' minlen 7 Offsets: [45] 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] Omitting $` $& $' support. The first line shows the pre-compiled form of the regex. The second shows the size of the compiled form (in arbitrary units, usually 4-byte words) and the total number of bytes allocated for the offset/length table, usually 4+C<size>*8. The next line shows the label I<id> of the first node that does a match. The anchored 'de' at 1 floating 'gh' at 3..2147483647 (checking floating) stclass 'ANYOF[bc]' minlen 7 line (split into two lines above) contains optimizer information. In the example shown, the optimizer found that the match should contain a substring C<de> at offset 1, plus substring C<gh> at some offset between 3 and infinity. Moreover, when checking for these substrings (to abandon impossible matches quickly), Perl will check for the substring C<gh> before checking for the substring C<de>. The optimizer may also use the knowledge that the match starts (at the C<first> I<id>) with a character class, and no string shorter than 7 characters can possibly match. The fields of interest which may appear in this line are =over 4 =item C<anchored> I<STRING> C<at> I<POS> =item C<floating> I<STRING> C<at> I<POS1..POS2> See above. =item C<matching floating/anchored> Which substring to check first. =item C<minlen> The minimal length of the match. =item C<stclass> I<TYPE> Type of first matching node. =item C<noscan> Don't scan for the found substrings. =item C<isall> Means that the optimizer information is all that the regular expression contains, and thus one does not need to enter the regex engine at all. =item C<GPOS> Set if the pattern contains C<\G>. =item C<plus> Set if the pattern starts with a repeated char (as in C<x+y>). =item C<implicit> Set if the pattern starts with C<.*>. =item C<with eval> Set if the pattern contain eval-groups, such as C<(?{ code })> and C<(??{ code })>. =item C<anchored(TYPE)> If the pattern may match only at a handful of places, with C<TYPE> being C<SBOL>, C<MBOL>, or C<GPOS>. See the table below. =back If a substring is known to match at end-of-line only, it may be followed by C<$>, as in C<floating 'k'$>. The optimizer-specific information is used to avoid entering (a slow) regex engine on strings that will not definitely match. If the C<isall> flag is set, a call to the regex engine may be avoided even when the optimizer found an appropriate place for the match. Above the optimizer section is the list of I<nodes> of the compiled form of the regex. Each line has format C< >I<id>: I<TYPE> I<OPTIONAL-INFO> (I<next-id>) =head2 Types of Nodes Here are the current possible types, with short descriptions: =for comment This table is generated by regen/regcomp.pl. Any changes made here will be lost. =for regcomp.pl begin # TYPE arg-description [regnode-struct-suffix] [longjump-len] DESCRIPTION # Exit points END no End of program. SUCCEED no Return from a subroutine, basically. # Line Start Anchors: SBOL no Match "" at beginning of line: /^/, /\A/ MBOL no Same, assuming multiline: /^/m # Line End Anchors: SEOL no Match "" at end of line: /$/ MEOL no Same, assuming multiline: /$/m EOS no Match "" at end of string: /\z/ # Match Start Anchors: GPOS no Matches where last m//g left off. # Word Boundary Opcodes: BOUND no Like BOUNDA for non-utf8, otherwise like BOUNDU BOUNDL no Like BOUND/BOUNDU, but \w and \W are defined by current locale BOUNDU no Match "" at any boundary of a given type using /u rules. BOUNDA no Match "" at any boundary between \w\W or \W\w, where \w is [_a-zA-Z0-9] NBOUND no Like NBOUNDA for non-utf8, otherwise like BOUNDU NBOUNDL no Like NBOUND/NBOUNDU, but \w and \W are defined by current locale NBOUNDU no Match "" at any non-boundary of a given type using using /u rules. NBOUNDA no Match "" betweeen any \w\w or \W\W, where \w is [_a-zA-Z0-9] # [Special] alternatives: REG_ANY no Match any one character (except newline). SANY no Match any one character. ANYOF sv Match character in (or not in) this class, charclass single char match only ANYOFD sv Like ANYOF, but /d is in effect charclass ANYOFL sv Like ANYOF, but /l is in effect charclass ANYOFPOSIXL sv Like ANYOFL, but matches [[:posix:]] charclass_ classes posixl ANYOFH sv 1 Like ANYOF, but only has "High" matches, none in the bitmap; the flags field contains the lowest matchable UTF-8 start byte ANYOFHb sv 1 Like ANYOFH, but all matches share the same UTF-8 start byte, given in the flags field ANYOFHr sv 1 Like ANYOFH, but the flags field contains packed bounds for all matchable UTF-8 start bytes. ANYOFHs sv 1 Like ANYOFHb, but has a string field that gives the leading matchable UTF-8 bytes; flags field is len ANYOFR packed 1 Matches any character in the range given by its packed args: upper 12 bits is the max delta from the base lower 20; the flags field contains the lowest matchable UTF-8 start byte ANYOFRb packed 1 Like ANYOFR, but all matches share the same UTF-8 start byte, given in the flags field ANYOFM byte 1 Like ANYOF, but matches an invariant byte as determined by the mask and arg NANYOFM byte 1 complement of ANYOFM # POSIX Character Classes: POSIXD none Some [[:class:]] under /d; the FLAGS field gives which one POSIXL none Some [[:class:]] under /l; the FLAGS field gives which one POSIXU none Some [[:class:]] under /u; the FLAGS field gives which one POSIXA none Some [[:class:]] under /a; the FLAGS field gives which one NPOSIXD none complement of POSIXD, [[:^class:]] NPOSIXL none complement of POSIXL, [[:^class:]] NPOSIXU none complement of POSIXU, [[:^class:]] NPOSIXA none complement of POSIXA, [[:^class:]] CLUMP no Match any extended grapheme cluster sequence # Alternation # BRANCH The set of branches constituting a single choice are # hooked together with their "next" pointers, since # precedence prevents anything being concatenated to # any individual branch. The "next" pointer of the last # BRANCH in a choice points to the thing following the # whole choice. This is also where the final "next" # pointer of each individual branch points; each branch # starts with the operand node of a BRANCH node. # BRANCH node Match this alternative, or the next... # Literals EXACT str Match this string (flags field is the length). # In a long string node, the U32 argument is the length, and is # immediately followed by the string. LEXACT len:str 1 Match this long string (preceded by length; flags unused). EXACTL str Like EXACT, but /l is in effect (used so locale-related warnings can be checked for) EXACTF str Like EXACT, but match using /id rules; (string not UTF-8, ASCII folded; non-ASCII not) EXACTFL str Like EXACT, but match using /il rules; (string not likely to be folded) EXACTFU str Like EXACT, but match using /iu rules; (string folded) EXACTFAA str Like EXACT, but match using /iaa rules; (string folded except in non-UTF8 patterns: MICRO, SHARP S; folded length <= unfolded) EXACTFUP str Like EXACT, but match using /iu rules; (string not UTF-8, folded except MICRO, SHARP S: hence Problematic) EXACTFLU8 str Like EXACTFU, but use /il, UTF-8, (string is folded, and everything in it is above 255 EXACTFAA_NO_TRIE str Like EXACT, but match using /iaa rules (string not UTF-8, not guaranteed to be folded, not currently trie-able) EXACT_REQ8 str Like EXACT, but only UTF-8 encoded targets can match LEXACT_REQ8 len:str 1 Like LEXACT, but only UTF-8 encoded targets can match EXACTFU_REQ8 str Like EXACTFU, but only UTF-8 encoded targets can match EXACTFU_S_EDGE str /di rules, but nothing in it precludes /ui, except begins and/or ends with [Ss]; (string not UTF-8; compile-time only) # Do nothing types NOTHING no Match empty string. # A variant of above which delimits a group, thus stops optimizations TAIL no Match empty string. Can jump here from outside. # Loops # STAR,PLUS '?', and complex '*' and '+', are implemented as # circular BRANCH structures. Simple cases # (one character per match) are implemented with STAR # and PLUS for speed and to minimize recursive plunges. # STAR node Match this (simple) thing 0 or more times. PLUS node Match this (simple) thing 1 or more times. CURLY sv 2 Match this simple thing {n,m} times. CURLYN no 2 Capture next-after-this simple thing CURLYM no 2 Capture this medium-complex thing {n,m} times. CURLYX sv 2 Match this complex thing {n,m} times. # This terminator creates a loop structure for CURLYX WHILEM no Do curly processing and see if rest matches. # Buffer related # OPEN,CLOSE,GROUPP ...are numbered at compile time. OPEN num 1 Mark this point in input as start of #n. CLOSE num 1 Close corresponding OPEN of #n. SROPEN none Same as OPEN, but for script run SRCLOSE none Close preceding SROPEN REF num 1 Match some already matched string REFF num 1 Match already matched string, using /di rules. REFFL num 1 Match already matched string, using /li rules. REFFU num 1 Match already matched string, usng /ui. REFFA num 1 Match already matched string, using /aai rules. # Named references. Code in regcomp.c assumes that these all are after # the numbered references REFN no-sv 1 Match some already matched string REFFN no-sv 1 Match already matched string, using /di rules. REFFLN no-sv 1 Match already matched string, using /li rules. REFFUN num 1 Match already matched string, using /ui rules. REFFAN num 1 Match already matched string, using /aai rules. # Support for long RE LONGJMP off 1 1 Jump far away. BRANCHJ off 1 1 BRANCH with long offset. # Special Case Regops IFMATCH off 1 1 Succeeds if the following matches; non-zero flags "f", next_off "o" means lookbehind assertion starting "f..(f-o)" characters before current UNLESSM off 1 1 Fails if the following matches; non-zero flags "f", next_off "o" means lookbehind assertion starting "f..(f-o)" characters before current SUSPEND off 1 1 "Independent" sub-RE. IFTHEN off 1 1 Switch, should be preceded by switcher. GROUPP num 1 Whether the group matched. # The heavy worker EVAL evl/flags Execute some Perl code. 2L # Modifiers MINMOD no Next operator is not greedy. LOGICAL no Next opcode should set the flag only. # This is not used yet RENUM off 1 1 Group with independently numbered parens. # Trie Related # Behave the same as A|LIST|OF|WORDS would. The '..C' variants # have inline charclass data (ascii only), the 'C' store it in the # structure. TRIE trie 1 Match many EXACT(F[ALU]?)? at once. flags==type TRIEC trie Same as TRIE, but with embedded charclass charclass data AHOCORASICK trie 1 Aho Corasick stclass. flags==type AHOCORASICKC trie Same as AHOCORASICK, but with embedded charclass charclass data # Regex Subroutines GOSUB num/ofs 2L recurse to paren arg1 at (signed) ofs arg2 # Special conditionals GROUPPN no-sv 1 Whether the group matched. INSUBP num 1 Whether we are in a specific recurse. DEFINEP none 1 Never execute directly. # Backtracking Verbs ENDLIKE none Used only for the type field of verbs OPFAIL no-sv 1 Same as (?!), but with verb arg ACCEPT no-sv/num Accepts the current matched string, with 2L verbar # Verbs With Arguments VERB no-sv 1 Used only for the type field of verbs PRUNE no-sv 1 Pattern fails at this startpoint if no- backtracking through this MARKPOINT no-sv 1 Push the current location for rollback by cut. SKIP no-sv 1 On failure skip forward (to the mark) before retrying COMMIT no-sv 1 Pattern fails outright if backtracking through this CUTGROUP no-sv 1 On failure go to the next alternation in the group # Control what to keep in $&. KEEPS no $& begins here. # New charclass like patterns LNBREAK none generic newline pattern # SPECIAL REGOPS # This is not really a node, but an optimized away piece of a "long" # node. To simplify debugging output, we mark it as if it were a node OPTIMIZED off Placeholder for dump. # Special opcode with the property that no opcode in a compiled program # will ever be of this type. Thus it can be used as a flag value that # no other opcode has been seen. END is used similarly, in that an END # node cant be optimized. So END implies "unoptimizable" and PSEUDO # mean "not seen anything to optimize yet". PSEUDO off Pseudo opcode for internal use. REGEX_SET depth p Regex set, temporary node used in pre- optimization compilation =for regcomp.pl end =for unprinted-credits Next section M-J. Dominus (mjd-perl-patch+@plover.com) 20010421 Following the optimizer information is a dump of the offset/length table, here split across several lines: Offsets: [45] 1[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 5[1] 0[0] 12[1] 0[0] 6[1] 0[0] 7[1] 0[0] 9[1] 8[1] 0[0] 10[1] 0[0] 11[1] 0[0] 12[0] 12[0] 13[1] 0[0] 14[4] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 0[0] 18[1] 0[0] 19[1] 20[0] The first line here indicates that the offset/length table contains 45 entries. Each entry is a pair of integers, denoted by C<offset[length]>. Entries are numbered starting with 1, so entry #1 here is C<1[4]> and entry #12 is C<5[1]>. C<1[4]> indicates that the node labeled C<1:> (the C<1: ANYOF[bc]>) begins at character position 1 in the pre-compiled form of the regex, and has a length of 4 characters. C<5[1]> in position 12 indicates that the node labeled C<12:> (the C<< 12: EXACT <d> >>) begins at character position 5 in the pre-compiled form of the regex, and has a length of 1 character. C<12[1]> in position 14 indicates that the node labeled C<14:> (the C<< 14: CURLYX[0] {1,32767} >>) begins at character position 12 in the pre-compiled form of the regex, and has a length of 1 character---that is, it corresponds to the C<+> symbol in the precompiled regex. C<0[0]> items indicate that there is no corresponding node. =head2 Run-time Output First of all, when doing a match, one may get no run-time output even if debugging is enabled. This means that the regex engine was never entered and that all of the job was therefore done by the optimizer. If the regex engine was entered, the output may look like this: Matching '[bc]d(ef*g)+h[ij]k$' against 'abcdefg__gh__' Setting an EVAL scope, savestack=3 2 <ab> <cdefg__gh_> | 1: ANYOF 3 <abc> <defg__gh_> | 11: EXACT <d> 4 <abcd> <efg__gh_> | 13: CURLYX {1,32767} 4 <abcd> <efg__gh_> | 26: WHILEM 0 out of 1..32767 cc=effff31c 4 <abcd> <efg__gh_> | 15: OPEN1 4 <abcd> <efg__gh_> | 17: EXACT <e> 5 <abcde> <fg__gh_> | 19: STAR EXACT <f> can match 1 times out of 32767... Setting an EVAL scope, savestack=3 6 <bcdef> <g__gh__> | 22: EXACT <g> 7 <bcdefg> <__gh__> | 24: CLOSE1 7 <bcdefg> <__gh__> | 26: WHILEM 1 out of 1..32767 cc=effff31c Setting an EVAL scope, savestack=12 7 <bcdefg> <__gh__> | 15: OPEN1 7 <bcdefg> <__gh__> | 17: EXACT <e> restoring \1 to 4(4)..7 failed, try continuation... 7 <bcdefg> <__gh__> | 27: NOTHING 7 <bcdefg> <__gh__> | 28: EXACT <h> failed... failed... The most significant information in the output is about the particular I<node> of the compiled regex that is currently being tested against the target string. The format of these lines is C< >I<STRING-OFFSET> <I<PRE-STRING>> <I<POST-STRING>> |I<ID>: I<TYPE> The I<TYPE> info is indented with respect to the backtracking level. Other incidental information appears interspersed within. =head1 Debugging Perl Memory Usage Perl is a profligate wastrel when it comes to memory use. There is a saying that to estimate memory usage of Perl, assume a reasonable algorithm for memory allocation, multiply that estimate by 10, and while you still may miss the mark, at least you won't be quite so astonished. This is not absolutely true, but may provide a good grasp of what happens. Assume that an integer cannot take less than 20 bytes of memory, a float cannot take less than 24 bytes, a string cannot take less than 32 bytes (all these examples assume 32-bit architectures, the result are quite a bit worse on 64-bit architectures). If a variable is accessed in two of three different ways (which require an integer, a float, or a string), the memory footprint may increase yet another 20 bytes. A sloppy malloc(3) implementation can inflate these numbers dramatically. On the opposite end of the scale, a declaration like sub foo; may take up to 500 bytes of memory, depending on which release of Perl you're running. Anecdotal estimates of source-to-compiled code bloat suggest an eightfold increase. This means that the compiled form of reasonable (normally commented, properly indented etc.) code will take about eight times more space in memory than the code took on disk. The B<-DL> command-line switch is obsolete since circa Perl 5.6.0 (it was available only if Perl was built with C<-DDEBUGGING>). The switch was used to track Perl's memory allocations and possible memory leaks. These days the use of malloc debugging tools like F<Purify> or F<valgrind> is suggested instead. See also L<perlhacktips/PERL_MEM_LOG>. One way to find out how much memory is being used by Perl data structures is to install the Devel::Size module from CPAN: it gives you the minimum number of bytes required to store a particular data structure. Please be mindful of the difference between the size() and total_size(). If Perl has been compiled using Perl's malloc you can analyze Perl memory usage by setting $ENV{PERL_DEBUG_MSTATS}. =head2 Using C<$ENV{PERL_DEBUG_MSTATS}> If your perl is using Perl's malloc() and was compiled with the necessary switches (this is the default), then it will print memory usage statistics after compiling your code when C<< $ENV{PERL_DEBUG_MSTATS} > 1 >>, and before termination of the program when C<< $ENV{PERL_DEBUG_MSTATS} >= 1 >>. The report format is similar to the following example: $ PERL_DEBUG_MSTATS=2 perl -e "require Carp" Memory allocation statistics after compilation: (buckets 4(4)..8188(8192) 14216 free: 130 117 28 7 9 0 2 2 1 0 0 437 61 36 0 5 60924 used: 125 137 161 55 7 8 6 16 2 0 1 74 109 304 84 20 Total sbrk(): 77824/21:119. Odd ends: pad+heads+chain+tail: 0+636+0+2048. Memory allocation statistics after execution: (buckets 4(4)..8188(8192) 30888 free: 245 78 85 13 6 2 1 3 2 0 1 315 162 39 42 11 175816 used: 265 176 1112 111 26 22 11 27 2 1 1 196 178 1066 798 39 Total sbrk(): 215040/47:145. Odd ends: pad+heads+chain+tail: 0+2192+0+6144. It is possible to ask for such a statistic at arbitrary points in your execution using the mstat() function out of the standard Devel::Peek module. Here is some explanation of that format: =over 4 =item C<buckets SMALLEST(APPROX)..GREATEST(APPROX)> Perl's malloc() uses bucketed allocations. Every request is rounded up to the closest bucket size available, and a bucket is taken from the pool of buckets of that size. The line above describes the limits of buckets currently in use. Each bucket has two sizes: memory footprint and the maximal size of user data that can fit into this bucket. Suppose in the above example that the smallest bucket were size 4. The biggest bucket would have usable size 8188, and the memory footprint would be 8192. In a Perl built for debugging, some buckets may have negative usable size. This means that these buckets cannot (and will not) be used. For larger buckets, the memory footprint may be one page greater than a power of 2. If so, the corresponding power of two is printed in the C<APPROX> field above. =item Free/Used The 1 or 2 rows of numbers following that correspond to the number of buckets of each size between C<SMALLEST> and C<GREATEST>. In the first row, the sizes (memory footprints) of buckets are powers of two--or possibly one page greater. In the second row, if present, the memory footprints of the buckets are between the memory footprints of two buckets "above". For example, suppose under the previous example, the memory footprints were free: 8 16 32 64 128 256 512 1024 2048 4096 8192 4 12 24 48 80 With a non-C<DEBUGGING> perl, the buckets starting from C<128> have a 4-byte overhead, and thus an 8192-long bucket may take up to 8188-byte allocations. =item C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS> The first two fields give the total amount of memory perl sbrk(2)ed (ess-broken? :-) and number of sbrk(2)s used. The third number is what perl thinks about continuity of returned chunks. So long as this number is positive, malloc() will assume that it is probable that sbrk(2) will provide continuous memory. Memory allocated by external libraries is not counted. =item C<pad: 0> The amount of sbrk(2)ed memory needed to keep buckets aligned. =item C<heads: 2192> Although memory overhead of bigger buckets is kept inside the bucket, for smaller buckets, it is kept in separate areas. This field gives the total size of these areas. =item C<chain: 0> malloc() may want to subdivide a bigger bucket into smaller buckets. If only a part of the deceased bucket is left unsubdivided, the rest is kept as an element of a linked list. This field gives the total size of these chunks. =item C<tail: 6144> To minimize the number of sbrk(2)s, malloc() asks for more memory. This field gives the size of the yet unused part, which is sbrk(2)ed, but never touched. =back =head1 SEE ALSO L<perldebug>, L<perlguts>, L<perlrun> L<re>, and L<Devel::DProf>. PK �=�[����� �� perlvms.podnu �[��� =head1 NAME perlvms - VMS-specific documentation for Perl =head1 DESCRIPTION Gathered below are notes describing details of Perl 5's behavior on VMS. They are a supplement to the regular Perl 5 documentation, so we have focussed on the ways in which Perl 5 functions differently under VMS than it does under Unix, and on the interactions between Perl and the rest of the operating system. We haven't tried to duplicate complete descriptions of Perl features from the main Perl documentation, which can be found in the F<[.pod]> subdirectory of the Perl distribution. We hope these notes will save you from confusion and lost sleep when writing Perl scripts on VMS. If you find we've missed something you think should appear here, please don't hesitate to drop a line to vmsperl@perl.org. =head1 Installation Directions for building and installing Perl 5 can be found in the file F<README.vms> in the main source directory of the Perl distribution. =head1 Organization of Perl Images =head2 Core Images During the build process, three Perl images are produced. F<Miniperl.Exe> is an executable image which contains all of the basic functionality of Perl, but cannot take advantage of Perl XS extensions and has a hard-wired list of library locations for loading pure-Perl modules. It is used extensively to build and test Perl and various extensions, but is not installed. Most of the complete Perl resides in the shareable image F<PerlShr.Exe>, which provides a core to which the Perl executable image and all Perl extensions are linked. It is generally located via the logical name F<PERLSHR>. While it's possible to put the image in F<SYS$SHARE> to make it loadable, that's not recommended. And while you may wish to INSTALL the image for performance reasons, you should not install it with privileges; if you do, the result will not be what you expect as image privileges are disabled during Perl start-up. Finally, F<Perl.Exe> is an executable image containing the main entry point for Perl, as well as some initialization code. It should be placed in a public directory, and made world executable. In order to run Perl with command line arguments, you should define a foreign command to invoke this image. =head2 Perl Extensions Perl extensions are packages which provide both XS and Perl code to add new functionality to perl. (XS is a meta-language which simplifies writing C code which interacts with Perl, see L<perlxs> for more details.) The Perl code for an extension is treated like any other library module - it's made available in your script through the appropriate C<use> or C<require> statement, and usually defines a Perl package containing the extension. The portion of the extension provided by the XS code may be connected to the rest of Perl in either of two ways. In the B<static> configuration, the object code for the extension is linked directly into F<PerlShr.Exe>, and is initialized whenever Perl is invoked. In the B<dynamic> configuration, the extension's machine code is placed into a separate shareable image, which is mapped by Perl's DynaLoader when the extension is C<use>d or C<require>d in your script. This allows you to maintain the extension as a separate entity, at the cost of keeping track of the additional shareable image. Most extensions can be set up as either static or dynamic. The source code for an extension usually resides in its own directory. At least three files are generally provided: I<Extshortname>F<.xs> (where I<Extshortname> is the portion of the extension's name following the last C<::>), containing the XS code, I<Extshortname>F<.pm>, the Perl library module for the extension, and F<Makefile.PL>, a Perl script which uses the C<MakeMaker> library modules supplied with Perl to generate a F<Descrip.MMS> file for the extension. =head2 Installing static extensions Since static extensions are incorporated directly into F<PerlShr.Exe>, you'll have to rebuild Perl to incorporate a new extension. You should edit the main F<Descrip.MMS> or F<Makefile> you use to build Perl, adding the extension's name to the C<ext> macro, and the extension's object file to the C<extobj> macro. You'll also need to build the extension's object file, either by adding dependencies to the main F<Descrip.MMS>, or using a separate F<Descrip.MMS> for the extension. Then, rebuild F<PerlShr.Exe> to incorporate the new code. Finally, you'll need to copy the extension's Perl library module to the F<[.>I<Extname>F<]> subdirectory under one of the directories in C<@INC>, where I<Extname> is the name of the extension, with all C<::> replaced by C<.> (e.g. the library module for extension Foo::Bar would be copied to a F<[.Foo.Bar]> subdirectory). =head2 Installing dynamic extensions In general, the distributed kit for a Perl extension includes a file named Makefile.PL, which is a Perl program which is used to create a F<Descrip.MMS> file which can be used to build and install the files required by the extension. The kit should be unpacked into a directory tree B<not> under the main Perl source directory, and the procedure for building the extension is simply $ perl Makefile.PL ! Create Descrip.MMS $ mmk ! Build necessary files $ mmk test ! Run test code, if supplied $ mmk install ! Install into public Perl tree VMS support for this process in the current release of Perl is sufficient to handle most extensions. (See the MakeMaker documentation for more details on installation options for extensions.) =over 4 =item * the F<[.Lib.Auto.>I<Arch>I<$PVers>I<Extname>F<]> subdirectory of one of the directories in C<@INC> (where I<PVers> is the version of Perl you're using, as supplied in C<$]>, with '.' converted to '_'), or =item * one of the directories in C<@INC>, or =item * a directory which the extensions Perl library module passes to the DynaLoader when asking it to map the shareable image, or =item * F<Sys$Share> or F<Sys$Library>. =back If the shareable image isn't in any of these places, you'll need to define a logical name I<Extshortname>, where I<Extshortname> is the portion of the extension's name after the last C<::>, which translates to the full file specification of the shareable image. =head1 File specifications =head2 Syntax We have tried to make Perl aware of both VMS-style and Unix-style file specifications wherever possible. You may use either style, or both, on the command line and in scripts, but you may not combine the two styles within a single file specification. VMS Perl interprets Unix pathnames in much the same way as the CRTL (I<e.g.> the first component of an absolute path is read as the device name for the VMS file specification). There are a set of functions provided in the C<VMS::Filespec> package for explicit interconversion between VMS and Unix syntax; its documentation provides more details. We've tried to minimize the dependence of Perl library modules on Unix syntax, but you may find that some of these, as well as some scripts written for Unix systems, will require that you use Unix syntax, since they will assume that '/' is the directory separator, I<etc.> If you find instances of this in the Perl distribution itself, please let us know, so we can try to work around them. Also when working on Perl programs on VMS, if you need a syntax in a specific operating system format, then you need either to check the appropriate DECC$ feature logical, or call a conversion routine to force it to that format. The feature logical name DECC$FILENAME_UNIX_REPORT modifies traditional Perl behavior in the conversion of file specifications from Unix to VMS format in order to follow the extended character handling rules now expected by the CRTL. Specifically, when this feature is in effect, the C<./.../> in a Unix path is now translated to C<[.^.^.^.]> instead of the traditional VMS C<[...]>. To be compatible with what MakeMaker expects, if a VMS path cannot be translated to a Unix path, it is passed through unchanged, so C<unixify("[...]")> will return C<[...]>. There are several ambiguous cases where a conversion routine cannot determine whether an input filename is in Unix format or in VMS format, since now both VMS and Unix file specifications may have characters in them that could be mistaken for syntax delimiters of the other type. So some pathnames simply cannot be used in a mode that allows either type of pathname to be present. Perl will tend to assume that an ambiguous filename is in Unix format. Allowing "." as a version delimiter is simply incompatible with determining whether a pathname is in VMS format or in Unix format with extended file syntax. There is no way to know whether "perl-5.8.6" is a Unix "perl-5.8.6" or a VMS "perl-5.8;6" when passing it to unixify() or vmsify(). The DECC$FILENAME_UNIX_REPORT logical name controls how Perl interprets filenames to the extent that Perl uses the CRTL internally for many purposes, and attempts to follow CRTL conventions for reporting filenames. The DECC$FILENAME_UNIX_ONLY feature differs in that it expects all filenames passed to the C run-time to be already in Unix format. This feature is not yet supported in Perl since Perl uses traditional OpenVMS file specifications internally and in the test harness, and it is not yet clear whether this mode will be useful or useable. The feature logical name DECC$POSIX_COMPLIANT_PATHNAMES is new with the RMS Symbolic Link SDK and included with OpenVMS v8.3, but is not yet supported in Perl. =head2 Filename Case Perl enables DECC$EFS_CASE_PRESERVE and DECC$ARGV_PARSE_STYLE by default. Note that the latter only takes effect when extended parse is set in the process in which Perl is running. When these features are explicitly disabled in the environment or the CRTL does not support them, Perl follows the traditional CRTL behavior of downcasing command-line arguments and returning file specifications in lower case only. I<N. B.> It is very easy to get tripped up using a mixture of other programs, external utilities, and Perl scripts that are in varying states of being able to handle case preservation. For example, a file created by an older version of an archive utility or a build utility such as MMK or MMS may generate a filename in all upper case even on an ODS-5 volume. If this filename is later retrieved by a Perl script or module in a case preserving environment, that upper case name may not match the mixed-case or lower-case expectations of the Perl code. Your best bet is to follow an all-or-nothing approach to case preservation: either don't use it at all, or make sure your entire toolchain and application environment support and use it. OpenVMS Alpha v7.3-1 and later and all version of OpenVMS I64 support case sensitivity as a process setting (see C<SET PROCESS /CASE_LOOKUP=SENSITIVE>). Perl does not currently support case sensitivity on VMS, but it may in the future, so Perl programs should use the C<< File::Spec->case_tolerant >> method to determine the state, and not the C<$^O> variable. =head2 Symbolic Links When built on an ODS-5 volume with symbolic links enabled, Perl by default supports symbolic links when the requisite support is available in the filesystem and CRTL (generally 64-bit OpenVMS v8.3 and later). There are a number of limitations and caveats to be aware of when working with symbolic links on VMS. Most notably, the target of a valid symbolic link must be expressed as a Unix-style path and it must exist on a volume visible from your POSIX root (see the C<SHOW ROOT> command in DCL help). For further details on symbolic link capabilities and requirements, see chapter 12 of the CRTL manual that ships with OpenVMS v8.3 or later. =head2 Wildcard expansion File specifications containing wildcards are allowed both on the command line and within Perl globs (e.g. C<E<lt>*.cE<gt>>). If the wildcard filespec uses VMS syntax, the resultant filespecs will follow VMS syntax; if a Unix-style filespec is passed in, Unix-style filespecs will be returned. Similar to the behavior of wildcard globbing for a Unix shell, one can escape command line wildcards with double quotation marks C<"> around a perl program command line argument. However, owing to the stripping of C<"> characters carried out by the C handling of argv you will need to escape a construct such as this one (in a directory containing the files F<PERL.C>, F<PERL.EXE>, F<PERL.H>, and F<PERL.OBJ>): $ perl -e "print join(' ',@ARGV)" perl.* perl.c perl.exe perl.h perl.obj in the following triple quoted manner: $ perl -e "print join(' ',@ARGV)" """perl.*""" perl.* In both the case of unquoted command line arguments or in calls to C<glob()> VMS wildcard expansion is performed. (csh-style wildcard expansion is available if you use C<File::Glob::glob>.) If the wildcard filespec contains a device or directory specification, then the resultant filespecs will also contain a device and directory; otherwise, device and directory information are removed. VMS-style resultant filespecs will contain a full device and directory, while Unix-style resultant filespecs will contain only as much of a directory path as was present in the input filespec. For example, if your default directory is Perl_Root:[000000], the expansion of C<[.t]*.*> will yield filespecs like "perl_root:[t]base.dir", while the expansion of C<t/*/*> will yield filespecs like "t/base.dir". (This is done to match the behavior of glob expansion performed by Unix shells.) Similarly, the resultant filespec will contain the file version only if one was present in the input filespec. =head2 Pipes Input and output pipes to Perl filehandles are supported; the "file name" is passed to lib$spawn() for asynchronous execution. You should be careful to close any pipes you have opened in a Perl script, lest you leave any "orphaned" subprocesses around when Perl exits. You may also use backticks to invoke a DCL subprocess, whose output is used as the return value of the expression. The string between the backticks is handled as if it were the argument to the C<system> operator (see below). In this case, Perl will wait for the subprocess to complete before continuing. The mailbox (MBX) that perl can create to communicate with a pipe defaults to a buffer size of 8192 on 64-bit systems, 512 on VAX. The default buffer size is adjustable via the logical name PERL_MBX_SIZE provided that the value falls between 128 and the SYSGEN parameter MAXBUF inclusive. For example, to set the mailbox size to 32767 use C<$ENV{'PERL_MBX_SIZE'} = 32767;> and then open and use pipe constructs. An alternative would be to issue the command: $ Define PERL_MBX_SIZE 32767 before running your wide record pipe program. A larger value may improve performance at the expense of the BYTLM UAF quota. =head1 PERL5LIB and PERLLIB The PERL5LIB and PERLLIB environment elements work as documented in L<perl>, except that the element separator is, by default, '|' instead of ':'. However, when running under a Unix shell as determined by the logical name C<GNV$UNIX_SHELL>, the separator will be ':' as on Unix systems. The directory specifications may use either VMS or Unix syntax. =head1 The Perl Forked Debugger The Perl forked debugger places the debugger commands and output in a separate X-11 terminal window so that commands and output from multiple processes are not mixed together. Perl on VMS supports an emulation of the forked debugger when Perl is run on a VMS system that has X11 support installed. To use the forked debugger, you need to have the default display set to an X-11 Server and some environment variables set that Unix expects. The forked debugger requires the environment variable C<TERM> to be C<xterm>, and the environment variable C<DISPLAY> to exist. C<xterm> must be in lower case. $define TERM "xterm" $define DISPLAY "hostname:0.0" Currently the value of C<DISPLAY> is ignored. It is recommended that it be set to be the hostname of the display, the server and screen in Unix notation. In the future the value of DISPLAY may be honored by Perl instead of using the default display. It may be helpful to always use the forked debugger so that script I/O is separated from debugger I/O. You can force the debugger to be forked by assigning a value to the logical name <PERLDB_PIDS> that is not a process identification number. $define PERLDB_PIDS XXXX =head1 PERL_VMS_EXCEPTION_DEBUG The PERL_VMS_EXCEPTION_DEBUG being defined as "ENABLE" will cause the VMS debugger to be invoked if a fatal exception that is not otherwise handled is raised. The purpose of this is to allow debugging of internal Perl problems that would cause such a condition. This allows the programmer to look at the execution stack and variables to find out the cause of the exception. As the debugger is being invoked as the Perl interpreter is about to do a fatal exit, continuing the execution in debug mode is usually not practical. Starting Perl in the VMS debugger may change the program execution profile in a way that such problems are not reproduced. The C<kill> function can be used to test this functionality from within a program. In typical VMS style, only the first letter of the value of this logical name is actually checked in a case insensitive mode, and it is considered enabled if it is the value "T","1" or "E". This logical name must be defined before Perl is started. =head1 Command line =head2 I/O redirection and backgrounding Perl for VMS supports redirection of input and output on the command line, using a subset of Bourne shell syntax: =over 4 =item * C<E<lt>file> reads stdin from C<file>, =item * C<E<gt>file> writes stdout to C<file>, =item * C<E<gt>E<gt>file> appends stdout to C<file>, =item * C<2E<gt>file> writes stderr to C<file>, =item * C<2E<gt>E<gt>file> appends stderr to C<file>, and =item * C<< 2>&1 >> redirects stderr to stdout. =back In addition, output may be piped to a subprocess, using the character '|'. Anything after this character on the command line is passed to a subprocess for execution; the subprocess takes the output of Perl as its input. Finally, if the command line ends with '&', the entire command is run in the background as an asynchronous subprocess. =head2 Command line switches The following command line switches behave differently under VMS than described in L<perlrun>. Note also that in order to pass uppercase switches to Perl, you need to enclose them in double-quotes on the command line, since the CRTL downcases all unquoted strings. On newer 64 bit versions of OpenVMS, a process setting now controls if the quoting is needed to preserve the case of command line arguments. =over 4 =item -i If the C<-i> switch is present but no extension for a backup copy is given, then inplace editing creates a new version of a file; the existing copy is not deleted. (Note that if an extension is given, an existing file is renamed to the backup file, as is the case under other operating systems, so it does not remain as a previous version under the original filename.) =item -S If the C<"-S"> or C<-"S"> switch is present I<and> the script name does not contain a directory, then Perl translates the logical name DCL$PATH as a searchlist, using each translation as a directory in which to look for the script. In addition, if no file type is specified, Perl looks in each directory for a file matching the name specified, with a blank type, a type of F<.pl>, and a type of F<.com>, in that order. =item -u The C<-u> switch causes the VMS debugger to be invoked after the Perl program is compiled, but before it has run. It does not create a core dump file. =back =head1 Perl functions As of the time this document was last revised, the following Perl functions were implemented in the VMS port of Perl (functions marked with * are discussed in more detail below): file tests*, abs, alarm, atan, backticks*, binmode*, bless, caller, chdir, chmod, chown, chomp, chop, chr, close, closedir, cos, crypt*, defined, delete, die, do, dump*, each, endgrent, endpwent, eof, eval, exec*, exists, exit, exp, fileno, flock getc, getgrent*, getgrgid*, getgrnam, getlogin, getppid, getpwent*, getpwnam*, getpwuid*, glob, gmtime*, goto, grep, hex, ioctl, import, index, int, join, keys, kill*, last, lc, lcfirst, lchown*, length, link*, local, localtime, log, lstat, m//, map, mkdir, my, next, no, oct, open, opendir, ord, pack, pipe, pop, pos, print, printf, push, q//, qq//, qw//, qx//*, quotemeta, rand, read, readdir, readlink*, redo, ref, rename, require, reset, return, reverse, rewinddir, rindex, rmdir, s///, scalar, seek, seekdir, select(internal), select (system call)*, setgrent, setpwent, shift, sin, sleep, socketpair, sort, splice, split, sprintf, sqrt, srand, stat, study, substr, symlink*, sysread, system*, syswrite, tell, telldir, tie, time, times*, tr///, uc, ucfirst, umask, undef, unlink*, unpack, untie, unshift, use, utime*, values, vec, wait, waitpid*, wantarray, warn, write, y/// The following functions were not implemented in the VMS port, and calling them produces a fatal error (usually) or undefined behavior (rarely, we hope): chroot, dbmclose, dbmopen, fork*, getpgrp, getpriority, msgctl, msgget, msgsend, msgrcv, semctl, semget, semop, setpgrp, setpriority, shmctl, shmget, shmread, shmwrite, syscall The following functions are available on Perls compiled with Dec C 5.2 or greater and running VMS 7.0 or greater: truncate The following functions are available on Perls built on VMS 7.2 or greater: fcntl (without locking) The following functions may or may not be implemented, depending on what type of socket support you've built into your copy of Perl: accept, bind, connect, getpeername, gethostbyname, getnetbyname, getprotobyname, getservbyname, gethostbyaddr, getnetbyaddr, getprotobynumber, getservbyport, gethostent, getnetent, getprotoent, getservent, sethostent, setnetent, setprotoent, setservent, endhostent, endnetent, endprotoent, endservent, getsockname, getsockopt, listen, recv, select(system call)*, send, setsockopt, shutdown, socket The following function is available on Perls built on 64 bit OpenVMS v8.2 with hard links enabled on an ODS-5 formatted build disk. CRTL support is in principle available as of OpenVMS v7.3-1, and better configuration support could detect this. link The following functions are available on Perls built on 64 bit OpenVMS v8.2 and later. CRTL support is in principle available as of OpenVMS v7.3-2, and better configuration support could detect this. getgrgid, getgrnam, getpwnam, getpwuid, setgrent, ttyname The following functions are available on Perls built on 64 bit OpenVMS v8.2 and later. statvfs, socketpair =over 4 =item File tests The tests C<-b>, C<-B>, C<-c>, C<-C>, C<-d>, C<-e>, C<-f>, C<-o>, C<-M>, C<-s>, C<-S>, C<-t>, C<-T>, and C<-z> work as advertised. The return values for C<-r>, C<-w>, and C<-x> tell you whether you can actually access the file; this may not reflect the UIC-based file protections. Since real and effective UIC don't differ under VMS, C<-O>, C<-R>, C<-W>, and C<-X> are equivalent to C<-o>, C<-r>, C<-w>, and C<-x>. Similarly, several other tests, including C<-A>, C<-g>, C<-k>, C<-l>, C<-p>, and C<-u>, aren't particularly meaningful under VMS, and the values returned by these tests reflect whatever your CRTL C<stat()> routine does to the equivalent bits in the st_mode field. Finally, C<-d> returns true if passed a device specification without an explicit directory (e.g. C<DUA1:>), as well as if passed a directory. There are DECC feature logical names AND ODS-5 volume attributes that also control what values are returned for the date fields. Note: Some sites have reported problems when using the file-access tests (C<-r>, C<-w>, and C<-x>) on files accessed via DEC's DFS. Specifically, since DFS does not currently provide access to the extended file header of files on remote volumes, attempts to examine the ACL fail, and the file tests will return false, with C<$!> indicating that the file does not exist. You can use C<stat> on these files, since that checks UIC-based protection only, and then manually check the appropriate bits, as defined by your C compiler's F<stat.h>, in the mode value it returns, if you need an approximation of the file's protections. =item backticks Backticks create a subprocess, and pass the enclosed string to it for execution as a DCL command. Since the subprocess is created directly via C<lib$spawn()>, any valid DCL command string may be specified. =item binmode FILEHANDLE The C<binmode> operator will attempt to insure that no translation of carriage control occurs on input from or output to this filehandle. Since this involves reopening the file and then restoring its file position indicator, if this function returns FALSE, the underlying filehandle may no longer point to an open file, or may point to a different position in the file than before C<binmode> was called. Note that C<binmode> is generally not necessary when using normal filehandles; it is provided so that you can control I/O to existing record-structured files when necessary. You can also use the C<vmsfopen> function in the VMS::Stdio extension to gain finer control of I/O to files and devices with different record structures. =item crypt PLAINTEXT, USER The C<crypt> operator uses the C<sys$hash_password> system service to generate the hashed representation of PLAINTEXT. If USER is a valid username, the algorithm and salt values are taken from that user's UAF record. If it is not, then the preferred algorithm and a salt of 0 are used. The quadword encrypted value is returned as an 8-character string. The value returned by C<crypt> may be compared against the encrypted password from the UAF returned by the C<getpw*> functions, in order to authenticate users. If you're going to do this, remember that the encrypted password in the UAF was generated using uppercase username and password strings; you'll have to upcase the arguments to C<crypt> to insure that you'll get the proper value: sub validate_passwd { my($user,$passwd) = @_; my($pwdhash); if ( !($pwdhash = (getpwnam($user))[1]) || $pwdhash ne crypt("\U$passwd","\U$name") ) { intruder_alert($name); } return 1; } =item die C<die> will force the native VMS exit status to be an SS$_ABORT code if neither of the $! or $? status values are ones that would cause the native status to be interpreted as being what VMS classifies as SEVERE_ERROR severity for DCL error handling. When C<PERL_VMS_POSIX_EXIT> is active (see L</"$?"> below), the native VMS exit status value will have either one of the C<$!> or C<$?> or C<$^E> or the Unix value 255 encoded into it in a way that the effective original value can be decoded by other programs written in C, including Perl and the GNV package. As per the normal non-VMS behavior of C<die> if either C<$!> or C<$?> are non-zero, one of those values will be encoded into a native VMS status value. If both of the Unix status values are 0, and the C<$^E> value is set one of ERROR or SEVERE_ERROR severity, then the C<$^E> value will be used as the exit code as is. If none of the above apply, the Unix value of 255 will be encoded into a native VMS exit status value. Please note a significant difference in the behavior of C<die> in the C<PERL_VMS_POSIX_EXIT> mode is that it does not force a VMS SEVERE_ERROR status on exit. The Unix exit values of 2 through 255 will be encoded in VMS status values with severity levels of SUCCESS. The Unix exit value of 1 will be encoded in a VMS status value with a severity level of ERROR. This is to be compatible with how the VMS C library encodes these values. The minimum severity level set by C<die> in C<PERL_VMS_POSIX_EXIT> mode may be changed to be ERROR or higher in the future depending on the results of testing and further review. See L</"$?"> for a description of the encoding of the Unix value to produce a native VMS status containing it. =item dump Rather than causing Perl to abort and dump core, the C<dump> operator invokes the VMS debugger. If you continue to execute the Perl program under the debugger, control will be transferred to the label specified as the argument to C<dump>, or, if no label was specified, back to the beginning of the program. All other state of the program (I<e.g.> values of variables, open file handles) are not affected by calling C<dump>. =item exec LIST A call to C<exec> will cause Perl to exit, and to invoke the command given as an argument to C<exec> via C<lib$do_command>. If the argument begins with '@' or '$' (other than as part of a filespec), then it is executed as a DCL command. Otherwise, the first token on the command line is treated as the filespec of an image to run, and an attempt is made to invoke it (using F<.Exe> and the process defaults to expand the filespec) and pass the rest of C<exec>'s argument to it as parameters. If the token has no file type, and matches a file with null type, then an attempt is made to determine whether the file is an executable image which should be invoked using C<MCR> or a text file which should be passed to DCL as a command procedure. =item fork While in principle the C<fork> operator could be implemented via (and with the same rather severe limitations as) the CRTL C<vfork()> routine, and while some internal support to do just that is in place, the implementation has never been completed, making C<fork> currently unavailable. A true kernel C<fork()> is expected in a future version of VMS, and the pseudo-fork based on interpreter threads may be available in a future version of Perl on VMS (see L<perlfork>). In the meantime, use C<system>, backticks, or piped filehandles to create subprocesses. =item getpwent =item getpwnam =item getpwuid These operators obtain the information described in L<perlfunc>, if you have the privileges necessary to retrieve the named user's UAF information via C<sys$getuai>. If not, then only the C<$name>, C<$uid>, and C<$gid> items are returned. The C<$dir> item contains the login directory in VMS syntax, while the C<$comment> item contains the login directory in Unix syntax. The C<$gcos> item contains the owner field from the UAF record. The C<$quota> item is not used. =item gmtime The C<gmtime> operator will function properly if you have a working CRTL C<gmtime()> routine, or if the logical name SYS$TIMEZONE_DIFFERENTIAL is defined as the number of seconds which must be added to UTC to yield local time. (This logical name is defined automatically if you are running a version of VMS with built-in UTC support.) If neither of these cases is true, a warning message is printed, and C<undef> is returned. =item kill In most cases, C<kill> is implemented via the undocumented system service C<$SIGPRC>, which has the same calling sequence as C<$FORCEX>, but throws an exception in the target process rather than forcing it to call C<$EXIT>. Generally speaking, C<kill> follows the behavior of the CRTL's C<kill()> function, but unlike that function can be called from within a signal handler. Also, unlike the C<kill> in some versions of the CRTL, Perl's C<kill> checks the validity of the signal passed in and returns an error rather than attempting to send an unrecognized signal. Also, negative signal values don't do anything special under VMS; they're just converted to the corresponding positive value. =item qx// See the entry on C<backticks> above. =item select (system call) If Perl was not built with socket support, the system call version of C<select> is not available at all. If socket support is present, then the system call version of C<select> functions only for file descriptors attached to sockets. It will not provide information about regular files or pipes, since the CRTL C<select()> routine does not provide this functionality. =item stat EXPR Since VMS keeps track of files according to a different scheme than Unix, it's not really possible to represent the file's ID in the C<st_dev> and C<st_ino> fields of a C<struct stat>. Perl tries its best, though, and the values it uses are pretty unlikely to be the same for two different files. We can't guarantee this, though, so caveat scriptor. =item system LIST The C<system> operator creates a subprocess, and passes its arguments to the subprocess for execution as a DCL command. Since the subprocess is created directly via C<lib$spawn()>, any valid DCL command string may be specified. If the string begins with '@', it is treated as a DCL command unconditionally. Otherwise, if the first token contains a character used as a delimiter in file specification (e.g. C<:> or C<]>), an attempt is made to expand it using a default type of F<.Exe> and the process defaults, and if successful, the resulting file is invoked via C<MCR>. This allows you to invoke an image directly simply by passing the file specification to C<system>, a common Unixish idiom. If the token has no file type, and matches a file with null type, then an attempt is made to determine whether the file is an executable image which should be invoked using C<MCR> or a text file which should be passed to DCL as a command procedure. If LIST consists of the empty string, C<system> spawns an interactive DCL subprocess, in the same fashion as typing B<SPAWN> at the DCL prompt. Perl waits for the subprocess to complete before continuing execution in the current process. As described in L<perlfunc>, the return value of C<system> is a fake "status" which follows POSIX semantics unless the pragma C<use vmsish 'status'> is in effect; see the description of C<$?> in this document for more detail. =item time The value returned by C<time> is the offset in seconds from 01-JAN-1970 00:00:00 (just like the CRTL's times() routine), in order to make life easier for code coming in from the POSIX/Unix world. =item times The array returned by the C<times> operator is divided up according to the same rules the CRTL C<times()> routine. Therefore, the "system time" elements will always be 0, since there is no difference between "user time" and "system" time under VMS, and the time accumulated by a subprocess may or may not appear separately in the "child time" field, depending on whether C<times()> keeps track of subprocesses separately. Note especially that the VAXCRTL (at least) keeps track only of subprocesses spawned using C<fork()> and C<exec()>; it will not accumulate the times of subprocesses spawned via pipes, C<system()>, or backticks. =item unlink LIST C<unlink> will delete the highest version of a file only; in order to delete all versions, you need to say 1 while unlink LIST; You may need to make this change to scripts written for a Unix system which expect that after a call to C<unlink>, no files with the names passed to C<unlink> will exist. (Note: This can be changed at compile time; if you C<use Config> and C<$Config{'d_unlink_all_versions'}> is C<define>, then C<unlink> will delete all versions of a file on the first call.) C<unlink> will delete a file if at all possible, even if it requires changing file protection (though it won't try to change the protection of the parent directory). You can tell whether you've got explicit delete access to a file by using the C<VMS::Filespec::candelete> operator. For instance, in order to delete only files to which you have delete access, you could say something like sub safe_unlink { my($file,$num); foreach $file (@_) { next unless VMS::Filespec::candelete($file); $num += unlink $file; } $num; } (or you could just use C<VMS::Stdio::remove>, if you've installed the VMS::Stdio extension distributed with Perl). If C<unlink> has to change the file protection to delete the file, and you interrupt it in midstream, the file may be left intact, but with a changed ACL allowing you delete access. This behavior of C<unlink> is to be compatible with POSIX behavior and not traditional VMS behavior. =item utime LIST This operator changes only the modification time of the file (VMS revision date) on ODS-2 volumes and ODS-5 volumes without access dates enabled. On ODS-5 volumes with access dates enabled, the true access time is modified. =item waitpid PID,FLAGS If PID is a subprocess started by a piped C<open()> (see L<open>), C<waitpid> will wait for that subprocess, and return its final status value in C<$?>. If PID is a subprocess created in some other way (e.g. SPAWNed before Perl was invoked), C<waitpid> will simply check once per second whether the process has completed, and return when it has. (If PID specifies a process that isn't a subprocess of the current process, and you invoked Perl with the C<-w> switch, a warning will be issued.) Returns PID on success, -1 on error. The FLAGS argument is ignored in all cases. =back =head1 Perl variables The following VMS-specific information applies to the indicated "special" Perl variables, in addition to the general information in L<perlvar>. Where there is a conflict, this information takes precedence. =over 4 =item %ENV The operation of the C<%ENV> array depends on the translation of the logical name F<PERL_ENV_TABLES>. If defined, it should be a search list, each element of which specifies a location for C<%ENV> elements. If you tell Perl to read or set the element C<$ENV{>I<name>C<}>, then Perl uses the translations of F<PERL_ENV_TABLES> as follows: =over 4 =item CRTL_ENV This string tells Perl to consult the CRTL's internal C<environ> array of key-value pairs, using I<name> as the key. In most cases, this contains only a few keys, but if Perl was invoked via the C C<exec[lv]e()> function, as is the case for some embedded Perl applications or when running under a shell such as GNV bash, the C<environ> array may have been populated by the calling program. =item CLISYM_[LOCAL] A string beginning with C<CLISYM_>tells Perl to consult the CLI's symbol tables, using I<name> as the name of the symbol. When reading an element of C<%ENV>, the local symbol table is scanned first, followed by the global symbol table.. The characters following C<CLISYM_> are significant when an element of C<%ENV> is set or deleted: if the complete string is C<CLISYM_LOCAL>, the change is made in the local symbol table; otherwise the global symbol table is changed. =item Any other string If an element of F<PERL_ENV_TABLES> translates to any other string, that string is used as the name of a logical name table, which is consulted using I<name> as the logical name. The normal search order of access modes is used. =back F<PERL_ENV_TABLES> is translated once when Perl starts up; any changes you make while Perl is running do not affect the behavior of C<%ENV>. If F<PERL_ENV_TABLES> is not defined, then Perl defaults to consulting first the logical name tables specified by F<LNM$FILE_DEV>, and then the CRTL C<environ> array. This default order is reversed when the logical name F<GNV$UNIX_SHELL> is defined, such as when running under GNV bash. For operations on %ENV entries based on logical names or DCL symbols, the key string is treated as if it were entirely uppercase, regardless of the case actually specified in the Perl expression. Entries in %ENV based on the CRTL's environ array preserve the case of the key string when stored, and lookups are case sensitive. When an element of C<%ENV> is read, the locations to which F<PERL_ENV_TABLES> points are checked in order, and the value obtained from the first successful lookup is returned. If the name of the C<%ENV> element contains a semi-colon, it and any characters after it are removed. These are ignored when the CRTL C<environ> array or a CLI symbol table is consulted. However, the name is looked up in a logical name table, the suffix after the semi-colon is treated as the translation index to be used for the lookup. This lets you look up successive values for search list logical names. For instance, if you say $ Define STORY once,upon,a,time,there,was $ perl -e "for ($i = 0; $i <= 6; $i++) " - _$ -e "{ print $ENV{'story;'.$i},' '}" Perl will print C<ONCE UPON A TIME THERE WAS>, assuming, of course, that F<PERL_ENV_TABLES> is set up so that the logical name C<story> is found, rather than a CLI symbol or CRTL C<environ> element with the same name. When an element of C<%ENV> is set to a defined string, the corresponding definition is made in the location to which the first translation of F<PERL_ENV_TABLES> points. If this causes a logical name to be created, it is defined in supervisor mode. (The same is done if an existing logical name was defined in executive or kernel mode; an existing user or supervisor mode logical name is reset to the new value.) If the value is an empty string, the logical name's translation is defined as a single C<NUL> (ASCII C<\0>) character, since a logical name cannot translate to a zero-length string. (This restriction does not apply to CLI symbols or CRTL C<environ> values; they are set to the empty string.) When an element of C<%ENV> is set to C<undef>, the element is looked up as if it were being read, and if it is found, it is deleted. (An item "deleted" from the CRTL C<environ> array is set to the empty string.) Using C<delete> to remove an element from C<%ENV> has a similar effect, but after the element is deleted, another attempt is made to look up the element, so an inner-mode logical name or a name in another location will replace the logical name just deleted. In either case, only the first value found searching PERL_ENV_TABLES is altered. It is not possible at present to define a search list logical name via %ENV. The element C<$ENV{DEFAULT}> is special: when read, it returns Perl's current default device and directory, and when set, it resets them, regardless of the definition of F<PERL_ENV_TABLES>. It cannot be cleared or deleted; attempts to do so are silently ignored. Note that if you want to pass on any elements of the C-local environ array to a subprocess which isn't started by fork/exec, or isn't running a C program, you can "promote" them to logical names in the current process, which will then be inherited by all subprocesses, by saying foreach my $key (qw[C-local keys you want promoted]) { my $temp = $ENV{$key}; # read from C-local array $ENV{$key} = $temp; # and define as logical name } (You can't just say C<$ENV{$key} = $ENV{$key}>, since the Perl optimizer is smart enough to elide the expression.) Don't try to clear C<%ENV> by saying C<%ENV = ();>, it will throw a fatal error. This is equivalent to doing the following from DCL: DELETE/LOGICAL * You can imagine how bad things would be if, for example, the SYS$MANAGER or SYS$SYSTEM logical names were deleted. At present, the first time you iterate over %ENV using C<keys>, or C<values>, you will incur a time penalty as all logical names are read, in order to fully populate %ENV. Subsequent iterations will not reread logical names, so they won't be as slow, but they also won't reflect any changes to logical name tables caused by other programs. You do need to be careful with the logical names representing process-permanent files, such as C<SYS$INPUT> and C<SYS$OUTPUT>. The translations for these logical names are prepended with a two-byte binary value (0x1B 0x00) that needs to be stripped off if you want to use it. (In previous versions of Perl it wasn't possible to get the values of these logical names, as the null byte acted as an end-of-string marker) =item $! The string value of C<$!> is that returned by the CRTL's strerror() function, so it will include the VMS message for VMS-specific errors. The numeric value of C<$!> is the value of C<errno>, except if errno is EVMSERR, in which case C<$!> contains the value of vaxc$errno. Setting C<$!> always sets errno to the value specified. If this value is EVMSERR, it also sets vaxc$errno to 4 (NONAME-F-NOMSG), so that the string value of C<$!> won't reflect the VMS error message from before C<$!> was set. =item $^E This variable provides direct access to VMS status values in vaxc$errno, which are often more specific than the generic Unix-style error messages in C<$!>. Its numeric value is the value of vaxc$errno, and its string value is the corresponding VMS message string, as retrieved by sys$getmsg(). Setting C<$^E> sets vaxc$errno to the value specified. While Perl attempts to keep the vaxc$errno value to be current, if errno is not EVMSERR, it may not be from the current operation. =item $? The "status value" returned in C<$?> is synthesized from the actual exit status of the subprocess in a way that approximates POSIX wait(5) semantics, in order to allow Perl programs to portably test for successful completion of subprocesses. The low order 8 bits of C<$?> are always 0 under VMS, since the termination status of a process may or may not have been generated by an exception. The next 8 bits contain the termination status of the program. If the child process follows the convention of C programs compiled with the _POSIX_EXIT macro set, the status value will contain the actual value of 0 to 255 returned by that program on a normal exit. With the _POSIX_EXIT macro set, the Unix exit value of zero is represented as a VMS native status of 1, and the Unix values from 2 to 255 are encoded by the equation: VMS_status = 0x35a000 + (unix_value * 8) + 1. And in the special case of Unix value 1 the encoding is: VMS_status = 0x35a000 + 8 + 2 + 0x10000000. For other termination statuses, the severity portion of the subprocess's exit status is used: if the severity was success or informational, these bits are all 0; if the severity was warning, they contain a value of 1; if the severity was error or fatal error, they contain the actual severity bits, which turns out to be a value of 2 for error and 4 for severe_error. Fatal is another term for the severe_error status. As a result, C<$?> will always be zero if the subprocess's exit status indicated successful completion, and non-zero if a warning or error occurred or a program compliant with encoding _POSIX_EXIT values was run and set a status. How can you tell the difference between a non-zero status that is the result of a VMS native error status or an encoded Unix status? You can not unless you look at the ${^CHILD_ERROR_NATIVE} value. The ${^CHILD_ERROR_NATIVE} value returns the actual VMS status value and check the severity bits. If the severity bits are equal to 1, then if the numeric value for C<$?> is between 2 and 255 or 0, then C<$?> accurately reflects a value passed back from a Unix application. If C<$?> is 1, and the severity bits indicate a VMS error (2), then C<$?> is from a Unix application exit value. In practice, Perl scripts that call programs that return _POSIX_EXIT type status values will be expecting those values, and programs that call traditional VMS programs will either be expecting the previous behavior or just checking for a non-zero status. And success is always the value 0 in all behaviors. When the actual VMS termination status of the child is an error, internally the C<$!> value will be set to the closest Unix errno value to that error so that Perl scripts that test for error messages will see the expected Unix style error message instead of a VMS message. Conversely, when setting C<$?> in an END block, an attempt is made to convert the POSIX value into a native status intelligible to the operating system upon exiting Perl. What this boils down to is that setting C<$?> to zero results in the generic success value SS$_NORMAL, and setting C<$?> to a non-zero value results in the generic failure status SS$_ABORT. See also L<perlport/exit>. With the C<PERL_VMS_POSIX_EXIT> logical name defined as "ENABLE", setting C<$?> will cause the new value to be encoded into C<$^E> so that either the original parent or child exit status values 0 to 255 can be automatically recovered by C programs expecting _POSIX_EXIT behavior. If both a parent and a child exit value are non-zero, then it will be assumed that this is actually a VMS native status value to be passed through. The special value of 0xFFFF is almost a NOOP as it will cause the current native VMS status in the C library to become the current native Perl VMS status, and is handled this way as it is known to not be a valid native VMS status value. It is recommend that only values in the range of normal Unix parent or child status numbers, 0 to 255 are used. The pragma C<use vmsish 'status'> makes C<$?> reflect the actual VMS exit status instead of the default emulation of POSIX status described above. This pragma also disables the conversion of non-zero values to SS$_ABORT when setting C<$?> in an END block (but zero will still be converted to SS$_NORMAL). Do not use the pragma C<use vmsish 'status'> with C<PERL_VMS_POSIX_EXIT> enabled, as they are at times requesting conflicting actions and the consequence of ignoring this advice will be undefined to allow future improvements in the POSIX exit handling. In general, with C<PERL_VMS_POSIX_EXIT> enabled, more detailed information will be available in the exit status for DCL scripts or other native VMS tools, and will give the expected information for Posix programs. It has not been made the default in order to preserve backward compatibility. N.B. Setting C<DECC$FILENAME_UNIX_REPORT> implicitly enables C<PERL_VMS_POSIX_EXIT>. =item $| Setting C<$|> for an I/O stream causes data to be flushed all the way to disk on each write (I<i.e.> not just to the underlying RMS buffers for a file). In other words, it's equivalent to calling fflush() and fsync() from C. =back =head1 Standard modules with VMS-specific differences =head2 SDBM_File SDBM_File works properly on VMS. It has, however, one minor difference. The database directory file created has a F<.sdbm_dir> extension rather than a F<.dir> extension. F<.dir> files are VMS filesystem directory files, and using them for other purposes could cause unacceptable problems. =head1 Revision date Please see the git repository for revision history. =head1 AUTHOR Charles Bailey bailey@cor.newman.upenn.edu Craig Berry craigberry@mac.com Dan Sugalski dan@sidhe.org John Malmberg wb8tyw@qsl.net PK �=�[ �N� � perlcn.podnu �[��� =encoding utf8 如果你用一般的文字编辑器阅览这份文件, 请忽略文中奇特的注记字符. 这份文件是以 POD (简明文件格式) 写成; 这种格式是为了能让人直接阅读, 而特别设计的. 关于此格式的进一步信息, 请参考 perlpod 线上文件. =head1 NAME perlcn - 简体中文 Perl 指南 =head1 DESCRIPTION 欢迎来到 Perl 的天地! 从 5.8.0 版开始, Perl 具备了完善的 Unicode (统一码) 支援, 也连带支援了许多拉丁语系以外的编码方式; CJK (中日韩) 便是其中的一部份. Unicode 是国际性的标准, 试图涵盖世界上所有的字符: 西方世界, 东方世界, 以及两者间的一切 (希腊文, 叙利亚文, 亚拉伯文, 希伯来文, 印度文, 印地安文, 等等). 它也容纳了多种作业系统与平台 (如 PC 及麦金塔). Perl 本身以 Unicode 进行操作. 这表示 Perl 内部的字符串数据可用 Unicode 表示; Perl 的函式与算符 (例如正规表示式比对) 也能对 Unicode 进行操作. 在输入及输出时, 为了处理以 Unicode 之前的编码方式存放的数据, Perl 提供了 Encode 这个模块, 可以让你轻易地读取及写入旧有的编码数据. Encode 延伸模块支援下列简体中文的编码方式 ('gb2312' 表示 'euc-cn'): euc-cn Unix 延伸字符集, 也就是俗称的国标码 gb2312-raw 未经处理的 (低比特) GB2312 字符表 gb12345 未经处理的中国用繁体中文编码 iso-ir-165 GB2312 + GB6345 + GB8565 + 新增字符 cp936 字码页 936, 也可以用 'GBK' (扩充国标码) 指明 hz 7 比特逸出式 GB2312 编码 举例来说, 将 EUC-CN 编码的档案转成 Unicode, 祗需键入下列指令: perl -Mencoding=euc-cn,STDOUT,utf8 -pe1 < file.euc-cn > file.utf8 Perl 也内附了 "piconv", 一支完全以 Perl 写成的字符转换工具程序, 用法如下: piconv -f euc-cn -t utf8 < file.euc-cn > file.utf8 piconv -f utf8 -t euc-cn < file.utf8 > file.euc-cn 另外, 利用 encoding 模块, 你可以轻易写出以字符为单位的程序码, 如下所示: #!/usr/bin/env perl # 启动 euc-cn 字串解析; 标准输出入及标准错误都设为 euc-cn 编码 use encoding 'euc-cn', STDIN => 'euc-cn', STDOUT => 'euc-cn'; print length("骆驼"); # 2 (双引号表示字符) print length('骆驼'); # 4 (单引号表示字节) print index("谆谆教诲", "蛔唤"); # -1 (不包含此子字符串) print index('谆谆教诲', '蛔唤'); # 1 (从第二个字节开始) 在最后一列例子里, "谆" 的第二个字节与 "谆" 的第一个字节结合成 EUC-CN 码的 "蛔"; "谆" 的第二个字节则与 "教" 的第一个字节结合成 "唤". 这解决了以前 EUC-CN 码比对处理上常见的问题. =head2 额外的中文编码 如果需要更多的中文编码, 可以从 CPAN (L<https://www.cpan.org/>) 下载 Encode::HanExtra 模块. 它目前提供下列编码方式: gb18030 扩充过的国标码, 包含繁体中文 另外, Encode::HanConvert 模块则提供了简繁转换用的两种编码: big5-simp Big5 繁体中文与 Unicode 简体中文互转 gbk-trad GBK 简体中文与 Unicode 繁体中文互转 若想在 GBK 与 Big5 之间互转, 请参考该模块内附的 b2g.pl 与 g2b.pl 两支程序, 或在程序内使用下列写法: use Encode::HanConvert; $euc_cn = big5_to_gb($big5); # 从 Big5 转为 GBK $big5 = gb_to_big5($euc_cn); # 从 GBK 转为 Big5 =head2 进一步的信息 请参考 Perl 内附的大量说明文件 (不幸全是用英文写的), 来学习更多关于 Perl 的知识, 以及 Unicode 的使用方式. 不过, 外部的资源相当丰富: =head2 提供 Perl 资源的网址 =over 4 =item L<https://www.perl.org/> =back Perl 的首页 =over 4 =item L<https://www.perl.com/> 由 Perl 基金会所营运的文章辑录 =item L<https://www.cpan.org/> Perl 综合典藏网 (Comprehensive Perl Archive Network) =item L<https://lists.perl.org/> Perl 邮递论坛一览 =back =head2 学习 Perl 的网址 =over 4 =item L<http://www.oreilly.com.cn/index.php?func=booklist&cat=68> 简体中文版的欧莱礼 Perl 书藉 =back =head2 Perl 使用者集会 =over 4 =item L<https://www.pm.org/groups/asia.html> 中国 Perl 推广组一览 =back =head2 Unicode 相关网址 =over 4 =item L<https://www.unicode.org/> Unicode 学术学会 (Unicode 标准的制定者) =item L<https://www.cl.cam.ac.uk/%7Emgk25/unicode.html> Unix/Linux 上的 UTF-8 及 Unicode 答客问 =back =head1 SEE ALSO L<Encode>, L<Encode::CN>, L<encoding>, L<perluniintro>, L<perlunicode> =head1 AUTHORS Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt> Audrey Tang (唐凤) E<lt>audreyt@audreyt.orgE<gt> =cut PK �=�[�]�� � perlnetware.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specifically designed to be readable as is. =head1 NAME perlnetware - Perl for NetWare =head1 DESCRIPTION This file gives instructions for building Perl 5.7 and above, and also Perl modules for NetWare. Before you start, you may want to read the README file found in the top level directory into which the Perl source code distribution was extracted. Make sure you read and understand the terms under which the software is being distributed. =head1 BUILD This section describes the steps to be performed to build a Perl NLM and other associated NLMs. =head2 Tools & SDK The build requires CodeWarrior compiler and linker. In addition, the "NetWare SDK", "NLM & NetWare Libraries for C" and "NetWare Server Protocol Libraries for C", all available at L<http://developer.novell.com/wiki/index.php/Category:Novell_Developer_Kit>, are required. Microsoft Visual C++ version 4.2 or later is also required. =head2 Setup The build process is dependent on the location of the NetWare SDK. Once the Tools & SDK are installed, the build environment has to be setup. The following batch files setup the environment. =over 4 =item SetNWBld.bat The Execution of this file takes 2 parameters as input. The first being the NetWare SDK path, second being the path for CodeWarrior Compiler & tools. Execution of this file sets these paths and also sets the build type to Release by default. =item Buildtype.bat This is used to set the build type to debug or release. Change the build type only after executing SetNWBld.bat Example: =over =item 1. Typing "buildtype d on" at the command prompt causes the buildtype to be set to Debug type with D2 flag set. =item 2. Typing "buildtype d off" or "buildtype d" at the command prompt causes the buildtype to be set to Debug type with D1 flag set. =item 3. Typing "buildtype r" at the command prompt sets it to Release Build type. =back =back =head2 Make The make process runs only under WinNT shell. The NetWare makefile is located under the NetWare folder. This makes use of miniperl.exe to run some of the Perl scripts. To create miniperl.exe, first set the required paths for Visual c++ compiler (specify vcvars32 location) at the command prompt. Then run nmake from win32 folder through WinNT command prompt. The build process can be stopped after miniperl.exe is created. Then run nmake from NetWare folder through WinNT command prompt. Currently the following two build types are tested on NetWare: =over 4 =item * USE_MULTI, USE_ITHREADS & USE_IMP_SYS defined =item * USE_MULTI & USE_IMP_SYS defined and USE_ITHREADS not defined =back =head2 Interpreter Once miniperl.exe creation is over, run nmake from the NetWare folder. This will build the Perl interpreter for NetWare as I<perl.nlm>. This is copied under the I<Release> folder if you are doing a release build, else will be copied under I<Debug> folder for debug builds. =head2 Extensions The make process also creates the Perl extensions as I<<Extension>.nlm> =head1 INSTALL To install NetWare Perl onto a NetWare server, first map the Sys volume of a NetWare server to I<i:>. This is because the makefile by default sets the drive letter to I<i:>. Type I<nmake nwinstall> from NetWare folder on a WinNT command prompt. This will copy the binaries and module files onto the NetWare server under I<sys:\Perl> folder. The Perl interpreter, I<perl.nlm>, is copied under I<sys:\perl\system> folder. Copy this to I<sys:\system> folder. Example: At the command prompt Type "nmake nwinstall". This will install NetWare Perl on the NetWare Server. Similarly, if you type "nmake install", this will cause the binaries to be installed on the local machine. (Typically under the c:\perl folder) =head1 BUILD NEW EXTENSIONS To build extensions other than standard extensions, NetWare Perl has to be installed on Windows along with Windows Perl. The Perl for Windows can be either downloaded from the CPAN site and built using the sources, or the binaries can be directly downloaded from the ActiveState site. Installation can be done by invoking I<nmake install> from the NetWare folder on a WinNT command prompt after building NetWare Perl by following steps given above. This will copy all the *.pm files and other required files. Documentation files are not copied. Thus one must first install Windows Perl, Then install NetWare Perl. Once this is done, do the following to build any extension: =over 4 =item * Change to the extension directory where its source files are present. =item * Run the following command at the command prompt: perl -II<path to NetWare lib dir> -II<path to lib> Makefile.pl Example: perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread \ -Ic:\perl\5.6.1\lib MakeFile.pl or perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread \ -Ic:\perl\5.8.0\lib MakeFile.pl =item * nmake =item * nmake install Install will copy the files into the Windows machine where NetWare Perl is installed and these files may have to be copied to the NetWare server manually. Alternatively, pass I<INSTALLSITELIB=i:\perl\lib> as an input to makefile.pl above. Here I<i:> is the mapped drive to the sys: volume of the server where Perl on NetWare is installed. Now typing I<nmake install>, will copy the files onto the NetWare server. Example: You can execute the following on the command prompt. perl -Ic:/perl/5.6.1/lib/NetWare-x86-multi-thread \ -Ic:\perl\5.6.1\lib MakeFile.pl INSTALLSITELIB=i:\perl\lib or perl -Ic:/perl/5.8.0/lib/NetWare-x86-multi-thread \ -Ic:\perl\5.8.0\lib MakeFile.pl INSTALLSITELIB=i:\perl\lib =item * Note: Some modules downloaded from CPAN may require NetWare related API in order to build on NetWare. Other modules may however build smoothly with or without minor changes depending on the type of module. =back =head1 ACKNOWLEDGEMENTS The makefile for Win32 is used as a reference to create the makefile for NetWare. Also, the make process for NetWare port uses miniperl.exe to run scripts during the make and installation process. =head1 AUTHORS Anantha Kesari H Y (hyanantha@novell.com) Aditya C (caditya@novell.com) =head1 DATE =over 4 =item * Created - 18 Jan 2001 =item * Modified - 25 June 2001 =item * Modified - 13 July 2001 =item * Modified - 28 May 2002 =back PK �=�[Қ�<* <* perlmodlib.podnu �[��� -*- buffer-read-only: t -*- !!!!!!! DO NOT EDIT THIS FILE !!!!!!! This file is built by pod/perlmodlib.PL extracting documentation from the Perl source files. Any changes made here will be lost! =head1 NAME perlmodlib - constructing new Perl modules and finding existing ones =head1 THE PERL MODULE LIBRARY Many modules are included in the Perl distribution. These are described below, and all end in F<.pm>. You may discover compiled library files (usually ending in F<.so>) or small pieces of modules to be autoloaded (ending in F<.al>); these were automatically generated by the installation process. You may also discover files in the library directory that end in either F<.pl> or F<.ph>. These are old libraries supplied so that old programs that use them still run. The F<.pl> files will all eventually be converted into standard modules, and the F<.ph> files made by B<h2ph> will probably end up as extension modules made by B<h2xs>. (Some F<.ph> values may already be available through the POSIX, Errno, or Fcntl modules.) The B<pl2pm> file in the distribution may help in your conversion, but it's just a mechanical process and therefore far from bulletproof. =head2 Pragmatic Modules They work somewhat like compiler directives (pragmata) in that they tend to affect the compilation of your program, and thus will usually work well only when used within a C<use>, or C<no>. Most of these are lexically scoped, so an inner BLOCK may countermand them by saying: no integer; no strict 'refs'; no warnings; which lasts until the end of that BLOCK. Some pragmas are lexically scoped--typically those that affect the C<$^H> hints variable. Others affect the current package instead, like C<use vars> and C<use subs>, which allow you to predeclare a variables or subroutines within a particular I<file> rather than just a block. Such declarations are effective for the entire file for which they were declared. You cannot rescind them with C<no vars> or C<no subs>. The following pragmas are defined (and have their own documentation). =over 12 =item attributes Get/set subroutine or variable attributes =item autodie Replace functions with ones that succeed or die with lexical scope =item autodie::exception Exceptions from autodying functions. =item autodie::exception::system Exceptions from autodying system(). =item autodie::hints Provide hints about user subroutines to autodie =item autodie::skip Skip a package when throwing autodie exceptions =item autouse Postpone load of modules until a function is used =item base Establish an ISA relationship with base classes at compile time =item bigint Transparent BigInteger support for Perl =item bignum Transparent BigNumber support for Perl =item bigrat Transparent BigNumber/BigRational support for Perl =item blib Use MakeMaker's uninstalled version of a package =item bytes Expose the individual bytes of characters =item charnames Access to Unicode character names and named character sequences; also define character names =item constant Declare constants =item deprecate Perl pragma for deprecating the inclusion of a module in core =item diagnostics Produce verbose warning diagnostics =item encoding Allows you to write your script in non-ASCII and non-UTF-8 =item encoding::warnings Warn on implicit encoding conversions =item experimental Experimental features made easy =item feature Enable new features =item fields Compile-time class fields =item filetest Control the filetest permission operators =item if C<use> a Perl module if a condition holds =item integer Use integer arithmetic instead of floating point =item less Request less of something =item lib Manipulate @INC at compile time =item locale Use or avoid POSIX locales for built-in operations =item mro Method Resolution Order =item ok Alternative to Test::More::use_ok =item open Set default PerlIO layers for input and output =item ops Restrict unsafe operations when compiling =item overload Package for overloading Perl operations =item overloading Lexically control overloading =item parent Establish an ISA relationship with base classes at compile time =item re Alter regular expression behaviour =item sigtrap Enable simple signal handling =item sort Control sort() behaviour =item strict Restrict unsafe constructs =item subs Predeclare subroutine names =item threads Perl interpreter-based threads =item threads::shared Perl extension for sharing data structures between threads =item utf8 Enable/disable UTF-8 (or UTF-EBCDIC) in source code =item vars Predeclare global variable names =item version Perl extension for Version Objects =item vmsish Control VMS-specific language features =item warnings Control optional warnings =item warnings::register Warnings import function =back =head2 Standard Modules Standard, bundled modules are all expected to behave in a well-defined manner with respect to namespace pollution because they use the Exporter module. See their own documentation for details. It's possible that not all modules listed below are installed on your system. For example, the GDBM_File module will not be installed if you don't have the gdbm library. =over 12 =item Amiga::ARexx Perl extension for ARexx support =item Amiga::Exec Perl extension for low level amiga support =item AnyDBM_File Provide framework for multiple DBMs =item App::Cpan Easily interact with CPAN from the command line =item App::Prove Implements the C<prove> command. =item App::Prove::State State storage for the C<prove> command. =item App::Prove::State::Result Individual test suite results. =item App::Prove::State::Result::Test Individual test results. =item Archive::Tar Module for manipulations of tar archives =item Archive::Tar::File A subclass for in-memory extracted file from Archive::Tar =item Attribute::Handlers Simpler definition of attribute handlers =item AutoLoader Load subroutines only on demand =item AutoSplit Split a package for autoloading =item B The Perl Compiler Backend =item B::Concise Walk Perl syntax tree, printing concise info about ops =item B::Deparse Perl compiler backend to produce perl code =item B::Op_private OP op_private flag definitions =item B::Showlex Show lexical variables used in functions or files =item B::Terse Walk Perl syntax tree, printing terse info about ops =item B::Xref Generates cross reference reports for Perl programs =item Benchmark Benchmark running times of Perl code =item C<IO::Socket::IP> Family-neutral IP socket supporting both IPv4 and IPv6 =item C<Socket> Networking constants and support functions =item CORE Namespace for Perl's core routines =item CPAN Query, download and build perl modules from CPAN sites =item CPAN::API::HOWTO A recipe book for programming with CPAN.pm =item CPAN::Debug Internal debugging for CPAN.pm =item CPAN::Distroprefs Read and match distroprefs =item CPAN::FirstTime Utility for CPAN::Config file Initialization =item CPAN::HandleConfig Internal configuration handling for CPAN.pm =item CPAN::Kwalify Interface between CPAN.pm and Kwalify.pm =item CPAN::Meta The distribution metadata for a CPAN dist =item CPAN::Meta::Converter Convert CPAN distribution metadata structures =item CPAN::Meta::Feature An optional feature provided by a CPAN distribution =item CPAN::Meta::History History of CPAN Meta Spec changes =item CPAN::Meta::History::Meta_1_0 Version 1.0 metadata specification for META.yml =item CPAN::Meta::History::Meta_1_1 Version 1.1 metadata specification for META.yml =item CPAN::Meta::History::Meta_1_2 Version 1.2 metadata specification for META.yml =item CPAN::Meta::History::Meta_1_3 Version 1.3 metadata specification for META.yml =item CPAN::Meta::History::Meta_1_4 Version 1.4 metadata specification for META.yml =item CPAN::Meta::Merge Merging CPAN Meta fragments =item CPAN::Meta::Prereqs A set of distribution prerequisites by phase and type =item CPAN::Meta::Requirements A set of version requirements for a CPAN dist =item CPAN::Meta::Spec Specification for CPAN distribution metadata =item CPAN::Meta::Validator Validate CPAN distribution metadata structures =item CPAN::Meta::YAML Read and write a subset of YAML for CPAN Meta files =item CPAN::Nox Wrapper around CPAN.pm without using any XS module =item CPAN::Plugin Base class for CPAN shell extensions =item CPAN::Plugin::Specfile Proof of concept implementation of a trivial CPAN::Plugin =item CPAN::Queue Internal queue support for CPAN.pm =item CPAN::Tarzip Internal handling of tar archives for CPAN.pm =item CPAN::Version Utility functions to compare CPAN versions =item Carp Alternative warn and die for modules =item Class::Struct Declare struct-like datatypes as Perl classes =item Compress::Raw::Bzip2 Low-Level Interface to bzip2 compression library =item Compress::Raw::Zlib Low-Level Interface to zlib compression library =item Compress::Zlib Interface to zlib compression library =item Config Access Perl configuration information =item Config::Extensions Hash lookup of which core extensions were built. =item Config::Perl::V Structured data retrieval of perl -V output =item Cwd Get pathname of current working directory =item DB Programmatic interface to the Perl debugging API =item DBM_Filter Filter DBM keys/values =item DBM_Filter::compress Filter for DBM_Filter =item DBM_Filter::encode Filter for DBM_Filter =item DBM_Filter::int32 Filter for DBM_Filter =item DBM_Filter::null Filter for DBM_Filter =item DBM_Filter::utf8 Filter for DBM_Filter =item DB_File Perl5 access to Berkeley DB version 1.x =item Data::Dumper Stringified perl data structures, suitable for both printing and C<eval> =item Devel::PPPort Perl/Pollution/Portability =item Devel::Peek A data debugging tool for the XS programmer =item Devel::SelfStubber Generate stubs for a SelfLoading module =item Digest Modules that calculate message digests =item Digest::MD5 Perl interface to the MD5 Algorithm =item Digest::SHA Perl extension for SHA-1/224/256/384/512 =item Digest::base Digest base class =item Digest::file Calculate digests of files =item DirHandle (obsolete) supply object methods for directory handles =item Dumpvalue Provides screen dump of Perl data. =item DynaLoader Dynamically load C libraries into Perl code =item Encode Character encodings in Perl =item Encode::Alias Alias definitions to encodings =item Encode::Byte Single Byte Encodings =item Encode::CJKConstants Internally used by Encode::??::ISO_2022_* =item Encode::CN China-based Chinese Encodings =item Encode::CN::HZ Internally used by Encode::CN =item Encode::Config Internally used by Encode =item Encode::EBCDIC EBCDIC Encodings =item Encode::Encoder Object Oriented Encoder =item Encode::Encoding Encode Implementation Base Class =item Encode::GSM0338 ESTI GSM 03.38 Encoding =item Encode::Guess Guesses encoding from data =item Encode::JP Japanese Encodings =item Encode::JP::H2Z Internally used by Encode::JP::2022_JP* =item Encode::JP::JIS7 Internally used by Encode::JP =item Encode::KR Korean Encodings =item Encode::KR::2022_KR Internally used by Encode::KR =item Encode::MIME::Header MIME encoding for an unstructured email header =item Encode::MIME::Name Internally used by Encode =item Encode::PerlIO A detailed document on Encode and PerlIO =item Encode::Supported Encodings supported by Encode =item Encode::Symbol Symbol Encodings =item Encode::TW Taiwan-based Chinese Encodings =item Encode::Unicode Various Unicode Transformation Formats =item Encode::Unicode::UTF7 UTF-7 encoding =item English Use nice English (or awk) names for ugly punctuation variables =item Env Perl module that imports environment variables as scalars or arrays =item Errno System errno constants =item Exporter Implements default import method for modules =item Exporter::Heavy Exporter guts =item ExtUtils::CBuilder Compile and link C code for Perl modules =item ExtUtils::CBuilder::Platform::Windows Builder class for Windows platforms =item ExtUtils::Command Utilities to replace common UNIX commands in Makefiles etc. =item ExtUtils::Command::MM Commands for the MM's to use in Makefiles =item ExtUtils::Constant Generate XS code to import C header constants =item ExtUtils::Constant::Base Base class for ExtUtils::Constant objects =item ExtUtils::Constant::Utils Helper functions for ExtUtils::Constant =item ExtUtils::Constant::XS Generate C code for XS modules' constants. =item ExtUtils::Embed Utilities for embedding Perl in C/C++ applications =item ExtUtils::Install Install files from here to there =item ExtUtils::Installed Inventory management of installed modules =item ExtUtils::Liblist Determine libraries to use and how to use them =item ExtUtils::MM OS adjusted ExtUtils::MakeMaker subclass =item ExtUtils::MM::Utils ExtUtils::MM methods without dependency on ExtUtils::MakeMaker =item ExtUtils::MM_AIX AIX specific subclass of ExtUtils::MM_Unix =item ExtUtils::MM_Any Platform-agnostic MM methods =item ExtUtils::MM_BeOS Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_Cygwin Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_DOS DOS specific subclass of ExtUtils::MM_Unix =item ExtUtils::MM_Darwin Special behaviors for OS X =item ExtUtils::MM_MacOS Once produced Makefiles for MacOS Classic =item ExtUtils::MM_NW5 Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_OS2 Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_QNX QNX specific subclass of ExtUtils::MM_Unix =item ExtUtils::MM_UWIN U/WIN specific subclass of ExtUtils::MM_Unix =item ExtUtils::MM_Unix Methods used by ExtUtils::MakeMaker =item ExtUtils::MM_VMS Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_VOS VOS specific subclass of ExtUtils::MM_Unix =item ExtUtils::MM_Win32 Methods to override UN*X behaviour in ExtUtils::MakeMaker =item ExtUtils::MM_Win95 Method to customize MakeMaker for Win9X =item ExtUtils::MY ExtUtils::MakeMaker subclass for customization =item ExtUtils::MakeMaker Create a module Makefile =item ExtUtils::MakeMaker::Config Wrapper around Config.pm =item ExtUtils::MakeMaker::FAQ Frequently Asked Questions About MakeMaker =item ExtUtils::MakeMaker::Locale Bundled Encode::Locale =item ExtUtils::MakeMaker::Tutorial Writing a module with MakeMaker =item ExtUtils::Manifest Utilities to write and check a MANIFEST file =item ExtUtils::Miniperl Write the C code for miniperlmain.c and perlmain.c =item ExtUtils::Mkbootstrap Make a bootstrap file for use by DynaLoader =item ExtUtils::Mksymlists Write linker options files for dynamic extension =item ExtUtils::Packlist Manage .packlist files =item ExtUtils::ParseXS Converts Perl XS code into C code =item ExtUtils::ParseXS::Constants Initialization values for some globals =item ExtUtils::ParseXS::Eval Clean package to evaluate code in =item ExtUtils::ParseXS::Utilities Subroutines used with ExtUtils::ParseXS =item ExtUtils::Typemaps Read/Write/Modify Perl/XS typemap files =item ExtUtils::Typemaps::Cmd Quick commands for handling typemaps =item ExtUtils::Typemaps::InputMap Entry in the INPUT section of a typemap =item ExtUtils::Typemaps::OutputMap Entry in the OUTPUT section of a typemap =item ExtUtils::Typemaps::Type Entry in the TYPEMAP section of a typemap =item ExtUtils::XSSymSet Keep sets of symbol names palatable to the VMS linker =item ExtUtils::testlib Add blib/* directories to @INC =item Fatal Replace functions with equivalents which succeed or die =item Fcntl Load the C Fcntl.h defines =item File::Basename Parse file paths into directory, filename and suffix. =item File::Compare Compare files or filehandles =item File::Copy Copy files or filehandles =item File::DosGlob DOS like globbing and then some =item File::Fetch A generic file fetching mechanism =item File::Find Traverse a directory tree. =item File::Glob Perl extension for BSD glob routine =item File::GlobMapper Extend File Glob to Allow Input and Output Files =item File::Path Create or remove directory trees =item File::Spec Portably perform operations on file names =item File::Spec::AmigaOS File::Spec for AmigaOS =item File::Spec::Cygwin Methods for Cygwin file specs =item File::Spec::Epoc Methods for Epoc file specs =item File::Spec::Functions Portably perform operations on file names =item File::Spec::Mac File::Spec for Mac OS (Classic) =item File::Spec::OS2 Methods for OS/2 file specs =item File::Spec::Unix File::Spec for Unix, base for other File::Spec modules =item File::Spec::VMS Methods for VMS file specs =item File::Spec::Win32 Methods for Win32 file specs =item File::Temp Return name and handle of a temporary file safely =item File::stat By-name interface to Perl's built-in stat() functions =item FileCache Keep more files open than the system permits =item FileHandle Supply object methods for filehandles =item Filter::Simple Simplified source filtering =item Filter::Util::Call Perl Source Filter Utility Module =item FindBin Locate directory of original perl script =item GDBM_File Perl5 access to the gdbm library. =item Getopt::Long Extended processing of command line options =item Getopt::Std Process single-character switches with switch clustering =item HTTP::Tiny A small, simple, correct HTTP/1.1 client =item Hash::Util A selection of general-utility hash subroutines =item Hash::Util::FieldHash Support for Inside-Out Classes =item I18N::Collate Compare 8-bit scalar data according to the current locale =item I18N::LangTags Functions for dealing with RFC3066-style language tags =item I18N::LangTags::Detect Detect the user's language preferences =item I18N::LangTags::List Tags and names for human languages =item I18N::Langinfo Query locale information =item IO Load various IO modules =item IO::Compress::Base Base Class for IO::Compress modules =item IO::Compress::Bzip2 Write bzip2 files/buffers =item IO::Compress::Deflate Write RFC 1950 files/buffers =item IO::Compress::FAQ Frequently Asked Questions about IO::Compress =item IO::Compress::Gzip Write RFC 1952 files/buffers =item IO::Compress::RawDeflate Write RFC 1951 files/buffers =item IO::Compress::Zip Write zip files/buffers =item IO::Dir Supply object methods for directory handles =item IO::File Supply object methods for filehandles =item IO::Handle Supply object methods for I/O handles =item IO::Pipe Supply object methods for pipes =item IO::Poll Object interface to system poll call =item IO::Seekable Supply seek based methods for I/O objects =item IO::Select OO interface to the select system call =item IO::Socket Object interface to socket communications =item IO::Socket::INET Object interface for AF_INET domain sockets =item IO::Socket::UNIX Object interface for AF_UNIX domain sockets =item IO::Uncompress::AnyInflate Uncompress zlib-based (zip, gzip) file/buffer =item IO::Uncompress::AnyUncompress Uncompress gzip, zip, bzip2, xz, lzma, lzip, lzf or lzop file/buffer =item IO::Uncompress::Base Base Class for IO::Uncompress modules =item IO::Uncompress::Bunzip2 Read bzip2 files/buffers =item IO::Uncompress::Gunzip Read RFC 1952 files/buffers =item IO::Uncompress::Inflate Read RFC 1950 files/buffers =item IO::Uncompress::RawInflate Read RFC 1951 files/buffers =item IO::Uncompress::Unzip Read zip files/buffers =item IO::Zlib IO:: style interface to L<Compress::Zlib> =item IPC::Cmd Finding and running system commands made easy =item IPC::Msg SysV Msg IPC object class =item IPC::Open2 Open a process for both reading and writing using open2() =item IPC::Open3 Open a process for reading, writing, and error handling using open3() =item IPC::Semaphore SysV Semaphore IPC object class =item IPC::SharedMem SysV Shared Memory IPC object class =item IPC::SysV System V IPC constants and system calls =item Internals Reserved special namespace for internals related functions =item JSON::PP JSON::XS compatible pure-Perl module. =item JSON::PP::Boolean Dummy module providing JSON::PP::Boolean =item List::Util A selection of general-utility list subroutines =item List::Util::XS Indicate if List::Util was compiled with a C compiler =item Locale::Maketext Framework for localization =item Locale::Maketext::Cookbook Recipes for using Locale::Maketext =item Locale::Maketext::Guts Deprecated module to load Locale::Maketext utf8 code =item Locale::Maketext::GutsLoader Deprecated module to load Locale::Maketext utf8 code =item Locale::Maketext::Simple Simple interface to Locale::Maketext::Lexicon =item Locale::Maketext::TPJ13 Article about software localization =item MIME::Base64 Encoding and decoding of base64 strings =item MIME::QuotedPrint Encoding and decoding of quoted-printable strings =item Math::BigFloat Arbitrary size floating point math package =item Math::BigInt Arbitrary size integer/float math package =item Math::BigInt::Calc Pure Perl module to support Math::BigInt =item Math::BigInt::FastCalc Math::BigInt::Calc with some XS for more speed =item Math::BigInt::Lib Virtual parent class for Math::BigInt libraries =item Math::BigRat Arbitrary big rational numbers =item Math::Complex Complex numbers and associated mathematical functions =item Math::Trig Trigonometric functions =item Memoize Make functions faster by trading space for time =item Memoize::AnyDBM_File Glue to provide EXISTS for AnyDBM_File for Storable use =item Memoize::Expire Plug-in module for automatic expiration of memoized values =item Memoize::ExpireFile Test for Memoize expiration semantics =item Memoize::ExpireTest Test for Memoize expiration semantics =item Memoize::NDBM_File Glue to provide EXISTS for NDBM_File for Storable use =item Memoize::SDBM_File Glue to provide EXISTS for SDBM_File for Storable use =item Memoize::Storable Store Memoized data in Storable database =item Module::CoreList What modules shipped with versions of perl =item Module::CoreList::Utils What utilities shipped with versions of perl =item Module::Load Runtime require of both modules and files =item Module::Load::Conditional Looking up module information / loading at runtime =item Module::Loaded Mark modules as loaded or unloaded =item Module::Metadata Gather package and POD information from perl module files =item NDBM_File Tied access to ndbm files =item NEXT Provide a pseudo-class NEXT (et al) that allows method redispatch =item Net::Cmd Network Command class (as used by FTP, SMTP etc) =item Net::Config Local configuration data for libnet =item Net::Domain Attempt to evaluate the current host's internet name and domain =item Net::FTP FTP Client class =item Net::FTP::dataconn FTP Client data connection class =item Net::NNTP NNTP Client class =item Net::Netrc OO interface to users netrc file =item Net::POP3 Post Office Protocol 3 Client class (RFC1939) =item Net::Ping Check a remote host for reachability =item Net::SMTP Simple Mail Transfer Protocol Client =item Net::Time Time and daytime network client interface =item Net::hostent By-name interface to Perl's built-in gethost*() functions =item Net::libnetFAQ Libnet Frequently Asked Questions =item Net::netent By-name interface to Perl's built-in getnet*() functions =item Net::protoent By-name interface to Perl's built-in getproto*() functions =item Net::servent By-name interface to Perl's built-in getserv*() functions =item O Generic interface to Perl Compiler backends =item ODBM_File Tied access to odbm files =item Opcode Disable named opcodes when compiling perl code =item POSIX Perl interface to IEEE Std 1003.1 =item Params::Check A generic input parsing/checking mechanism. =item Parse::CPAN::Meta Parse META.yml and META.json CPAN metadata files =item Perl::OSType Map Perl operating system names to generic types =item PerlIO On demand loader for PerlIO layers and root of PerlIO::* name space =item PerlIO::encoding Encoding layer =item PerlIO::mmap Memory mapped IO =item PerlIO::scalar In-memory IO, scalar IO =item PerlIO::via Helper class for PerlIO layers implemented in perl =item PerlIO::via::QuotedPrint PerlIO layer for quoted-printable strings =item Pod::Checker Check pod documents for syntax errors =item Pod::Escapes For resolving Pod EE<lt>...E<gt> sequences =item Pod::Functions Group Perl's functions a la perlfunc.pod =item Pod::Html Module to convert pod files to HTML =item Pod::Man Convert POD data to formatted *roff input =item Pod::ParseLink Parse an LE<lt>E<gt> formatting code in POD text =item Pod::Perldoc Look up Perl documentation in Pod format. =item Pod::Perldoc::BaseTo Base for Pod::Perldoc formatters =item Pod::Perldoc::GetOptsOO Customized option parser for Pod::Perldoc =item Pod::Perldoc::ToANSI Render Pod with ANSI color escapes =item Pod::Perldoc::ToChecker Let Perldoc check Pod for errors =item Pod::Perldoc::ToMan Let Perldoc render Pod as man pages =item Pod::Perldoc::ToNroff Let Perldoc convert Pod to nroff =item Pod::Perldoc::ToPod Let Perldoc render Pod as ... Pod! =item Pod::Perldoc::ToRtf Let Perldoc render Pod as RTF =item Pod::Perldoc::ToTerm Render Pod with terminal escapes =item Pod::Perldoc::ToText Let Perldoc render Pod as plaintext =item Pod::Perldoc::ToTk Let Perldoc use Tk::Pod to render Pod =item Pod::Perldoc::ToXml Let Perldoc render Pod as XML =item Pod::Simple Framework for parsing Pod =item Pod::Simple::Checker Check the Pod syntax of a document =item Pod::Simple::Debug Put Pod::Simple into trace/debug mode =item Pod::Simple::DumpAsText Dump Pod-parsing events as text =item Pod::Simple::DumpAsXML Turn Pod into XML =item Pod::Simple::HTML Convert Pod to HTML =item Pod::Simple::HTMLBatch Convert several Pod files to several HTML files =item Pod::Simple::JustPod Just the Pod, the whole Pod, and nothing but the Pod =item Pod::Simple::LinkSection Represent "section" attributes of L codes =item Pod::Simple::Methody Turn Pod::Simple events into method calls =item Pod::Simple::PullParser A pull-parser interface to parsing Pod =item Pod::Simple::PullParserEndToken End-tokens from Pod::Simple::PullParser =item Pod::Simple::PullParserStartToken Start-tokens from Pod::Simple::PullParser =item Pod::Simple::PullParserTextToken Text-tokens from Pod::Simple::PullParser =item Pod::Simple::PullParserToken Tokens from Pod::Simple::PullParser =item Pod::Simple::RTF Format Pod as RTF =item Pod::Simple::Search Find POD documents in directory trees =item Pod::Simple::SimpleTree Parse Pod into a simple parse tree =item Pod::Simple::Subclassing Write a formatter as a Pod::Simple subclass =item Pod::Simple::Text Format Pod as plaintext =item Pod::Simple::TextContent Get the text content of Pod =item Pod::Simple::XHTML Format Pod as validating XHTML =item Pod::Simple::XMLOutStream Turn Pod into XML =item Pod::Text Convert POD data to formatted text =item Pod::Text::Color Convert POD data to formatted color ASCII text =item Pod::Text::Overstrike Convert POD data to formatted overstrike text =item Pod::Text::Termcap Convert POD data to ASCII text with format escapes =item Pod::Usage Print a usage message from embedded pod documentation =item SDBM_File Tied access to sdbm files =item Safe Compile and execute code in restricted compartments =item Scalar::Util A selection of general-utility scalar subroutines =item Search::Dict Look - search for key in dictionary file =item SelectSaver Save and restore selected file handle =item SelfLoader Load functions only on demand =item Storable Persistence for Perl data structures =item Sub::Util A selection of utility subroutines for subs and CODE references =item Symbol Manipulate Perl symbols and their names =item Sys::Hostname Try every conceivable way to get hostname =item Sys::Syslog Perl interface to the UNIX syslog(3) calls =item Sys::Syslog::Win32 Win32 support for Sys::Syslog =item TAP::Base Base class that provides common functionality to L<TAP::Parser> =item TAP::Formatter::Base Base class for harness output delegates =item TAP::Formatter::Color Run Perl test scripts with color =item TAP::Formatter::Console Harness output delegate for default console output =item TAP::Formatter::Console::ParallelSession Harness output delegate for parallel console output =item TAP::Formatter::Console::Session Harness output delegate for default console output =item TAP::Formatter::File Harness output delegate for file output =item TAP::Formatter::File::Session Harness output delegate for file output =item TAP::Formatter::Session Abstract base class for harness output delegate =item TAP::Harness Run test scripts with statistics =item TAP::Harness::Env Parsing harness related environmental variables where appropriate =item TAP::Object Base class that provides common functionality to all C<TAP::*> modules =item TAP::Parser Parse L<TAP|Test::Harness::TAP> output =item TAP::Parser::Aggregator Aggregate TAP::Parser results =item TAP::Parser::Grammar A grammar for the Test Anything Protocol. =item TAP::Parser::Iterator Base class for TAP source iterators =item TAP::Parser::Iterator::Array Iterator for array-based TAP sources =item TAP::Parser::Iterator::Process Iterator for process-based TAP sources =item TAP::Parser::Iterator::Stream Iterator for filehandle-based TAP sources =item TAP::Parser::IteratorFactory Figures out which SourceHandler objects to use for a given Source =item TAP::Parser::Multiplexer Multiplex multiple TAP::Parsers =item TAP::Parser::Result Base class for TAP::Parser output objects =item TAP::Parser::Result::Bailout Bailout result token. =item TAP::Parser::Result::Comment Comment result token. =item TAP::Parser::Result::Plan Plan result token. =item TAP::Parser::Result::Pragma TAP pragma token. =item TAP::Parser::Result::Test Test result token. =item TAP::Parser::Result::Unknown Unknown result token. =item TAP::Parser::Result::Version TAP syntax version token. =item TAP::Parser::Result::YAML YAML result token. =item TAP::Parser::ResultFactory Factory for creating TAP::Parser output objects =item TAP::Parser::Scheduler Schedule tests during parallel testing =item TAP::Parser::Scheduler::Job A single testing job. =item TAP::Parser::Scheduler::Spinner A no-op job. =item TAP::Parser::Source A TAP source & meta data about it =item TAP::Parser::SourceHandler Base class for different TAP source handlers =item TAP::Parser::SourceHandler::Executable Stream output from an executable TAP source =item TAP::Parser::SourceHandler::File Stream TAP from a text file. =item TAP::Parser::SourceHandler::Handle Stream TAP from an IO::Handle or a GLOB. =item TAP::Parser::SourceHandler::Perl Stream TAP from a Perl executable =item TAP::Parser::SourceHandler::RawTAP Stream output from raw TAP in a scalar/array ref. =item TAP::Parser::YAMLish::Reader Read YAMLish data from iterator =item TAP::Parser::YAMLish::Writer Write YAMLish data =item Term::ANSIColor Color screen output using ANSI escape sequences =item Term::Cap Perl termcap interface =item Term::Complete Perl word completion module =item Term::ReadLine Perl interface to various C<readline> packages. =item Test Provides a simple framework for writing test scripts =item Test2 Framework for writing test tools that all work together. =item Test2::API Primary interface for writing Test2 based testing tools. =item Test2::API::Breakage What breaks at what version =item Test2::API::Context Object to represent a testing context. =item Test2::API::Instance Object used by Test2::API under the hood =item Test2::API::Stack Object to manage a stack of L<Test2::Hub> =item Test2::Event Base class for events =item Test2::Event::Bail Bailout! =item Test2::Event::Diag Diag event type =item Test2::Event::Encoding Set the encoding for the output stream =item Test2::Event::Exception Exception event =item Test2::Event::Fail Event for a simple failed assertion =item Test2::Event::Generic Generic event type. =item Test2::Event::Note Note event type =item Test2::Event::Ok Ok event type =item Test2::Event::Pass Event for a simple passing assertion =item Test2::Event::Plan The event of a plan =item Test2::Event::Skip Skip event type =item Test2::Event::Subtest Event for subtest types =item Test2::Event::TAP::Version Event for TAP version. =item Test2::Event::V2 Second generation event. =item Test2::Event::Waiting Tell all procs/threads it is time to be done =item Test2::EventFacet Base class for all event facets. =item Test2::EventFacet::About Facet with event details. =item Test2::EventFacet::Amnesty Facet for assertion amnesty. =item Test2::EventFacet::Assert Facet representing an assertion. =item Test2::EventFacet::Control Facet for hub actions and behaviors. =item Test2::EventFacet::Error Facet for errors that need to be shown. =item Test2::EventFacet::Hub Facet for the hubs an event passes through. =item Test2::EventFacet::Info Facet for information a developer might care about. =item Test2::EventFacet::Info::Table Intermediary representation of a table. =item Test2::EventFacet::Meta Facet for meta-data =item Test2::EventFacet::Parent Facet for events contains other events =item Test2::EventFacet::Plan Facet for setting the plan =item Test2::EventFacet::Render Facet that dictates how to render an event. =item Test2::EventFacet::Trace Debug information for events =item Test2::Formatter Namespace for formatters. =item Test2::Formatter::TAP Standard TAP formatter =item Test2::Hub The conduit through which all events flow. =item Test2::Hub::Interceptor Hub used by interceptor to grab results. =item Test2::Hub::Interceptor::Terminator Exception class used by =item Test2::Hub::Subtest Hub used by subtests =item Test2::IPC Turn on IPC for threading or forking support. =item Test2::IPC::Driver Base class for Test2 IPC drivers. =item Test2::IPC::Driver::Files Temp dir + Files concurrency model. =item Test2::Tools::Tiny Tiny set of tools for unfortunate souls who cannot use =item Test2::Transition Transition notes when upgrading to Test2 =item Test2::Util Tools used by Test2 and friends. =item Test2::Util::ExternalMeta Allow third party tools to safely attach meta-data =item Test2::Util::Facets2Legacy Convert facet data to the legacy event API. =item Test2::Util::HashBase Build hash based classes. =item Test2::Util::Trace Legacy wrapper fro L<Test2::EventFacet::Trace>. =item Test::Builder Backend for building test libraries =item Test::Builder::Formatter Test::Builder subclass of Test2::Formatter::TAP =item Test::Builder::IO::Scalar A copy of IO::Scalar for Test::Builder =item Test::Builder::Module Base class for test modules =item Test::Builder::Tester Test testsuites that have been built with =item Test::Builder::Tester::Color Turn on colour in Test::Builder::Tester =item Test::Builder::TodoDiag Test::Builder subclass of Test2::Event::Diag =item Test::Harness Run Perl standard test scripts with statistics =item Test::Harness::Beyond Beyond make test =item Test::More Yet another framework for writing test scripts =item Test::Simple Basic utilities for writing tests. =item Test::Tester Ease testing test modules built with Test::Builder =item Test::Tester::Capture Help testing test modules built with Test::Builder =item Test::Tester::CaptureRunner Help testing test modules built with Test::Builder =item Test::Tutorial A tutorial about writing really basic tests =item Test::use::ok Alternative to Test::More::use_ok =item Text::Abbrev Abbrev - create an abbreviation table from a list =item Text::Balanced Extract delimited text sequences from strings. =item Text::ParseWords Parse text into an array of tokens or array of arrays =item Text::Tabs Expand and unexpand tabs like unix expand(1) and unexpand(1) =item Text::Wrap Line wrapping to form simple paragraphs =item Thread Manipulate threads in Perl (for old code only) =item Thread::Queue Thread-safe queues =item Thread::Semaphore Thread-safe semaphores =item Tie::Array Base class for tied arrays =item Tie::File Access the lines of a disk file via a Perl array =item Tie::Handle Base class definitions for tied handles =item Tie::Hash Base class definitions for tied hashes =item Tie::Hash::NamedCapture Named regexp capture buffers =item Tie::Memoize Add data to hash when needed =item Tie::RefHash Use references as hash keys =item Tie::Scalar Base class definitions for tied scalars =item Tie::StdHandle Base class definitions for tied handles =item Tie::SubstrHash Fixed-table-size, fixed-key-length hashing =item Time::HiRes High resolution alarm, sleep, gettimeofday, interval timers =item Time::Local Efficiently compute time from local and GMT time =item Time::Piece Object Oriented time objects =item Time::Seconds A simple API to convert seconds to other date values =item Time::gmtime By-name interface to Perl's built-in gmtime() function =item Time::localtime By-name interface to Perl's built-in localtime() function =item Time::tm Internal object used by Time::gmtime and Time::localtime =item UNIVERSAL Base class for ALL classes (blessed references) =item Unicode::Collate Unicode Collation Algorithm =item Unicode::Collate::CJK::Big5 Weighting CJK Unified Ideographs =item Unicode::Collate::CJK::GB2312 Weighting CJK Unified Ideographs =item Unicode::Collate::CJK::JISX0208 Weighting JIS KANJI for Unicode::Collate =item Unicode::Collate::CJK::Korean Weighting CJK Unified Ideographs =item Unicode::Collate::CJK::Pinyin Weighting CJK Unified Ideographs =item Unicode::Collate::CJK::Stroke Weighting CJK Unified Ideographs =item Unicode::Collate::CJK::Zhuyin Weighting CJK Unified Ideographs =item Unicode::Collate::Locale Linguistic tailoring for DUCET via Unicode::Collate =item Unicode::Normalize Unicode Normalization Forms =item Unicode::UCD Unicode character database =item User::grent By-name interface to Perl's built-in getgr*() functions =item User::pwent By-name interface to Perl's built-in getpw*() functions =item VMS::DCLsym Perl extension to manipulate DCL symbols =item VMS::Filespec Convert between VMS and Unix file specification syntax =item VMS::Stdio Standard I/O functions via VMS extensions =item Win32 Interfaces to some Win32 API Functions =item Win32API::File Low-level access to Win32 system API calls for files/dirs. =item Win32CORE Win32 CORE function stubs =item XS::APItest Test the perl C API =item XS::Typemap Module to test the XS typemaps distributed with perl =item XSLoader Dynamically load C libraries into Perl code =item autodie::Scope::Guard Wrapper class for calling subs at end of scope =item autodie::Scope::GuardStack Hook stack for managing scopes via %^H =item autodie::Util Internal Utility subroutines for autodie and Fatal =item version::Internals Perl extension for Version Objects =back To find out I<all> modules installed on your system, including those without documentation or outside the standard release, just use the following command (under the default win32 shell, double quotes should be used instead of single quotes). % perl -MFile::Find=find -MFile::Spec::Functions -Tlwe \ 'find { wanted => sub { print canonpath $_ if /\.pm\z/ }, no_chdir => 1 }, @INC' (The -T is here to prevent '.' from being listed in @INC.) They should all have their own documentation installed and accessible via your system man(1) command. If you do not have a B<find> program, you can use the Perl B<find2perl> program instead, which generates Perl code as output you can run through perl. If you have a B<man> program but it doesn't find your modules, you'll have to fix your manpath. See L<perl> for details. If you have no system B<man> command, you might try the B<perldoc> program. Note also that the command C<perldoc perllocal> gives you a (possibly incomplete) list of the modules that have been further installed on your system. (The perllocal.pod file is updated by the standard MakeMaker install process.) =head2 Extension Modules Extension modules are written in C (or a mix of Perl and C). They are usually dynamically loaded into Perl if and when you need them, but may also be linked in statically. Supported extension modules include Socket, Fcntl, and POSIX. Many popular C extension modules do not come bundled (at least, not completely) due to their sizes, volatility, or simply lack of time for adequate testing and configuration across the multitude of platforms on which Perl was beta-tested. You are encouraged to look for them on CPAN (described below), or using web search engines like Google or DuckDuckGo. =head1 CPAN CPAN stands for Comprehensive Perl Archive Network; it's a globally replicated trove of Perl materials, including documentation, style guides, tricks and traps, alternate ports to non-Unix systems and occasional binary distributions for these. Search engines for CPAN can be found at https://www.cpan.org/ Most importantly, CPAN includes around a thousand unbundled modules, some of which require a C compiler to build. Major categories of modules are: =over =item * Language Extensions and Documentation Tools =item * Development Support =item * Operating System Interfaces =item * Networking, Device Control (modems) and InterProcess Communication =item * Data Types and Data Type Utilities =item * Database Interfaces =item * User Interfaces =item * Interfaces to / Emulations of Other Programming Languages =item * File Names, File Systems and File Locking (see also File Handles) =item * String Processing, Language Text Processing, Parsing, and Searching =item * Option, Argument, Parameter, and Configuration File Processing =item * Internationalization and Locale =item * Authentication, Security, and Encryption =item * World Wide Web, HTML, HTTP, CGI, MIME =item * Server and Daemon Utilities =item * Archiving and Compression =item * Images, Pixmap and Bitmap Manipulation, Drawing, and Graphing =item * Mail and Usenet News =item * Control Flow Utilities (callbacks and exceptions etc) =item * File Handle and Input/Output Stream Utilities =item * Miscellaneous Modules =back The list of the registered CPAN sites follows. Please note that the sorting order is alphabetical on fields: Continent | |-->Country | |-->[state/province] | |-->ftp | |-->[http] and thus the North American servers happen to be listed between the European and the South American sites. Registered CPAN sites =for maintainers Generated by Porting/make_modlib_cpan.pl =head2 Africa =over 4 =item South Africa http://mirror.is.co.za/pub/cpan/ ftp://ftp.is.co.za/pub/cpan/ http://cpan.mirror.ac.za/ ftp://cpan.mirror.ac.za/ http://cpan.saix.net/ ftp://ftp.saix.net/pub/CPAN/ http://ftp.wa.co.za/pub/CPAN/ ftp://ftp.wa.co.za/pub/CPAN/ =item Uganda http://mirror.ucu.ac.ug/cpan/ =item Zimbabwe http://mirror.zol.co.zw/CPAN/ ftp://mirror.zol.co.zw/CPAN/ =back =head2 Asia =over 4 =item Bangladesh http://mirror.dhakacom.com/CPAN/ ftp://mirror.dhakacom.com/CPAN/ =item China http://cpan.communilink.net/ http://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/ ftp://ftp.cuhk.edu.hk/pub/packages/perl/CPAN/ http://mirrors.hust.edu.cn/CPAN/ http://mirrors.neusoft.edu.cn/cpan/ http://mirror.lzu.edu.cn/CPAN/ http://mirrors.163.com/cpan/ http://mirrors.sohu.com/CPAN/ http://mirrors.ustc.edu.cn/CPAN/ ftp://mirrors.ustc.edu.cn/CPAN/ http://mirrors.xmu.edu.cn/CPAN/ ftp://mirrors.xmu.edu.cn/CPAN/ http://mirrors.zju.edu.cn/CPAN/ =item India http://cpan.excellmedia.net/ http://perlmirror.indialinks.com/ =item Indonesia http://kambing.ui.ac.id/cpan/ http://cpan.pesat.net.id/ http://mirror.poliwangi.ac.id/CPAN/ http://kartolo.sby.datautama.net.id/CPAN/ http://mirror.wanxp.id/cpan/ =item Iran http://mirror.yazd.ac.ir/cpan/ =item Israel http://biocourse.weizmann.ac.il/CPAN/ =item Japan http://ftp.jaist.ac.jp/pub/CPAN/ ftp://ftp.jaist.ac.jp/pub/CPAN/ http://mirror.jre655.com/CPAN/ ftp://mirror.jre655.com/CPAN/ ftp://ftp.kddilabs.jp/CPAN/ http://ftp.nara.wide.ad.jp/pub/CPAN/ ftp://ftp.nara.wide.ad.jp/pub/CPAN/ http://ftp.riken.jp/lang/CPAN/ ftp://ftp.riken.jp/lang/CPAN/ ftp://ftp.u-aizu.ac.jp/pub/CPAN/ http://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/ ftp://ftp.yz.yamagata-u.ac.jp/pub/lang/cpan/ =item Kazakhstan http://mirror.neolabs.kz/CPAN/ ftp://mirror.neolabs.kz/CPAN/ =item Philippines http://mirror.pregi.net/CPAN/ ftp://mirror.pregi.net/CPAN/ http://mirror.rise.ph/cpan/ ftp://mirror.rise.ph/cpan/ =item Qatar http://mirror.qnren.qa/CPAN/ ftp://mirror.qnren.qa/CPAN/ =item Republic of Korea http://cpan.mirror.cdnetworks.com/ ftp://cpan.mirror.cdnetworks.com/CPAN/ http://ftp.kaist.ac.kr/pub/CPAN/ ftp://ftp.kaist.ac.kr/CPAN/ http://ftp.kr.freebsd.org/pub/CPAN/ ftp://ftp.kr.freebsd.org/pub/CPAN/ http://mirror.navercorp.com/CPAN/ http://ftp.neowiz.com/CPAN/ ftp://ftp.neowiz.com/CPAN/ =item Singapore http://cpan.mirror.choon.net/ http://mirror.0x.sg/CPAN/ ftp://mirror.0x.sg/CPAN/ =item Taiwan http://cpan.cdpa.nsysu.edu.tw/Unix/Lang/CPAN/ ftp://cpan.cdpa.nsysu.edu.tw/Unix/Lang/CPAN/ http://cpan.stu.edu.tw/ ftp://ftp.stu.edu.tw/CPAN/ http://ftp.yzu.edu.tw/CPAN/ ftp://ftp.yzu.edu.tw/CPAN/ http://cpan.nctu.edu.tw/ ftp://cpan.nctu.edu.tw/ http://ftp.ubuntu-tw.org/mirror/CPAN/ ftp://ftp.ubuntu-tw.org/mirror/CPAN/ =item Turkey http://cpan.ulak.net.tr/ ftp://ftp.ulak.net.tr/pub/perl/CPAN/ http://mirror.vit.com.tr/mirror/CPAN/ ftp://mirror.vit.com.tr/CPAN/ =item Viet Nam http://mirrors.digipower.vn/CPAN/ http://mirror.downloadvn.com/cpan/ http://mirrors.vinahost.vn/CPAN/ =back =head2 Europe =over 4 =item Austria http://cpan.inode.at/ ftp://cpan.inode.at/ http://mirror.easyname.at/cpan/ ftp://mirror.easyname.at/cpan/ http://gd.tuwien.ac.at/languages/perl/CPAN/ ftp://gd.tuwien.ac.at/pub/CPAN/ =item Belarus http://ftp.byfly.by/pub/CPAN/ ftp://ftp.byfly.by/pub/CPAN/ http://mirror.datacenter.by/pub/CPAN/ ftp://mirror.datacenter.by/pub/CPAN/ =item Belgium http://ftp.belnet.be/ftp.cpan.org/ ftp://ftp.belnet.be/mirror/ftp.cpan.org/ http://cpan.cu.be/ http://lib.ugent.be/CPAN/ http://cpan.weepeetelecom.be/ =item Bosnia and Herzegovina http://cpan.mirror.ba/ ftp://ftp.mirror.ba/CPAN/ =item Bulgaria http://mirrors.neterra.net/CPAN/ ftp://mirrors.neterra.net/CPAN/ http://mirrors.netix.net/CPAN/ ftp://mirrors.netix.net/CPAN/ =item Croatia http://ftp.carnet.hr/pub/CPAN/ ftp://ftp.carnet.hr/pub/CPAN/ =item Czech Republic http://mirror.dkm.cz/cpan/ ftp://mirror.dkm.cz/cpan/ ftp://ftp.fi.muni.cz/pub/CPAN/ http://mirrors.nic.cz/CPAN/ ftp://mirrors.nic.cz/pub/CPAN/ http://cpan.mirror.vutbr.cz/ ftp://mirror.vutbr.cz/cpan/ =item Denmark http://www.cpan.dk/ http://mirrors.dotsrc.org/cpan/ ftp://mirrors.dotsrc.org/cpan/ =item Finland ftp://ftp.funet.fi/pub/languages/perl/CPAN/ =item France http://ftp.ciril.fr/pub/cpan/ ftp://ftp.ciril.fr/pub/cpan/ http://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/ ftp://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/ http://ftp.lip6.fr/pub/perl/CPAN/ ftp://ftp.lip6.fr/pub/perl/CPAN/ http://mirror.ibcp.fr/pub/CPAN/ ftp://ftp.oleane.net/pub/CPAN/ http://cpan.mirrors.ovh.net/ftp.cpan.org/ ftp://cpan.mirrors.ovh.net/ftp.cpan.org/ http://cpan.enstimac.fr/ =item Germany http://mirror.23media.de/cpan/ ftp://mirror.23media.de/cpan/ http://artfiles.org/cpan.org/ ftp://artfiles.org/cpan.org/ http://mirror.bibleonline.ru/cpan/ http://mirror.checkdomain.de/CPAN/ ftp://mirror.checkdomain.de/CPAN/ http://cpan.noris.de/ http://mirror.de.leaseweb.net/CPAN/ ftp://mirror.de.leaseweb.net/CPAN/ http://cpan.mirror.euserv.net/ ftp://mirror.euserv.net/cpan/ http://ftp-stud.hs-esslingen.de/pub/Mirrors/CPAN/ ftp://mirror.fraunhofer.de/CPAN/ ftp://ftp.freenet.de/pub/ftp.cpan.org/pub/CPAN/ http://ftp.hosteurope.de/pub/CPAN/ ftp://ftp.hosteurope.de/pub/CPAN/ ftp://ftp.fu-berlin.de/unix/languages/perl/ http://ftp.gwdg.de/pub/languages/perl/CPAN/ ftp://ftp.gwdg.de/pub/languages/perl/CPAN/ http://ftp.hawo.stw.uni-erlangen.de/CPAN/ ftp://ftp.hawo.stw.uni-erlangen.de/CPAN/ http://cpan.mirror.iphh.net/ ftp://cpan.mirror.iphh.net/pub/CPAN/ ftp://ftp.mpi-inf.mpg.de/pub/perl/CPAN/ http://cpan.netbet.org/ http://mirror.netcologne.de/cpan/ ftp://mirror.netcologne.de/cpan/ ftp://mirror.petamem.com/CPAN/ http://www.planet-elektronik.de/CPAN/ http://ftp.halifax.rwth-aachen.de/cpan/ ftp://ftp.halifax.rwth-aachen.de/cpan/ http://mirror.softaculous.com/cpan/ http://ftp.u-tx.net/CPAN/ ftp://ftp.u-tx.net/CPAN/ http://mirror.reismil.ch/CPAN/ =item Greece http://cpan.cc.uoc.gr/mirrors/CPAN/ ftp://ftp.cc.uoc.gr/mirrors/CPAN/ http://ftp.ntua.gr/pub/lang/perl/ ftp://ftp.ntua.gr/pub/lang/perl/ =item Hungary http://mirror.met.hu/CPAN/ =item Ireland http://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/ ftp://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/ =item Italy http://bo.mirror.garr.it/mirrors/CPAN/ ftp://ftp.eutelia.it/CPAN_Mirror/ http://cpan.panu.it/ ftp://ftp.panu.it/pub/mirrors/perl/CPAN/ http://cpan.muzzy.it/ =item Latvia http://kvin.lv/pub/CPAN/ =item Lithuania http://ftp.litnet.lt/pub/CPAN/ ftp://ftp.litnet.lt/pub/CPAN/ =item Moldova http://mirror.as43289.net/pub/CPAN/ ftp://mirror.as43289.net/pub/CPAN/ =item Netherlands http://cpan.cs.uu.nl/ ftp://ftp.cs.uu.nl/pub/CPAN/ http://mirror.nl.leaseweb.net/CPAN/ ftp://mirror.nl.leaseweb.net/CPAN/ http://ftp.nluug.nl/languages/perl/CPAN/ ftp://ftp.nluug.nl/pub/languages/perl/CPAN/ http://mirror.transip.net/CPAN/ ftp://mirror.transip.net/CPAN/ http://cpan.mirror.triple-it.nl/ http://ftp.tudelft.nl/cpan/ ftp://ftp.tudelft.nl/pub/CPAN/ ftp://download.xs4all.nl/pub/mirror/CPAN/ =item Norway http://cpan.uib.no/ ftp://cpan.uib.no/pub/CPAN/ ftp://ftp.uninett.no/pub/languages/perl/CPAN/ http://cpan.vianett.no/ =item Poland http://ftp.agh.edu.pl/CPAN/ ftp://ftp.agh.edu.pl/CPAN/ http://ftp.piotrkosoft.net/pub/mirrors/CPAN/ ftp://ftp.piotrkosoft.net/pub/mirrors/CPAN/ ftp://ftp.ps.pl/pub/CPAN/ http://sunsite.icm.edu.pl/pub/CPAN/ ftp://sunsite.icm.edu.pl/pub/CPAN/ =item Portugal http://cpan.dcc.fc.up.pt/ http://mirrors.fe.up.pt/pub/CPAN/ http://cpan.perl-hackers.net/ http://cpan.perl.pt/ =item Romania http://mirrors.hostingromania.ro/cpan.org/ ftp://ftp.lug.ro/CPAN/ http://mirrors.m247.ro/CPAN/ http://mirrors.evowise.com/CPAN/ http://mirrors.teentelecom.net/CPAN/ ftp://mirrors.teentelecom.net/CPAN/ http://mirrors.xservers.ro/CPAN/ =item Russian Federation ftp://ftp.aha.ru/CPAN/ http://cpan.rinet.ru/ ftp://cpan.rinet.ru/pub/mirror/CPAN/ http://cpan-mirror.rbc.ru/pub/CPAN/ http://mirror.rol.ru/CPAN/ http://cpan.uni-altai.ru/ http://cpan.webdesk.ru/ ftp://cpan.webdesk.ru/cpan/ http://mirror.yandex.ru/mirrors/cpan/ ftp://mirror.yandex.ru/mirrors/cpan/ =item Serbia http://mirror.sbb.rs/CPAN/ ftp://mirror.sbb.rs/CPAN/ =item Slovakia http://cpan.lnx.sk/ http://tux.rainside.sk/CPAN/ ftp://tux.rainside.sk/CPAN/ =item Slovenia http://ftp.arnes.si/software/perl/CPAN/ ftp://ftp.arnes.si/software/perl/CPAN/ =item Spain http://mirrors.evowise.com/CPAN/ http://osl.ugr.es/CPAN/ http://ftp.rediris.es/mirror/CPAN/ ftp://ftp.rediris.es/mirror/CPAN/ =item Sweden http://ftp.acc.umu.se/mirror/CPAN/ ftp://ftp.acc.umu.se/mirror/CPAN/ =item Switzerland http://www.pirbot.com/mirrors/cpan/ http://mirror.switch.ch/ftp/mirror/CPAN/ ftp://mirror.switch.ch/mirror/CPAN/ =item Ukraine http://cpan.ip-connect.vn.ua/ ftp://cpan.ip-connect.vn.ua/mirror/cpan/ =item United Kingdom http://cpan.mirror.anlx.net/ ftp://ftp.mirror.anlx.net/CPAN/ http://mirror.bytemark.co.uk/CPAN/ ftp://mirror.bytemark.co.uk/CPAN/ http://mirrors.coreix.net/CPAN/ http://cpan.etla.org/ ftp://cpan.etla.org/pub/CPAN/ http://cpan.cpantesters.org/ http://mirror.sax.uk.as61049.net/CPAN/ http://mirror.sov.uk.goscomb.net/CPAN/ http://www.mirrorservice.org/sites/cpan.perl.org/CPAN/ ftp://ftp.mirrorservice.org/sites/cpan.perl.org/CPAN/ http://mirror.ox.ac.uk/sites/www.cpan.org/ ftp://mirror.ox.ac.uk/sites/www.cpan.org/ http://ftp.ticklers.org/pub/CPAN/ ftp://ftp.ticklers.org/pub/CPAN/ http://cpan.mirrors.uk2.net/ ftp://mirrors.uk2.net/pub/CPAN/ http://mirror.ukhost4u.com/CPAN/ =back =head2 North America =over 4 =item Canada http://CPAN.mirror.rafal.ca/ ftp://CPAN.mirror.rafal.ca/pub/CPAN/ http://mirror.csclub.uwaterloo.ca/CPAN/ ftp://mirror.csclub.uwaterloo.ca/CPAN/ http://mirrors.gossamer-threads.com/CPAN/ http://mirror.its.dal.ca/cpan/ ftp://mirror.its.dal.ca/cpan/ ftp://ftp.ottix.net/pub/CPAN/ =item Costa Rica http://mirrors.ucr.ac.cr/CPAN/ =item Mexico http://www.msg.com.mx/CPAN/ ftp://ftp.msg.com.mx/pub/CPAN/ =item United States =over 8 =item Alabama http://mirror.teklinks.com/CPAN/ =item Arizona http://mirror.n5tech.com/CPAN/ http://mirrors.namecheap.com/CPAN/ ftp://mirrors.namecheap.com/CPAN/ =item California http://cpan.develooper.com/ http://httpupdate127.cpanel.net/CPAN/ http://mirrors.sonic.net/cpan/ ftp://mirrors.sonic.net/cpan/ http://www.perl.com/CPAN/ http://cpan.yimg.com/ =item Idaho http://mirrors.syringanetworks.net/CPAN/ ftp://mirrors.syringanetworks.net/CPAN/ =item Illinois http://cpan.mirrors.hoobly.com/ http://mirror.team-cymru.org/CPAN/ ftp://mirror.team-cymru.org/CPAN/ =item Indiana http://cpan.netnitco.net/ ftp://cpan.netnitco.net/pub/mirrors/CPAN/ ftp://ftp.uwsg.iu.edu/pub/perl/CPAN/ =item Kansas http://mirrors.concertpass.com/cpan/ =item Massachusetts http://mirrors.ccs.neu.edu/CPAN/ =item Michigan http://cpan.cse.msu.edu/ ftp://cpan.cse.msu.edu/ http://httpupdate118.cpanel.net/CPAN/ http://mirrors-usa.go-parts.com/cpan/ http://ftp.wayne.edu/CPAN/ ftp://ftp.wayne.edu/CPAN/ =item New Hampshire http://mirror.metrocast.net/cpan/ =item New Jersey http://mirror.datapipe.net/CPAN/ ftp://mirror.datapipe.net/pub/CPAN/ http://www.hoovism.com/CPAN/ ftp://ftp.hoovism.com/CPAN/ http://cpan.mirror.nac.net/ =item New York http://mirror.cc.columbia.edu/pub/software/cpan/ ftp://mirror.cc.columbia.edu/pub/software/cpan/ http://cpan.belfry.net/ http://cpan.erlbaum.net/ ftp://cpan.erlbaum.net/CPAN/ http://cpan.hexten.net/ ftp://cpan.hexten.net/ http://mirror.nyi.net/CPAN/ ftp://mirror.nyi.net/pub/CPAN/ http://noodle.portalus.net/CPAN/ ftp://noodle.portalus.net/CPAN/ http://mirrors.rit.edu/CPAN/ ftp://mirrors.rit.edu/CPAN/ =item North Carolina http://httpupdate140.cpanel.net/CPAN/ http://mirrors.ibiblio.org/CPAN/ =item Oregon http://ftp.osuosl.org/pub/CPAN/ ftp://ftp.osuosl.org/pub/CPAN/ http://mirror.uoregon.edu/CPAN/ =item Pennsylvania http://cpan.pair.com/ ftp://cpan.pair.com/pub/CPAN/ http://cpan.mirrors.ionfish.org/ =item South Carolina http://cpan.mirror.clemson.edu/ =item Texas http://mirror.uta.edu/CPAN/ =item Utah http://cpan.cs.utah.edu/ ftp://cpan.cs.utah.edu/CPAN/ ftp://mirror.xmission.com/CPAN/ =item Virginia http://mirror.cogentco.com/pub/CPAN/ ftp://mirror.cogentco.com/pub/CPAN/ http://mirror.jmu.edu/pub/CPAN/ ftp://mirror.jmu.edu/pub/CPAN/ http://mirror.us.leaseweb.net/CPAN/ ftp://mirror.us.leaseweb.net/CPAN/ =item Washington http://cpan.llarian.net/ ftp://cpan.llarian.net/pub/CPAN/ =item Wisconsin http://cpan.mirrors.tds.net/ ftp://cpan.mirrors.tds.net/pub/CPAN/ =back =back =head2 Oceania =over 4 =item Australia http://mirror.as24220.net/pub/cpan/ ftp://mirror.as24220.net/pub/cpan/ http://cpan.mirrors.ilisys.com.au/ http://cpan.mirror.digitalpacific.com.au/ ftp://mirror.internode.on.net/pub/cpan/ http://mirror.optusnet.com.au/CPAN/ http://cpan.mirror.serversaustralia.com.au/ http://cpan.uberglobalmirror.com/ http://mirror.waia.asn.au/pub/cpan/ =item New Caledonia http://cpan.lagoon.nc/pub/CPAN/ ftp://cpan.lagoon.nc/pub/CPAN/ http://cpan.nautile.nc/CPAN/ ftp://cpan.nautile.nc/CPAN/ =item New Zealand ftp://ftp.auckland.ac.nz/pub/perl/CPAN/ http://cpan.catalyst.net.nz/CPAN/ ftp://cpan.catalyst.net.nz/pub/CPAN/ http://cpan.inspire.net.nz/ ftp://cpan.inspire.net.nz/cpan/ http://mirror.webtastix.net/CPAN/ ftp://mirror.webtastix.net/CPAN/ =back =head2 South America =over 4 =item Argentina http://cpan.mmgdesigns.com.ar/ =item Brazil http://cpan.kinghost.net/ http://linorg.usp.br/CPAN/ http://mirror.nbtelecom.com.br/CPAN/ =item Chile http://cpan.dcc.uchile.cl/ ftp://cpan.dcc.uchile.cl/pub/lang/cpan/ =back =head2 RSYNC Mirrors rsync://ftp.is.co.za/IS-Mirror/ftp.cpan.org/ rsync://mirror.ac.za/CPAN/ rsync://mirror.zol.co.zw/CPAN/ rsync://mirror.dhakacom.com/CPAN/ rsync://mirrors.ustc.edu.cn/CPAN/ rsync://mirrors.xmu.edu.cn/CPAN/ rsync://kambing.ui.ac.id/CPAN/ rsync://ftp.jaist.ac.jp/pub/CPAN/ rsync://mirror.jre655.com/CPAN/ rsync://ftp.kddilabs.jp/cpan/ rsync://ftp.nara.wide.ad.jp/cpan/ rsync://ftp.riken.jp/cpan/ rsync://mirror.neolabs.kz/CPAN/ rsync://mirror.qnren.qa/CPAN/ rsync://ftp.neowiz.com/CPAN/ rsync://mirror.0x.sg/CPAN/ rsync://ftp.yzu.edu.tw/pub/CPAN/ rsync://ftp.ubuntu-tw.org/CPAN/ rsync://mirrors.digipower.vn/CPAN/ rsync://cpan.inode.at/CPAN/ rsync://ftp.byfly.by/CPAN/ rsync://mirror.datacenter.by/CPAN/ rsync://ftp.belnet.be/cpan/ rsync://cpan.mirror.ba/CPAN/ rsync://mirrors.neterra.net/CPAN/ rsync://mirrors.netix.net/CPAN/ rsync://mirror.dkm.cz/cpan/ rsync://mirrors.nic.cz/CPAN/ rsync://cpan.mirror.vutbr.cz/cpan/ rsync://rsync.nic.funet.fi/CPAN/ rsync://ftp.ciril.fr/pub/cpan/ rsync://distrib-coffee.ipsl.jussieu.fr/pub/mirrors/cpan/ rsync://cpan.mirrors.ovh.net/CPAN/ rsync://mirror.de.leaseweb.net/CPAN/ rsync://mirror.euserv.net/cpan/ rsync://ftp-stud.hs-esslingen.de/CPAN/ rsync://ftp.gwdg.de/pub/languages/perl/CPAN/ rsync://ftp.hawo.stw.uni-erlangen.de/CPAN/ rsync://cpan.mirror.iphh.net/CPAN/ rsync://mirror.netcologne.de/cpan/ rsync://ftp.halifax.rwth-aachen.de/cpan/ rsync://ftp.ntua.gr/CPAN/ rsync://mirror.met.hu/CPAN/ rsync://ftp.heanet.ie/mirrors/ftp.perl.org/pub/CPAN/ rsync://rsync.panu.it/CPAN/ rsync://mirror.as43289.net/CPAN/ rsync://rsync.cs.uu.nl/CPAN/ rsync://mirror.nl.leaseweb.net/CPAN/ rsync://ftp.nluug.nl/CPAN/ rsync://mirror.transip.net/CPAN/ rsync://cpan.uib.no/cpan/ rsync://cpan.vianett.no/CPAN/ rsync://cpan.perl-hackers.net/CPAN/ rsync://cpan.perl.pt/cpan/ rsync://mirrors.m247.ro/CPAN/ rsync://mirrors.teentelecom.net/CPAN/ rsync://cpan.webdesk.ru/CPAN/ rsync://mirror.yandex.ru/mirrors/cpan/ rsync://mirror.sbb.rs/CPAN/ rsync://ftp.acc.umu.se/mirror/CPAN/ rsync://rsync.pirbot.com/ftp/cpan/ rsync://cpan.ip-connect.vn.ua/CPAN/ rsync://rsync.mirror.anlx.net/CPAN/ rsync://mirror.bytemark.co.uk/CPAN/ rsync://mirror.sax.uk.as61049.net/CPAN/ rsync://rsync.mirrorservice.org/cpan.perl.org/CPAN/ rsync://ftp.ticklers.org/CPAN/ rsync://mirrors.uk2.net/CPAN/ rsync://CPAN.mirror.rafal.ca/CPAN/ rsync://mirror.csclub.uwaterloo.ca/CPAN/ rsync://mirrors.namecheap.com/CPAN/ rsync://mirrors.syringanetworks.net/CPAN/ rsync://mirror.team-cymru.org/CPAN/ rsync://debian.cse.msu.edu/cpan/ rsync://mirrors-usa.go-parts.com/mirrors/cpan/ rsync://rsync.hoovism.com/CPAN/ rsync://mirror.cc.columbia.edu/cpan/ rsync://noodle.portalus.net/CPAN/ rsync://mirrors.rit.edu/cpan/ rsync://mirrors.ibiblio.org/CPAN/ rsync://cpan.pair.com/CPAN/ rsync://cpan.cs.utah.edu/CPAN/ rsync://mirror.cogentco.com/CPAN/ rsync://mirror.jmu.edu/CPAN/ rsync://mirror.us.leaseweb.net/CPAN/ rsync://cpan.mirror.digitalpacific.com.au/cpan/ rsync://mirror.internode.on.net/cpan/ rsync://uberglobalmirror.com/cpan/ rsync://cpan.lagoon.nc/cpan/ rsync://mirrors.mmgdesigns.com.ar/CPAN/ For an up-to-date listing of CPAN sites, see L<https://www.cpan.org/SITES> or L<ftp://www.cpan.org/SITES>. =head1 Modules: Creation, Use, and Abuse (The following section is borrowed directly from Tim Bunce's modules file, available at your nearest CPAN site.) Perl implements a class using a package, but the presence of a package doesn't imply the presence of a class. A package is just a namespace. A class is a package that provides subroutines that can be used as methods. A method is just a subroutine that expects, as its first argument, either the name of a package (for "static" methods), or a reference to something (for "virtual" methods). A module is a file that (by convention) provides a class of the same name (sans the .pm), plus an import method in that class that can be called to fetch exported symbols. This module may implement some of its methods by loading dynamic C or C++ objects, but that should be totally transparent to the user of the module. Likewise, the module might set up an AUTOLOAD function to slurp in subroutine definitions on demand, but this is also transparent. Only the F<.pm> file is required to exist. See L<perlsub>, L<perlobj>, and L<AutoLoader> for details about the AUTOLOAD mechanism. =head2 Guidelines for Module Creation =over 4 =item * Do similar modules already exist in some form? If so, please try to reuse the existing modules either in whole or by inheriting useful features into a new class. If this is not practical try to get together with the module authors to work on extending or enhancing the functionality of the existing modules. A perfect example is the plethora of packages in perl4 for dealing with command line options. If you are writing a module to expand an already existing set of modules, please coordinate with the author of the package. It helps if you follow the same naming scheme and module interaction scheme as the original author. =item * Try to design the new module to be easy to extend and reuse. Try to C<use warnings;> (or C<use warnings qw(...);>). Remember that you can add C<no warnings qw(...);> to individual blocks of code that need less warnings. Use blessed references. Use the two argument form of bless to bless into the class name given as the first parameter of the constructor, e.g.,: sub new { my $class = shift; return bless {}, $class; } or even this if you'd like it to be used as either a static or a virtual method. sub new { my $self = shift; my $class = ref($self) || $self; return bless {}, $class; } Pass arrays as references so more parameters can be added later (it's also faster). Convert functions into methods where appropriate. Split large methods into smaller more flexible ones. Inherit methods from other modules if appropriate. Avoid class name tests like: C<die "Invalid" unless ref $ref eq 'FOO'>. Generally you can delete the C<eq 'FOO'> part with no harm at all. Let the objects look after themselves! Generally, avoid hard-wired class names as far as possible. Avoid C<< $r->Class::func() >> where using C<@ISA=qw(... Class ...)> and C<< $r->func() >> would work. Use autosplit so little used or newly added functions won't be a burden to programs that don't use them. Add test functions to the module after __END__ either using AutoSplit or by saying: eval join('',<main::DATA>) || die $@ unless caller(); Does your module pass the 'empty subclass' test? If you say C<@SUBCLASS::ISA = qw(YOURCLASS);> your applications should be able to use SUBCLASS in exactly the same way as YOURCLASS. For example, does your application still work if you change: C<< $obj = YOURCLASS->new(); >> into: C<< $obj = SUBCLASS->new(); >> ? Avoid keeping any state information in your packages. It makes it difficult for multiple other packages to use yours. Keep state information in objects. Always use B<-w>. Try to C<use strict;> (or C<use strict qw(...);>). Remember that you can add C<no strict qw(...);> to individual blocks of code that need less strictness. Always use B<-w>. Follow the guidelines in L<perlstyle>. Always use B<-w>. =item * Some simple style guidelines The perlstyle manual supplied with Perl has many helpful points. Coding style is a matter of personal taste. Many people evolve their style over several years as they learn what helps them write and maintain good code. Here's one set of assorted suggestions that seem to be widely used by experienced developers: Use underscores to separate words. It is generally easier to read $var_names_like_this than $VarNamesLikeThis, especially for non-native speakers of English. It's also a simple rule that works consistently with VAR_NAMES_LIKE_THIS. Package/Module names are an exception to this rule. Perl informally reserves lowercase module names for 'pragma' modules like integer and strict. Other modules normally begin with a capital letter and use mixed case with no underscores (need to be short and portable). You may find it helpful to use letter case to indicate the scope or nature of a variable. For example: $ALL_CAPS_HERE constants only (beware clashes with Perl vars) $Some_Caps_Here package-wide global/static $no_caps_here function scope my() or local() variables Function and method names seem to work best as all lowercase. e.g., C<< $obj->as_string() >>. You can use a leading underscore to indicate that a variable or function should not be used outside the package that defined it. =item * Select what to export. Do NOT export method names! Do NOT export anything else by default without a good reason! Exports pollute the namespace of the module user. If you must export try to use @EXPORT_OK in preference to @EXPORT and avoid short or common names to reduce the risk of name clashes. Generally anything not exported is still accessible from outside the module using the ModuleName::item_name (or C<< $blessed_ref->method >>) syntax. By convention you can use a leading underscore on names to indicate informally that they are 'internal' and not for public use. (It is actually possible to get private functions by saying: C<my $subref = sub { ... }; &$subref;>. But there's no way to call that directly as a method, because a method must have a name in the symbol table.) As a general rule, if the module is trying to be object oriented then export nothing. If it's just a collection of functions then @EXPORT_OK anything but use @EXPORT with caution. =item * Select a name for the module. This name should be as descriptive, accurate, and complete as possible. Avoid any risk of ambiguity. Always try to use two or more whole words. Generally the name should reflect what is special about what the module does rather than how it does it. Please use nested module names to group informally or categorize a module. There should be a very good reason for a module not to have a nested name. Module names should begin with a capital letter. Having 57 modules all called Sort will not make life easy for anyone (though having 23 called Sort::Quick is only marginally better :-). Imagine someone trying to install your module alongside many others. If you are developing a suite of related modules/classes it's good practice to use nested classes with a common prefix as this will avoid namespace clashes. For example: Xyz::Control, Xyz::View, Xyz::Model etc. Use the modules in this list as a naming guide. If adding a new module to a set, follow the original author's standards for naming modules and the interface to methods in those modules. If developing modules for private internal or project specific use, that will never be released to the public, then you should ensure that their names will not clash with any future public module. You can do this either by using the reserved Local::* category or by using a category name that includes an underscore like Foo_Corp::*. To be portable each component of a module name should be limited to 11 characters. If it might be used on MS-DOS then try to ensure each is unique in the first 8 characters. Nested modules make this easier. For additional guidance on the naming of modules, please consult: https://pause.perl.org/pause/query?ACTION=pause_namingmodules or send mail to the <module-authors@perl.org> mailing list. =item * Have you got it right? How do you know that you've made the right decisions? Have you picked an interface design that will cause problems later? Have you picked the most appropriate name? Do you have any questions? The best way to know for sure, and pick up many helpful suggestions, is to ask someone who knows. The <module-authors@perl.org> mailing list is useful for this purpose; it's also accessible via news interface as perl.module-authors at nntp.perl.org. All you need to do is post a short summary of the module, its purpose and interfaces. A few lines on each of the main methods is probably enough. (If you post the whole module it might be ignored by busy people - generally the very people you want to read it!) Don't worry about posting if you can't say when the module will be ready - just say so in the message. It might be worth inviting others to help you, they may be able to complete it for you! =item * README and other Additional Files. It's well known that software developers usually fully document the software they write. If, however, the world is in urgent need of your software and there is not enough time to write the full documentation please at least provide a README file containing: =over 10 =item * A description of the module/package/extension etc. =item * A copyright notice - see below. =item * Prerequisites - what else you may need to have. =item * How to build it - possible changes to Makefile.PL etc. =item * How to install it. =item * Recent changes in this release, especially incompatibilities =item * Changes / enhancements you plan to make in the future. =back If the README file seems to be getting too large you may wish to split out some of the sections into separate files: INSTALL, Copying, ToDo etc. =over 4 =item * Adding a Copyright Notice. How you choose to license your work is a personal decision. The general mechanism is to assert your Copyright and then make a declaration of how others may copy/use/modify your work. Perl, for example, is supplied with two types of licence: The GNU GPL and The Artistic Licence (see the files README, Copying, and Artistic, or L<perlgpl> and L<perlartistic>). Larry has good reasons for NOT just using the GNU GPL. My personal recommendation, out of respect for Larry, Perl, and the Perl community at large is to state something simply like: Copyright (c) 1995 Your Name. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This statement should at least appear in the README file. You may also wish to include it in a Copying file and your source files. Remember to include the other words in addition to the Copyright. =item * Give the module a version/issue/release number. To be fully compatible with the Exporter and MakeMaker modules you should store your module's version number in a non-my package variable called $VERSION. This should be a positive floating point number with at least two digits after the decimal (i.e., hundredths, e.g, C<$VERSION = "0.01">). Don't use a "1.3.2" style version. See L<Exporter> for details. It may be handy to add a function or method to retrieve the number. Use the number in announcements and archive file names when releasing the module (ModuleName-1.02.tar.Z). See perldoc ExtUtils::MakeMaker.pm for details. =item * How to release and distribute a module. If possible, register the module with CPAN. Follow the instructions and links on: https://www.cpan.org/modules/04pause.html and upload to: https://pause.perl.org/ and notify <modules@perl.org>. This will allow anyone to install your module using the C<cpan> tool distributed with Perl. By using the WWW interface you can ask the Upload Server to mirror your modules from your ftp or WWW site into your own directory on CPAN! =item * Take care when changing a released module. Always strive to remain compatible with previous released versions. Otherwise try to add a mechanism to revert to the old behavior if people rely on it. Document incompatible changes. =back =back =head2 Guidelines for Converting Perl 4 Library Scripts into Modules =over 4 =item * There is no requirement to convert anything. If it ain't broke, don't fix it! Perl 4 library scripts should continue to work with no problems. You may need to make some minor changes (like escaping non-array @'s in double quoted strings) but there is no need to convert a .pl file into a Module for just that. =item * Consider the implications. All Perl applications that make use of the script will need to be changed (slightly) if the script is converted into a module. Is it worth it unless you plan to make other changes at the same time? =item * Make the most of the opportunity. If you are going to convert the script to a module you can use the opportunity to redesign the interface. The guidelines for module creation above include many of the issues you should consider. =item * The pl2pm utility will get you started. This utility will read *.pl files (given as parameters) and write corresponding *.pm files. The pl2pm utilities does the following: =over 10 =item * Adds the standard Module prologue lines =item * Converts package specifiers from ' to :: =item * Converts die(...) to croak(...) =item * Several other minor changes =back Being a mechanical process pl2pm is not bullet proof. The converted code will need careful checking, especially any package statements. Don't delete the original .pl file till the new .pm one works! =back =head2 Guidelines for Reusing Application Code =over 4 =item * Complete applications rarely belong in the Perl Module Library. =item * Many applications contain some Perl code that could be reused. Help save the world! Share your code in a form that makes it easy to reuse. =item * Break-out the reusable code into one or more separate module files. =item * Take the opportunity to reconsider and redesign the interfaces. =item * In some cases the 'application' can then be reduced to a small fragment of code built on top of the reusable modules. In these cases the application could invoked as: % perl -e 'use Module::Name; method(@ARGV)' ... or % perl -mModule::Name ... (in perl5.002 or higher) =back =head1 NOTE Perl does not enforce private and public parts of its modules as you may have been used to in other languages like C++, Ada, or Modula-17. Perl doesn't have an infatuation with enforced privacy. It would prefer that you stayed out of its living room because you weren't invited, not because it has a shotgun. The module and its user have a contract, part of which is common law, and part of which is "written". Part of the common law contract is that a module doesn't pollute any namespace it wasn't asked to. The written contract for the module (A.K.A. documentation) may make other provisions. But then you know when you C<use RedefineTheWorld> that you're redefining the world and willing to take the consequences. =cut ex: set ro: PK �=�[��=�� � perl5244delta.podnu �[��� =encoding utf8 =head1 NAME perl5244delta - what is new for perl v5.24.4 =head1 DESCRIPTION This document describes differences between the 5.24.3 release and the 5.24.4 release. If you are upgrading from an earlier release such as 5.24.2, first read L<perl5243delta>, which describes differences between 5.24.2 and 5.24.3. =head1 Security =head2 [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom (regcomp.c) A crafted regular expression could cause a heap buffer write overflow, with control over the bytes written. L<[perl #132227]|https://rt.perl.org/Public/Bug/Display.html?id=132227> =head2 [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string (utf8.c) Matching a crafted locale dependent regular expression could cause a heap buffer read overflow and potentially information disclosure. L<[perl #132063]|https://rt.perl.org/Public/Bug/Display.html?id=132063> =head2 [CVE-2018-6913] heap-buffer-overflow in S_pack_rec C<pack()> could cause a heap buffer write overflow with a large item count. L<[perl #131844]|https://rt.perl.org/Public/Bug/Display.html?id=131844> =head2 Assertion failure in Perl__core_swash_init (utf8.c) Control characters in a supposed Unicode property name could cause perl to crash. This has been fixed. L<[perl #132055]|https://rt.perl.org/Public/Bug/Display.html?id=132055> L<[perl #132553]|https://rt.perl.org/Public/Bug/Display.html?id=132553> L<[perl #132658]|https://rt.perl.org/Public/Bug/Display.html?id=132658> =head1 Incompatible Changes There are no changes intentionally incompatible with 5.24.3. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Module::CoreList> has been upgraded from version 5.20170922_24 to 5.20180414_24. =back =head1 Selected Bug Fixes =over 4 =item * The C<readpipe()> built-in function now checks at compile time that it has only one parameter expression, and puts it in scalar context, thus ensuring that it doesn't corrupt the stack at runtime. L<[perl #4574]|https://rt.perl.org/Public/Bug/Display.html?id=4574> =back =head1 Acknowledgements Perl 5.24.4 represents approximately 7 months of development since Perl 5.24.3 and contains approximately 2,400 lines of changes across 49 files from 12 authors. Excluding auto-generated files, documentation and release tools, there were approximately 1,300 lines of changes to 12 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.24.4: Abigail, Chris 'BinGOs' Williams, John SJ Anderson, Karen Etheridge, Karl Williamson, Renee Baecker, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves Orton, Zefram. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at L<https://rt.perl.org/> . There may also be information at L<http://www.perl.org/> , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications which make it inappropriate to send to a publicly archived mailing list, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[LW���g �g perlpolicy.podnu �[��� =encoding utf8 =head1 NAME perlpolicy - Various and sundry policies and commitments related to the Perl core =head1 DESCRIPTION This document is the master document which records all written policies about how the Perl 5 Porters collectively develop and maintain the Perl core. =head1 GOVERNANCE =head2 Perl 5 Porters Subscribers to perl5-porters (the porters themselves) come in several flavours. Some are quiet curious lurkers, who rarely pitch in and instead watch the ongoing development to ensure they're forewarned of new changes or features in Perl. Some are representatives of vendors, who are there to make sure that Perl continues to compile and work on their platforms. Some patch any reported bug that they know how to fix, some are actively patching their pet area (threads, Win32, the regexp -engine), while others seem to do nothing but complain. In other words, it's your usual mix of technical people. Among these people are the core Perl team. These are trusted volunteers involved in the ongoing development of the Perl language and interpreter. They are not required to be language developers or committers. Over this group of porters presides Larry Wall. He has the final word in what does and does not change in any of the Perl programming languages. These days, Larry spends most of his time on Raku, while Perl 5 is shepherded by a steering council of porters responsible for deciding what goes into each release and ensuring that releases happen on a regular basis. Larry sees Perl development along the lines of the US government: there's the Legislature (the porters, represented by the core team), the Executive branch (the steering council), and the Supreme Court (Larry). The legislature can discuss and submit patches to the executive branch all they like, but the executive branch is free to veto them. Rarely, the Supreme Court will side with the executive branch over the legislature, or the legislature over the executive branch. Mostly, however, the legislature and the executive branch are supposed to get along and work out their differences without impeachment or court cases. You might sometimes see reference to Rule 1 and Rule 2. Larry's power as Supreme Court is expressed in The Rules: =over 4 =item 1 Larry is always by definition right about how Perl should behave. This means he has final veto power on the core functionality. =item 2 Larry is allowed to change his mind about any matter at a later date, regardless of whether he previously invoked Rule 1. =back Got that? Larry is always right, even when he was wrong. It's rare to see either Rule exercised, but they are often alluded to. For the specifics on how the members of the core team and steering council are elected or rotated, consult L<perlgov>, which spells it all out in detail. =head1 MAINTENANCE AND SUPPORT Perl 5 is developed by a community, not a corporate entity. Every change contributed to the Perl core is the result of a donation. Typically, these donations are contributions of code or time by individual members of our community. On occasion, these donations come in the form of corporate or organizational sponsorship of a particular individual or project. As a volunteer organization, the commitments we make are heavily dependent on the goodwill and hard work of individuals who have no obligation to contribute to Perl. That being said, we value Perl's stability and security and have long had an unwritten covenant with the broader Perl community to support and maintain releases of Perl. This document codifies the support and maintenance commitments that the Perl community should expect from Perl's developers: =over =item * We "officially" support the two most recent stable release series. 5.26.x and earlier are now out of support. As of the release of 5.32.0, we will "officially" end support for Perl 5.28.x, other than providing security updates as described below. =item * To the best of our ability, we will attempt to fix critical issues in the two most recent stable 5.x release series. Fixes for the current release series take precedence over fixes for the previous release series. =item * To the best of our ability, we will provide "critical" security patches / releases for any major version of Perl whose 5.x.0 release was within the past three years. We can only commit to providing these for the most recent .y release in any 5.x.y series. =item * We will not provide security updates or bug fixes for development releases of Perl. =item * We encourage vendors to ship the most recent supported release of Perl at the time of their code freeze. =item * As a vendor, you may have a requirement to backport security fixes beyond our 3 year support commitment. We can provide limited support and advice to you as you do so and, where possible will try to apply those patches to the relevant -maint branches in git, though we may or may not choose to make numbered releases or "official" patches available. See L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details on how to begin that process. =back =head1 BACKWARD COMPATIBILITY AND DEPRECATION Our community has a long-held belief that backward-compatibility is a virtue, even when the functionality in question is a design flaw. We would all love to unmake some mistakes we've made over the past decades. Living with every design error we've ever made can lead to painful stagnation. Unwinding our mistakes is very, very difficult. Doing so without actively harming our users is nearly impossible. Lately, ignoring or actively opposing compatibility with earlier versions of Perl has come into vogue. Sometimes, a change is proposed which wants to usurp syntax which previously had another meaning. Sometimes, a change wants to improve previously-crazy semantics. Down this road lies madness. Requiring end-user programmers to change just a few language constructs, even language constructs which no well-educated developer would ever intentionally use is tantamount to saying "you should not upgrade to a new release of Perl unless you have 100% test coverage and can do a full manual audit of your codebase." If we were to have tools capable of reliably upgrading Perl source code from one version of Perl to another, this concern could be significantly mitigated. We want to ensure that Perl continues to grow and flourish in the coming years and decades, but not at the expense of our user community. Existing syntax and semantics should only be marked for destruction in very limited circumstances. If they are believed to be very rarely used, stand in the way of actual improvement to the Perl language or perl interpreter, and if affected code can be easily updated to continue working, they may be considered for removal. When in doubt, caution dictates that we will favor backward compatibility. When a feature is deprecated, a statement of reasoning describing the decision process will be posted, and a link to it will be provided in the relevant perldelta documents. Using a lexical pragma to enable or disable legacy behavior should be considered when appropriate, and in the absence of any pragma legacy behavior should be enabled. Which backward-incompatible changes are controlled implicitly by a 'use v5.x.y' is a decision which should be made by the steering council in consultation with the community. Historically, we've held ourselves to a far higher standard than backward-compatibility -- bugward-compatibility. Any accident of implementation or unintentional side-effect of running some bit of code has been considered to be a feature of the language to be defended with the same zeal as any other feature or functionality. No matter how frustrating these unintentional features may be to us as we continue to improve Perl, these unintentional features often deserve our protection. It is very important that existing software written in Perl continue to work correctly. If end-user developers have adopted a bug as a feature, we need to treat it as such. New syntax and semantics which don't break existing language constructs and syntax have a much lower bar. They merely need to prove themselves to be useful, elegant, well designed, and well tested. In most cases, these additions will be marked as I<experimental> for some time. See below for more on that. =head2 Terminology To make sure we're talking about the same thing when we discuss the removal of features or functionality from the Perl core, we have specific definitions for a few words and phrases. =over =item experimental If something in the Perl core is marked as B<experimental>, we may change its behaviour, deprecate or remove it without notice. While we'll always do our best to smooth the transition path for users of experimental features, you should contact the perl5-porters mailinglist if you find an experimental feature useful and want to help shape its future. Experimental features must be experimental in two stable releases before being marked non-experimental. Experimental features will only have their experimental status revoked when they no longer have any design-changing bugs open against them and when they have remained unchanged in behavior for the entire length of a development cycle. In other words, a feature present in v5.20.0 may be marked no longer experimental in v5.22.0 if and only if its behavior is unchanged throughout all of v5.21. =item deprecated If something in the Perl core is marked as B<deprecated>, we may remove it from the core in the future, though we might not. Generally, backward incompatible changes will have deprecation warnings for two release cycles before being removed, but may be removed after just one cycle if the risk seems quite low or the benefits quite high. As of Perl 5.12, deprecated features and modules warn the user as they're used. When a module is deprecated, it will also be made available on CPAN. Installing it from CPAN will silence deprecation warnings for that module. If you use a deprecated feature or module and believe that its removal from the Perl core would be a mistake, please contact the perl5-porters mailinglist and plead your case. We don't deprecate things without a good reason, but sometimes there's a counterargument we haven't considered. Historically, we did not distinguish between "deprecated" and "discouraged" features. =item discouraged From time to time, we may mark language constructs and features which we consider to have been mistakes as B<discouraged>. Discouraged features aren't currently candidates for removal, but we may later deprecate them if they're found to stand in the way of a significant improvement to the Perl core. =item removed Once a feature, construct or module has been marked as deprecated, we may remove it from the Perl core. Unsurprisingly, we say we've B<removed> these things. When a module is removed, it will no longer ship with Perl, but will continue to be available on CPAN. =back =head1 MAINTENANCE BRANCHES New releases of maintenance branches should only contain changes that fall into one of the "acceptable" categories set out below, but must not contain any changes that fall into one of the "unacceptable" categories. (For example, a fix for a crashing bug must not be included if it breaks binary compatibility.) It is not necessary to include every change meeting these criteria, and in general the focus should be on addressing security issues, crashing bugs, regressions and serious installation issues. The temptation to include a plethora of minor changes that don't affect the installation or execution of perl (e.g. spelling corrections in documentation) should be resisted in order to reduce the overall risk of overlooking something. The intention is to create maintenance releases which are both worthwhile and which users can have full confidence in the stability of. (A secondary concern is to avoid burning out the maint-release manager or overwhelming other committers voting on changes to be included (see L</"Getting changes into a maint branch"> below).) The following types of change may be considered acceptable, as long as they do not also fall into any of the "unacceptable" categories set out below: =over =item * Patches that fix CVEs or security issues. These changes should be passed using the security reporting mechanism rather than applied directly; see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION>. =item * Patches that fix crashing bugs, assertion failures and memory corruption but which do not otherwise change perl's functionality or negatively impact performance. =item * Patches that fix regressions in perl's behavior relative to previous releases, no matter how old the regression, since some people may upgrade from very old versions of perl to the latest version. =item * Patches that fix bugs in features that were new in the corresponding 5.x.0 stable release. =item * Patches that fix anything which prevents or seriously impacts the build or installation of perl. =item * Portability fixes, such as changes to Configure and the files in the hints/ folder. =item * Minimal patches that fix platform-specific test failures. =item * Documentation updates that correct factual errors, explain significant bugs or deficiencies in the current implementation, or fix broken markup. =item * Updates to dual-life modules should consist of minimal patches to fix crashing bugs or security issues (as above). Any changes made to dual-life modules for which CPAN is canonical should be coordinated with the upstream author. =back The following types of change are NOT acceptable: =over =item * Patches that break binary compatibility. (Please talk to the steering council.) =item * Patches that add or remove features. =item * Patches that add new warnings or errors or deprecate features. =item * Ports of Perl to a new platform, architecture or OS release that involve changes to the implementation. =item * New versions of dual-life modules should NOT be imported into maint. Those belong in the next stable series. =back If there is any question about whether a given patch might merit inclusion in a maint release, then it almost certainly should not be included. =head2 Getting changes into a maint branch Historically, only the single-person project manager cherry-picked changes from bleadperl into maintperl. This has scaling problems. At the same time, maintenance branches of stable versions of Perl need to be treated with great care. To that end, as of Perl 5.12, we have a new process for maint branches. Any committer may cherry-pick any commit from blead to a maint branch by first adding an entry to the relevant voting file in the maint-votes branch announcing the commit as a candidate for back-porting, and then waiting for at least two other committers to add their votes in support of this (i.e. a total of at least three votes is required before a commit may be back-ported). Most of the work involved in both rounding up a suitable set of candidate commits and cherry-picking those for which three votes have been cast will be done by the maint branch release manager, but anyone else is free to add other proposals if they're keen to ensure certain fixes don't get overlooked or fear they already have been. Other voting mechanisms may also be used instead (e.g. sending mail to perl5-porters and at least two other committers responding to the list giving their assent), as long as the same number of votes is gathered in a transparent manner. Specifically, proposals of which changes to cherry-pick must be visible to everyone on perl5-porters so that the views of everyone interested may be heard. It is not necessary for voting to be held on cherry-picking perldelta entries associated with changes that have already been cherry-picked, nor for the maint-release manager to obtain votes on changes required by the F<Porting/release_managers_guide.pod> where such changes can be applied by the means of cherry-picking from blead. =head1 CONTRIBUTED MODULES =head2 A Social Contract about Artistic Control What follows is a statement about artistic control, defined as the ability of authors of packages to guide the future of their code and maintain control over their work. It is a recognition that authors should have control over their work, and that it is a responsibility of the rest of the Perl community to ensure that they retain this control. It is an attempt to document the standards to which we, as Perl developers, intend to hold ourselves. It is an attempt to write down rough guidelines about the respect we owe each other as Perl developers. This statement is not a legal contract. This statement is not a legal document in any way, shape, or form. Perl is distributed under the GNU Public License and under the Artistic License; those are the precise legal terms. This statement isn't about the law or licenses. It's about community, mutual respect, trust, and good-faith cooperation. We recognize that the Perl core, defined as the software distributed with the heart of Perl itself, is a joint project on the part of all of us. From time to time, a script, module, or set of modules (hereafter referred to simply as a "module") will prove so widely useful and/or so integral to the correct functioning of Perl itself that it should be distributed with the Perl core. This should never be done without the author's explicit consent, and a clear recognition on all parts that this means the module is being distributed under the same terms as Perl itself. A module author should realize that inclusion of a module into the Perl core will necessarily mean some loss of control over it, since changes may occasionally have to be made on short notice or for consistency with the rest of Perl. Once a module has been included in the Perl core, however, everyone involved in maintaining Perl should be aware that the module is still the property of the original author unless the original author explicitly gives up their ownership of it. In particular: =over =item * The version of the module in the Perl core should still be considered the work of the original author. All patches, bug reports, and so forth should be fed back to them. Their development directions should be respected whenever possible. =item * Patches may be applied by the steering council without the explicit cooperation of the module author if and only if they are very minor, time-critical in some fashion (such as urgent security fixes), or if the module author cannot be reached. Those patches must still be given back to the author when possible, and if the author decides on an alternate fix in their version, that fix should be strongly preferred unless there is a serious problem with it. Any changes not endorsed by the author should be marked as such, and the contributor of the change acknowledged. =item * The version of the module distributed with Perl should, whenever possible, be the latest version of the module as distributed by the author (the latest non-beta version in the case of public Perl releases), although the steering council may hold off on upgrading the version of the module distributed with Perl to the latest version until the latest version has had sufficient testing. =back In other words, the author of a module should be considered to have final say on modifications to their module whenever possible (bearing in mind that it's expected that everyone involved will work together and arrive at reasonable compromises when there are disagreements). As a last resort, however: If the author's vision of the future of their module is sufficiently different from the vision of the steering council and perl5-porters as a whole so as to cause serious problems for Perl, the steering council may choose to formally fork the version of the module in the Perl core from the one maintained by the author. This should not be done lightly and should B<always> if at all possible be done only after direct input from Larry. If this is done, it must then be made explicit in the module as distributed with the Perl core that it is a forked version and that while it is based on the original author's work, it is no longer maintained by them. This must be noted in both the documentation and in the comments in the source of the module. Again, this should be a last resort only. Ideally, this should never happen, and every possible effort at cooperation and compromise should be made before doing this. If it does prove necessary to fork a module for the overall health of Perl, proper credit must be given to the original author in perpetuity and the decision should be constantly re-evaluated to see if a remerging of the two branches is possible down the road. In all dealings with contributed modules, everyone maintaining Perl should keep in mind that the code belongs to the original author, that they may not be on perl5-porters at any given time, and that a patch is not official unless it has been integrated into the author's copy of the module. To aid with this, and with points #1, #2, and #3 above, contact information for the authors of all contributed modules should be kept with the Perl distribution. Finally, the Perl community as a whole recognizes that respect for ownership of code, respect for artistic control, proper credit, and active effort to prevent unintentional code skew or communication gaps is vital to the health of the community and Perl itself. Members of a community should not normally have to resort to rules and laws to deal with each other, and this document, although it contains rules so as to be clear, is about an attitude and general approach. The first step in any dispute should be open communication, respect for opposing views, and an attempt at a compromise. In nearly every circumstance nothing more will be necessary, and certainly no more drastic measure should be used until every avenue of communication and discussion has failed. =head1 DOCUMENTATION Perl's documentation is an important resource for our users. It's incredibly important for Perl's documentation to be reasonably coherent and to accurately reflect the current implementation. Just as P5P collectively maintains the codebase, we collectively maintain the documentation. Writing a particular bit of documentation doesn't give an author control of the future of that documentation. At the same time, just as source code changes should match the style of their surrounding blocks, so should documentation changes. Examples in documentation should be illustrative of the concept they're explaining. Sometimes, the best way to show how a language feature works is with a small program the reader can run without modification. More often, examples will consist of a snippet of code containing only the "important" bits. The definition of "important" varies from snippet to snippet. Sometimes it's important to declare C<use strict> and C<use warnings>, initialize all variables and fully catch every error condition. More often than not, though, those things obscure the lesson the example was intended to teach. As Perl is developed by a global team of volunteers, our documentation often contains spellings which look funny to I<somebody>. Choice of American/British/Other spellings is left as an exercise for the author of each bit of documentation. When patching documentation, try to emulate the documentation around you, rather than changing the existing prose. In general, documentation should describe what Perl does "now" rather than what it used to do. It's perfectly reasonable to include notes in documentation about how behaviour has changed from previous releases, but, with very few exceptions, documentation isn't "dual-life" -- it doesn't need to fully describe how all old versions used to work. =head1 STANDARDS OF CONDUCT The official forum for the development of perl is the perl5-porters mailing list, mentioned above, and its bugtracker at GitHub. Posting to the list and the bugtracker is not a right: all participants in discussion are expected to adhere to a standard of conduct. =over 4 =item * Always be civil. =item * Heed the moderators. =back Civility is simple: stick to the facts while avoiding demeaning remarks, belittling other individuals, sarcasm, or a presumption of bad faith. It is not enough to be factual. You must also be civil. Responding in kind to incivility is not acceptable. If you relay otherwise-unposted comments to the list from a third party, you take responsibility for the content of those comments, and you must therefore ensure that they are civil. While civility is required, kindness is encouraged; if you have any doubt about whether you are being civil, simply ask yourself, "Am I being kind?" and aspire to that. If the list moderators tell you that you are not being civil, carefully consider how your words have appeared before responding in any way. Were they kind? You may protest, but repeated protest in the face of a repeatedly reaffirmed decision is not acceptable. Repeatedly protesting about the moderators' decisions regarding a third party is also unacceptable, as is continuing to initiate off-list contact with the moderators about their decisions. Unacceptable behavior will result in a public and clearly identified warning. A second instance of unacceptable behavior from the same individual will result in removal from the mailing list and GitHub issue tracker, for a period of one calendar month. The rationale for this is to provide an opportunity for the person to change the way they act. After the time-limited ban has been lifted, a third instance of unacceptable behavior will result in a further public warning. A fourth or subsequent instance will result in an indefinite ban. The rationale is that, in the face of an apparent refusal to change behavior, we must protect other community members from future unacceptable actions. The moderators may choose to lift an indefinite ban if the person in question affirms they will not transgress again. Removals, like warnings, are public. The list of moderators will be public knowledge. At present, it is: Karen Etheridge, Ricardo Signes, Sawyer X, Steffen Müller, Todd Rinaldo, Aaron Crane. =head1 CREDITS "Social Contract about Contributed Modules" originally by Russ Allbery E<lt>rra@stanford.eduE<gt> and the perl5-porters. PK �=�[T'�W perl5241delta.podnu �[��� =encoding utf8 =head1 NAME perl5241delta - what is new for perl v5.24.1 =head1 DESCRIPTION This document describes differences between the 5.24.0 release and the 5.24.1 release. If you are upgrading from an earlier release such as 5.22.0, first read L<perl5240delta>, which describes differences between 5.22.0 and 5.24.0. =head1 Security =head2 B<-Di> switch is now required for PerlIO debugging output Previously PerlIO debugging output would be sent to the file specified by the C<PERLIO_DEBUG> environment variable if perl wasn't running setuid and the B<-T> or B<-t> switches hadn't been parsed yet. If perl performed output at a point where it hadn't yet parsed its switches this could result in perl creating or overwriting the file named by C<PERLIO_DEBUG> even when the B<-T> switch had been supplied. Perl now requires the B<-Di> switch to produce PerlIO debugging output. By default this is written to C<stderr>, but can optionally be redirected to a file by setting the C<PERLIO_DEBUG> environment variable. If perl is running setuid or the B<-T> switch was supplied C<PERLIO_DEBUG> is ignored and the debugging output is sent to C<stderr> as for any other B<-D> switch. =head2 Core modules and tools no longer search F<"."> for optional modules The tools and many modules supplied in core no longer search the default current directory entry in L<C<@INC>|perlvar/@INC> for optional modules. For example, L<Storable> will remove the final F<"."> from C<@INC> before trying to load L<Log::Agent>. This prevents an attacker injecting an optional module into a process run by another user where the current directory is writable by the attacker, e.g. the F</tmp> directory. In most cases this removal should not cause problems, but difficulties were encountered with L<base>, which treats every module name supplied as optional. These difficulties have not yet been resolved, so for this release there are no changes to L<base>. We hope to have a fix for L<base> in Perl 5.24.2. To protect your own code from this attack, either remove the default F<"."> entry from C<@INC> at the start of your script, so: #!/usr/bin/perl use strict; ... becomes: #!/usr/bin/perl BEGIN { pop @INC if $INC[-1] eq '.' } use strict; ... or for modules, remove F<"."> from a localized C<@INC>, so: my $can_foo = eval { require Foo; } becomes: my $can_foo = eval { local @INC = @INC; pop @INC if $INC[-1] eq '.'; require Foo; }; =head1 Incompatible Changes Other than the security changes above there are no changes intentionally incompatible with Perl 5.24.0. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Archive::Tar> has been upgraded from version 2.04 to 2.04_01. =item * L<bignum> has been upgraded from version 0.42 to 0.42_01. =item * L<CPAN> has been upgraded from version 2.11 to 2.11_01. =item * L<Digest> has been upgraded from version 1.17 to 1.17_01. =item * L<Digest::SHA> has been upgraded from version 5.95 to 5.95_01. =item * L<Encode> has been upgraded from version 2.80 to 2.80_01. =item * L<ExtUtils::MakeMaker> has been upgraded from version 7.10_01 to 7.10_02. =item * L<File::Fetch> has been upgraded from version 0.48 to 0.48_01. =item * L<File::Spec> has been upgraded from version 3.63 to 3.63_01. =item * L<HTTP::Tiny> has been upgraded from version 0.056 to 0.056_001. =item * L<IO> has been upgraded from version 1.36 to 1.36_01. =item * The IO-Compress modules have been upgraded from version 2.069 to 2.069_001. =item * L<IPC::Cmd> has been upgraded from version 0.92 to 0.92_01. =item * L<JSON::PP> has been upgraded from version 2.27300 to 2.27300_01. =item * L<Locale::Maketext> has been upgraded from version 1.26 to 1.26_01. =item * L<Locale::Maketext::Simple> has been upgraded from version 0.21 to 0.21_01. =item * L<Memoize> has been upgraded from version 1.03 to 1.03_01. =item * L<Module::CoreList> has been upgraded from version 5.20160506 to 5.20170114_24. =item * L<Net::Ping> has been upgraded from version 2.43 to 2.43_01. =item * L<Parse::CPAN::Meta> has been upgraded from version 1.4417 to 1.4417_001. =item * L<Pod::Html> has been upgraded from version 1.22 to 1.2201. =item * L<Pod::Perldoc> has been upgraded from version 3.25_02 to 3.25_03. =item * L<Storable> has been upgraded from version 2.56 to 2.56_01. =item * L<Sys::Syslog> has been upgraded from version 0.33 to 0.33_01. =item * L<Test> has been upgraded from version 1.28 to 1.28_01. =item * L<Test::Harness> has been upgraded from version 3.36 to 3.36_01. =item * L<XSLoader> has been upgraded from version 0.21 to 0.22, fixing a security hole in which binary files could be loaded from a path outside of C<@INC>. L<[perl #128528]|https://rt.perl.org/Public/Bug/Display.html?id=128528> =back =head1 Documentation =head2 Changes to Existing Documentation =head3 L<perlapio> =over 4 =item * The documentation of C<PERLIO_DEBUG> has been updated. =back =head3 L<perlrun> =over 4 =item * The new B<-Di> switch has been documented, and the documentation of C<PERLIO_DEBUG> has been updated. =back =head1 Testing =over 4 =item * A new test script, F<t/run/switchDx.t>, has been added to test that the new B<-Di> switch is working correctly. =back =head1 Selected Bug Fixes =over 4 =item * The change to hashbang redirection introduced in Perl 5.24.0, whereby perl would redirect to another interpreter (Perl 6) if it found a hashbang path which contains "perl" followed by "6", has been reverted because it broke in cases such as C<#!/opt/perl64/bin/perl>. =back =head1 Acknowledgements Perl 5.24.1 represents approximately 8 months of development since Perl 5.24.0 and contains approximately 8,100 lines of changes across 240 files from 18 authors. Excluding auto-generated files, documentation and release tools, there were approximately 2,200 lines of changes to 170 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.24.1: Aaron Crane, Alex Vandiver, Aristotle Pagaltzis, Chad Granum, Chris 'BinGOs' Williams, Craig A. Berry, Father Chrysostomos, James E Keenan, Jarkko Hietaniemi, Karen Etheridge, Leon Timmermans, Matthew Horsfall, Ricardo Signes, Sawyer X, Sébastien Aperghis-Tramoni, Stevan Little, Steve Hay, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the Perl bug database at L<https://rt.perl.org/> . There may also be information at L<http://www.perl.org/> , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications which make it inappropriate to send to a publicly archived mailing list, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[���� perl5160delta.podnu �[��� =encoding utf8 =head1 NAME perl5160delta - what is new for perl v5.16.0 =head1 DESCRIPTION This document describes differences between the 5.14.0 release and the 5.16.0 release. If you are upgrading from an earlier release such as 5.12.0, first read L<perl5140delta>, which describes differences between 5.12.0 and 5.14.0. Some bug fixes in this release have been backported to later releases of 5.14.x. Those are indicated with the 5.14.x version in parentheses. =head1 Notice With the release of Perl 5.16.0, the 5.12.x series of releases is now out of its support period. There may be future 5.12.x releases, but only in the event of a critical security issue. Users of Perl 5.12 or earlier should consider upgrading to a more recent release of Perl. This policy is described in greater detail in L<perlpolicy|perlpolicy/MAINTENANCE AND SUPPORT>. =head1 Core Enhancements =head2 C<use I<VERSION>> As of this release, version declarations like C<use v5.16> now disable all features before enabling the new feature bundle. This means that the following holds true: use 5.016; # only 5.16 features enabled here use 5.014; # only 5.14 features enabled here (not 5.16) C<use v5.12> and higher continue to enable strict, but explicit C<use strict> and C<no strict> now override the version declaration, even when they come first: no strict; use 5.012; # no strict here There is a new ":default" feature bundle that represents the set of features enabled before any version declaration or C<use feature> has been seen. Version declarations below 5.10 now enable the ":default" feature set. This does not actually change the behavior of C<use v5.8>, because features added to the ":default" set are those that were traditionally enabled by default, before they could be turned off. C<< no feature >> now resets to the default feature set. To disable all features (which is likely to be a pretty special-purpose request, since it presumably won't match any named set of semantics) you can now write C<< no feature ':all' >>. C<$[> is now disabled under C<use v5.16>. It is part of the default feature set and can be turned on or off explicitly with C<use feature 'array_base'>. =head2 C<__SUB__> The new C<__SUB__> token, available under the C<current_sub> feature (see L<feature>) or C<use v5.16>, returns a reference to the current subroutine, making it easier to write recursive closures. =head2 New and Improved Built-ins =head3 More consistent C<eval> The C<eval> operator sometimes treats a string argument as a sequence of characters and sometimes as a sequence of bytes, depending on the internal encoding. The internal encoding is not supposed to make any difference, but there is code that relies on this inconsistency. The new C<unicode_eval> and C<evalbytes> features (enabled under C<use 5.16.0>) resolve this. The C<unicode_eval> feature causes C<eval $string> to treat the string always as Unicode. The C<evalbytes> features provides a function, itself called C<evalbytes>, which evaluates its argument always as a string of bytes. These features also fix oddities with source filters leaking to outer dynamic scopes. See L<feature> for more detail. =head3 C<substr> lvalue revamp =for comment Does this belong here, or under Incompatible Changes? When C<substr> is called in lvalue or potential lvalue context with two or three arguments, a special lvalue scalar is returned that modifies the original string (the first argument) when assigned to. Previously, the offsets (the second and third arguments) passed to C<substr> would be converted immediately to match the string, negative offsets being translated to positive and offsets beyond the end of the string being truncated. Now, the offsets are recorded without modification in the special lvalue scalar that is returned, and the original string is not even looked at by C<substr> itself, but only when the returned lvalue is read or modified. These changes result in an incompatible change: If the original string changes length after the call to C<substr> but before assignment to its return value, negative offsets will remember their position from the end of the string, affecting code like this: my $string = "string"; my $lvalue = \substr $string, -4, 2; print $$lvalue, "\n"; # prints "ri" $string = "bailing twine"; print $$lvalue, "\n"; # prints "wi"; used to print "il" The same thing happens with an omitted third argument. The returned lvalue will always extend to the end of the string, even if the string becomes longer. Since this change also allowed many bugs to be fixed (see L</The C<substr> operator>), and since the behavior of negative offsets has never been specified, the change was deemed acceptable. =head3 Return value of C<tied> The value returned by C<tied> on a tied variable is now the actual scalar that holds the object to which the variable is tied. This lets ties be weakened with C<Scalar::Util::weaken(tied $tied_variable)>. =head2 Unicode Support =head3 Supports (I<almost>) Unicode 6.1 Besides the addition of whole new scripts, and new characters in existing scripts, this new version of Unicode, as always, makes some changes to existing characters. One change that may trip up some applications is that the General Category of two characters in the Latin-1 range, PILCROW SIGN and SECTION SIGN, has been changed from Other_Symbol to Other_Punctuation. The same change has been made for a character in each of Tibetan, Ethiopic, and Aegean. The code points U+3248..U+324F (CIRCLED NUMBER TEN ON BLACK SQUARE through CIRCLED NUMBER EIGHTY ON BLACK SQUARE) have had their General Category changed from Other_Symbol to Other_Numeric. The Line Break property has changes for Hebrew and Japanese; and because of other changes in 6.1, the Perl regular expression construct C<\X> now works differently for some characters in Thai and Lao. New aliases (synonyms) have been defined for many property values; these, along with the previously existing ones, are all cross-indexed in L<perluniprops>. The return value of C<charnames::viacode()> is affected by other changes: Code point Old Name New Name U+000A LINE FEED (LF) LINE FEED U+000C FORM FEED (FF) FORM FEED U+000D CARRIAGE RETURN (CR) CARRIAGE RETURN U+0085 NEXT LINE (NEL) NEXT LINE U+008E SINGLE-SHIFT 2 SINGLE-SHIFT-2 U+008F SINGLE-SHIFT 3 SINGLE-SHIFT-3 U+0091 PRIVATE USE 1 PRIVATE USE-1 U+0092 PRIVATE USE 2 PRIVATE USE-2 U+2118 SCRIPT CAPITAL P WEIERSTRASS ELLIPTIC FUNCTION Perl will accept any of these names as input, but C<charnames::viacode()> now returns the new name of each pair. The change for U+2118 is considered by Unicode to be a correction, that is the original name was a mistake (but again, it will remain forever valid to use it to refer to U+2118). But most of these changes are the fallout of the mistake Unicode 6.0 made in naming a character used in Japanese cell phones to be "BELL", which conflicts with the longstanding industry use of (and Unicode's recommendation to use) that name to mean the ASCII control character at U+0007. Therefore, that name has been deprecated in Perl since v5.14, and any use of it will raise a warning message (unless turned off). The name "ALERT" is now the preferred name for this code point, with "BEL" an acceptable short form. The name for the new cell phone character, at code point U+1F514, remains undefined in this version of Perl (hence we don't implement quite all of Unicode 6.1), but starting in v5.18, BELL will mean this character, and not U+0007. Unicode has taken steps to make sure that this sort of mistake does not happen again. The Standard now includes all generally accepted names and abbreviations for control characters, whereas previously it didn't (though there were recommended names for most of them, which Perl used). This means that most of those recommended names are now officially in the Standard. Unicode did not recommend names for the four code points listed above between U+008E and U+008F, and in standardizing them Unicode subtly changed the names that Perl had previously given them, by replacing the final blank in each name by a hyphen. Unicode also officially accepts names that Perl had deprecated, such as FILE SEPARATOR. Now the only deprecated name is BELL. Finally, Perl now uses the new official names instead of the old (now considered obsolete) names for the first four code points in the list above (the ones which have the parentheses in them). Now that the names have been placed in the Unicode standard, these kinds of changes should not happen again, though corrections, such as to U+2118, are still possible. Unicode also added some name abbreviations, which Perl now accepts: SP for SPACE; TAB for CHARACTER TABULATION; NEW LINE, END OF LINE, NL, and EOL for LINE FEED; LOCKING-SHIFT ONE for SHIFT OUT; LOCKING-SHIFT ZERO for SHIFT IN; and ZWNBSP for ZERO WIDTH NO-BREAK SPACE. More details on this version of Unicode are provided in L<http://www.unicode.org/versions/Unicode6.1.0/>. =head3 C<use charnames> is no longer needed for C<\N{I<name>}> When C<\N{I<name>}> is encountered, the C<charnames> module is now automatically loaded when needed as if the C<:full> and C<:short> options had been specified. See L<charnames> for more information. =head3 C<\N{...}> can now have Unicode loose name matching This is described in the C<charnames> item in L</Updated Modules and Pragmata> below. =head3 Unicode Symbol Names Perl now has proper support for Unicode in symbol names. It used to be that C<*{$foo}> would ignore the internal UTF8 flag and use the bytes of the underlying representation to look up the symbol. That meant that C<*{"\x{100}"}> and C<*{"\xc4\x80"}> would return the same thing. All these parts of Perl have been fixed to account for Unicode: =over =item * Method names (including those passed to C<use overload>) =item * Typeglob names (including names of variables, subroutines, and filehandles) =item * Package names =item * C<goto> =item * Symbolic dereferencing =item * Second argument to C<bless()> and C<tie()> =item * Return value of C<ref()> =item * Subroutine prototypes =item * Attributes =item * Various warnings and error messages that mention variable names or values, methods, etc. =back In addition, a parsing bug has been fixed that prevented C<*{é}> from implicitly quoting the name, but instead interpreted it as C<*{+é}>, which would cause a strict violation. C<*{"*a::b"}> automatically strips off the * if it is followed by an ASCII letter. That has been extended to all Unicode identifier characters. One-character non-ASCII non-punctuation variables (like C<$é>) are now subject to "Used only once" warnings. They used to be exempt, as they were treated as punctuation variables. Also, single-character Unicode punctuation variables (like C<$‰>) are now supported [perl #69032]. =head3 Improved ability to mix locales and Unicode, including UTF-8 locales An optional parameter has been added to C<use locale> use locale ':not_characters'; which tells Perl to use all but the C<LC_CTYPE> and C<LC_COLLATE> portions of the current locale. Instead, the character set is assumed to be Unicode. This lets locales and Unicode be seamlessly mixed, including the increasingly frequent UTF-8 locales. When using this hybrid form of locales, the C<:locale> layer to the L<open> pragma can be used to interface with the file system, and there are CPAN modules available for ARGV and environment variable conversions. Full details are in L<perllocale>. =head3 New function C<fc> and corresponding escape sequence C<\F> for Unicode foldcase Unicode foldcase is an extension to lowercase that gives better results when comparing two strings case-insensitively. It has long been used internally in regular expression C</i> matching. Now it is available explicitly through the new C<fc> function call (enabled by S<C<"use feature 'fc'">>, or C<use v5.16>, or explicitly callable via C<CORE::fc>) or through the new C<\F> sequence in double-quotish strings. Full details are in L<perlfunc/fc>. =head3 The Unicode C<Script_Extensions> property is now supported. New in Unicode 6.0, this is an improved C<Script> property. Details are in L<perlunicode/Scripts>. =head2 XS Changes =head3 Improved typemaps for Some Builtin Types Most XS authors will know there is a longstanding bug in the OUTPUT typemap for T_AVREF (C<AV*>), T_HVREF (C<HV*>), T_CVREF (C<CV*>), and T_SVREF (C<SVREF> or C<\$foo>) that requires manually decrementing the reference count of the return value instead of the typemap taking care of this. For backwards-compatibility, this cannot be changed in the default typemaps. But we now provide additional typemaps C<T_AVREF_REFCOUNT_FIXED>, etc. that do not exhibit this bug. Using them in your extension is as simple as having one line in your C<TYPEMAP> section: HV* T_HVREF_REFCOUNT_FIXED =head3 C<is_utf8_char()> The XS-callable function C<is_utf8_char()>, when presented with malformed UTF-8 input, can read up to 12 bytes beyond the end of the string. This cannot be fixed without changing its API, and so its use is now deprecated. Use C<is_utf8_char_buf()> (described just below) instead. =head3 Added C<is_utf8_char_buf()> This function is designed to replace the deprecated L</is_utf8_char()> function. It includes an extra parameter to make sure it doesn't read past the end of the input buffer. =head3 Other C<is_utf8_foo()> functions, as well as C<utf8_to_foo()>, etc. Most other XS-callable functions that take UTF-8 encoded input implicitly assume that the UTF-8 is valid (not malformed) with respect to buffer length. Do not do things such as change a character's case or see if it is alphanumeric without first being sure that it is valid UTF-8. This can be safely done for a whole string by using one of the functions C<is_utf8_string()>, C<is_utf8_string_loc()>, and C<is_utf8_string_loclen()>. =head3 New Pad API Many new functions have been added to the API for manipulating lexical pads. See L<perlapi/Pad Data Structures> for more information. =head2 Changes to Special Variables =head3 C<$$> can be assigned to C<$$> was made read-only in Perl 5.8.0. But only sometimes: C<local $$> would make it writable again. Some CPAN modules were using C<local $$> or XS code to bypass the read-only check, so there is no reason to keep C<$$> read-only. (This change also allowed a bug to be fixed while maintaining backward compatibility.) =head3 C<$^X> converted to an absolute path on FreeBSD, OS X and Solaris C<$^X> is now converted to an absolute path on OS X, FreeBSD (without needing F</proc> mounted) and Solaris 10 and 11. This augments the previous approach of using F</proc> on Linux, FreeBSD, and NetBSD (in all cases, where mounted). This makes relocatable perl installations more useful on these platforms. (See "Relocatable @INC" in F<INSTALL>) =head2 Debugger Changes =head3 Features inside the debugger The current Perl's L<feature> bundle is now enabled for commands entered in the interactive debugger. =head3 New option for the debugger's B<t> command The B<t> command in the debugger, which toggles tracing mode, now accepts a numeric argument that determines how many levels of subroutine calls to trace. =head3 C<enable> and C<disable> The debugger now has C<disable> and C<enable> commands for disabling existing breakpoints and re-enabling them. See L<perldebug>. =head3 Breakpoints with file names The debugger's "b" command for setting breakpoints now lets a line number be prefixed with a file name. See L<perldebug/"b [file]:[line] [condition]">. =head2 The C<CORE> Namespace =head3 The C<CORE::> prefix The C<CORE::> prefix can now be used on keywords enabled by L<feature.pm|feature>, even outside the scope of C<use feature>. =head3 Subroutines in the C<CORE> namespace Many Perl keywords are now available as subroutines in the CORE namespace. This lets them be aliased: BEGIN { *entangle = \&CORE::tie } entangle $variable, $package, @args; And for prototypes to be bypassed: sub mytie(\[%$*@]$@) { my ($ref, $pack, @args) = @_; ... do something ... goto &CORE::tie; } Some of these cannot be called through references or via C<&foo> syntax, but must be called as barewords. See L<CORE> for details. =head2 Other Changes =head3 Anonymous handles Automatically generated file handles are now named __ANONIO__ when the variable name cannot be determined, rather than $__ANONIO__. =head3 Autoloaded sort Subroutines Custom sort subroutines can now be autoloaded [perl #30661]: sub AUTOLOAD { ... } @sorted = sort foo @list; # uses AUTOLOAD =head3 C<continue> no longer requires the "switch" feature The C<continue> keyword has two meanings. It can introduce a C<continue> block after a loop, or it can exit the current C<when> block. Up to now, the latter meaning was valid only with the "switch" feature enabled, and was a syntax error otherwise. Since the main purpose of feature.pm is to avoid conflicts with user-defined subroutines, there is no reason for C<continue> to depend on it. =head3 DTrace probes for interpreter phase change The C<phase-change> probes will fire when the interpreter's phase changes, which tracks the C<${^GLOBAL_PHASE}> variable. C<arg0> is the new phase name; C<arg1> is the old one. This is useful for limiting your instrumentation to one or more of: compile time, run time, or destruct time. =head3 C<__FILE__()> Syntax The C<__FILE__>, C<__LINE__> and C<__PACKAGE__> tokens can now be written with an empty pair of parentheses after them. This makes them parse the same way as C<time>, C<fork> and other built-in functions. =head3 The C<\$> prototype accepts any scalar lvalue The C<\$> and C<\[$]> subroutine prototypes now accept any scalar lvalue argument. Previously they accepted only scalars beginning with C<$> and hash and array elements. This change makes them consistent with the way the built-in C<read> and C<recv> functions (among others) parse their arguments. This means that one can override the built-in functions with custom subroutines that parse their arguments the same way. =head3 C<_> in subroutine prototypes The C<_> character in subroutine prototypes is now allowed before C<@> or C<%>. =head1 Security =head2 Use C<is_utf8_char_buf()> and not C<is_utf8_char()> The latter function is now deprecated because its API is insufficient to guarantee that it doesn't read (up to 12 bytes in the worst case) beyond the end of its input string. See L<is_utf8_char_buf()|/Added is_utf8_char_buf()>. =head2 Malformed UTF-8 input could cause attempts to read beyond the end of the buffer Two new XS-accessible functions, C<utf8_to_uvchr_buf()> and C<utf8_to_uvuni_buf()> are now available to prevent this, and the Perl core has been converted to use them. See L</Internal Changes>. =head2 C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728). Calling C<File::Glob::bsd_glob> with the unsupported flag GLOB_ALTDIRFUNC would cause an access violation / segfault. A Perl program that accepts a flags value from an external source could expose itself to denial of service or arbitrary code execution attacks. There are no known exploits in the wild. The problem has been corrected by explicitly disabling all unsupported flags and setting unused function pointers to null. Bug reported by Clément Lecigne. (5.14.2) =head2 Privileges are now set correctly when assigning to C<$(> A hypothetical bug (probably unexploitable in practice) because the incorrect setting of the effective group ID while setting C<$(> has been fixed. The bug would have affected only systems that have C<setresgid()> but not C<setregid()>, but no such systems are known to exist. =head1 Deprecations =head2 Don't read the Unicode data base files in F<lib/unicore> It is now deprecated to directly read the Unicode data base files. These are stored in the F<lib/unicore> directory. Instead, you should use the new functions in L<Unicode::UCD>. These provide a stable API, and give complete information. Perl may at some point in the future change or remove these files. The file which applications were most likely to have used is F<lib/unicore/ToDigit.pl>. L<Unicode::UCD/prop_invmap()> can be used to get at its data instead. =head2 XS functions C<is_utf8_char()>, C<utf8_to_uvchr()> and C<utf8_to_uvuni()> This function is deprecated because it could read beyond the end of the input string. Use the new L<is_utf8_char_buf()|/Added is_utf8_char_buf()>, C<utf8_to_uvchr_buf()> and C<utf8_to_uvuni_buf()> instead. =head1 Future Deprecations This section serves as a notice of features that are I<likely> to be removed or L<deprecated|perlpolicy/deprecated> in the next release of perl (5.18.0). If your code depends on these features, you should contact the Perl 5 Porters via the L<mailing list|http://lists.perl.org/list/perl5-porters.html> or L<perlbug> to explain your use case and inform the deprecation process. =head2 Core Modules These modules may be marked as deprecated I<from the core>. This only means that they will no longer be installed by default with the core distribution, but will remain available on the CPAN. =over =item * CPANPLUS =item * Filter::Simple =item * PerlIO::mmap =item * Pod::LaTeX =item * Pod::Parser =item * SelfLoader =item * Text::Soundex =item * Thread.pm =back =head2 Platforms with no supporting programmers These platforms will probably have their special build support removed during the 5.17.0 development series. =over =item * BeOS =item * djgpp =item * dgux =item * EPOC =item * MPE/iX =item * Rhapsody =item * UTS =item * VM/ESA =back =head2 Other Future Deprecations =over =item * Swapping of $< and $> For more information about this future deprecation, see L<the relevant RT ticket|https://rt.perl.org/rt3/Ticket/Display.html?id=96212>. =item * sfio, stdio Perl supports being built without PerlIO proper, using a stdio or sfio wrapper instead. A perl build like this will not support IO layers and thus Unicode IO, making it rather handicapped. PerlIO supports a C<stdio> layer if stdio use is desired, and similarly a sfio layer could be produced. =item * Unescaped literal C<< "{" >> in regular expressions. Starting with v5.20, it is planned to require a literal C<"{"> to be escaped, for example by preceding it with a backslash. In v5.18, a deprecated warning message will be emitted for all such uses. This affects only patterns that are to match a literal C<"{">. Other uses of this character, such as part of a quantifier or sequence as in those below, are completely unaffected: /foo{3,5}/ /\p{Alphabetic}/ /\N{DIGIT ZERO} Removing this will permit extensions to Perl's pattern syntax and better error checking for existing syntax. See L<perlre/Quantifiers> for an example. =item * Revamping C<< "\Q" >> semantics in double-quotish strings when combined with other escapes. There are several bugs and inconsistencies involving combinations of C<\Q> and escapes like C<\x>, C<\L>, etc., within a C<\Q...\E> pair. These need to be fixed, and doing so will necessarily change current behavior. The changes have not yet been settled. =back =head1 Incompatible Changes =head2 Special blocks called in void context Special blocks (C<BEGIN>, C<CHECK>, C<INIT>, C<UNITCHECK>, C<END>) are now called in void context. This avoids wasteful copying of the result of the last statement [perl #108794]. =head2 The C<overloading> pragma and regexp objects With C<no overloading>, regular expression objects returned by C<qr//> are now stringified as "Regexp=REGEXP(0xbe600d)" instead of the regular expression itself [perl #108780]. =head2 Two XS typemap Entries removed Two presumably unused XS typemap entries have been removed from the core typemap: T_DATAUNIT and T_CALLBACK. If you are, against all odds, a user of these, please see the instructions on how to restore them in L<perlxstypemap>. =head2 Unicode 6.1 has incompatibilities with Unicode 6.0 These are detailed in L</Supports (almost) Unicode 6.1> above. You can compile this version of Perl to use Unicode 6.0. See L<perlunicode/Hacking Perl to work on earlier Unicode versions (for very serious hackers only)>. =head2 Borland compiler All support for the Borland compiler has been dropped. The code had not worked for a long time anyway. =head2 Certain deprecated Unicode properties are no longer supported by default Perl should never have exposed certain Unicode properties that are used by Unicode internally and not meant to be publicly available. Use of these has generated deprecated warning messages since Perl 5.12. The removed properties are Other_Alphabetic, Other_Default_Ignorable_Code_Point, Other_Grapheme_Extend, Other_ID_Continue, Other_ID_Start, Other_Lowercase, Other_Math, and Other_Uppercase. Perl may be recompiled to include any or all of them; instructions are given in L<perluniprops/Unicode character properties that are NOT accepted by Perl>. =head2 Dereferencing IO thingies as typeglobs The C<*{...}> operator, when passed a reference to an IO thingy (as in C<*{*STDIN{IO}}>), creates a new typeglob containing just that IO object. Previously, it would stringify as an empty string, but some operators would treat it as undefined, producing an "uninitialized" warning. Now it stringifies as __ANONIO__ [perl #96326]. =head2 User-defined case-changing operations This feature was deprecated in Perl 5.14, and has now been removed. The CPAN module L<Unicode::Casing> provides better functionality without the drawbacks that this feature had, as are detailed in the 5.14 documentation: L<http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29> =head2 XSUBs are now 'static' XSUB C functions are now 'static', that is, they are not visible from outside the compilation unit. Users can use the new C<XS_EXTERNAL(name)> and C<XS_INTERNAL(name)> macros to pick the desired linking behavior. The ordinary C<XS(name)> declaration for XSUBs will continue to declare non-'static' XSUBs for compatibility, but the XS compiler, L<ExtUtils::ParseXS> (C<xsubpp>) will emit 'static' XSUBs by default. L<ExtUtils::ParseXS>'s behavior can be reconfigured from XS using the C<EXPORT_XSUB_SYMBOLS> keyword. See L<perlxs> for details. =head2 Weakening read-only references Weakening read-only references is no longer permitted. It should never have worked anyway, and could sometimes result in crashes. =head2 Tying scalars that hold typeglobs Attempting to tie a scalar after a typeglob was assigned to it would instead tie the handle in the typeglob's IO slot. This meant that it was impossible to tie the scalar itself. Similar problems affected C<tied> and C<untie>: C<tied $scalar> would return false on a tied scalar if the last thing returned was a typeglob, and C<untie $scalar> on such a tied scalar would do nothing. We fixed this problem before Perl 5.14.0, but it caused problems with some CPAN modules, so we put in a deprecation cycle instead. Now the deprecation has been removed and this bug has been fixed. So C<tie $scalar> will always tie the scalar, not the handle it holds. To tie the handle, use C<tie *$scalar> (with an explicit asterisk). The same applies to C<tied *$scalar> and C<untie *$scalar>. =head2 IPC::Open3 no longer provides C<xfork()>, C<xclose_on_exec()> and C<xpipe_anon()> All three functions were private, undocumented, and unexported. They do not appear to be used by any code on CPAN. Two have been inlined and one deleted entirely. =head2 C<$$> no longer caches PID Previously, if one called fork(3) from C, Perl's notion of C<$$> could go out of sync with what getpid() returns. By always fetching the value of C<$$> via getpid(), this potential bug is eliminated. Code that depends on the caching behavior will break. As described in L<Core Enhancements|/C<$$> can be assigned to>, C<$$> is now writable, but it will be reset during a fork. =head2 C<$$> and C<getppid()> no longer emulate POSIX semantics under LinuxThreads The POSIX emulation of C<$$> and C<getppid()> under the obsolete LinuxThreads implementation has been removed. This only impacts users of Linux 2.4 and users of Debian GNU/kFreeBSD up to and including 6.0, not the vast majority of Linux installations that use NPTL threads. This means that C<getppid()>, like C<$$>, is now always guaranteed to return the OS's idea of the current state of the process, not perl's cached version of it. See the documentation for L<$$|perlvar/$$> for details. =head2 C<< $< >>, C<< $> >>, C<$(> and C<$)> are no longer cached Similarly to the changes to C<$$> and C<getppid()>, the internal caching of C<< $< >>, C<< $> >>, C<$(> and C<$)> has been removed. When we cached these values our idea of what they were would drift out of sync with reality if someone (e.g., someone embedding perl) called C<sete?[ug]id()> without updating C<PL_e?[ug]id>. Having to deal with this complexity wasn't worth it given how cheap the C<gete?[ug]id()> system call is. This change will break a handful of CPAN modules that use the XS-level C<PL_uid>, C<PL_gid>, C<PL_euid> or C<PL_egid> variables. The fix for those breakages is to use C<PerlProc_gete?[ug]id()> to retrieve them (e.g., C<PerlProc_getuid()>), and not to assign to C<PL_e?[ug]id> if you change the UID/GID/EUID/EGID. There is no longer any need to do so since perl will always retrieve the up-to-date version of those values from the OS. =head2 Which Non-ASCII characters get quoted by C<quotemeta> and C<\Q> has changed This is unlikely to result in a real problem, as Perl does not attach special meaning to any non-ASCII character, so it is currently irrelevant which are quoted or not. This change fixes bug [perl #77654] and brings Perl's behavior more into line with Unicode's recommendations. See L<perlfunc/quotemeta>. =head1 Performance Enhancements =over =item * Improved performance for Unicode properties in regular expressions =for comment Can this be compacted some? -- rjbs, 2012-02-20 Matching a code point against a Unicode property is now done via a binary search instead of linear. This means for example that the worst case for a 1000 item property is 10 probes instead of 1000. This inefficiency has been compensated for in the past by permanently storing in a hash the results of a given probe plus the results for the adjacent 64 code points, under the theory that near-by code points are likely to be searched for. A separate hash was used for each mention of a Unicode property in each regular expression. Thus, C<qr/\p{foo}abc\p{foo}/> would generate two hashes. Any probes in one instance would be unknown to the other, and the hashes could expand separately to be quite large if the regular expression were used on many different widely-separated code points. Now, however, there is just one hash shared by all instances of a given property. This means that if C<\p{foo}> is matched against "A" in one regular expression in a thread, the result will be known immediately to all regular expressions, and the relentless march of using up memory is slowed considerably. =item * Version declarations with the C<use> keyword (e.g., C<use 5.012>) are now faster, as they enable features without loading F<feature.pm>. =item * C<local $_> is faster now, as it no longer iterates through magic that it is not going to copy anyway. =item * Perl 5.12.0 sped up the destruction of objects whose classes define empty C<DESTROY> methods (to prevent autoloading), by simply not calling such empty methods. This release takes this optimization a step further, by not calling any C<DESTROY> method that begins with a C<return> statement. This can be useful for destructors that are only used for debugging: use constant DEBUG => 1; sub DESTROY { return unless DEBUG; ... } Constant-folding will reduce the first statement to C<return;> if DEBUG is set to 0, triggering this optimization. =item * Assigning to a variable that holds a typeglob or copy-on-write scalar is now much faster. Previously the typeglob would be stringified or the copy-on-write scalar would be copied before being clobbered. =item * Assignment to C<substr> in void context is now more than twice its previous speed. Instead of creating and returning a special lvalue scalar that is then assigned to, C<substr> modifies the original string itself. =item * C<substr> no longer calculates a value to return when called in void context. =item * Due to changes in L<File::Glob>, Perl's C<glob> function and its C<< <...> >> equivalent are now much faster. The splitting of the pattern into words has been rewritten in C, resulting in speed-ups of 20% for some cases. This does not affect C<glob> on VMS, as it does not use File::Glob. =item * The short-circuiting operators C<&&>, C<||>, and C<//>, when chained (such as C<$a || $b || $c>), are now considerably faster to short-circuit, due to reduced optree traversal. =item * The implementation of C<s///r> makes one fewer copy of the scalar's value. =item * Recursive calls to lvalue subroutines in lvalue scalar context use less memory. =back =head1 Modules and Pragmata =head2 Deprecated Modules =over =item L<Version::Requirements> Version::Requirements is now DEPRECATED, use L<CPAN::Meta::Requirements>, which is a drop-in replacement. It will be deleted from perl.git blead in v5.17.0. =back =head2 New Modules and Pragmata =over 4 =item * L<arybase> -- this new module implements the C<$[> variable. =item * L<PerlIO::mmap> 0.010 has been added to the Perl core. The C<mmap> PerlIO layer is no longer implemented by perl itself, but has been moved out into the new L<PerlIO::mmap> module. =back =head2 Updated Modules and Pragmata This is only an overview of selected module updates. For a complete list of updates, run: $ corelist --diff 5.14.0 5.16.0 You can substitute your favorite version in place of 5.14.0, too. =over 4 =item * L<Archive::Extract> has been upgraded from version 0.48 to 0.58. Includes a fix for FreeBSD to only use C<unzip> if it is located in C</usr/local/bin>, as FreeBSD 9.0 will ship with a limited C<unzip> in C</usr/bin>. =item * L<Archive::Tar> has been upgraded from version 1.76 to 1.82. Adjustments to handle files >8gb (>0777777777777 octal) and a feature to return the MD5SUM of files in the archive. =item * L<base> has been upgraded from version 2.16 to 2.18. C<base> no longer sets a module's C<$VERSION> to "-1" when a module it loads does not define a C<$VERSION>. This change has been made because "-1" is not a valid version number under the new "lax" criteria used internally by C<UNIVERSAL::VERSION>. (See L<version> for more on "lax" version criteria.) C<base> no longer internally skips loading modules it has already loaded and instead relies on C<require> to inspect C<%INC>. This fixes a bug when C<base> is used with code that clear C<%INC> to force a module to be reloaded. =item * L<Carp> has been upgraded from version 1.20 to 1.26. It now includes last read filehandle info and puts a dot after the file and line number, just like errors from C<die> [perl #106538]. =item * L<charnames> has been updated from version 1.18 to 1.30. C<charnames> can now be invoked with a new option, C<:loose>, which is like the existing C<:full> option, but enables Unicode loose name matching. Details are in L<charnames/LOOSE MATCHES>. =item * L<B::Deparse> has been upgraded from version 1.03 to 1.14. This fixes numerous deparsing bugs. =item * L<CGI> has been upgraded from version 3.52 to 3.59. It uses the public and documented FCGI.pm API in CGI::Fast. CGI::Fast was using an FCGI API that was deprecated and removed from documentation more than ten years ago. Usage of this deprecated API with FCGI E<gt>= 0.70 or FCGI E<lt>= 0.73 introduces a security issue. L<https://rt.cpan.org/Public/Bug/Display.html?id=68380> L<http://web.nvd.nist.gov/view/vuln/detail?vulnId=CVE-2011-2766> Things that may break your code: C<url()> was fixed to return C<PATH_INFO> when it is explicitly requested with either the C<path=E<gt>1> or C<path_info=E<gt>1> flag. If your code is running under mod_rewrite (or compatible) and you are calling C<self_url()> or you are calling C<url()> and passing C<path_info=E<gt>1>, these methods will actually be returning C<PATH_INFO> now, as you have explicitly requested or C<self_url()> has requested on your behalf. The C<PATH_INFO> has been omitted in such URLs since the issue was introduced in the 3.12 release in December, 2005. This bug is so old your application may have come to depend on it or workaround it. Check for application before upgrading to this release. Examples of affected method calls: $q->url(-absolute => 1, -query => 1, -path_info => 1); $q->url(-path=>1); $q->url(-full=>1,-path=>1); $q->url(-rewrite=>1,-path=>1); $q->self_url(); We no longer read from STDIN when the Content-Length is not set, preventing requests with no Content-Length from sometimes freezing. This is consistent with the CGI RFC 3875, and is also consistent with CGI::Simple. However, the old behavior may have been expected by some command-line uses of CGI.pm. In addition, the DELETE HTTP verb is now supported. =item * L<Compress::Zlib> has been upgraded from version 2.035 to 2.048. IO::Compress::Zip and IO::Uncompress::Unzip now have support for LZMA (method 14). There is a fix for a CRC issue in IO::Compress::Unzip and it supports Streamed Stored context now. And fixed a Zip64 issue in IO::Compress::Zip when the content size was exactly 0xFFFFFFFF. =item * L<Digest::SHA> has been upgraded from version 5.61 to 5.71. Added BITS mode to the addfile method and shasum. This makes partial-byte inputs possible via files/STDIN and lets shasum check all 8074 NIST Msg vectors, where previously special programming was required to do this. =item * L<Encode> has been upgraded from version 2.42 to 2.44. Missing aliases added, a deep recursion error fixed and various documentation updates. Addressed 'decode_xs n-byte heap-overflow' security bug in Unicode.xs (CVE-2011-2939). (5.14.2) =item * L<ExtUtils::CBuilder> updated from version 0.280203 to 0.280206. The new version appends CFLAGS and LDFLAGS to their Config.pm counterparts. =item * L<ExtUtils::ParseXS> has been upgraded from version 2.2210 to 3.16. Much of L<ExtUtils::ParseXS>, the module behind the XS compiler C<xsubpp>, was rewritten and cleaned up. It has been made somewhat more extensible and now finally uses strictures. The typemap logic has been moved into a separate module, L<ExtUtils::Typemaps>. See L</New Modules and Pragmata>, above. For a complete set of changes, please see the ExtUtils::ParseXS changelog, available on the CPAN. =item * L<File::Glob> has been upgraded from version 1.12 to 1.17. On Windows, tilde (~) expansion now checks the C<USERPROFILE> environment variable, after checking C<HOME>. It has a new C<:bsd_glob> export tag, intended to replace C<:glob>. Like C<:glob> it overrides C<glob> with a function that does not split the glob pattern into words, but, unlike C<:glob>, it iterates properly in scalar context, instead of returning the last file. There are other changes affecting Perl's own C<glob> operator (which uses File::Glob internally, except on VMS). See L</Performance Enhancements> and L</Selected Bug Fixes>. =item * L<FindBin> updated from version 1.50 to 1.51. It no longer returns a wrong result if a script of the same name as the current one exists in the path and is executable. =item * L<HTTP::Tiny> has been upgraded from version 0.012 to 0.017. Added support for using C<$ENV{http_proxy}> to set the default proxy host. Adds additional shorthand methods for all common HTTP verbs, a C<post_form()> method for POST-ing x-www-form-urlencoded data and a C<www_form_urlencode()> utility method. =item * L<IO> has been upgraded from version 1.25_04 to 1.25_06, and L<IO::Handle> from version 1.31 to 1.33. Together, these upgrades fix a problem with IO::Handle's C<getline> and C<getlines> methods. When these methods are called on the special ARGV handle, the next file is automatically opened, as happens with the built-in C<E<lt>E<gt>> and C<readline> functions. But, unlike the built-ins, these methods were not respecting the caller's use of the L<open> pragma and applying the appropriate I/O layers to the newly-opened file [rt.cpan.org #66474]. =item * L<IPC::Cmd> has been upgraded from version 0.70 to 0.76. Capturing of command output (both C<STDOUT> and C<STDERR>) is now supported using L<IPC::Open3> on MSWin32 without requiring L<IPC::Run>. =item * L<IPC::Open3> has been upgraded from version 1.09 to 1.12. Fixes a bug which prevented use of C<open3> on Windows when C<*STDIN>, C<*STDOUT> or C<*STDERR> had been localized. Fixes a bug which prevented duplicating numeric file descriptors on Windows. C<open3> with "-" for the program name works once more. This was broken in version 1.06 (and hence in Perl 5.14.0) [perl #95748]. =item * L<Locale::Codes> has been upgraded from version 3.16 to 3.21. Added Language Extension codes (langext) and Language Variation codes (langvar) as defined in the IANA language registry. Added language codes from ISO 639-5 Added language/script codes from the IANA language subtag registry Fixed an uninitialized value warning [rt.cpan.org #67438]. Fixed the return value for the all_XXX_codes and all_XXX_names functions [rt.cpan.org #69100]. Reorganized modules to move Locale::MODULE to Locale::Codes::MODULE to allow for cleaner future additions. The original four modules (Locale::Language, Locale::Currency, Locale::Country, Locale::Script) will continue to work, but all new sets of codes will be added in the Locale::Codes namespace. The code2XXX, XXX2code, all_XXX_codes, and all_XXX_names functions now support retired codes. All codesets may be specified by a constant or by their name now. Previously, they were specified only by a constant. The alias_code function exists for backward compatibility. It has been replaced by rename_country_code. The alias_code function will be removed some time after September, 2013. All work is now done in the central module (Locale::Codes). Previously, some was still done in the wrapper modules (Locale::Codes::*). Added Language Family codes (langfam) as defined in ISO 639-5. =item * L<Math::BigFloat> has been upgraded from version 1.993 to 1.997. The C<numify> method has been corrected to return a normalized Perl number (the result of C<0 + $thing>), instead of a string [rt.cpan.org #66732]. =item * L<Math::BigInt> has been upgraded from version 1.994 to 1.998. It provides a new C<bsgn> method that complements the C<babs> method. It fixes the internal C<objectify> function's handling of "foreign objects" so they are converted to the appropriate class (Math::BigInt or Math::BigFloat). =item * L<Math::BigRat> has been upgraded from version 0.2602 to 0.2603. C<int()> on a Math::BigRat object containing -1/2 now creates a Math::BigInt containing 0, rather than -0. L<Math::BigInt> does not even support negative zero, so the resulting object was actually malformed [perl #95530]. =item * L<Math::Complex> has been upgraded from version 1.56 to 1.59 and L<Math::Trig> from version 1.2 to 1.22. Fixes include: correct copy constructor usage; fix polarwise formatting with numeric format specifier; and more stable C<great_circle_direction> algorithm. =item * L<Module::CoreList> has been upgraded from version 2.51 to 2.66. The C<corelist> utility now understands the C<-r> option for displaying Perl release dates and the C<--diff> option to print the set of modlib changes between two perl distributions. =item * L<Module::Metadata> has been upgraded from version 1.000004 to 1.000009. Adds C<provides> method to generate a CPAN META provides data structure correctly; use of C<package_versions_from_directory> is discouraged. =item * L<ODBM_File> has been upgraded from version 1.10 to 1.12. The XS code is now compiled with C<PERL_NO_GET_CONTEXT>, which will aid performance under ithreads. =item * L<open> has been upgraded from version 1.08 to 1.10. It no longer turns off layers on standard handles when invoked without the ":std" directive. Similarly, when invoked I<with> the ":std" directive, it now clears layers on STDERR before applying the new ones, and not just on STDIN and STDOUT [perl #92728]. =item * L<overload> has been upgraded from version 1.13 to 1.18. C<overload::Overloaded> no longer calls C<can> on the class, but uses another means to determine whether the object has overloading. It was never correct for it to call C<can>, as overloading does not respect AUTOLOAD. So classes that autoload methods and implement C<can> no longer have to account for overloading [perl #40333]. A warning is now produced for invalid arguments. See L</New Diagnostics>. =item * L<PerlIO::scalar> has been upgraded from version 0.11 to 0.14. (This is the module that implements C<< open $fh, '>', \$scalar >>.) It fixes a problem with C<< open my $fh, ">", \$scalar >> not working if C<$scalar> is a copy-on-write scalar. (5.14.2) It also fixes a hang that occurs with C<readline> or C<< <$fh> >> if a typeglob has been assigned to $scalar [perl #92258]. It no longer assumes during C<seek> that $scalar is a string internally. If it didn't crash, it was close to doing so [perl #92706]. Also, the internal print routine no longer assumes that the position set by C<seek> is valid, but extends the string to that position, filling the intervening bytes (between the old length and the seek position) with nulls [perl #78980]. Printing to an in-memory handle now works if the $scalar holds a reference, stringifying the reference before modifying it. References used to be treated as empty strings. Printing to an in-memory handle no longer crashes if the $scalar happens to hold a number internally, but no string buffer. Printing to an in-memory handle no longer creates scalars that confuse the regular expression engine [perl #108398]. =item * L<Pod::Functions> has been upgraded from version 1.04 to 1.05. F<Functions.pm> is now generated at perl build time from annotations in F<perlfunc.pod>. This will ensure that L<Pod::Functions> and L<perlfunc> remain in synchronisation. =item * L<Pod::Html> has been upgraded from version 1.11 to 1.1502. This is an extensive rewrite of Pod::Html to use L<Pod::Simple> under the hood. The output has changed significantly. =item * L<Pod::Perldoc> has been upgraded from version 3.15_03 to 3.17. It corrects the search paths on VMS [perl #90640]. (5.14.1) The B<-v> option now fetches the right section for C<$0>. This upgrade has numerous significant fixes. Consult its changelog on the CPAN for more information. =item * L<POSIX> has been upgraded from version 1.24 to 1.30. L<POSIX> no longer uses L<AutoLoader>. Any code which was relying on this implementation detail was buggy, and may fail because of this change. The module's Perl code has been considerably simplified, roughly halving the number of lines, with no change in functionality. The XS code has been refactored to reduce the size of the shared object by about 12%, with no change in functionality. More POSIX functions now have tests. C<sigsuspend> and C<pause> now run signal handlers before returning, as the whole point of these two functions is to wait until a signal has arrived, and then return I<after> it has been triggered. Delayed, or "safe", signals were preventing that from happening, possibly resulting in race conditions [perl #107216]. C<POSIX::sleep> is now a direct call into the underlying OS C<sleep> function, instead of being a Perl wrapper on C<CORE::sleep>. C<POSIX::dup2> now returns the correct value on Win32 (I<i.e.>, the file descriptor). C<POSIX::SigSet> C<sigsuspend> and C<sigpending> and C<POSIX::pause> now dispatch safe signals immediately before returning to their caller. C<POSIX::Termios::setattr> now defaults the third argument to C<TCSANOW>, instead of 0. On most platforms C<TCSANOW> is defined to be 0, but on some 0 is not a valid parameter, which caused a call with defaults to fail. =item * L<Socket> has been upgraded from version 1.94 to 2.001. It has new functions and constants for handling IPv6 sockets: pack_ipv6_mreq unpack_ipv6_mreq IPV6_ADD_MEMBERSHIP IPV6_DROP_MEMBERSHIP IPV6_MTU IPV6_MTU_DISCOVER IPV6_MULTICAST_HOPS IPV6_MULTICAST_IF IPV6_MULTICAST_LOOP IPV6_UNICAST_HOPS IPV6_V6ONLY =item * L<Storable> has been upgraded from version 2.27 to 2.34. It no longer turns copy-on-write scalars into read-only scalars when freezing and thawing. =item * L<Sys::Syslog> has been upgraded from version 0.27 to 0.29. This upgrade closes many outstanding bugs. =item * L<Term::ANSIColor> has been upgraded from version 3.00 to 3.01. Only interpret an initial array reference as a list of colors, not any initial reference, allowing the colored function to work properly on objects with stringification defined. =item * L<Term::ReadLine> has been upgraded from version 1.07 to 1.09. Term::ReadLine now supports any event loop, including unpublished ones and simple L<IO::Select>, loops without the need to rewrite existing code for any particular framework [perl #108470]. =item * L<threads::shared> has been upgraded from version 1.37 to 1.40. Destructors on shared objects used to be ignored sometimes if the objects were referenced only by shared data structures. This has been mostly fixed, but destructors may still be ignored if the objects still exist at global destruction time [perl #98204]. =item * L<Unicode::Collate> has been upgraded from version 0.73 to 0.89. Updated to CLDR 1.9.1 Locales updated to CLDR 2.0: mk, mt, nb, nn, ro, ru, sk, sr, sv, uk, zh__pinyin, zh__stroke Newly supported locales: bn, fa, ml, mr, or, pa, sa, si, si__dictionary, sr_Latn, sv__reformed, ta, te, th, ur, wae. Tailored compatibility ideographs as well as unified ideographs for the locales: ja, ko, zh__big5han, zh__gb2312han, zh__pinyin, zh__stroke. Locale/*.pl files are now searched for in @INC. =item * L<Unicode::Normalize> has been upgraded from version 1.10 to 1.14. Fixes for the removal of F<unicore/CompositionExclusions.txt> from core. =item * L<Unicode::UCD> has been upgraded from version 0.32 to 0.43. This adds four new functions: C<prop_aliases()> and C<prop_value_aliases()>, which are used to find all Unicode-approved synonyms for property names, or to convert from one name to another; C<prop_invlist> which returns all code points matching a given Unicode binary property; and C<prop_invmap> which returns the complete specification of a given Unicode property. =item * L<Win32API::File> has been upgraded from version 0.1101 to 0.1200. Added SetStdHandle and GetStdHandle functions =back =head2 Removed Modules and Pragmata As promised in Perl 5.14.0's release notes, the following modules have been removed from the core distribution, and if needed should be installed from CPAN instead. =over =item * L<Devel::DProf> has been removed from the Perl core. Prior version was 20110228.00. =item * L<Shell> has been removed from the Perl core. Prior version was 0.72_01. =item * Several old perl4-style libraries which have been deprecated with 5.14 are now removed: abbrev.pl assert.pl bigfloat.pl bigint.pl bigrat.pl cacheout.pl complete.pl ctime.pl dotsh.pl exceptions.pl fastcwd.pl flush.pl getcwd.pl getopt.pl getopts.pl hostname.pl importenv.pl lib/find{,depth}.pl look.pl newgetopt.pl open2.pl open3.pl pwd.pl shellwords.pl stat.pl tainted.pl termcap.pl timelocal.pl They can be found on CPAN as L<Perl4::CoreLibs>. =back =head1 Documentation =head2 New Documentation =head3 L<perldtrace> L<perldtrace> describes Perl's DTrace support, listing the provided probes and gives examples of their use. =head3 L<perlexperiment> This document is intended to provide a list of experimental features in Perl. It is still a work in progress. =head3 L<perlootut> This a new OO tutorial. It focuses on basic OO concepts, and then recommends that readers choose an OO framework from CPAN. =head3 L<perlxstypemap> The new manual describes the XS typemapping mechanism in unprecedented detail and combines new documentation with information extracted from L<perlxs> and the previously unofficial list of all core typemaps. =head2 Changes to Existing Documentation =head3 L<perlapi> =over 4 =item * The HV API has long accepted negative lengths to show that the key is in UTF8. This is now documented. =item * The C<boolSV()> macro is now documented. =back =head3 L<perlfunc> =over 4 =item * C<dbmopen> treats a 0 mode as a special case, that prevents a nonexistent file from being created. This has been the case since Perl 5.000, but was never documented anywhere. Now the perlfunc entry mentions it [perl #90064]. =item * As an accident of history, C<open $fh, '<:', ...> applies the default layers for the platform (C<:raw> on Unix, C<:crlf> on Windows), ignoring whatever is declared by L<open.pm|open>. This seems such a useful feature it has been documented in L<perlfunc|perlfunc/open> and L<open>. =item * The entry for C<split> has been rewritten. It is now far clearer than before. =back =head3 L<perlguts> =over 4 =item * A new section, L<Autoloading with XSUBs|perlguts/Autoloading with XSUBs>, has been added, which explains the two APIs for accessing the name of the autoloaded sub. =item * Some function descriptions in L<perlguts> were confusing, as it was not clear whether they referred to the function above or below the description. This has been clarified [perl #91790]. =back =head3 L<perlobj> =over 4 =item * This document has been rewritten from scratch, and its coverage of various OO concepts has been expanded. =back =head3 L<perlop> =over 4 =item * Documentation of the smartmatch operator has been reworked and moved from perlsyn to perlop where it belongs. It has also been corrected for the case of C<undef> on the left-hand side. The list of different smart match behaviors had an item in the wrong place. =item * Documentation of the ellipsis statement (C<...>) has been reworked and moved from perlop to perlsyn. =item * The explanation of bitwise operators has been expanded to explain how they work on Unicode strings (5.14.1). =item * More examples for C<m//g> have been added (5.14.1). =item * The C<<< <<\FOO >>> here-doc syntax has been documented (5.14.1). =back =head3 L<perlpragma> =over 4 =item * There is now a standard convention for naming keys in the C<%^H>, documented under L<Key naming|perlpragma/Key naming>. =back =head3 L<perlsec/Laundering and Detecting Tainted Data> =over 4 =item * The example function for checking for taintedness contained a subtle error. C<$@> needs to be localized to prevent its changing this global's value outside the function. The preferred method to check for this remains L<Scalar::Util/tainted>. =back =head3 L<perllol> =over =item * L<perllol> has been expanded with examples using the new C<push $scalar> syntax introduced in Perl 5.14.0 (5.14.1). =back =head3 L<perlmod> =over =item * L<perlmod> now states explicitly that some types of explicit symbol table manipulation are not supported. This codifies what was effectively already the case [perl #78074]. =back =head3 L<perlpodstyle> =over 4 =item * The tips on which formatting codes to use have been corrected and greatly expanded. =item * There are now a couple of example one-liners for previewing POD files after they have been edited. =back =head3 L<perlre> =over =item * The C<(*COMMIT)> directive is now listed in the right section (L<Verbs without an argument|perlre/Verbs without an argument>). =back =head3 L<perlrun> =over =item * L<perlrun> has undergone a significant clean-up. Most notably, the B<-0x...> form of the B<-0> flag has been clarified, and the final section on environment variables has been corrected and expanded (5.14.1). =back =head3 L<perlsub> =over =item * The ($;) prototype syntax, which has existed for rather a long time, is now documented in L<perlsub>. It lets a unary function have the same precedence as a list operator. =back =head3 L<perltie> =over =item * The required syntax for tying handles has been documented. =back =head3 L<perlvar> =over =item * The documentation for L<$!|perlvar/$!> has been corrected and clarified. It used to state that $! could be C<undef>, which is not the case. It was also unclear whether system calls set C's C<errno> or Perl's C<$!> [perl #91614]. =item * Documentation for L<$$|perlvar/$$> has been amended with additional cautions regarding changing the process ID. =back =head3 Other Changes =over 4 =item * L<perlxs> was extended with documentation on inline typemaps. =item * L<perlref> has a new L<Circular References|perlref/Circular References> section explaining how circularities may not be freed and how to solve that with weak references. =item * Parts of L<perlapi> were clarified, and Perl equivalents of some C functions have been added as an additional mode of exposition. =item * A few parts of L<perlre> and L<perlrecharclass> were clarified. =back =head2 Removed Documentation =head3 Old OO Documentation The old OO tutorials, perltoot, perltooc, and perlboot, have been removed. The perlbot (bag of object tricks) document has been removed as well. =head3 Development Deltas The perldelta files for development releases are no longer packaged with perl. These can still be found in the perl source code repository. =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 New Diagnostics =head3 New Errors =over 4 =item * L<Cannot set tied @DB::args|perldiag/"Cannot set tied @DB::args"> This error occurs when C<caller> tries to set C<@DB::args> but finds it tied. Before this error was added, it used to crash instead. =item * L<Cannot tie unreifiable array|perldiag/"Cannot tie unreifiable array"> This error is part of a safety check that the C<tie> operator does before tying a special array like C<@_>. You should never see this message. =item * L<&CORE::%s cannot be called directly|perldiag/"&CORE::%s cannot be called directly"> This occurs when a subroutine in the C<CORE::> namespace is called with C<&foo> syntax or through a reference. Some subroutines in this package cannot yet be called that way, but must be called as barewords. See L</Subroutines in the C<CORE> namespace>, above. =item * L<Source filters apply only to byte streams|perldiag/"Source filters apply only to byte streams"> This new error occurs when you try to activate a source filter (usually by loading a source filter module) within a string passed to C<eval> under the C<unicode_eval> feature. =back =head3 New Warnings =over 4 =item * L<defined(@array) is deprecated|perldiag/"defined(@array) is deprecated"> The long-deprecated C<defined(@array)> now also warns for package variables. Previously it issued a warning for lexical variables only. =item * L<length() used on %s|perldiag/length() used on %s> This new warning occurs when C<length> is used on an array or hash, instead of C<scalar(@array)> or C<scalar(keys %hash)>. =item * L<lvalue attribute %s already-defined subroutine|perldiag/"lvalue attribute %s already-defined subroutine"> L<attributes.pm|attributes> now emits this warning when the :lvalue attribute is applied to a Perl subroutine that has already been defined, as doing so can have unexpected side-effects. =item * L<overload arg '%s' is invalid|perldiag/"overload arg '%s' is invalid"> This warning, in the "overload" category, is produced when the overload pragma is given an argument it doesn't recognize, presumably a mistyped operator. =item * L<$[ used in %s (did you mean $] ?)|perldiag/"$[ used in %s (did you mean $] ?)"> This new warning exists to catch the mistaken use of C<$[> in version checks. C<$]>, not C<$[>, contains the version number. =item * L<Useless assignment to a temporary|perldiag/"Useless assignment to a temporary"> Assigning to a temporary scalar returned from an lvalue subroutine now produces this warning [perl #31946]. =item * L<Useless use of \E|perldiag/"Useless use of \E"> C<\E> does nothing unless preceded by C<\Q>, C<\L> or C<\U>. =back =head2 Removed Errors =over =item * "sort is now a reserved word" This error used to occur when C<sort> was called without arguments, followed by C<;> or C<)>. (E.g., C<sort;> would die, but C<{sort}> was OK.) This error message was added in Perl 3 to catch code like C<close(sort)> which would no longer work. More than two decades later, this message is no longer appropriate. Now C<sort> without arguments is always allowed, and returns an empty list, as it did in those cases where it was already allowed [perl #90030]. =back =head2 Changes to Existing Diagnostics =over 4 =item * The "Applying pattern match..." or similar warning produced when an array or hash is on the left-hand side of the C<=~> operator now mentions the name of the variable. =item * The "Attempt to free non-existent shared string" has had the spelling of "non-existent" corrected to "nonexistent". It was already listed with the correct spelling in L<perldiag>. =item * The error messages for using C<default> and C<when> outside a topicalizer have been standardized to match the messages for C<continue> and loop controls. They now read 'Can't "default" outside a topicalizer' and 'Can't "when" outside a topicalizer'. They both used to be 'Can't use when() outside a topicalizer' [perl #91514]. =item * The message, "Code point 0x%X is not Unicode, no properties match it; all inverse properties do" has been changed to "Code point 0x%X is not Unicode, all \p{} matches fail; all \P{} matches succeed". =item * Redefinition warnings for constant subroutines used to be mandatory, even occurring under C<no warnings>. Now they respect the L<warnings> pragma. =item * The "glob failed" warning message is now suppressible via C<no warnings> [perl #111656]. =item * The L<Invalid version format|perldiag/"Invalid version format (%s)"> error message now says "negative version number" within the parentheses, rather than "non-numeric data", for negative numbers. =item * The two warnings L<Possible attempt to put comments in qw() list|perldiag/"Possible attempt to put comments in qw() list"> and L<Possible attempt to separate words with commas|perldiag/"Possible attempt to separate words with commas"> are no longer mutually exclusive: the same C<qw> construct may produce both. =item * The uninitialized warning for C<y///r> when C<$_> is implicit and undefined now mentions the variable name, just like the non-/r variation of the operator. =item * The 'Use of "foo" without parentheses is ambiguous' warning has been extended to apply also to user-defined subroutines with a (;$) prototype, and not just to built-in functions. =item * Warnings that mention the names of lexical (C<my>) variables with Unicode characters in them now respect the presence or absence of the C<:utf8> layer on the output handle, instead of outputting UTF8 regardless. Also, the correct names are included in the strings passed to C<$SIG{__WARN__}> handlers, rather than the raw UTF8 bytes. =back =head1 Utility Changes =head3 L<h2ph> =over 4 =item * L<h2ph> used to generate code of the form unless(defined(&FOO)) { sub FOO () {42;} } But the subroutine is a compile-time declaration, and is hence unaffected by the condition. It has now been corrected to emit a string C<eval> around the subroutine [perl #99368]. =back =head3 L<splain> =over 4 =item * F<splain> no longer emits backtraces with the first line number repeated. This: Uncaught exception from user code: Cannot fwiddle the fwuddle at -e line 1. at -e line 1 main::baz() called at -e line 1 main::bar() called at -e line 1 main::foo() called at -e line 1 has become this: Uncaught exception from user code: Cannot fwiddle the fwuddle at -e line 1. main::baz() called at -e line 1 main::bar() called at -e line 1 main::foo() called at -e line 1 =item * Some error messages consist of multiple lines that are listed as separate entries in L<perldiag>. splain has been taught to find the separate entries in these cases, instead of simply failing to find the message. =back =head3 L<zipdetails> =over 4 =item * This is a new utility, included as part of an L<IO::Compress::Base> upgrade. L<zipdetails> displays information about the internal record structure of the zip file. It is not concerned with displaying any details of the compressed data stored in the zip file. =back =head1 Configuration and Compilation =over 4 =item * F<regexp.h> has been modified for compatibility with GCC's B<-Werror> option, as used by some projects that include perl's header files (5.14.1). =item * C<USE_LOCALE{,_COLLATE,_CTYPE,_NUMERIC}> have been added the output of perl -V as they have affect the behavior of the interpreter binary (albeit in only a small area). =item * The code and tests for L<IPC::Open2> have been moved from F<ext/IPC-Open2> into F<ext/IPC-Open3>, as C<IPC::Open2::open2()> is implemented as a thin wrapper around C<IPC::Open3::_open3()>, and hence is very tightly coupled to it. =item * The magic types and magic vtables are now generated from data in a new script F<regen/mg_vtable.pl>, instead of being maintained by hand. As different EBCDIC variants can't agree on the code point for '~', the character to code point conversion is done at build time by F<generate_uudmap> to a new generated header F<mg_data.h>. C<PL_vtbl_bm> and C<PL_vtbl_fm> are now defined by the pre-processor as C<PL_vtbl_regexp>, instead of being distinct C variables. C<PL_vtbl_sig> has been removed. =item * Building with C<-DPERL_GLOBAL_STRUCT> works again. This configuration is not generally used. =item * Perl configured with I<MAD> now correctly frees C<MADPROP> structures when OPs are freed. C<MADPROP>s are now allocated with C<PerlMemShared_malloc()> =item * F<makedef.pl> has been refactored. This should have no noticeable affect on any of the platforms that use it as part of their build (AIX, VMS, Win32). =item * C<useperlio> can no longer be disabled. =item * The file F<global.sym> is no longer needed, and has been removed. It contained a list of all exported functions, one of the files generated by F<regen/embed.pl> from data in F<embed.fnc> and F<regen/opcodes>. The code has been refactored so that the only user of F<global.sym>, F<makedef.pl>, now reads F<embed.fnc> and F<regen/opcodes> directly, removing the need to store the list of exported functions in an intermediate file. As F<global.sym> was never installed, this change should not be visible outside the build process. =item * F<pod/buildtoc>, used by the build process to build L<perltoc>, has been refactored and simplified. It now contains only code to build L<perltoc>; the code to regenerate Makefiles has been moved to F<Porting/pod_rules.pl>. It's a bug if this change has any material effect on the build process. =item * F<pod/roffitall> is now built by F<pod/buildtoc>, instead of being shipped with the distribution. Its list of manpages is now generated (and therefore current). See also RT #103202 for an unresolved related issue. =item * The man page for C<XS::Typemap> is no longer installed. C<XS::Typemap> is a test module which is not installed, hence installing its documentation makes no sense. =item * The -Dusesitecustomize and -Duserelocatableinc options now work together properly. =back =head1 Platform Support =head2 Platform-Specific Notes =head3 Cygwin =over 4 =item * Since version 1.7, Cygwin supports native UTF-8 paths. If Perl is built under that environment, directory and filenames will be UTF-8 encoded. =item * Cygwin does not initialize all original Win32 environment variables. See F<README.cygwin> for a discussion of the newly-added C<Cygwin::sync_winenv()> function [perl #110190] and for further links. =back =head3 HP-UX =over 4 =item * HP-UX PA-RISC/64 now supports gcc-4.x A fix to correct the socketsize now makes the test suite pass on HP-UX PA-RISC for 64bitall builds. (5.14.2) =back =head3 VMS =over 4 =item * Remove unnecessary includes, fix miscellaneous compiler warnings and close some unclosed comments on F<vms/vms.c>. =item * Remove sockadapt layer from the VMS build. =item * Explicit support for VMS versions before v7.0 and DEC C versions before v6.0 has been removed. =item * Since Perl 5.10.1, the home-grown C<stat> wrapper has been unable to distinguish between a directory name containing an underscore and an otherwise-identical filename containing a dot in the same position (e.g., t/test_pl as a directory and t/test.pl as a file). This problem has been corrected. =item * The build on VMS now permits names of the resulting symbols in C code for Perl longer than 31 characters. Symbols like C<Perl__it_was_the_best_of_times_it_was_the_worst_of_times> can now be created freely without causing the VMS linker to seize up. =back =head3 GNU/Hurd =over 4 =item * Numerous build and test failures on GNU/Hurd have been resolved with hints for building DBM modules, detection of the library search path, and enabling of large file support. =back =head3 OpenVOS =over 4 =item * Perl is now built with dynamic linking on OpenVOS, the minimum supported version of which is now Release 17.1.0. =back =head3 SunOS The CC workshop C++ compiler is now detected and used on systems that ship without cc. =head1 Internal Changes =over 4 =item * The compiled representation of formats is now stored via the C<mg_ptr> of their C<PERL_MAGIC_fm>. Previously it was stored in the string buffer, beyond C<SvLEN()>, the regular end of the string. C<SvCOMPILED()> and C<SvCOMPILED_{on,off}()> now exist solely for compatibility for XS code. The first is always 0, the other two now no-ops. (5.14.1) =item * Some global variables have been marked C<const>, members in the interpreter structure have been re-ordered, and the opcodes have been re-ordered. The op C<OP_AELEMFAST> has been split into C<OP_AELEMFAST> and C<OP_AELEMFAST_LEX>. =item * When empting a hash of its elements (e.g., via undef(%h), or %h=()), HvARRAY field is no longer temporarily zeroed. Any destructors called on the freed elements see the remaining elements. Thus, %h=() becomes more like C<delete $h{$_} for keys %h>. =item * Boyer-Moore compiled scalars are now PVMGs, and the Boyer-Moore tables are now stored via the mg_ptr of their C<PERL_MAGIC_bm>. Previously they were PVGVs, with the tables stored in the string buffer, beyond C<SvLEN()>. This eliminates the last place where the core stores data beyond C<SvLEN()>. =item * Simplified logic in C<Perl_sv_magic()> introduces a small change of behavior for error cases involving unknown magic types. Previously, if C<Perl_sv_magic()> was passed a magic type unknown to it, it would =over =item 1. Croak "Modification of a read-only value attempted" if read only =item 2. Return without error if the SV happened to already have this magic =item 3. otherwise croak "Don't know how to handle magic of type \\%o" =back Now it will always croak "Don't know how to handle magic of type \\%o", even on read-only values, or SVs which already have the unknown magic type. =item * The experimental C<fetch_cop_label> function has been renamed to C<cop_fetch_label>. =item * The C<cop_store_label> function has been added to the API, but is experimental. =item * F<embedvar.h> has been simplified, and one level of macro indirection for PL_* variables has been removed for the default (non-multiplicity) configuration. PERLVAR*() macros now directly expand their arguments to tokens such as C<PL_defgv>, instead of expanding to C<PL_Idefgv>, with F<embedvar.h> defining a macro to map C<PL_Idefgv> to C<PL_defgv>. XS code which has unwarranted chumminess with the implementation may need updating. =item * An API has been added to explicitly choose whether to export XSUB symbols. More detail can be found in the comments for commit e64345f8. =item * The C<is_gv_magical_sv> function has been eliminated and merged with C<gv_fetchpvn_flags>. It used to be called to determine whether a GV should be autovivified in rvalue context. Now it has been replaced with a new C<GV_ADDMG> flag (not part of the API). =item * The returned code point from the function C<utf8n_to_uvuni()> when the input is malformed UTF-8, malformations are allowed, and C<utf8> warnings are off is now the Unicode REPLACEMENT CHARACTER whenever the malformation is such that no well-defined code point can be computed. Previously the returned value was essentially garbage. The only malformations that have well-defined values are a zero-length string (0 is the return), and overlong UTF-8 sequences. =item * Padlists are now marked C<AvREAL>; i.e., reference-counted. They have always been reference-counted, but were not marked real, because F<pad.c> did its own clean-up, instead of using the usual clean-up code in F<sv.c>. That caused problems in thread cloning, so now the C<AvREAL> flag is on, but is turned off in F<pad.c> right before the padlist is freed (after F<pad.c> has done its custom freeing of the pads). =item * All C files that make up the Perl core have been converted to UTF-8. =item * These new functions have been added as part of the work on Unicode symbols: HvNAMELEN HvNAMEUTF8 HvENAMELEN HvENAMEUTF8 gv_init_pv gv_init_pvn gv_init_pvsv gv_fetchmeth_pv gv_fetchmeth_pvn gv_fetchmeth_sv gv_fetchmeth_pv_autoload gv_fetchmeth_pvn_autoload gv_fetchmeth_sv_autoload gv_fetchmethod_pv_flags gv_fetchmethod_pvn_flags gv_fetchmethod_sv_flags gv_autoload_pv gv_autoload_pvn gv_autoload_sv newGVgen_flags sv_derived_from_pv sv_derived_from_pvn sv_derived_from_sv sv_does_pv sv_does_pvn sv_does_sv whichsig_pv whichsig_pvn whichsig_sv newCONSTSUB_flags The gv_fetchmethod_*_flags functions, like gv_fetchmethod_flags, are experimental and may change in a future release. =item * The following functions were added. These are I<not> part of the API: GvNAMEUTF8 GvENAMELEN GvENAME_HEK CopSTASH_flags CopSTASH_flags_set PmopSTASH_flags PmopSTASH_flags_set sv_sethek HEKfARG There is also a C<HEKf> macro corresponding to C<SVf>, for interpolating HEKs in formatted strings. =item * C<sv_catpvn_flags> takes a couple of new internal-only flags, C<SV_CATBYTES> and C<SV_CATUTF8>, which tell it whether the char array to be concatenated is UTF8. This allows for more efficient concatenation than creating temporary SVs to pass to C<sv_catsv>. =item * For XS AUTOLOAD subs, $AUTOLOAD is set once more, as it was in 5.6.0. This is in addition to setting C<SvPVX(cv)>, for compatibility with 5.8 to 5.14. See L<perlguts/Autoloading with XSUBs>. =item * Perl now checks whether the array (the linearized isa) returned by a MRO plugin begins with the name of the class itself, for which the array was created, instead of assuming that it does. This prevents the first element from being skipped during method lookup. It also means that C<mro::get_linear_isa> may return an array with one more element than the MRO plugin provided [perl #94306]. =item * C<PL_curstash> is now reference-counted. =item * There are now feature bundle hints in C<PL_hints> (C<$^H>) that version declarations use, to avoid having to load F<feature.pm>. One setting of the hint bits indicates a "custom" feature bundle, which means that the entries in C<%^H> still apply. F<feature.pm> uses that. The C<HINT_FEATURE_MASK> macro is defined in F<perl.h> along with other hints. Other macros for setting and testing features and bundles are in the new F<feature.h>. C<FEATURE_IS_ENABLED> (which has moved to F<feature.h>) is no longer used throughout the codebase, but more specific macros, e.g., C<FEATURE_SAY_IS_ENABLED>, that are defined in F<feature.h>. =item * F<lib/feature.pm> is now a generated file, created by the new F<regen/feature.pl> script, which also generates F<feature.h>. =item * Tied arrays are now always C<AvREAL>. If C<@_> or C<DB::args> is tied, it is reified first, to make sure this is always the case. =item * Two new functions C<utf8_to_uvchr_buf()> and C<utf8_to_uvuni_buf()> have been added. These are the same as C<utf8_to_uvchr> and C<utf8_to_uvuni> (which are now deprecated), but take an extra parameter that is used to guard against reading beyond the end of the input string. See L<perlapi/utf8_to_uvchr_buf> and L<perlapi/utf8_to_uvuni_buf>. =item * The regular expression engine now does TRIE case insensitive matches under Unicode. This may change the output of C<< use re 'debug'; >>, and will speed up various things. =item * There is a new C<wrap_op_checker()> function, which provides a thread-safe alternative to writing to C<PL_check> directly. =back =head1 Selected Bug Fixes =head2 Array and hash =over =item * A bug has been fixed that would cause a "Use of freed value in iteration" error if the next two hash elements that would be iterated over are deleted [perl #85026]. (5.14.1) =item * Deleting the current hash iterator (the hash element that would be returned by the next call to C<each>) in void context used not to free it [perl #85026]. =item * Deletion of methods via C<delete $Class::{method}> syntax used to update method caches if called in void context, but not scalar or list context. =item * When hash elements are deleted in void context, the internal hash entry is now freed before the value is freed, to prevent destructors called by that latter freeing from seeing the hash in an inconsistent state. It was possible to cause double-frees if the destructor freed the hash itself [perl #100340]. =item * A C<keys> optimization in Perl 5.12.0 to make it faster on empty hashes caused C<each> not to reset the iterator if called after the last element was deleted. =item * Freeing deeply nested hashes no longer crashes [perl #44225]. =item * It is possible from XS code to create hashes with elements that have no values. The hash element and slice operators used to crash when handling these in lvalue context. They now produce a "Modification of non-creatable hash value attempted" error message. =item * If list assignment to a hash or array triggered destructors that freed the hash or array itself, a crash would ensue. This is no longer the case [perl #107440]. =item * It used to be possible to free the typeglob of a localized array or hash (e.g., C<local @{"x"}; delete $::{x}>), resulting in a crash on scope exit. =item * Some core bugs affecting L<Hash::Util> have been fixed: locking a hash element that is a glob copy no longer causes the next assignment to it to corrupt the glob (5.14.2), and unlocking a hash element that holds a copy-on-write scalar no longer causes modifications to that scalar to modify other scalars that were sharing the same string buffer. =back =head2 C API fixes =over =item * The C<newHVhv> XS function now works on tied hashes, instead of crashing or returning an empty hash. =item * The C<SvIsCOW> C macro now returns false for read-only copies of typeglobs, such as those created by: $hash{elem} = *foo; Hash::Util::lock_value %hash, 'elem'; It used to return true. =item * The C<SvPVutf8> C function no longer tries to modify its argument, resulting in errors [perl #108994]. =item * C<SvPVutf8> now works properly with magical variables. =item * C<SvPVbyte> now works properly non-PVs. =item * When presented with malformed UTF-8 input, the XS-callable functions C<is_utf8_string()>, C<is_utf8_string_loc()>, and C<is_utf8_string_loclen()> could read beyond the end of the input string by up to 12 bytes. This no longer happens. [perl #32080]. However, currently, C<is_utf8_char()> still has this defect, see L</is_utf8_char()> above. =item * The C-level C<pregcomp> function could become confused about whether the pattern was in UTF8 if the pattern was an overloaded, tied, or otherwise magical scalar [perl #101940]. =back =head2 Compile-time hints =over =item * Tying C<%^H> no longer causes perl to crash or ignore the contents of C<%^H> when entering a compilation scope [perl #106282]. =item * C<eval $string> and C<require> used not to localize C<%^H> during compilation if it was empty at the time the C<eval> call itself was compiled. This could lead to scary side effects, like C<use re "/m"> enabling other flags that the surrounding code was trying to enable for its caller [perl #68750]. =item * C<eval $string> and C<require> no longer localize hints (C<$^H> and C<%^H>) at run time, but only during compilation of the $string or required file. This makes C<BEGIN { $^H{foo}=7 }> equivalent to C<BEGIN { eval '$^H{foo}=7' }> [perl #70151]. =item * Creating a BEGIN block from XS code (via C<newXS> or C<newATTRSUB>) would, on completion, make the hints of the current compiling code the current hints. This could cause warnings to occur in a non-warning scope. =back =head2 Copy-on-write scalars Copy-on-write or shared hash key scalars were introduced in 5.8.0, but most Perl code did not encounter them (they were used mostly internally). Perl 5.10.0 extended them, such that assigning C<__PACKAGE__> or a hash key to a scalar would make it copy-on-write. Several parts of Perl were not updated to account for them, but have now been fixed. =over =item * C<utf8::decode> had a nasty bug that would modify copy-on-write scalars' string buffers in place (i.e., skipping the copy). This could result in hashes having two elements with the same key [perl #91834]. (5.14.2) =item * Lvalue subroutines were not allowing COW scalars to be returned. This was fixed for lvalue scalar context in Perl 5.12.3 and 5.14.0, but list context was not fixed until this release. =item * Elements of restricted hashes (see the L<fields> pragma) containing copy-on-write values couldn't be deleted, nor could such hashes be cleared (C<%hash = ()>). (5.14.2) =item * Localizing a tied variable used to make it read-only if it contained a copy-on-write string. (5.14.2) =item * Assigning a copy-on-write string to a stash element no longer causes a double free. Regardless of this change, the results of such assignments are still undefined. =item * Assigning a copy-on-write string to a tied variable no longer stops that variable from being tied if it happens to be a PVMG or PVLV internally. =item * Doing a substitution on a tied variable returning a copy-on-write scalar used to cause an assertion failure or an "Attempt to free nonexistent shared string" warning. =item * This one is a regression from 5.12: In 5.14.0, the bitwise assignment operators C<|=>, C<^=> and C<&=> started leaving the left-hand side undefined if it happened to be a copy-on-write string [perl #108480]. =item * L<Storable>, L<Devel::Peek> and L<PerlIO::scalar> had similar problems. See L</Updated Modules and Pragmata>, above. =back =head2 The debugger =over =item * F<dumpvar.pl>, and therefore the C<x> command in the debugger, have been fixed to handle objects blessed into classes whose names contain "=". The contents of such objects used not to be dumped [perl #101814]. =item * The "R" command for restarting a debugger session has been fixed to work on Windows, or any other system lacking a C<POSIX::_SC_OPEN_MAX> constant [perl #87740]. =item * The C<#line 42 foo> directive used not to update the arrays of lines used by the debugger if it occurred in a string eval. This was partially fixed in 5.14, but it worked only for a single C<#line 42 foo> in each eval. Now it works for multiple. =item * When subroutine calls are intercepted by the debugger, the name of the subroutine or a reference to it is stored in C<$DB::sub>, for the debugger to access. Sometimes (such as C<$foo = *bar; undef *bar; &$foo>) C<$DB::sub> would be set to a name that could not be used to find the subroutine, and so the debugger's attempt to call it would fail. Now the check to see whether a reference is needed is more robust, so those problems should not happen anymore [rt.cpan.org #69862]. =item * Every subroutine has a filename associated with it that the debugger uses. The one associated with constant subroutines used to be misallocated when cloned under threads. Consequently, debugging threaded applications could result in memory corruption [perl #96126]. =back =head2 Dereferencing operators =over =item * C<defined(${"..."})>, C<defined(*{"..."})>, etc., used to return true for most, but not all built-in variables, if they had not been used yet. This bug affected C<${^GLOBAL_PHASE}> and C<${^UTF8CACHE}>, among others. It also used to return false if the package name was given as well (C<${"::!"}>) [perl #97978, #97492]. =item * Perl 5.10.0 introduced a similar bug: C<defined(*{"foo"})> where "foo" represents the name of a built-in global variable used to return false if the variable had never been used before, but only on the I<first> call. This, too, has been fixed. =item * Since 5.6.0, C<*{ ... }> has been inconsistent in how it treats undefined values. It would die in strict mode or lvalue context for most undefined values, but would be treated as the empty string (with a warning) for the specific scalar return by C<undef()> (C<&PL_sv_undef> internally). This has been corrected. C<undef()> is now treated like other undefined scalars, as in Perl 5.005. =back =head2 Filehandle, last-accessed Perl has an internal variable that stores the last filehandle to be accessed. It is used by C<$.> and by C<tell> and C<eof> without arguments. =over =item * It used to be possible to set this internal variable to a glob copy and then modify that glob copy to be something other than a glob, and still have the last-accessed filehandle associated with the variable after assigning a glob to it again: my $foo = *STDOUT; # $foo is a glob copy <$foo>; # $foo is now the last-accessed handle $foo = 3; # no longer a glob $foo = *STDERR; # still the last-accessed handle Now the C<$foo = 3> assignment unsets that internal variable, so there is no last-accessed filehandle, just as if C<< <$foo> >> had never happened. This also prevents some unrelated handle from becoming the last-accessed handle if $foo falls out of scope and the same internal SV gets used for another handle [perl #97988]. =item * A regression in 5.14 caused these statements not to set that internal variable: my $fh = *STDOUT; tell $fh; eof $fh; seek $fh, 0,0; tell *$fh; eof *$fh; seek *$fh, 0,0; readline *$fh; This is now fixed, but C<tell *{ *$fh }> still has the problem, and it is not clear how to fix it [perl #106536]. =back =head2 Filetests and C<stat> The term "filetests" refers to the operators that consist of a hyphen followed by a single letter: C<-r>, C<-x>, C<-M>, etc. The term "stacked" when applied to filetests means followed by another filetest operator sharing the same operand, as in C<-r -x -w $fooo>. =over =item * C<stat> produces more consistent warnings. It no longer warns for "_" [perl #71002] and no longer skips the warning at times for other unopened handles. It no longer warns about an unopened handle when the operating system's C<fstat> function fails. =item * C<stat> would sometimes return negative numbers for large inode numbers, because it was using the wrong internal C type. [perl #84590] =item * C<lstat> is documented to fall back to C<stat> (with a warning) when given a filehandle. When passed an IO reference, it was actually doing the equivalent of S<C<stat _>> and ignoring the handle. =item * C<-T _> with no preceding C<stat> used to produce a confusing "uninitialized" warning, even though there is no visible uninitialized value to speak of. =item * C<-T>, C<-B>, C<-l> and C<-t> now work when stacked with other filetest operators [perl #77388]. =item * In 5.14.0, filetest ops (C<-r>, C<-x>, etc.) started calling FETCH on a tied argument belonging to the previous argument to a list operator, if called with a bareword argument or no argument at all. This has been fixed, so C<push @foo, $tied, -r> no longer calls FETCH on C<$tied>. =item * In Perl 5.6, C<-l> followed by anything other than a bareword would treat its argument as a file name. That was changed in 5.8 for glob references (C<\*foo>), but not for globs themselves (C<*foo>). C<-l> started returning C<undef> for glob references without setting the last stat buffer that the "_" handle uses, but only if warnings were turned on. With warnings off, it was the same as 5.6. In other words, it was simply buggy and inconsistent. Now the 5.6 behavior has been restored. =item * C<-l> followed by a bareword no longer "eats" the previous argument to the list operator in whose argument list it resides. Hence, C<print "bar", -l foo> now actually prints "bar", because C<-l> on longer eats it. =item * Perl keeps several internal variables to keep track of the last stat buffer, from which file(handle) it originated, what type it was, and whether the last stat succeeded. There were various cases where these could get out of synch, resulting in inconsistent or erratic behavior in edge cases (every mention of C<-T> applies to C<-B> as well): =over =item * C<-T I<HANDLE>>, even though it does a C<stat>, was not resetting the last stat type, so an C<lstat _> following it would merrily return the wrong results. Also, it was not setting the success status. =item * Freeing the handle last used by C<stat> or a filetest could result in S<C<-T _>> using an unrelated handle. =item * C<stat> with an IO reference would not reset the stat type or record the filehandle for S<C<-T _>> to use. =item * Fatal warnings could cause the stat buffer not to be reset for a filetest operator on an unopened filehandle or C<-l> on any handle. Fatal warnings also stopped C<-T> from setting C<$!>. =item * When the last stat was on an unreadable file, C<-T _> is supposed to return C<undef>, leaving the last stat buffer unchanged. But it was setting the stat type, causing C<lstat _> to stop working. =item * C<-T I<FILENAME>> was not resetting the internal stat buffers for unreadable files. =back These have all been fixed. =back =head2 Formats =over =item * Several edge cases have been fixed with formats and C<formline>; in particular, where the format itself is potentially variable (such as with ties and overloading), and where the format and data differ in their encoding. In both these cases, it used to possible for the output to be corrupted [perl #91032]. =item * C<formline> no longer converts its argument into a string in-place. So passing a reference to C<formline> no longer destroys the reference [perl #79532]. =item * Assignment to C<$^A> (the format output accumulator) now recalculates the number of lines output. =back =head2 C<given> and C<when> =over =item * C<given> was not scoping its implicit $_ properly, resulting in memory leaks or "Variable is not available" warnings [perl #94682]. =item * C<given> was not calling set-magic on the implicit lexical C<$_> that it uses. This meant, for example, that C<pos> would be remembered from one execution of the same C<given> block to the next, even if the input were a different variable [perl #84526]. =item * C<when> blocks are now capable of returning variables declared inside the enclosing C<given> block [perl #93548]. =back =head2 The C<glob> operator =over =item * On OSes other than VMS, Perl's C<glob> operator (and the C<< <...> >> form) use L<File::Glob> underneath. L<File::Glob> splits the pattern into words, before feeding each word to its C<bsd_glob> function. There were several inconsistencies in the way the split was done. Now quotation marks (' and ") are always treated as shell-style word delimiters (that allow whitespace as part of a word) and backslashes are always preserved, unless they exist to escape quotation marks. Before, those would only sometimes be the case, depending on whether the pattern contained whitespace. Also, escaped whitespace at the end of the pattern is no longer stripped [perl #40470]. =item * C<CORE::glob> now works as a way to call the default globbing function. It used to respect overrides, despite the C<CORE::> prefix. =item * Under miniperl (used to configure modules when perl itself is built), C<glob> now clears %ENV before calling csh, since the latter croaks on some systems if it does not like the contents of the LS_COLORS environment variable [perl #98662]. =back =head2 Lvalue subroutines =over =item * Explicit return now returns the actual argument passed to return, instead of copying it [perl #72724, #72706]. =item * Lvalue subroutines used to enforce lvalue syntax (i.e., whatever can go on the left-hand side of C<=>) for the last statement and the arguments to return. Since lvalue subroutines are not always called in lvalue context, this restriction has been lifted. =item * Lvalue subroutines are less restrictive about what values can be returned. It used to croak on values returned by C<shift> and C<delete> and from other subroutines, but no longer does so [perl #71172]. =item * Empty lvalue subroutines (C<sub :lvalue {}>) used to return C<@_> in list context. All subroutines used to do this, but regular subs were fixed in Perl 5.8.2. Now lvalue subroutines have been likewise fixed. =item * Autovivification now works on values returned from lvalue subroutines [perl #7946], as does returning C<keys> in lvalue context. =item * Lvalue subroutines used to copy their return values in rvalue context. Not only was this a waste of CPU cycles, but it also caused bugs. A C<($)> prototype would cause an lvalue sub to copy its return value [perl #51408], and C<while(lvalue_sub() =~ m/.../g) { ... }> would loop endlessly [perl #78680]. =item * When called in potential lvalue context (e.g., subroutine arguments or a list passed to C<for>), lvalue subroutines used to copy any read-only value that was returned. E.g., C< sub :lvalue { $] } > would not return C<$]>, but a copy of it. =item * When called in potential lvalue context, an lvalue subroutine returning arrays or hashes used to bind the arrays or hashes to scalar variables, resulting in bugs. This was fixed in 5.14.0 if an array were the first thing returned from the subroutine (but not for C<$scalar, @array> or hashes being returned). Now a more general fix has been applied [perl #23790]. =item * Method calls whose arguments were all surrounded with C<my()> or C<our()> (as in C<< $object->method(my($a,$b)) >>) used to force lvalue context on the subroutine. This would prevent lvalue methods from returning certain values. =item * Lvalue sub calls that are not determined to be such at compile time (C<&$name> or &{"name"}) are no longer exempt from strict refs if they occur in the last statement of an lvalue subroutine [perl #102486]. =item * Sub calls whose subs are not visible at compile time, if they occurred in the last statement of an lvalue subroutine, would reject non-lvalue subroutines and die with "Can't modify non-lvalue subroutine call" [perl #102486]. Non-lvalue sub calls whose subs I<are> visible at compile time exhibited the opposite bug. If the call occurred in the last statement of an lvalue subroutine, there would be no error when the lvalue sub was called in lvalue context. Perl would blindly assign to the temporary value returned by the non-lvalue subroutine. =item * C<AUTOLOAD> routines used to take precedence over the actual sub being called (i.e., when autoloading wasn't needed), for sub calls in lvalue or potential lvalue context, if the subroutine was not visible at compile time. =item * Applying the C<:lvalue> attribute to an XSUB or to an aliased subroutine stub with C<< sub foo :lvalue; >> syntax stopped working in Perl 5.12. This has been fixed. =item * Applying the :lvalue attribute to subroutine that is already defined does not work properly, as the attribute changes the way the sub is compiled. Hence, Perl 5.12 began warning when an attempt is made to apply the attribute to an already defined sub. In such cases, the attribute is discarded. But the change in 5.12 missed the case where custom attributes are also present: that case still silently and ineffectively applied the attribute. That omission has now been corrected. C<sub foo :lvalue :Whatever> (when C<foo> is already defined) now warns about the :lvalue attribute, and does not apply it. =item * A bug affecting lvalue context propagation through nested lvalue subroutine calls has been fixed. Previously, returning a value in nested rvalue context would be treated as lvalue context by the inner subroutine call, resulting in some values (such as read-only values) being rejected. =back =head2 Overloading =over =item * Arithmetic assignment (C<$left += $right>) involving overloaded objects that rely on the 'nomethod' override no longer segfault when the left operand is not overloaded. =item * Errors that occur when methods cannot be found during overloading now mention the correct package name, as they did in 5.8.x, instead of erroneously mentioning the "overload" package, as they have since 5.10.0. =item * Undefining C<%overload::> no longer causes a crash. =back =head2 Prototypes of built-in keywords =over =item * The C<prototype> function no longer dies for the C<__FILE__>, C<__LINE__> and C<__PACKAGE__> directives. It now returns an empty-string prototype for them, because they are syntactically indistinguishable from nullary functions like C<time>. =item * C<prototype> now returns C<undef> for all overridable infix operators, such as C<eq>, which are not callable in any way resembling functions. It used to return incorrect prototypes for some and die for others [perl #94984]. =item * The prototypes of several built-in functions--C<getprotobynumber>, C<lock>, C<not> and C<select>--have been corrected, or at least are now closer to reality than before. =back =head2 Regular expressions =for comment Is it possible to merge some of these items? =over 4 =item * C</[[:ascii:]]/> and C</[[:blank:]]/> now use locale rules under C<use locale> when the platform supports that. Previously, they used the platform's native character set. =item * C<m/[[:ascii:]]/i> and C</\p{ASCII}/i> now match identically (when not under a differing locale). This fixes a regression introduced in 5.14 in which the first expression could match characters outside of ASCII, such as the KELVIN SIGN. =item * C</.*/g> would sometimes refuse to match at the end of a string that ends with "\n". This has been fixed [perl #109206]. =item * Starting with 5.12.0, Perl used to get its internal bookkeeping muddled up after assigning C<${ qr// }> to a hash element and locking it with L<Hash::Util>. This could result in double frees, crashes, or erratic behavior. =item * The new (in 5.14.0) regular expression modifier C</a> when repeated like C</aa> forbids the characters outside the ASCII range that match characters inside that range from matching under C</i>. This did not work under some circumstances, all involving alternation, such as: "\N{KELVIN SIGN}" =~ /k|foo/iaa; succeeded inappropriately. This is now fixed. =item * 5.14.0 introduced some memory leaks in regular expression character classes such as C<[\w\s]>, which have now been fixed. (5.14.1) =item * An edge case in regular expression matching could potentially loop. This happened only under C</i> in bracketed character classes that have characters with multi-character folds, and the target string to match against includes the first portion of the fold, followed by another character that has a multi-character fold that begins with the remaining portion of the fold, plus some more. "s\N{U+DF}" =~ /[\x{DF}foo]/i is one such case. C<\xDF> folds to C<"ss">. (5.14.1) =item * A few characters in regular expression pattern matches did not match correctly in some circumstances, all involving C</i>. The affected characters are: COMBINING GREEK YPOGEGRAMMENI, GREEK CAPITAL LETTER IOTA, GREEK CAPITAL LETTER UPSILON, GREEK PROSGEGRAMMENI, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND OXIA, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS, GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND OXIA, GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS, LATIN SMALL LETTER LONG S, LATIN SMALL LIGATURE LONG S T, and LATIN SMALL LIGATURE ST. =item * A memory leak regression in regular expression compilation under threading has been fixed. =item * A regression introduced in 5.14.0 has been fixed. This involved an inverted bracketed character class in a regular expression that consisted solely of a Unicode property. That property wasn't getting inverted outside the Latin1 range. =item * Three problematic Unicode characters now work better in regex pattern matching under C</i>. In the past, three Unicode characters: LATIN SMALL LETTER SHARP S, GREEK SMALL LETTER IOTA WITH DIALYTIKA AND TONOS, and GREEK SMALL LETTER UPSILON WITH DIALYTIKA AND TONOS, along with the sequences that they fold to (including "ss" for LATIN SMALL LETTER SHARP S), did not properly match under C</i>. 5.14.0 fixed some of these cases, but introduced others, including a panic when one of the characters or sequences was used in the C<(?(DEFINE)> regular expression predicate. The known bugs that were introduced in 5.14 have now been fixed; as well as some other edge cases that have never worked until now. These all involve using the characters and sequences outside bracketed character classes under C</i>. This closes [perl #98546]. There remain known problems when using certain characters with multi-character folds inside bracketed character classes, including such constructs as C<qr/[\N{LATIN SMALL LETTER SHARP}a-z]/i>. These remaining bugs are addressed in [perl #89774]. =item * RT #78266: The regex engine has been leaking memory when accessing named captures that weren't matched as part of a regex ever since 5.10 when they were introduced; e.g., this would consume over a hundred MB of memory: for (1..10_000_000) { if ("foo" =~ /(foo|(?<capture>bar))?/) { my $capture = $+{capture} } } system "ps -o rss $$"' =item * In 5.14, C</[[:lower:]]/i> and C</[[:upper:]]/i> no longer matched the opposite case. This has been fixed [perl #101970]. =item * A regular expression match with an overloaded object on the right-hand side would sometimes stringify the object too many times. =item * A regression has been fixed that was introduced in 5.14, in C</i> regular expression matching, in which a match improperly fails if the pattern is in UTF-8, the target string is not, and a Latin-1 character precedes a character in the string that should match the pattern. [perl #101710] =item * In case-insensitive regular expression pattern matching, no longer on UTF-8 encoded strings does the scan for the start of match look only at the first possible position. This caused matches such as C<"f\x{FB00}" =~ /ff/i> to fail. =item * The regexp optimizer no longer crashes on debugging builds when merging fixed-string nodes with inconvenient contents. =item * A panic involving the combination of the regular expression modifiers C</aa> and the C<\b> escape sequence introduced in 5.14.0 has been fixed [perl #95964]. (5.14.2) =item * The combination of the regular expression modifiers C</aa> and the C<\b> and C<\B> escape sequences did not work properly on UTF-8 encoded strings. All non-ASCII characters under C</aa> should be treated as non-word characters, but what was happening was that Unicode rules were used to determine wordness/non-wordness for non-ASCII characters. This is now fixed [perl #95968]. =item * C<< (?foo: ...) >> no longer loses passed in character set. =item * The trie optimization used to have problems with alternations containing an empty C<(?:)>, causing C<< "x" =~ /\A(?>(?:(?:)A|B|C?x))\z/ >> not to match, whereas it should [perl #111842]. =item * Use of lexical (C<my>) variables in code blocks embedded in regular expressions will no longer result in memory corruption or crashes. Nevertheless, these code blocks are still experimental, as there are still problems with the wrong variables being closed over (in loops for instance) and with abnormal exiting (e.g., C<die>) causing memory corruption. =item * The C<\h>, C<\H>, C<\v> and C<\V> regular expression metacharacters used to cause a panic error message when trying to match at the end of the string [perl #96354]. =item * The abbreviations for four C1 control characters C<MW> C<PM>, C<RI>, and C<ST> were previously unrecognized by C<\N{}>, vianame(), and string_vianame(). =item * Mentioning a variable named "&" other than C<$&> (i.e., C<@&> or C<%&>) no longer stops C<$&> from working. The same applies to variables named "'" and "`" [perl #24237]. =item * Creating a C<UNIVERSAL::AUTOLOAD> sub no longer stops C<%+>, C<%-> and C<%!> from working some of the time [perl #105024]. =back =head2 Smartmatching =over =item * C<~~> now correctly handles the precedence of Any~~Object, and is not tricked by an overloaded object on the left-hand side. =item * In Perl 5.14.0, C<$tainted ~~ @array> stopped working properly. Sometimes it would erroneously fail (when C<$tainted> contained a string that occurs in the array I<after> the first element) or erroneously succeed (when C<undef> occurred after the first element) [perl #93590]. =back =head2 The C<sort> operator =over =item * C<sort> was not treating C<sub {}> and C<sub {()}> as equivalent when such a sub was provided as the comparison routine. It used to croak on C<sub {()}>. =item * C<sort> now works once more with custom sort routines that are XSUBs. It stopped working in 5.10.0. =item * C<sort> with a constant for a custom sort routine, although it produces unsorted results, no longer crashes. It started crashing in 5.10.0. =item * Warnings emitted by C<sort> when a custom comparison routine returns a non-numeric value now contain "in sort" and show the line number of the C<sort> operator, rather than the last line of the comparison routine. The warnings also now occur only if warnings are enabled in the scope where C<sort> occurs. Previously the warnings would occur if enabled in the comparison routine's scope. =item * C<< sort { $a <=> $b } >>, which is optimized internally, now produces "uninitialized" warnings for NaNs (not-a-number values), since C<< <=> >> returns C<undef> for those. This brings it in line with S<C<< sort { 1; $a <=> $b } >>> and other more complex cases, which are not optimized [perl #94390]. =back =head2 The C<substr> operator =over =item * Tied (and otherwise magical) variables are no longer exempt from the "Attempt to use reference as lvalue in substr" warning. =item * That warning now occurs when the returned lvalue is assigned to, not when C<substr> itself is called. This makes a difference only if the return value of C<substr> is referenced and later assigned to. =item * Passing a substring of a read-only value or a typeglob to a function (potential lvalue context) no longer causes an immediate "Can't coerce" or "Modification of a read-only value" error. That error occurs only if the passed value is assigned to. The same thing happens with the "substr outside of string" error. If the lvalue is only read from, not written to, it is now just a warning, as with rvalue C<substr>. =item * C<substr> assignments no longer call FETCH twice if the first argument is a tied variable, just once. =back =head2 Support for embedded nulls Some parts of Perl did not work correctly with nulls (C<chr 0>) embedded in strings. That meant that, for instance, C<< $m = "a\0b"; foo->$m >> would call the "a" method, instead of the actual method name contained in $m. These parts of perl have been fixed to support nulls: =over =item * Method names =item * Typeglob names (including filehandle and subroutine names) =item * Package names, including the return value of C<ref()> =item * Typeglob elements (C<*foo{"THING\0stuff"}>) =item * Signal names =item * Various warnings and error messages that mention variable names or values, methods, etc. =back One side effect of these changes is that blessing into "\0" no longer causes C<ref()> to return false. =head2 Threading bugs =over =item * Typeglobs returned from threads are no longer cloned if the parent thread already has a glob with the same name. This means that returned subroutines will now assign to the right package variables [perl #107366]. =item * Some cases of threads crashing due to memory allocation during cloning have been fixed [perl #90006]. =item * Thread joining would sometimes emit "Attempt to free unreferenced scalar" warnings if C<caller> had been used from the C<DB> package before thread creation [perl #98092]. =item * Locking a subroutine (via C<lock &sub>) is no longer a compile-time error for regular subs. For lvalue subroutines, it no longer tries to return the sub as a scalar, resulting in strange side effects like C<ref \$_> returning "CODE" in some instances. C<lock &sub> is now a run-time error if L<threads::shared> is loaded (a no-op otherwise), but that may be rectified in a future version. =back =head2 Tied variables =over =item * Various cases in which FETCH was being ignored or called too many times have been fixed: =over =item * C<PerlIO::get_layers> [perl #97956] =item * C<$tied =~ y/a/b/>, C<chop $tied> and C<chomp $tied> when $tied holds a reference. =item * When calling C<local $_> [perl #105912] =item * Four-argument C<select> =item * A tied buffer passed to C<sysread> =item * C<< $tied .= <> >> =item * Three-argument C<open>, the third being a tied file handle (as in C<< open $fh, ">&", $tied >>) =item * C<sort> with a reference to a tied glob for the comparison routine. =item * C<..> and C<...> in list context [perl #53554]. =item * C<${$tied}>, C<@{$tied}>, C<%{$tied}> and C<*{$tied}> where the tied variable returns a string (C<&{}> was unaffected) =item * C<defined ${ $tied_variable }> =item * Various functions that take a filehandle argument in rvalue context (C<close>, C<readline>, etc.) [perl #97482] =item * Some cases of dereferencing a complex expression, such as C<${ (), $tied } = 1>, used to call C<FETCH> multiple times, but now call it once. =item * C<$tied-E<gt>method> where $tied returns a package name--even resulting in a failure to call the method, due to memory corruption =item * Assignments like C<*$tied = \&{"..."}> and C<*glob = $tied> =item * C<chdir>, C<chmod>, C<chown>, C<utime>, C<truncate>, C<stat>, C<lstat> and the filetest ops (C<-r>, C<-x>, etc.) =back =item * C<caller> sets C<@DB::args> to the subroutine arguments when called from the DB package. It used to crash when doing so if C<@DB::args> happened to be tied. Now it croaks instead. =item * Tying an element of %ENV or C<%^H> and then deleting that element would result in a call to the tie object's DELETE method, even though tying the element itself is supposed to be equivalent to tying a scalar (the element is, of course, a scalar) [perl #67490]. =item * When Perl autovivifies an element of a tied array or hash (which entails calling STORE with a new reference), it now calls FETCH immediately after the STORE, instead of assuming that FETCH would have returned the same reference. This can make it easier to implement tied objects [perl #35865, #43011]. =item * Four-argument C<select> no longer produces its "Non-string passed as bitmask" warning on tied or tainted variables that are strings. =item * Localizing a tied scalar that returns a typeglob no longer stops it from being tied till the end of the scope. =item * Attempting to C<goto> out of a tied handle method used to cause memory corruption or crashes. Now it produces an error message instead [perl #8611]. =item * A bug has been fixed that occurs when a tied variable is used as a subroutine reference: if the last thing assigned to or returned from the variable was a reference or typeglob, the C<\&$tied> could either crash or return the wrong subroutine. The reference case is a regression introduced in Perl 5.10.0. For typeglobs, it has probably never worked till now. =back =head2 Version objects and vstrings =over =item * The bitwise complement operator (and possibly other operators, too) when passed a vstring would leave vstring magic attached to the return value, even though the string had changed. This meant that C<< version->new(~v1.2.3) >> would create a version looking like "v1.2.3" even though the string passed to C<< version->new >> was actually "\376\375\374". This also caused L<B::Deparse> to deparse C<~v1.2.3> incorrectly, without the C<~> [perl #29070]. =item * Assigning a vstring to a magic (e.g., tied, C<$!>) variable and then assigning something else used to blow away all magic. This meant that tied variables would come undone, C<$!> would stop getting updated on failed system calls, C<$|> would stop setting autoflush, and other mischief would take place. This has been fixed. =item * C<< version->new("version") >> and C<printf "%vd", "version"> no longer crash [perl #102586]. =item * Version comparisons, such as those that happen implicitly with C<use v5.43>, no longer cause locale settings to change [perl #105784]. =item * Version objects no longer cause memory leaks in boolean context [perl #109762]. =back =head2 Warnings, redefinition =over =item * Subroutines from the C<autouse> namespace are once more exempt from redefinition warnings. This used to work in 5.005, but was broken in 5.6 for most subroutines. For subs created via XS that redefine subroutines from the C<autouse> package, this stopped working in 5.10. =item * New XSUBs now produce redefinition warnings if they overwrite existing subs, as they did in 5.8.x. (The C<autouse> logic was reversed in 5.10-14. Only subroutines from the C<autouse> namespace would warn when clobbered.) =item * C<newCONSTSUB> used to use compile-time warning hints, instead of run-time hints. The following code should never produce a redefinition warning, but it used to, if C<newCONSTSUB> redefined an existing subroutine: use warnings; BEGIN { no warnings; some_XS_function_that_calls_new_CONSTSUB(); } =item * Redefinition warnings for constant subroutines are on by default (what are known as severe warnings in L<perldiag>). This occurred only when it was a glob assignment or declaration of a Perl subroutine that caused the warning. If the creation of XSUBs triggered the warning, it was not a default warning. This has been corrected. =item * The internal check to see whether a redefinition warning should occur used to emit "uninitialized" warnings in cases like this: use warnings "uninitialized"; use constant {u => undef, v => undef}; sub foo(){u} sub foo(){v} =back =head2 Warnings, "Uninitialized" =over =item * Various functions that take a filehandle argument in rvalue context (C<close>, C<readline>, etc.) used to warn twice for an undefined handle [perl #97482]. =item * C<dbmopen> now only warns once, rather than three times, if the mode argument is C<undef> [perl #90064]. =item * The C<+=> operator does not usually warn when the left-hand side is C<undef>, but it was doing so for tied variables. This has been fixed [perl #44895]. =item * A bug fix in Perl 5.14 introduced a new bug, causing "uninitialized" warnings to report the wrong variable if the operator in question had two operands and one was C<%{...}> or C<@{...}>. This has been fixed [perl #103766]. =item * C<..> and C<...> in list context now mention the name of the variable in "uninitialized" warnings for string (as opposed to numeric) ranges. =back =head2 Weak references =over =item * Weakening the first argument to an automatically-invoked C<DESTROY> method could result in erroneous "DESTROY created new reference" errors or crashes. Now it is an error to weaken a read-only reference. =item * Weak references to lexical hashes going out of scope were not going stale (becoming undefined), but continued to point to the hash. =item * Weak references to lexical variables going out of scope are now broken before any magical methods (e.g., DESTROY on a tie object) are called. This prevents such methods from modifying the variable that will be seen the next time the scope is entered. =item * Creating a weak reference to an @ISA array or accessing the array index (C<$#ISA>) could result in confused internal bookkeeping for elements later added to the @ISA array. For instance, creating a weak reference to the element itself could push that weak reference on to @ISA; and elements added after use of C<$#ISA> would be ignored by method lookup [perl #85670]. =back =head2 Other notable fixes =over =item * C<quotemeta> now quotes consistently the same non-ASCII characters under C<use feature 'unicode_strings'>, regardless of whether the string is encoded in UTF-8 or not, hence fixing the last vestiges (we hope) of the notorious L<perlunicode/The "Unicode Bug">. [perl #77654]. Which of these code points is quoted has changed, based on Unicode's recommendations. See L<perlfunc/quotemeta> for details. =item * C<study> is now a no-op, presumably fixing all outstanding bugs related to study causing regex matches to behave incorrectly! =item * When one writes C<open foo || die>, which used to work in Perl 4, a "Precedence problem" warning is produced. This warning used erroneously to apply to fully-qualified bareword handle names not followed by C<||>. This has been corrected. =item * After package aliasing (C<*foo:: = *bar::>), C<select> with 0 or 1 argument would sometimes return a name that could not be used to refer to the filehandle, or sometimes it would return C<undef> even when a filehandle was selected. Now it returns a typeglob reference in such cases. =item * C<PerlIO::get_layers> no longer ignores some arguments that it thinks are numeric, while treating others as filehandle names. It is now consistent for flat scalars (i.e., not references). =item * Unrecognized switches on C<#!> line If a switch, such as B<-x>, that cannot occur on the C<#!> line is used there, perl dies with "Can't emulate...". It used to produce the same message for switches that perl did not recognize at all, whether on the command line or the C<#!> line. Now it produces the "Unrecognized switch" error message [perl #104288]. =item * C<system> now temporarily blocks the SIGCHLD signal handler, to prevent the signal handler from stealing the exit status [perl #105700]. =item * The %n formatting code for C<printf> and C<sprintf>, which causes the number of characters to be assigned to the next argument, now actually assigns the number of characters, instead of the number of bytes. It also works now with special lvalue functions like C<substr> and with nonexistent hash and array elements [perl #3471, #103492]. =item * Perl skips copying values returned from a subroutine, for the sake of speed, if doing so would make no observable difference. Because of faulty logic, this would happen with the result of C<delete>, C<shift> or C<splice>, even if the result was referenced elsewhere. It also did so with tied variables about to be freed [perl #91844, #95548]. =item * C<utf8::decode> now refuses to modify read-only scalars [perl #91850]. =item * Freeing $_ inside a C<grep> or C<map> block, a code block embedded in a regular expression, or an @INC filter (a subroutine returned by a subroutine in @INC) used to result in double frees or crashes [perl #91880, #92254, #92256]. =item * C<eval> returns C<undef> in scalar context or an empty list in list context when there is a run-time error. When C<eval> was passed a string in list context and a syntax error occurred, it used to return a list containing a single undefined element. Now it returns an empty list in list context for all errors [perl #80630]. =item * C<goto &func> no longer crashes, but produces an error message, when the unwinding of the current subroutine's scope fires a destructor that undefines the subroutine being "goneto" [perl #99850]. =item * Perl now holds an extra reference count on the package that code is currently compiling in. This means that the following code no longer crashes [perl #101486]: package Foo; BEGIN {*Foo:: = *Bar::} sub foo; =item * The C<x> repetition operator no longer crashes on 64-bit builds with large repeat counts [perl #94560]. =item * Calling C<require> on an implicit C<$_> when C<*CORE::GLOBAL::require> has been overridden does not segfault anymore, and C<$_> is now passed to the overriding subroutine [perl #78260]. =item * C<use> and C<require> are no longer affected by the I/O layers active in the caller's scope (enabled by L<open.pm|open>) [perl #96008]. =item * C<our $::é; $é> (which is invalid) no longer produces the "Compilation error at lib/utf8_heavy.pl..." error message, which it started emitting in 5.10.0 [perl #99984]. =item * On 64-bit systems, C<read()> now understands large string offsets beyond the 32-bit range. =item * Errors that occur when processing subroutine attributes no longer cause the subroutine's op tree to leak. =item * Passing the same constant subroutine to both C<index> and C<formline> no longer causes one or the other to fail [perl #89218]. (5.14.1) =item * List assignment to lexical variables declared with attributes in the same statement (C<my ($x,@y) : blimp = (72,94)>) stopped working in Perl 5.8.0. It has now been fixed. =item * Perl 5.10.0 introduced some faulty logic that made "U*" in the middle of a pack template equivalent to "U0" if the input string was empty. This has been fixed [perl #90160]. (5.14.2) =item * Destructors on objects were not called during global destruction on objects that were not referenced by any scalars. This could happen if an array element were blessed (e.g., C<bless \$a[0]>) or if a closure referenced a blessed variable (C<bless \my @a; sub foo { @a }>). Now there is an extra pass during global destruction to fire destructors on any objects that might be left after the usual passes that check for objects referenced by scalars [perl #36347]. =item * Fixed a case where it was possible that a freed buffer may have been read from when parsing a here document [perl #90128]. (5.14.1) =item * C<each(I<ARRAY>)> is now wrapped in C<defined(...)>, like C<each(I<HASH>)>, inside a C<while> condition [perl #90888]. =item * A problem with context propagation when a C<do> block is an argument to C<return> has been fixed. It used to cause C<undef> to be returned in certain cases of a C<return> inside an C<if> block which itself is followed by another C<return>. =item * Calling C<index> with a tainted constant no longer causes constants in subsequently compiled code to become tainted [perl #64804]. =item * Infinite loops like C<1 while 1> used to stop C<strict 'subs'> mode from working for the rest of the block. =item * For list assignments like C<($a,$b) = ($b,$a)>, Perl has to make a copy of the items on the right-hand side before assignment them to the left. For efficiency's sake, it assigns the values on the right straight to the items on the left if no one variable is mentioned on both sides, as in C<($a,$b) = ($c,$d)>. The logic for determining when it can cheat was faulty, in that C<&&> and C<||> on the right-hand side could fool it. So C<($a,$b) = $some_true_value && ($b,$a)> would end up assigning the value of C<$b> to both scalars. =item * Perl no longer tries to apply lvalue context to the string in C<("string", $variable) ||= 1> (which used to be an error). Since the left-hand side of C<||=> is evaluated in scalar context, that's a scalar comma operator, which gives all but the last item void context. There is no such thing as void lvalue context, so it was a mistake for Perl to try to force it [perl #96942]. =item * C<caller> no longer leaks memory when called from the DB package if C<@DB::args> was assigned to after the first call to C<caller>. L<Carp> was triggering this bug [perl #97010]. (5.14.2) =item * C<close> and similar filehandle functions, when called on built-in global variables (like C<$+>), used to die if the variable happened to hold the undefined value, instead of producing the usual "Use of uninitialized value" warning. =item * When autovivified file handles were introduced in Perl 5.6.0, C<readline> was inadvertently made to autovivify when called as C<readline($foo)> (but not as C<E<lt>$fooE<gt>>). It has now been fixed never to autovivify. =item * Calling an undefined anonymous subroutine (e.g., what $x holds after C<undef &{$x = sub{}}>) used to cause a "Not a CODE reference" error, which has been corrected to "Undefined subroutine called" [perl #71154]. =item * Causing C<@DB::args> to be freed between uses of C<caller> no longer results in a crash [perl #93320]. =item * C<setpgrp($foo)> used to be equivalent to C<($foo, setpgrp)>, because C<setpgrp> was ignoring its argument if there was just one. Now it is equivalent to C<setpgrp($foo,0)>. =item * C<shmread> was not setting the scalar flags correctly when reading from shared memory, causing the existing cached numeric representation in the scalar to persist [perl #98480]. =item * C<++> and C<--> now work on copies of globs, instead of dying. =item * C<splice()> doesn't warn when truncating You can now limit the size of an array using C<splice(@a,MAX_LEN)> without worrying about warnings. =item * C<< $$ >> is no longer tainted. Since this value comes directly from C<< getpid() >>, it is always safe. =item * The parser no longer leaks a filehandle if STDIN was closed before parsing started [perl #37033]. =item * C<< die; >> with a non-reference, non-string, or magical (e.g., tainted) value in $@ now properly propagates that value [perl #111654]. =back =head1 Known Problems =over 4 =item * On Solaris, we have two kinds of failure. If F<make> is Sun's F<make>, we get an error about a badly formed macro assignment in the F<Makefile>. That happens when F<./Configure> tries to make depends. F<Configure> then exits 0, but further F<make>-ing fails. If F<make> is F<gmake>, F<Configure> completes, then we get errors related to F</usr/include/stdbool.h> =item * On Win32, a number of tests hang unless STDERR is redirected. The cause of this is still under investigation. =item * When building as root with a umask that prevents files from being other-readable, F<t/op/filetest.t> will fail. This is a test bug, not a bug in perl's behavior. =item * Configuring with a recent gcc and link-time-optimization, such as C<Configure -Doptimize='-O2 -flto'> fails because the optimizer optimizes away some of Configure's tests. A workaround is to omit the C<-flto> flag when running Configure, but add it back in while actually building, something like sh Configure -Doptimize=-O2 make OPTIMIZE='-O2 -flto' =item * The following CPAN modules have test failures with perl 5.16. Patches have been submitted for all of these, so hopefully there will be new releases soon: =over =item * L<Date::Pcalc> version 6.1 =item * L<Module::CPANTS::Analyse> version 0.85 This fails due to problems in L<Module::Find> 0.10 and L<File::MMagic> 1.27. =item * L<PerlIO::Util> version 0.72 =back =back =head1 Acknowledgements Perl 5.16.0 represents approximately 12 months of development since Perl 5.14.0 and contains approximately 590,000 lines of changes across 2,500 files from 139 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.0: Aaron Crane, Abhijit Menon-Sen, Abigail, Alan Haggai Alavi, Alberto Simões, Alexandr Ciornii, Andreas König, Andy Dougherty, Aristotle Pagaltzis, Bo Johansson, Bo Lindbergh, Breno G. de Oliveira, brian d foy, Brian Fraser, Brian Greenfield, Carl Hayter, Chas. Owens, Chia-liang Kao, Chip Salzenberg, Chris 'BinGOs' Williams, Christian Hansen, Christopher J. Madsen, chromatic, Claes Jacobsson, Claudio Ramirez, Craig A. Berry, Damian Conway, Daniel Kahn Gillmor, Darin McBride, Dave Rolsky, David Cantrell, David Golden, David Leadbeater, David Mitchell, Dee Newcum, Dennis Kaarsemaker, Dominic Hargreaves, Douglas Christopher Wilson, Eric Brine, Father Chrysostomos, Florian Ragwitz, Frederic Briere, George Greer, Gerard Goossen, Gisle Aas, H.Merijn Brand, Hojung Youn, Ian Goodacre, James E Keenan, Jan Dubois, Jerry D. Hedden, Jesse Luehrs, Jesse Vincent, Jilles Tjoelker, Jim Cromie, Jim Meyering, Joel Berger, Johan Vromans, Johannes Plunien, John Hawkinson, John P. Linderman, John Peacock, Joshua ben Jore, Juerd Waalboer, Karl Williamson, Karthik Rajagopalan, Keith Thompson, Kevin J. Woolley, Kevin Ryde, Laurent Dami, Leo Lapworth, Leon Brocard, Leon Timmermans, Louis Strous, Lukas Mai, Marc Green, Marcel Grünauer, Mark A. Stratman, Mark Dootson, Mark Jason Dominus, Martin Hasch, Matthew Horsfall, Max Maischein, Michael G Schwern, Michael Witten, Mike Sheldrake, Moritz Lenz, Nicholas Clark, Niko Tyni, Nuno Carvalho, Pau Amma, Paul Evans, Paul Green, Paul Johnson, Perlover, Peter John Acklam, Peter Martini, Peter Scott, Phil Monsen, Pino Toscano, Rafael Garcia-Suarez, Rainer Tammer, Reini Urban, Ricardo Signes, Robin Barker, Rodolfo Carvalho, Salvador Fandiño, Sam Kimbrel, Samuel Thibault, Shawn M Moore, Shigeya Suzuki, Shirakata Kentaro, Shlomi Fish, Sisyphus, Slaven Rezic, Spiros Denaxas, Steffen Müller, Steffen Schwigon, Stephen Bennett, Stephen Oberholtzer, Stevan Little, Steve Hay, Steve Peters, Thomas Sibley, Thorsten Glaser, Timothe Litt, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tony Cook, Vadim Konovalov, Vincent Pit, Vladimir Timofeev, Walt Mankowski, Yves Orton, Zefram, Zsbán Ambrus, Ævar Arnfjörð Bjarmason. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at L<http://rt.perl.org/perlbug/>. There may also be information at L<http://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please use this address only for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�V��[ [ perl5321delta.podnu �[��� =encoding utf8 =head1 NAME perldelta - what is new for perl v5.32.1 =head1 DESCRIPTION This document describes differences between the 5.32.0 release and the 5.32.1 release. If you are upgrading from an earlier release such as 5.31.0, first read L<perl5320delta>, which describes differences between 5.31.0 and 5.32.0. =head1 Incompatible Changes There are no changes intentionally incompatible with Perl 5.32.0. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Data::Dumper> has been upgraded from version 2.174 to 2.174_01. A number of memory leaks have been fixed. =item * L<DynaLoader> has been upgraded from version 1.47 to 1.47_01. =item * L<Module::CoreList> has been upgraded from version 5.20200620 to 5.20210123. =item * L<Opcode> has been upgraded from version 1.47 to 1.48. A warning has been added about evaluating untrusted code with the perl interpreter. =item * L<Safe> has been upgraded from version 2.41 to 2.41_01. A warning has been added about evaluating untrusted code with the perl interpreter. =back =head1 Documentation =head2 New Documentation =head3 L<perlgov> Documentation of the newly formed rules of governance for Perl. =head3 L<perlsecpolicy> Documentation of how the Perl security team operates and how the team evaluates new security reports. =head2 Changes to Existing Documentation We have attempted to update the documentation to reflect the changes listed in this document. If you find any we have missed, open an issue at L<https://github.com/Perl/perl5/issues>. Additionally, the following selected changes have been made: =head3 L<perlop> =over 4 =item * Document range op behaviour change. =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 Changes to Existing Diagnostics =over 4 =item * L<\K not permitted in lookahead/lookbehind in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"\K not permitted in lookahead/lookbehind in regex; marked by <-- HERE in m/%s/"> This error was incorrectly produced in some cases involving nested lookarounds. This has been fixed. [L<GH #18123|https://github.com/Perl/perl5/issues/18123>] =back =head1 Configuration and Compilation =over 4 =item * Newer 64-bit versions of the Intel C/C++ compiler are now recognized and have the correct flags set. =item * We now trap SIGBUS when F<Configure> checks for C<va_copy>. On several systems the attempt to determine if we need C<va_copy> or similar results in a SIGBUS instead of the expected SIGSEGV, which previously caused a core dump. [L<GH #18148|https://github.com/Perl/perl5/issues/18148>] =back =head1 Testing Tests were added and changed to reflect the other additions and changes in this release. =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item MacOS (Darwin) The hints file for darwin has been updated to handle future macOS versions beyond 10. Perl can now be built on macOS Big Sur. [L<GH #17946|https://github.com/Perl/perl5/issues/17946>, L<GH #18406|https://github.com/Perl/perl5/issues/18406>] =item Minix Build errors on Minix have been fixed. [L<GH #17908|https://github.com/Perl/perl5/issues/17908>] =back =head1 Selected Bug Fixes =over 4 =item * Some list assignments involving C<undef> on the left-hand side were over-optimized and produced incorrect results. [L<GH #16685|https://github.com/Perl/perl5/issues/16685>, L<GH #17816|https://github.com/Perl/perl5/issues/17816>] =item * Fixed a bug in which some regexps with recursive subpatterns matched incorrectly. [L<GH #18096|https://github.com/Perl/perl5/issues/18096>] =item * Fixed a deadlock that hung the build when Perl is compiled for debugging memory problems and has PERL_MEM_LOG enabled. [L<GH #18341|https://github.com/Perl/perl5/issues/18341>] =item * Fixed a crash in the use of chained comparison operators when run under "no warnings 'uninitialized'". [L<GH #17917|https://github.com/Perl/perl5/issues/17917>, L<GH #18380|https://github.com/Perl/perl5/issues/18380>] =item * Exceptions thrown from destructors during global destruction are no longer swallowed. [L<GH #18063|https://github.com/Perl/perl5/issues/18063>] =back =head1 Acknowledgements Perl 5.32.1 represents approximately 7 months of development since Perl 5.32.0 and contains approximately 7,000 lines of changes across 80 files from 23 authors. Excluding auto-generated files, documentation and release tools, there were approximately 1,300 lines of changes to 23 .pm, .t, .c and .h files. Perl continues to flourish into its fourth decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.32.1: Adam Hartley, Andy Dougherty, Dagfinn Ilmari Mannsåker, Dan Book, David Mitchell, Graham Knop, Graham Ollis, Hauke D, H.Merijn Brand, Hugo van der Sanden, John Lightsey, Karen Etheridge, Karl Williamson, Leon Timmermans, Max Maischein, Nicolas R., Ricardo Signes, Richard Leach, Sawyer X, Sevan Janiyan, Steve Hay, Tom Hukins, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the perl bug database at L<https://github.com/Perl/perl5/issues>. There may also be information at L<http://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please open an issue at L<https://github.com/Perl/perl5/issues>. Be sure to trim your bug down to a tiny but sufficient test case. If the bug you are reporting has security implications which make it inappropriate to send to a public issue tracker, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 Give Thanks If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the C<perlthanks> program: perlthanks This will send an email to the Perl 5 Porters list with your show of thanks. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[ �*c+� +� perlrecharclass.podnu �[��� =head1 NAME X<character class> perlrecharclass - Perl Regular Expression Character Classes =head1 DESCRIPTION The top level documentation about Perl regular expressions is found in L<perlre>. This manual page discusses the syntax and use of character classes in Perl regular expressions. A character class is a way of denoting a set of characters in such a way that one character of the set is matched. It's important to remember that: matching a character class consumes exactly one character in the source string. (The source string is the string the regular expression is matched against.) There are three types of character classes in Perl regular expressions: the dot, backslash sequences, and the form enclosed in square brackets. Keep in mind, though, that often the term "character class" is used to mean just the bracketed form. Certainly, most Perl documentation does that. =head2 The dot The dot (or period), C<.> is probably the most used, and certainly the most well-known character class. By default, a dot matches any character, except for the newline. That default can be changed to add matching the newline by using the I<single line> modifier: for the entire regular expression with the C</s> modifier, or locally with C<(?s)> (and even globally within the scope of L<C<use re '/s'>|re/'E<sol>flags' mode>). (The C<L</\N>> backslash sequence, described below, matches any character except newline without regard to the I<single line> modifier.) Here are some examples: "a" =~ /./ # Match "." =~ /./ # Match "" =~ /./ # No match (dot has to match a character) "\n" =~ /./ # No match (dot does not match a newline) "\n" =~ /./s # Match (global 'single line' modifier) "\n" =~ /(?s:.)/ # Match (local 'single line' modifier) "ab" =~ /^.$/ # No match (dot matches one character) =head2 Backslash sequences X<\w> X<\W> X<\s> X<\S> X<\d> X<\D> X<\p> X<\P> X<\N> X<\v> X<\V> X<\h> X<\H> X<word> X<whitespace> A backslash sequence is a sequence of characters, the first one of which is a backslash. Perl ascribes special meaning to many such sequences, and some of these are character classes. That is, they match a single character each, provided that the character belongs to the specific set of characters defined by the sequence. Here's a list of the backslash sequences that are character classes. They are discussed in more detail below. (For the backslash sequences that aren't character classes, see L<perlrebackslash>.) \d Match a decimal digit character. \D Match a non-decimal-digit character. \w Match a "word" character. \W Match a non-"word" character. \s Match a whitespace character. \S Match a non-whitespace character. \h Match a horizontal whitespace character. \H Match a character that isn't horizontal whitespace. \v Match a vertical whitespace character. \V Match a character that isn't vertical whitespace. \N Match a character that isn't a newline. \pP, \p{Prop} Match a character that has the given Unicode property. \PP, \P{Prop} Match a character that doesn't have the Unicode property =head3 \N C<\N>, available starting in v5.12, like the dot, matches any character that is not a newline. The difference is that C<\N> is not influenced by the I<single line> regular expression modifier (see L</The dot> above). Note that the form C<\N{...}> may mean something completely different. When the C<{...}> is a L<quantifier|perlre/Quantifiers>, it means to match a non-newline character that many times. For example, C<\N{3}> means to match 3 non-newlines; C<\N{5,}> means to match 5 or more non-newlines. But if C<{...}> is not a legal quantifier, it is presumed to be a named character. See L<charnames> for those. For example, none of C<\N{COLON}>, C<\N{4F}>, and C<\N{F4}> contain legal quantifiers, so Perl will try to find characters whose names are respectively C<COLON>, C<4F>, and C<F4>. =head3 Digits C<\d> matches a single character considered to be a decimal I<digit>. If the C</a> regular expression modifier is in effect, it matches [0-9]. Otherwise, it matches anything that is matched by C<\p{Digit}>, which includes [0-9]. (An unlikely possible exception is that under locale matching rules, the current locale might not have C<[0-9]> matched by C<\d>, and/or might match other characters whose code point is less than 256. The only such locale definitions that are legal would be to match C<[0-9]> plus another set of 10 consecutive digit characters; anything else would be in violation of the C language standard, but Perl doesn't currently assume anything in regard to this.) What this means is that unless the C</a> modifier is in effect C<\d> not only matches the digits '0' - '9', but also Arabic, Devanagari, and digits from other languages. This may cause some confusion, and some security issues. Some digits that C<\d> matches look like some of the [0-9] ones, but have different values. For example, BENGALI DIGIT FOUR (U+09EA) looks very much like an ASCII DIGIT EIGHT (U+0038), and LEPCHA DIGIT SIX (U+1C46) looks very much like an ASCII DIGIT FIVE (U+0035). An application that is expecting only the ASCII digits might be misled, or if the match is C<\d+>, the matched string might contain a mixture of digits from different writing systems that look like they signify a number different than they actually do. L<Unicode::UCD/num()> can be used to safely calculate the value, returning C<undef> if the input string contains such a mixture. Otherwise, for example, a displayed price might be deliberately different than it appears. What C<\p{Digit}> means (and hence C<\d> except under the C</a> modifier) is C<\p{General_Category=Decimal_Number}>, or synonymously, C<\p{General_Category=Digit}>. Starting with Unicode version 4.1, this is the same set of characters matched by C<\p{Numeric_Type=Decimal}>. But Unicode also has a different property with a similar name, C<\p{Numeric_Type=Digit}>, which matches a completely different set of characters. These characters are things such as C<CIRCLED DIGIT ONE> or subscripts, or are from writing systems that lack all ten digits. The design intent is for C<\d> to exactly match the set of characters that can safely be used with "normal" big-endian positional decimal syntax, where, for example 123 means one 'hundred', plus two 'tens', plus three 'ones'. This positional notation does not necessarily apply to characters that match the other type of "digit", C<\p{Numeric_Type=Digit}>, and so C<\d> doesn't match them. The Tamil digits (U+0BE6 - U+0BEF) can also legally be used in old-style Tamil numbers in which they would appear no more than one in a row, separated by characters that mean "times 10", "times 100", etc. (See L<https://www.unicode.org/notes/tn21>.) Any character not matched by C<\d> is matched by C<\D>. =head3 Word characters A C<\w> matches a single alphanumeric character (an alphabetic character, or a decimal digit); or a connecting punctuation character, such as an underscore ("_"); or a "mark" character (like some sort of accent) that attaches to one of those. It does not match a whole word. To match a whole word, use C<\w+>. This isn't the same thing as matching an English word, but in the ASCII range it is the same as a string of Perl-identifier characters. =over =item If the C</a> modifier is in effect ... C<\w> matches the 63 characters [a-zA-Z0-9_]. =item otherwise ... =over =item For code points above 255 ... C<\w> matches the same as C<\p{Word}> matches in this range. That is, it matches Thai letters, Greek letters, etc. This includes connector punctuation (like the underscore) which connect two words together, or diacritics, such as a C<COMBINING TILDE> and the modifier letters, which are generally used to add auxiliary markings to letters. =item For code points below 256 ... =over =item if locale rules are in effect ... C<\w> matches the platform's native underscore character plus whatever the locale considers to be alphanumeric. =item if, instead, Unicode rules are in effect ... C<\w> matches exactly what C<\p{Word}> matches. =item otherwise ... C<\w> matches [a-zA-Z0-9_]. =back =back =back Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>. There are a number of security issues with the full Unicode list of word characters. See L<http://unicode.org/reports/tr36>. Also, for a somewhat finer-grained set of characters that are in programming language identifiers beyond the ASCII range, you may wish to instead use the more customized L</Unicode Properties>, C<\p{ID_Start}>, C<\p{ID_Continue}>, C<\p{XID_Start}>, and C<\p{XID_Continue}>. See L<http://unicode.org/reports/tr31>. Any character not matched by C<\w> is matched by C<\W>. =head3 Whitespace C<\s> matches any single character considered whitespace. =over =item If the C</a> modifier is in effect ... In all Perl versions, C<\s> matches the 5 characters [\t\n\f\r ]; that is, the horizontal tab, the newline, the form feed, the carriage return, and the space. Starting in Perl v5.18, it also matches the vertical tab, C<\cK>. See note C<[1]> below for a discussion of this. =item otherwise ... =over =item For code points above 255 ... C<\s> matches exactly the code points above 255 shown with an "s" column in the table below. =item For code points below 256 ... =over =item if locale rules are in effect ... C<\s> matches whatever the locale considers to be whitespace. =item if, instead, Unicode rules are in effect ... C<\s> matches exactly the characters shown with an "s" column in the table below. =item otherwise ... C<\s> matches [\t\n\f\r ] and, starting in Perl v5.18, the vertical tab, C<\cK>. (See note C<[1]> below for a discussion of this.) Note that this list doesn't include the non-breaking space. =back =back =back Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>. Any character not matched by C<\s> is matched by C<\S>. C<\h> matches any character considered horizontal whitespace; this includes the platform's space and tab characters and several others listed in the table below. C<\H> matches any character not considered horizontal whitespace. They use the platform's native character set, and do not consider any locale that may otherwise be in use. C<\v> matches any character considered vertical whitespace; this includes the platform's carriage return and line feed characters (newline) plus several other characters, all listed in the table below. C<\V> matches any character not considered vertical whitespace. They use the platform's native character set, and do not consider any locale that may otherwise be in use. C<\R> matches anything that can be considered a newline under Unicode rules. It can match a multi-character sequence. It cannot be used inside a bracketed character class; use C<\v> instead (vertical whitespace). It uses the platform's native character set, and does not consider any locale that may otherwise be in use. Details are discussed in L<perlrebackslash>. Note that unlike C<\s> (and C<\d> and C<\w>), C<\h> and C<\v> always match the same characters, without regard to other factors, such as the active locale or whether the source string is in UTF-8 format. One might think that C<\s> is equivalent to C<[\h\v]>. This is indeed true starting in Perl v5.18, but prior to that, the sole difference was that the vertical tab (C<"\cK">) was not matched by C<\s>. The following table is a complete listing of characters matched by C<\s>, C<\h> and C<\v> as of Unicode 6.3. The first column gives the Unicode code point of the character (in hex format), the second column gives the (Unicode) name. The third column indicates by which class(es) the character is matched (assuming no locale is in effect that changes the C<\s> matching). 0x0009 CHARACTER TABULATION h s 0x000a LINE FEED (LF) vs 0x000b LINE TABULATION vs [1] 0x000c FORM FEED (FF) vs 0x000d CARRIAGE RETURN (CR) vs 0x0020 SPACE h s 0x0085 NEXT LINE (NEL) vs [2] 0x00a0 NO-BREAK SPACE h s [2] 0x1680 OGHAM SPACE MARK h s 0x2000 EN QUAD h s 0x2001 EM QUAD h s 0x2002 EN SPACE h s 0x2003 EM SPACE h s 0x2004 THREE-PER-EM SPACE h s 0x2005 FOUR-PER-EM SPACE h s 0x2006 SIX-PER-EM SPACE h s 0x2007 FIGURE SPACE h s 0x2008 PUNCTUATION SPACE h s 0x2009 THIN SPACE h s 0x200a HAIR SPACE h s 0x2028 LINE SEPARATOR vs 0x2029 PARAGRAPH SEPARATOR vs 0x202f NARROW NO-BREAK SPACE h s 0x205f MEDIUM MATHEMATICAL SPACE h s 0x3000 IDEOGRAPHIC SPACE h s =over 4 =item [1] Prior to Perl v5.18, C<\s> did not match the vertical tab. C<[^\S\cK]> (obscurely) matches what C<\s> traditionally did. =item [2] NEXT LINE and NO-BREAK SPACE may or may not match C<\s> depending on the rules in effect. See L<the beginning of this section|/Whitespace>. =back =head3 Unicode Properties C<\pP> and C<\p{Prop}> are character classes to match characters that fit given Unicode properties. One letter property names can be used in the C<\pP> form, with the property name following the C<\p>, otherwise, braces are required. When using braces, there is a single form, which is just the property name enclosed in the braces, and a compound form which looks like C<\p{name=value}>, which means to match if the property "name" for the character has that particular "value". For instance, a match for a number can be written as C</\pN/> or as C</\p{Number}/>, or as C</\p{Number=True}/>. Lowercase letters are matched by the property I<Lowercase_Letter> which has the short form I<Ll>. They need the braces, so are written as C</\p{Ll}/> or C</\p{Lowercase_Letter}/>, or C</\p{General_Category=Lowercase_Letter}/> (the underscores are optional). C</\pLl/> is valid, but means something different. It matches a two character string: a letter (Unicode property C<\pL>), followed by a lowercase C<l>. What a Unicode property matches is never subject to locale rules, and if locale rules are not otherwise in effect, the use of a Unicode property will force the regular expression into using Unicode rules, if it isn't already. Note that almost all properties are immune to case-insensitive matching. That is, adding a C</i> regular expression modifier does not change what they match. But there are two sets that are affected. The first set is C<Uppercase_Letter>, C<Lowercase_Letter>, and C<Titlecase_Letter>, all of which match C<Cased_Letter> under C</i> matching. The second set is C<Uppercase>, C<Lowercase>, and C<Titlecase>, all of which match C<Cased> under C</i> matching. (The difference between these sets is that some things, such as Roman numerals, come in both upper and lower case, so they are C<Cased>, but aren't considered to be letters, so they aren't C<Cased_Letter>s. They're actually C<Letter_Number>s.) This set also includes its subsets C<PosixUpper> and C<PosixLower>, both of which under C</i> match C<PosixAlpha>. For more details on Unicode properties, see L<perlunicode/Unicode Character Properties>; for a complete list of possible properties, see L<perluniprops/Properties accessible through \p{} and \P{}>, which notes all forms that have C</i> differences. It is also possible to define your own properties. This is discussed in L<perlunicode/User-Defined Character Properties>. Unicode properties are defined (surprise!) only on Unicode code points. Starting in v5.20, when matching against C<\p> and C<\P>, Perl treats non-Unicode code points (those above the legal Unicode maximum of 0x10FFFF) as if they were typical unassigned Unicode code points. Prior to v5.20, Perl raised a warning and made all matches fail on non-Unicode code points. This could be somewhat surprising: chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails on Perls < v5.20. chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Also fails on Perls # < v5.20 Even though these two matches might be thought of as complements, until v5.20 they were so only on Unicode code points. Starting in perl v5.30, wildcards are allowed in Unicode property values. See L<perlunicode/Wildcards in Property Values>. =head4 Examples "a" =~ /\w/ # Match, "a" is a 'word' character. "7" =~ /\w/ # Match, "7" is a 'word' character as well. "a" =~ /\d/ # No match, "a" isn't a digit. "7" =~ /\d/ # Match, "7" is a digit. " " =~ /\s/ # Match, a space is whitespace. "a" =~ /\D/ # Match, "a" is a non-digit. "7" =~ /\D/ # No match, "7" is not a non-digit. " " =~ /\S/ # No match, a space is not non-whitespace. " " =~ /\h/ # Match, space is horizontal whitespace. " " =~ /\v/ # No match, space is not vertical whitespace. "\r" =~ /\v/ # Match, a return is vertical whitespace. "a" =~ /\pL/ # Match, "a" is a letter. "a" =~ /\p{Lu}/ # No match, /\p{Lu}/ matches upper case letters. "\x{0e0b}" =~ /\p{Thai}/ # Match, \x{0e0b} is the character # 'THAI CHARACTER SO SO', and that's in # Thai Unicode class. "a" =~ /\P{Lao}/ # Match, as "a" is not a Laotian character. It is worth emphasizing that C<\d>, C<\w>, etc, match single characters, not complete numbers or words. To match a number (that consists of digits), use C<\d+>; to match a word, use C<\w+>. But be aware of the security considerations in doing so, as mentioned above. =head2 Bracketed Character Classes The third form of character class you can use in Perl regular expressions is the bracketed character class. In its simplest form, it lists the characters that may be matched, surrounded by square brackets, like this: C<[aeiou]>. This matches one of C<a>, C<e>, C<i>, C<o> or C<u>. Like the other character classes, exactly one character is matched.* To match a longer string consisting of characters mentioned in the character class, follow the character class with a L<quantifier|perlre/Quantifiers>. For instance, C<[aeiou]+> matches one or more lowercase English vowels. Repeating a character in a character class has no effect; it's considered to be in the set only once. Examples: "e" =~ /[aeiou]/ # Match, as "e" is listed in the class. "p" =~ /[aeiou]/ # No match, "p" is not listed in the class. "ae" =~ /^[aeiou]$/ # No match, a character class only matches # a single character. "ae" =~ /^[aeiou]+$/ # Match, due to the quantifier. ------- * There are two exceptions to a bracketed character class matching a single character only. Each requires special handling by Perl to make things work: =over =item * When the class is to match caselessly under C</i> matching rules, and a character that is explicitly mentioned inside the class matches a multiple-character sequence caselessly under Unicode rules, the class will also match that sequence. For example, Unicode says that the letter C<LATIN SMALL LETTER SHARP S> should match the sequence C<ss> under C</i> rules. Thus, 'ss' =~ /\A\N{LATIN SMALL LETTER SHARP S}\z/i # Matches 'ss' =~ /\A[aeioust\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches For this to happen, the class must not be inverted (see L</Negation>) and the character must be explicitly specified, and not be part of a multi-character range (not even as one of its endpoints). (L</Character Ranges> will be explained shortly.) Therefore, 'ss' =~ /\A[\0-\x{ff}]\z/ui # Doesn't match 'ss' =~ /\A[\0-\N{LATIN SMALL LETTER SHARP S}]\z/ui # No match 'ss' =~ /\A[\xDF-\xDF]\z/ui # Matches on ASCII platforms, since # \xDF is LATIN SMALL LETTER SHARP S, # and the range is just a single # element Note that it isn't a good idea to specify these types of ranges anyway. =item * Some names known to C<\N{...}> refer to a sequence of multiple characters, instead of the usual single character. When one of these is included in the class, the entire sequence is matched. For example, "\N{TAMIL LETTER KA}\N{TAMIL VOWEL SIGN AU}" =~ / ^ [\N{TAMIL SYLLABLE KAU}] $ /x; matches, because C<\N{TAMIL SYLLABLE KAU}> is a named sequence consisting of the two characters matched against. Like the other instance where a bracketed class can match multiple characters, and for similar reasons, the class must not be inverted, and the named sequence may not appear in a range, even one where it is both endpoints. If these happen, it is a fatal error if the character class is within the scope of L<C<use re 'strict>|re/'strict' mode>, or within an extended L<C<(?[...])>|/Extended Bracketed Character Classes> class; otherwise only the first code point is used (with a C<regexp>-type warning raised). =back =head3 Special Characters Inside a Bracketed Character Class Most characters that are meta characters in regular expressions (that is, characters that carry a special meaning like C<.>, C<*>, or C<(>) lose their special meaning and can be used inside a character class without the need to escape them. For instance, C<[()]> matches either an opening parenthesis, or a closing parenthesis, and the parens inside the character class don't group or capture. Be aware that, unless the pattern is evaluated in single-quotish context, variable interpolation will take place before the bracketed class is parsed: $, = "\t| "; $a =~ m'[$,]'; # single-quotish: matches '$' or ',' $a =~ q{[$,]}' # same $a =~ m/[$,]/; # double-quotish: matches "\t", "|", or " " Characters that may carry a special meaning inside a character class are: C<\>, C<^>, C<->, C<[> and C<]>, and are discussed below. They can be escaped with a backslash, although this is sometimes not needed, in which case the backslash may be omitted. The sequence C<\b> is special inside a bracketed character class. While outside the character class, C<\b> is an assertion indicating a point that does not have either two word characters or two non-word characters on either side, inside a bracketed character class, C<\b> matches a backspace character. The sequences C<\a>, C<\c>, C<\e>, C<\f>, C<\n>, C<\N{I<NAME>}>, C<\N{U+I<hex char>}>, C<\r>, C<\t>, and C<\x> are also special and have the same meanings as they do outside a bracketed character class. Also, a backslash followed by two or three octal digits is considered an octal number. A C<[> is not special inside a character class, unless it's the start of a POSIX character class (see L</POSIX Character Classes> below). It normally does not need escaping. A C<]> is normally either the end of a POSIX character class (see L</POSIX Character Classes> below), or it signals the end of the bracketed character class. If you want to include a C<]> in the set of characters, you must generally escape it. However, if the C<]> is the I<first> (or the second if the first character is a caret) character of a bracketed character class, it does not denote the end of the class (as you cannot have an empty class) and is considered part of the set of characters that can be matched without escaping. Examples: "+" =~ /[+?*]/ # Match, "+" in a character class is not special. "\cH" =~ /[\b]/ # Match, \b inside in a character class # is equivalent to a backspace. "]" =~ /[][]/ # Match, as the character class contains # both [ and ]. "[]" =~ /[[]]/ # Match, the pattern contains a character class # containing just [, and the character class is # followed by a ]. =head3 Bracketed Character Classes and the C</xx> pattern modifier Normally SPACE and TAB characters have no special meaning inside a bracketed character class; they are just added to the list of characters matched by the class. But if the L<C</xx>|perlre/E<sol>x and E<sol>xx> pattern modifier is in effect, they are generally ignored and can be added to improve readability. They can't be added in the middle of a single construct: / [ \x{10 FFFF} ] /xx # WRONG! The SPACE in the middle of the hex constant is illegal. To specify a literal SPACE character, you can escape it with a backslash, like: /[ a e i o u \ ]/xx This matches the English vowels plus the SPACE character. For clarity, you should already have been using C<\t> to specify a literal tab, and C<\t> is unaffected by C</xx>. =head3 Character Ranges It is not uncommon to want to match a range of characters. Luckily, instead of listing all characters in the range, one may use the hyphen (C<->). If inside a bracketed character class you have two characters separated by a hyphen, it's treated as if all characters between the two were in the class. For instance, C<[0-9]> matches any ASCII digit, and C<[a-m]> matches any lowercase letter from the first half of the ASCII alphabet. Note that the two characters on either side of the hyphen are not necessarily both letters or both digits. Any character is possible, although not advisable. C<['-?]> contains a range of characters, but most people will not know which characters that means. Furthermore, such ranges may lead to portability problems if the code has to run on a platform that uses a different character set, such as EBCDIC. If a hyphen in a character class cannot syntactically be part of a range, for instance because it is the first or the last character of the character class, or if it immediately follows a range, the hyphen isn't special, and so is considered a character to be matched literally. If you want a hyphen in your set of characters to be matched and its position in the class is such that it could be considered part of a range, you must escape that hyphen with a backslash. Examples: [a-z] # Matches a character that is a lower case ASCII letter. [a-fz] # Matches any letter between 'a' and 'f' (inclusive) or # the letter 'z'. [-z] # Matches either a hyphen ('-') or the letter 'z'. [a-f-m] # Matches any letter between 'a' and 'f' (inclusive), the # hyphen ('-'), or the letter 'm'. ['-?] # Matches any of the characters '()*+,-./0123456789:;<=>? # (But not on an EBCDIC platform). [\N{APOSTROPHE}-\N{QUESTION MARK}] # Matches any of the characters '()*+,-./0123456789:;<=>? # even on an EBCDIC platform. [\N{U+27}-\N{U+3F}] # Same. (U+27 is "'", and U+3F is "?") As the final two examples above show, you can achieve portability to non-ASCII platforms by using the C<\N{...}> form for the range endpoints. These indicate that the specified range is to be interpreted using Unicode values, so C<[\N{U+27}-\N{U+3F}]> means to match C<\N{U+27}>, C<\N{U+28}>, C<\N{U+29}>, ..., C<\N{U+3D}>, C<\N{U+3E}>, and C<\N{U+3F}>, whatever the native code point versions for those are. These are called "Unicode" ranges. If either end is of the C<\N{...}> form, the range is considered Unicode. A C<regexp> warning is raised under C<S<"use re 'strict'">> if the other endpoint is specified non-portably: [\N{U+00}-\x09] # Warning under re 'strict'; \x09 is non-portable [\N{U+00}-\t] # No warning; Both of the above match the characters C<\N{U+00}> C<\N{U+01}>, ... C<\N{U+08}>, C<\N{U+09}>, but the C<\x09> looks like it could be a mistake so the warning is raised (under C<re 'strict'>) for it. Perl also guarantees that the ranges C<A-Z>, C<a-z>, C<0-9>, and any subranges of these match what an English-only speaker would expect them to match on any platform. That is, C<[A-Z]> matches the 26 ASCII uppercase letters; C<[a-z]> matches the 26 lowercase letters; and C<[0-9]> matches the 10 digits. Subranges, like C<[h-k]>, match correspondingly, in this case just the four letters C<"h">, C<"i">, C<"j">, and C<"k">. This is the natural behavior on ASCII platforms where the code points (ordinal values) for C<"h"> through C<"k"> are consecutive integers (0x68 through 0x6B). But special handling to achieve this may be needed on platforms with a non-ASCII native character set. For example, on EBCDIC platforms, the code point for C<"h"> is 0x88, C<"i"> is 0x89, C<"j"> is 0x91, and C<"k"> is 0x92. Perl specially treats C<[h-k]> to exclude the seven code points in the gap: 0x8A through 0x90. This special handling is only invoked when the range is a subrange of one of the ASCII uppercase, lowercase, and digit ranges, AND each end of the range is expressed either as a literal, like C<"A">, or as a named character (C<\N{...}>, including the C<\N{U+...> form). EBCDIC Examples: [i-j] # Matches either "i" or "j" [i-\N{LATIN SMALL LETTER J}] # Same [i-\N{U+6A}] # Same [\N{U+69}-\N{U+6A}] # Same [\x{89}-\x{91}] # Matches 0x89 ("i"), 0x8A .. 0x90, 0x91 ("j") [i-\x{91}] # Same [\x{89}-j] # Same [i-J] # Matches, 0x89 ("i") .. 0xC1 ("J"); special # handling doesn't apply because range is mixed # case =head3 Negation It is also possible to instead list the characters you do not want to match. You can do so by using a caret (C<^>) as the first character in the character class. For instance, C<[^a-z]> matches any character that is not a lowercase ASCII letter, which therefore includes more than a million Unicode code points. The class is said to be "negated" or "inverted". This syntax make the caret a special character inside a bracketed character class, but only if it is the first character of the class. So if you want the caret as one of the characters to match, either escape the caret or else don't list it first. In inverted bracketed character classes, Perl ignores the Unicode rules that normally say that named sequence, and certain characters should match a sequence of multiple characters use under caseless C</i> matching. Following those rules could lead to highly confusing situations: "ss" =~ /^[^\xDF]+$/ui; # Matches! This should match any sequences of characters that aren't C<\xDF> nor what C<\xDF> matches under C</i>. C<"s"> isn't C<\xDF>, but Unicode says that C<"ss"> is what C<\xDF> matches under C</i>. So which one "wins"? Do you fail the match because the string has C<ss> or accept it because it has an C<s> followed by another C<s>? Perl has chosen the latter. (See note in L</Bracketed Character Classes> above.) Examples: "e" =~ /[^aeiou]/ # No match, the 'e' is listed. "x" =~ /[^aeiou]/ # Match, as 'x' isn't a lowercase vowel. "^" =~ /[^^]/ # No match, matches anything that isn't a caret. "^" =~ /[x^]/ # Match, caret is not special here. =head3 Backslash Sequences You can put any backslash sequence character class (with the exception of C<\N> and C<\R>) inside a bracketed character class, and it will act just as if you had put all characters matched by the backslash sequence inside the character class. For instance, C<[a-f\d]> matches any decimal digit, or any of the lowercase letters between 'a' and 'f' inclusive. C<\N> within a bracketed character class must be of the forms C<\N{I<name>}> or C<\N{U+I<hex char>}>, and NOT be the form that matches non-newlines, for the same reason that a dot C<.> inside a bracketed character class loses its special meaning: it matches nearly anything, which generally isn't what you want to happen. Examples: /[\p{Thai}\d]/ # Matches a character that is either a Thai # character, or a digit. /[^\p{Arabic}()]/ # Matches a character that is neither an Arabic # character, nor a parenthesis. Backslash sequence character classes cannot form one of the endpoints of a range. Thus, you can't say: /[\p{Thai}-\d]/ # Wrong! =head3 POSIX Character Classes X<character class> X<\p> X<\p{}> X<alpha> X<alnum> X<ascii> X<blank> X<cntrl> X<digit> X<graph> X<lower> X<print> X<punct> X<space> X<upper> X<word> X<xdigit> POSIX character classes have the form C<[:class:]>, where I<class> is the name, and the C<[:> and C<:]> delimiters. POSIX character classes only appear I<inside> bracketed character classes, and are a convenient and descriptive way of listing a group of characters. Be careful about the syntax, # Correct: $string =~ /[[:alpha:]]/ # Incorrect (will warn): $string =~ /[:alpha:]/ The latter pattern would be a character class consisting of a colon, and the letters C<a>, C<l>, C<p> and C<h>. POSIX character classes can be part of a larger bracketed character class. For example, [01[:alpha:]%] is valid and matches '0', '1', any alphabetic character, and the percent sign. Perl recognizes the following POSIX character classes: alpha Any alphabetical character (e.g., [A-Za-z]). alnum Any alphanumeric character (e.g., [A-Za-z0-9]). ascii Any character in the ASCII character set. blank A GNU extension, equal to a space or a horizontal tab ("\t"). cntrl Any control character. See Note [2] below. digit Any decimal digit (e.g., [0-9]), equivalent to "\d". graph Any printable character, excluding a space. See Note [3] below. lower Any lowercase character (e.g., [a-z]). print Any printable character, including a space. See Note [4] below. punct Any graphical character excluding "word" characters. Note [5]. space Any whitespace character. "\s" including the vertical tab ("\cK"). upper Any uppercase character (e.g., [A-Z]). word A Perl extension (e.g., [A-Za-z0-9_]), equivalent to "\w". xdigit Any hexadecimal digit (e.g., [0-9a-fA-F]). Note [7]. Like the L<Unicode properties|/Unicode Properties>, most of the POSIX properties match the same regardless of whether case-insensitive (C</i>) matching is in effect or not. The two exceptions are C<[:upper:]> and C<[:lower:]>. Under C</i>, they each match the union of C<[:upper:]> and C<[:lower:]>. Most POSIX character classes have two Unicode-style C<\p> property counterparts. (They are not official Unicode properties, but Perl extensions derived from official Unicode properties.) The table below shows the relation between POSIX character classes and these counterparts. One counterpart, in the column labelled "ASCII-range Unicode" in the table, matches only characters in the ASCII character set. The other counterpart, in the column labelled "Full-range Unicode", matches any appropriate characters in the full Unicode character set. For example, C<\p{Alpha}> matches not just the ASCII alphabetic characters, but any character in the entire Unicode character set considered alphabetic. An entry in the column labelled "backslash sequence" is a (short) equivalent. [[:...:]] ASCII-range Full-range backslash Note Unicode Unicode sequence ----------------------------------------------------- alpha \p{PosixAlpha} \p{XPosixAlpha} alnum \p{PosixAlnum} \p{XPosixAlnum} ascii \p{ASCII} blank \p{PosixBlank} \p{XPosixBlank} \h [1] or \p{HorizSpace} [1] cntrl \p{PosixCntrl} \p{XPosixCntrl} [2] digit \p{PosixDigit} \p{XPosixDigit} \d graph \p{PosixGraph} \p{XPosixGraph} [3] lower \p{PosixLower} \p{XPosixLower} print \p{PosixPrint} \p{XPosixPrint} [4] punct \p{PosixPunct} \p{XPosixPunct} [5] \p{PerlSpace} \p{XPerlSpace} \s [6] space \p{PosixSpace} \p{XPosixSpace} [6] upper \p{PosixUpper} \p{XPosixUpper} word \p{PosixWord} \p{XPosixWord} \w xdigit \p{PosixXDigit} \p{XPosixXDigit} [7] =over 4 =item [1] C<\p{Blank}> and C<\p{HorizSpace}> are synonyms. =item [2] Control characters don't produce output as such, but instead usually control the terminal somehow: for example, newline and backspace are control characters. On ASCII platforms, in the ASCII range, characters whose code points are between 0 and 31 inclusive, plus 127 (C<DEL>) are control characters; on EBCDIC platforms, their counterparts are control characters. =item [3] Any character that is I<graphical>, that is, visible. This class consists of all alphanumeric characters and all punctuation characters. =item [4] All printable characters, which is the set of all graphical characters plus those whitespace characters which are not also controls. =item [5] C<\p{PosixPunct}> and C<[[:punct:]]> in the ASCII range match all non-controls, non-alphanumeric, non-space characters: C<[-!"#$%&'()*+,./:;<=E<gt>?@[\\\]^_`{|}~]> (although if a locale is in effect, it could alter the behavior of C<[[:punct:]]>). The similarly named property, C<\p{Punct}>, matches a somewhat different set in the ASCII range, namely C<[-!"#%&'()*,./:;?@[\\\]_{}]>. That is, it is missing the nine characters C<[$+E<lt>=E<gt>^`|~]>. This is because Unicode splits what POSIX considers to be punctuation into two categories, Punctuation and Symbols. C<\p{XPosixPunct}> and (under Unicode rules) C<[[:punct:]]>, match what C<\p{PosixPunct}> matches in the ASCII range, plus what C<\p{Punct}> matches. This is different than strictly matching according to C<\p{Punct}>. Another way to say it is that if Unicode rules are in effect, C<[[:punct:]]> matches all characters that Unicode considers punctuation, plus all ASCII-range characters that Unicode considers symbols. =item [6] C<\p{XPerlSpace}> and C<\p{Space}> match identically starting with Perl v5.18. In earlier versions, these differ only in that in non-locale matching, C<\p{XPerlSpace}> did not match the vertical tab, C<\cK>. Same for the two ASCII-only range forms. =item [7] Unlike C<[[:digit:]]> which matches digits in many writing systems, such as Thai and Devanagari, there are currently only two sets of hexadecimal digits, and it is unlikely that more will be added. This is because you not only need the ten digits, but also the six C<[A-F]> (and C<[a-f]>) to correspond. That means only the Latin script is suitable for these, and Unicode has only two sets of these, the familiar ASCII set, and the fullwidth forms starting at U+FF10 (FULLWIDTH DIGIT ZERO). =back There are various other synonyms that can be used besides the names listed in the table. For example, C<\p{XPosixAlpha}> can be written as C<\p{Alpha}>. All are listed in L<perluniprops/Properties accessible through \p{} and \P{}>. Both the C<\p> counterparts always assume Unicode rules are in effect. On ASCII platforms, this means they assume that the code points from 128 to 255 are Latin-1, and that means that using them under locale rules is unwise unless the locale is guaranteed to be Latin-1 or UTF-8. In contrast, the POSIX character classes are useful under locale rules. They are affected by the actual rules in effect, as follows: =over =item If the C</a> modifier, is in effect ... Each of the POSIX classes matches exactly the same as their ASCII-range counterparts. =item otherwise ... =over =item For code points above 255 ... The POSIX class matches the same as its Full-range counterpart. =item For code points below 256 ... =over =item if locale rules are in effect ... The POSIX class matches according to the locale, except: =over =item C<word> also includes the platform's native underscore character, no matter what the locale is. =item C<ascii> on platforms that don't have the POSIX C<ascii> extension, this matches just the platform's native ASCII-range characters. =item C<blank> on platforms that don't have the POSIX C<blank> extension, this matches just the platform's native tab and space characters. =back =item if, instead, Unicode rules are in effect ... The POSIX class matches the same as the Full-range counterpart. =item otherwise ... The POSIX class matches the same as the ASCII range counterpart. =back =back =back Which rules apply are determined as described in L<perlre/Which character set modifier is in effect?>. =head4 Negation of POSIX character classes X<character class, negation> A Perl extension to the POSIX character class is the ability to negate it. This is done by prefixing the class name with a caret (C<^>). Some examples: POSIX ASCII-range Full-range backslash Unicode Unicode sequence ----------------------------------------------------- [[:^digit:]] \P{PosixDigit} \P{XPosixDigit} \D [[:^space:]] \P{PosixSpace} \P{XPosixSpace} \P{PerlSpace} \P{XPerlSpace} \S [[:^word:]] \P{PerlWord} \P{XPosixWord} \W The backslash sequence can mean either ASCII- or Full-range Unicode, depending on various factors as described in L<perlre/Which character set modifier is in effect?>. =head4 [= =] and [. .] Perl recognizes the POSIX character classes C<[=class=]> and C<[.class.]>, but does not (yet?) support them. Any attempt to use either construct raises an exception. =head4 Examples /[[:digit:]]/ # Matches a character that is a digit. /[01[:lower:]]/ # Matches a character that is either a # lowercase letter, or '0' or '1'. /[[:digit:][:^xdigit:]]/ # Matches a character that can be anything # except the letters 'a' to 'f' and 'A' to # 'F'. This is because the main character # class is composed of two POSIX character # classes that are ORed together, one that # matches any digit, and the other that # matches anything that isn't a hex digit. # The OR adds the digits, leaving only the # letters 'a' to 'f' and 'A' to 'F' excluded. =head3 Extended Bracketed Character Classes X<character class> X<set operations> This is a fancy bracketed character class that can be used for more readable and less error-prone classes, and to perform set operations, such as intersection. An example is /(?[ \p{Thai} & \p{Digit} ])/ This will match all the digit characters that are in the Thai script. This is an experimental feature available starting in 5.18, and is subject to change as we gain field experience with it. Any attempt to use it will raise a warning, unless disabled via no warnings "experimental::regex_sets"; Comments on this feature are welcome; send email to C<perl5-porters@perl.org>. The rules used by L<C<use re 'strict>|re/'strict' mode> apply to this construct. We can extend the example above: /(?[ ( \p{Thai} + \p{Lao} ) & \p{Digit} ])/ This matches digits that are in either the Thai or Laotian scripts. Notice the white space in these examples. This construct always has the C<E<sol>xx> modifier turned on within it. The available binary operators are: & intersection + union | another name for '+', hence means union - subtraction (the result matches the set consisting of those code points matched by the first operand, excluding any that are also matched by the second operand) ^ symmetric difference (the union minus the intersection). This is like an exclusive or, in that the result is the set of code points that are matched by either, but not both, of the operands. There is one unary operator: ! complement All the binary operators left associate; C<"&"> is higher precedence than the others, which all have equal precedence. The unary operator right associates, and has highest precedence. Thus this follows the normal Perl precedence rules for logical operators. Use parentheses to override the default precedence and associativity. The main restriction is that everything is a metacharacter. Thus, you cannot refer to single characters by doing something like this: /(?[ a + b ])/ # Syntax error! The easiest way to specify an individual typable character is to enclose it in brackets: /(?[ [a] + [b] ])/ (This is the same thing as C<[ab]>.) You could also have said the equivalent: /(?[[ a b ]])/ (You can, of course, specify single characters by using, C<\x{...}>, C<\N{...}>, etc.) This last example shows the use of this construct to specify an ordinary bracketed character class without additional set operations. Note the white space within it. This is allowed because C<E<sol>xx> is automatically turned on within this construct. All the other escapes accepted by normal bracketed character classes are accepted here as well. Because this construct compiles under L<C<use re 'strict>|re/'strict' mode>, unrecognized escapes that generate warnings in normal classes are fatal errors here, as well as all other warnings from these class elements, as well as some practices that don't currently warn outside C<re 'strict'>. For example you cannot say /(?[ [ \xF ] ])/ # Syntax error! You have to have two hex digits after a braceless C<\x> (use a leading zero to make two). These restrictions are to lower the incidence of typos causing the class to not match what you thought it would. If a regular bracketed character class contains a C<\p{}> or C<\P{}> and is matched against a non-Unicode code point, a warning may be raised, as the result is not Unicode-defined. No such warning will come when using this extended form. The final difference between regular bracketed character classes and these, is that it is not possible to get these to match a multi-character fold. Thus, /(?[ [\xDF] ])/iu does not match the string C<ss>. You don't have to enclose POSIX class names inside double brackets, hence both of the following work: /(?[ [:word:] - [:lower:] ])/ /(?[ [[:word:]] - [[:lower:]] ])/ Any contained POSIX character classes, including things like C<\w> and C<\D> respect the C<E<sol>a> (and C<E<sol>aa>) modifiers. Note that C<< (?[ ]) >> is a regex-compile-time construct. Any attempt to use something which isn't knowable at the time the containing regular expression is compiled is a fatal error. In practice, this means just three limitations: =over 4 =item 1 When compiled within the scope of C<use locale> (or the C<E<sol>l> regex modifier), this construct assumes that the execution-time locale will be a UTF-8 one, and the generated pattern always uses Unicode rules. What gets matched or not thus isn't dependent on the actual runtime locale, so tainting is not enabled. But a C<locale> category warning is raised if the runtime locale turns out to not be UTF-8. =item 2 Any L<user-defined property|perlunicode/"User-Defined Character Properties"> used must be already defined by the time the regular expression is compiled (but note that this construct can be used instead of such properties). =item 3 A regular expression that otherwise would compile using C<E<sol>d> rules, and which uses this construct will instead use C<E<sol>u>. Thus this construct tells Perl that you don't want C<E<sol>d> rules for the entire regular expression containing it. =back Note that skipping white space applies only to the interior of this construct. There must not be any space between any of the characters that form the initial C<(?[>. Nor may there be space between the closing C<])> characters. Just as in all regular expressions, the pattern can be built up by including variables that are interpolated at regex compilation time. But its best to compile each sub-component. my $thai_or_lao = qr/(?[ \p{Thai} + \p{Lao} ])/; my $lower = qr/(?[ \p{Lower} + \p{Digit} ])/; When these are embedded in another pattern, what they match does not change, regardless of parenthesization or what modifiers are in effect in that outer pattern. If you fail to compile the subcomponents, you can get some nasty surprises. For example: my $thai_or_lao = '\p{Thai} + \p{Lao}'; ... qr/(?[ \p{Digit} & $thai_or_lao ])/; compiles to qr/(?[ \p{Digit} & \p{Thai} + \p{Lao} ])/; But this does not have the effect that someone reading the source code would likely expect, as the intersection applies just to C<\p{Thai}>, excluding the Laotian. Its best to compile the subcomponents, but you could also parenthesize the component pieces: my $thai_or_lao = '( \p{Thai} + \p{Lao} )'; But any modifiers will still apply to all the components: my $lower = '\p{Lower} + \p{Digit}'; qr/(?[ \p{Greek} & $lower ])/i; matches upper case things. So just, compile the subcomponents, as illustrated above. Due to the way that Perl parses things, your parentheses and brackets may need to be balanced, even including comments. If you run into any examples, please submit them to L<https://github.com/Perl/perl5/issues>, so that we can have a concrete example for this man page. We may change it so that things that remain legal uses in normal bracketed character classes might become illegal within this experimental construct. One proposal, for example, is to forbid adjacent uses of the same character, as in C<(?[ [aa] ])>. The motivation for such a change is that this usage is likely a typo, as the second "a" adds nothing. PK �=�[ov�, perlunitut.podnu �[��� =head1 NAME perlunitut - Perl Unicode Tutorial =head1 DESCRIPTION The days of just flinging strings around are over. It's well established that modern programs need to be capable of communicating funny accented letters, and things like euro symbols. This means that programmers need new habits. It's easy to program Unicode capable software, but it does require discipline to do it right. There's a lot to know about character sets, and text encodings. It's probably best to spend a full day learning all this, but the basics can be learned in minutes. These are not the very basics, though. It is assumed that you already know the difference between bytes and characters, and realise (and accept!) that there are many different character sets and encodings, and that your program has to be explicit about them. Recommended reading is "The Absolute Minimum Every Software Developer Absolutely, Positively Must Know About Unicode and Character Sets (No Excuses!)" by Joel Spolsky, at L<http://joelonsoftware.com/articles/Unicode.html>. This tutorial speaks in rather absolute terms, and provides only a limited view of the wealth of character string related features that Perl has to offer. For most projects, this information will probably suffice. =head2 Definitions It's important to set a few things straight first. This is the most important part of this tutorial. This view may conflict with other information that you may have found on the web, but that's mostly because many sources are wrong. You may have to re-read this entire section a few times... =head3 Unicode B<Unicode> is a character set with room for lots of characters. The ordinal value of a character is called a B<code point>. (But in practice, the distinction between code point and character is blurred, so the terms often are used interchangeably.) There are many, many code points, but computers work with bytes, and a byte has room for only 256 values. Unicode has many more characters than that, so you need a method to make these accessible. Unicode is encoded using several competing encodings, of which UTF-8 is the most used. In a Unicode encoding, multiple subsequent bytes can be used to store a single code point, or simply: character. =head3 UTF-8 B<UTF-8> is a Unicode encoding. Many people think that Unicode and UTF-8 are the same thing, but they're not. There are more Unicode encodings, but much of the world has standardized on UTF-8. UTF-8 treats the first 128 codepoints, 0..127, the same as ASCII. They take only one byte per character. All other characters are encoded as two to four bytes using a complex scheme. Fortunately, Perl handles this for us, so we don't have to worry about this. =head3 Text strings (character strings) B<Text strings>, or B<character strings> are made of characters. Bytes are irrelevant here, and so are encodings. Each character is just that: the character. On a text string, you would do things like: $text =~ s/foo/bar/; if ($string =~ /^\d+$/) { ... } $text = ucfirst $text; my $character_count = length $text; The value of a character (C<ord>, C<chr>) is the corresponding Unicode code point. =head3 Binary strings (byte strings) B<Binary strings>, or B<byte strings> are made of bytes. Here, you don't have characters, just bytes. All communication with the outside world (anything outside of your current Perl process) is done in binary. On a binary string, you would do things like: my (@length_content) = unpack "(V/a)*", $binary; $binary =~ s/\x00\x0F/\xFF\xF0/; # for the brave :) print {$fh} $binary; my $byte_count = length $binary; =head3 Encoding B<Encoding> (as a verb) is the conversion from I<text> to I<binary>. To encode, you have to supply the target encoding, for example C<iso-8859-1> or C<UTF-8>. Some encodings, like the C<iso-8859> ("latin") range, do not support the full Unicode standard; characters that can't be represented are lost in the conversion. =head3 Decoding B<Decoding> is the conversion from I<binary> to I<text>. To decode, you have to know what encoding was used during the encoding phase. And most of all, it must be something decodable. It doesn't make much sense to decode a PNG image into a text string. =head3 Internal format Perl has an B<internal format>, an encoding that it uses to encode text strings so it can store them in memory. All text strings are in this internal format. In fact, text strings are never in any other format! You shouldn't worry about what this format is, because conversion is automatically done when you decode or encode. =head2 Your new toolkit Add to your standard heading the following line: use Encode qw(encode decode); Or, if you're lazy, just: use Encode; =head2 I/O flow (the actual 5 minute tutorial) The typical input/output flow of a program is: 1. Receive and decode 2. Process 3. Encode and output If your input is binary, and is supposed to remain binary, you shouldn't decode it to a text string, of course. But in all other cases, you should decode it. Decoding can't happen reliably if you don't know how the data was encoded. If you get to choose, it's a good idea to standardize on UTF-8. my $foo = decode('UTF-8', get 'http://example.com/'); my $bar = decode('ISO-8859-1', readline STDIN); my $xyzzy = decode('Windows-1251', $cgi->param('foo')); Processing happens as you knew before. The only difference is that you're now using characters instead of bytes. That's very useful if you use things like C<substr>, or C<length>. It's important to realize that there are no bytes in a text string. Of course, Perl has its internal encoding to store the string in memory, but ignore that. If you have to do anything with the number of bytes, it's probably best to move that part to step 3, just after you've encoded the string. Then you know exactly how many bytes it will be in the destination string. The syntax for encoding text strings to binary strings is as simple as decoding: $body = encode('UTF-8', $body); If you needed to know the length of the string in bytes, now's the perfect time for that. Because C<$body> is now a byte string, C<length> will report the number of bytes, instead of the number of characters. The number of characters is no longer known, because characters only exist in text strings. my $byte_count = length $body; And if the protocol you're using supports a way of letting the recipient know which character encoding you used, please help the receiving end by using that feature! For example, E-mail and HTTP support MIME headers, so you can use the C<Content-Type> header. They can also have C<Content-Length> to indicate the number of I<bytes>, which is always a good idea to supply if the number is known. "Content-Type: text/plain; charset=UTF-8", "Content-Length: $byte_count" =head1 SUMMARY Decode everything you receive, encode everything you send out. (If it's text data.) =head1 Q and A (or FAQ) After reading this document, you ought to read L<perlunifaq> too, then L<perluniintro>. =head1 ACKNOWLEDGEMENTS Thanks to Johan Vromans from Squirrel Consultancy. His UTF-8 rants during the Amsterdam Perl Mongers meetings got me interested and determined to find out how to use character encodings in Perl in ways that don't break easily. Thanks to Gerard Goossen from TTY. His presentation "UTF-8 in the wild" (Dutch Perl Workshop 2006) inspired me to publish my thoughts and write this tutorial. Thanks to the people who asked about this kind of stuff in several Perl IRC channels, and have constantly reminded me that a simpler explanation was needed. Thanks to the people who reviewed this document for me, before it went public. They are: Benjamin Smith, Jan-Pieter Cornet, Johan Vromans, Lukas Mai, Nathan Gray. =head1 AUTHOR Juerd Waalboer <#####@juerd.nl> =head1 SEE ALSO L<perlunifaq>, L<perlunicode>, L<perluniintro>, L<Encode> PK �=�[ƙ$:� � perlopenbsd.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specifically designed to be readable as is. =head1 NAME perlopenbsd - Perl version 5 on OpenBSD systems =head1 DESCRIPTION This document describes various features of OpenBSD that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. =head2 OpenBSD core dumps from getprotobyname_r and getservbyname_r with ithreads When Perl is configured to use ithreads, it will use re-entrant library calls in preference to non-re-entrant versions. There is an incompatibility in OpenBSD's C<getprotobyname_r> and C<getservbyname_r> function in versions 3.7 and later that will cause a SEGV when called without doing a C<bzero> on their return structs prior to calling these functions. Current Perl's should handle this problem correctly. Older threaded Perls (5.8.6 or earlier) will run into this problem. If you want to run a threaded Perl on OpenBSD 3.7 or higher, you will need to upgrade to at least Perl 5.8.7. =head1 AUTHOR Steve Peters <steve@fisharerojo.org> Please report any errors, updates, or suggestions to L<https://github.com/Perl/perl5/issues>. PK �=�[��;��s �s perlgit.podnu �[��� =encoding utf8 =for comment Consistent formatting of this file is achieved with: perl ./Porting/podtidy pod/perlgit.pod =head1 NAME perlgit - Detailed information about git and the Perl repository =head1 DESCRIPTION This document provides details on using git to develop Perl. If you are just interested in working on a quick patch, see L<perlhack> first. This document is intended for people who are regular contributors to Perl, including those with write access to the git repository. =head1 CLONING THE REPOSITORY All of Perl's source code is kept centrally in a Git repository at I<github.com>. You can make a read-only clone of the repository by running: % git clone git://github.com/Perl/perl5.git perl This uses the git protocol (port 9418). If you cannot use the git protocol for firewall reasons, you can also clone via http: % git clone https://github.com/Perl/perl5.git perl =head1 WORKING WITH THE REPOSITORY Once you have changed into the repository directory, you can inspect it. After a clone the repository will contain a single local branch, which will be the current branch as well, as indicated by the asterisk. % git branch * blead Using the -a switch to C<branch> will also show the remote tracking branches in the repository: % git branch -a * blead origin/HEAD origin/blead ... The branches that begin with "origin" correspond to the "git remote" that you cloned from (which is named "origin"). Each branch on the remote will be exactly tracked by these branches. You should NEVER do work on these remote tracking branches. You only ever do work in a local branch. Local branches can be configured to automerge (on pull) from a designated remote tracking branch. This is the case with the default branch C<blead> which will be configured to merge from the remote tracking branch C<origin/blead>. You can see recent commits: % git log And pull new changes from the repository, and update your local repository (must be clean first) % git pull Assuming we are on the branch C<blead> immediately after a pull, this command would be more or less equivalent to: % git fetch % git merge origin/blead In fact if you want to update your local repository without touching your working directory you do: % git fetch And if you want to update your remote-tracking branches for all defined remotes simultaneously you can do % git remote update Neither of these last two commands will update your working directory, however both will update the remote-tracking branches in your repository. To make a local branch of a remote branch: % git checkout -b maint-5.10 origin/maint-5.10 To switch back to blead: % git checkout blead =head2 Finding out your status The most common git command you will use will probably be % git status This command will produce as output a description of the current state of the repository, including modified files and unignored untracked files, and in addition it will show things like what files have been staged for the next commit, and usually some useful information about how to change things. For instance the following: % git status On branch blead Your branch is ahead of 'origin/blead' by 1 commit. Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: pod/perlgit.pod Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: pod/perlgit.pod Untracked files: (use "git add <file>..." to include in what will be committed) deliberate.untracked This shows that there were changes to this document staged for commit, and that there were further changes in the working directory not yet staged. It also shows that there was an untracked file in the working directory, and as you can see shows how to change all of this. It also shows that there is one commit on the working branch C<blead> which has not been pushed to the C<origin> remote yet. B<NOTE>: This output is also what you see as a template if you do not provide a message to C<git commit>. =head2 Patch workflow First, please read L<perlhack> for details on hacking the Perl core. That document covers many details on how to create a good patch. If you already have a Perl repository, you should ensure that you're on the I<blead> branch, and your repository is up to date: % git checkout blead % git pull It's preferable to patch against the latest blead version, since this is where new development occurs for all changes other than critical bug fixes. Critical bug fix patches should be made against the relevant maint branches, or should be submitted with a note indicating all the branches where the fix should be applied. Now that we have everything up to date, we need to create a temporary new branch for these changes and switch into it: % git checkout -b orange which is the short form of % git branch orange % git checkout orange Creating a topic branch makes it easier for the maintainers to rebase or merge back into the master blead for a more linear history. If you don't work on a topic branch the maintainer has to manually cherry pick your changes onto blead before they can be applied. That'll get you scolded on perl5-porters, so don't do that. Be Awesome. Then make your changes. For example, if Leon Brocard changes his name to Orange Brocard, we should change his name in the AUTHORS file: % perl -pi -e 's{Leon Brocard}{Orange Brocard}' AUTHORS You can see what files are changed: % git status On branch orange Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: AUTHORS And you can see the changes: % git diff diff --git a/AUTHORS b/AUTHORS index 293dd70..722c93e 100644 --- a/AUTHORS +++ b/AUTHORS @@ -541,7 +541,7 @@ Lars Hecking <lhecking@nmrc.ucc.ie> Laszlo Molnar <laszlo.molnar@eth.ericsson.se> Leif Huhn <leif@hale.dkstat.com> Len Johnson <lenjay@ibm.net> -Leon Brocard <acme@astray.com> +Orange Brocard <acme@astray.com> Les Peters <lpeters@aol.net> Lesley Binks <lesley.binks@gmail.com> Lincoln D. Stein <lstein@cshl.org> Now commit your change locally: % git commit -a -m 'Rename Leon Brocard to Orange Brocard' Created commit 6196c1d: Rename Leon Brocard to Orange Brocard 1 files changed, 1 insertions(+), 1 deletions(-) The C<-a> option is used to include all files that git tracks that you have changed. If at this time, you only want to commit some of the files you have worked on, you can omit the C<-a> and use the command C<S<git add I<FILE ...>>> before doing the commit. C<S<git add --interactive>> allows you to even just commit portions of files instead of all the changes in them. The C<-m> option is used to specify the commit message. If you omit it, git will open a text editor for you to compose the message interactively. This is useful when the changes are more complex than the sample given here, and, depending on the editor, to know that the first line of the commit message doesn't exceed the 50 character legal maximum. See L<perlhack/Commit message> for more information about what makes a good commit message. Once you've finished writing your commit message and exited your editor, git will write your change to disk and tell you something like this: Created commit daf8e63: explain git status and stuff about remotes 1 files changed, 83 insertions(+), 3 deletions(-) If you re-run C<git status>, you should see something like this: % git status On branch orange Untracked files: (use "git add <file>..." to include in what will be committed) deliberate.untracked nothing added to commit but untracked files present (use "git add" to track) When in doubt, before you do anything else, check your status and read it carefully, many questions are answered directly by the git status output. You can examine your last commit with: % git show HEAD and if you are not happy with either the description or the patch itself you can fix it up by editing the files once more and then issue: % git commit -a --amend Now, create a fork on GitHub to push your branch to, and add it as a remote if you haven't already, as described in the GitHub documentation at L<https://help.github.com/en/articles/working-with-forks>: % git remote add fork git@github.com:MyUser/perl5.git And push the branch to your fork: % git push -u fork orange You should now submit a Pull Request (PR) on GitHub from the new branch to blead. For more information, see the GitHub documentation at L<https://help.github.com/en/articles/creating-a-pull-request-from-a-fork>. You can also send patch files to L<perl5-porters@perl.org|mailto:perl5-porters@perl.org> directly if the patch is not ready to be applied, but intended for discussion. To create a patch file for all your local changes: % git format-patch -M blead.. 0001-Rename-Leon-Brocard-to-Orange-Brocard.patch Or for a lot of changes, e.g. from a topic branch: % git format-patch --stdout -M blead.. > topic-branch-changes.patch If you want to delete your temporary branch, you may do so with: % git checkout blead % git branch -d orange error: The branch 'orange' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D orange'. % git branch -D orange Deleted branch orange. =head2 A note on derived files Be aware that many files in the distribution are derivative--avoid patching them, because git won't see the changes to them, and the build process will overwrite them. Patch the originals instead. Most utilities (like perldoc) are in this category, i.e. patch F<utils/perldoc.PL> rather than F<utils/perldoc>. Similarly, don't create patches for files under F<$src_root/ext> from their copies found in F<$install_root/lib>. If you are unsure about the proper location of a file that may have gotten copied while building the source distribution, consult the F<MANIFEST>. =head2 Cleaning a working directory The command C<git clean> can with varying arguments be used as a replacement for C<make clean>. To reset your working directory to a pristine condition you can do: % git clean -dxf However, be aware this will delete ALL untracked content. You can use % git clean -Xf to remove all ignored untracked files, such as build and test byproduct, but leave any manually created files alone. If you only want to cancel some uncommitted edits, you can use C<git checkout> and give it a list of files to be reverted, or C<git checkout -f> to revert them all. If you want to cancel one or several commits, you can use C<git reset>. =head2 Bisecting C<git> provides a built-in way to determine which commit should be blamed for introducing a given bug. C<git bisect> performs a binary search of history to locate the first failing commit. It is fast, powerful and flexible, but requires some setup and to automate the process an auxiliary shell script is needed. The core provides a wrapper program, F<Porting/bisect.pl>, which attempts to simplify as much as possible, making bisecting as simple as running a Perl one-liner. For example, if you want to know when this became an error: perl -e 'my $a := 2' you simply run this: .../Porting/bisect.pl -e 'my $a := 2;' Using F<Porting/bisect.pl>, with one command (and no other files) it's easy to find out =over 4 =item * Which commit caused this example code to break? =item * Which commit caused this example code to start working? =item * Which commit added the first file to match this regex? =item * Which commit removed the last file to match this regex? =back usually without needing to know which versions of perl to use as start and end revisions, as F<Porting/bisect.pl> automatically searches to find the earliest stable version for which the test case passes. Run C<Porting/bisect.pl --help> for the full documentation, including how to set the C<Configure> and build time options. If you require more flexibility than F<Porting/bisect.pl> has to offer, you'll need to run C<git bisect> yourself. It's most useful to use C<git bisect run> to automate the building and testing of perl revisions. For this you'll need a shell script for C<git> to call to test a particular revision. An example script is F<Porting/bisect-example.sh>, which you should copy B<outside> of the repository, as the bisect process will reset the state to a clean checkout as it runs. The instructions below assume that you copied it as F<~/run> and then edited it as appropriate. You first enter in bisect mode with: % git bisect start For example, if the bug is present on C<HEAD> but wasn't in 5.10.0, C<git> will learn about this when you enter: % git bisect bad % git bisect good perl-5.10.0 Bisecting: 853 revisions left to test after this This results in checking out the median commit between C<HEAD> and C<perl-5.10.0>. You can then run the bisecting process with: % git bisect run ~/run When the first bad commit is isolated, C<git bisect> will tell you so: ca4cfd28534303b82a216cfe83a1c80cbc3b9dc5 is first bad commit commit ca4cfd28534303b82a216cfe83a1c80cbc3b9dc5 Author: Dave Mitchell <davem@fdisolutions.com> Date: Sat Feb 9 14:56:23 2008 +0000 [perl #49472] Attributes + Unknown Error ... bisect run success You can peek into the bisecting process with C<git bisect log> and C<git bisect visualize>. C<git bisect reset> will get you out of bisect mode. Please note that the first C<good> state must be an ancestor of the first C<bad> state. If you want to search for the commit that I<solved> some bug, you have to negate your test case (i.e. exit with C<1> if OK and C<0> if not) and still mark the lower bound as C<good> and the upper as C<bad>. The "first bad commit" has then to be understood as the "first commit where the bug is solved". C<git help bisect> has much more information on how you can tweak your binary searches. Following bisection you may wish to configure, build and test perl at commits identified by the bisection process. Sometimes, particularly with older perls, C<make> may fail during this process. In this case you may be able to patch the source code at the older commit point. To do so, please follow the suggestions provided in L<perlhack/Building perl at older commits>. =head2 Topic branches and rewriting history Individual committers should create topic branches under B<yourname>/B<some_descriptive_name>: % branch="$yourname/$some_descriptive_name" % git checkout -b $branch ... do local edits, commits etc ... % git push origin -u $branch Should you be stuck with an ancient version of git (prior to 1.7), then C<git push> will not have the C<-u> switch, and you have to replace the last step with the following sequence: % git push origin $branch:refs/heads/$branch % git config branch.$branch.remote origin % git config branch.$branch.merge refs/heads/$branch If you want to make changes to someone else's topic branch, you should check with its creator before making any change to it. You might sometimes find that the original author has edited the branch's history. There are lots of good reasons for this. Sometimes, an author might simply be rebasing the branch onto a newer source point. Sometimes, an author might have found an error in an early commit which they wanted to fix before merging the branch to blead. Currently the master repository is configured to forbid non-fast-forward merges. This means that the branches within can not be rebased and pushed as a single step. The only way you will ever be allowed to rebase or modify the history of a pushed branch is to delete it and push it as a new branch under the same name. Please think carefully about doing this. It may be better to sequentially rename your branches so that it is easier for others working with you to cherry-pick their local changes onto the new version. (XXX: needs explanation). If you want to rebase a personal topic branch, you will have to delete your existing topic branch and push as a new version of it. You can do this via the following formula (see the explanation about C<refspec>'s in the git push documentation for details) after you have rebased your branch: # first rebase % git checkout $user/$topic % git fetch % git rebase origin/blead # then "delete-and-push" % git push origin :$user/$topic % git push origin $user/$topic B<NOTE:> it is forbidden at the repository level to delete any of the "primary" branches. That is any branch matching C<m!^(blead|maint|perl)!>. Any attempt to do so will result in git producing an error like this: % git push origin :blead *** It is forbidden to delete blead/maint branches in this repository error: hooks/update exited with error code 1 error: hook declined to update refs/heads/blead To ssh://perl5.git.perl.org/perl ! [remote rejected] blead (hook declined) error: failed to push some refs to 'ssh://perl5.git.perl.org/perl' As a matter of policy we do B<not> edit the history of the blead and maint-* branches. If a typo (or worse) sneaks into a commit to blead or maint-*, we'll fix it in another commit. The only types of updates allowed on these branches are "fast-forwards", where all history is preserved. Annotated tags in the canonical perl.git repository will never be deleted or modified. Think long and hard about whether you want to push a local tag to perl.git before doing so. (Pushing simple tags is not allowed.) =head2 Grafts The perl history contains one mistake which was not caught in the conversion: a merge was recorded in the history between blead and maint-5.10 where no merge actually occurred. Due to the nature of git, this is now impossible to fix in the public repository. You can remove this mis-merge locally by adding the following line to your C<.git/info/grafts> file: 296f12bbbbaa06de9be9d09d3dcf8f4528898a49 434946e0cb7a32589ed92d18008aaa1d88515930 It is particularly important to have this graft line if any bisecting is done in the area of the "merge" in question. =head1 WRITE ACCESS TO THE GIT REPOSITORY Once you have write access, you will need to modify the URL for the origin remote to enable pushing. Edit F<.git/config> with the git-config(1) command: % git config remote.origin.url git@github.com:Perl/perl5.git You can also set up your user name and e-mail address. Most people do this once globally in their F<~/.gitconfig> by doing something like: % git config --global user.name "Ævar Arnfjörð Bjarmason" % git config --global user.email avarab@gmail.com However, if you'd like to override that just for perl, execute something like the following in F<perl>: % git config user.email avar@cpan.org It is also possible to keep C<origin> as a git remote, and add a new remote for ssh access: % git remote add camel git@github.com:Perl/perl5.git This allows you to update your local repository by pulling from C<origin>, which is faster and doesn't require you to authenticate, and to push your changes back with the C<camel> remote: % git fetch camel % git push camel The C<fetch> command just updates the C<camel> refs, as the objects themselves should have been fetched when pulling from C<origin>. =head2 Accepting a patch If you have received a patch file generated using the above section, you should try out the patch. First we need to create a temporary new branch for these changes and switch into it: % git checkout -b experimental Patches that were formatted by C<git format-patch> are applied with C<git am>: % git am 0001-Rename-Leon-Brocard-to-Orange-Brocard.patch Applying Rename Leon Brocard to Orange Brocard Note that some UNIX mail systems can mess with text attachments containing 'From '. This will fix them up: % perl -pi -e's/^>From /From /' \ 0001-Rename-Leon-Brocard-to-Orange-Brocard.patch If just a raw diff is provided, it is also possible use this two-step process: % git apply bugfix.diff % git commit -a -m "Some fixing" \ --author="That Guy <that.guy@internets.com>" Now we can inspect the change: % git show HEAD commit b1b3dab48344cff6de4087efca3dbd63548ab5e2 Author: Leon Brocard <acme@astray.com> Date: Fri Dec 19 17:02:59 2008 +0000 Rename Leon Brocard to Orange Brocard diff --git a/AUTHORS b/AUTHORS index 293dd70..722c93e 100644 --- a/AUTHORS +++ b/AUTHORS @@ -541,7 +541,7 @@ Lars Hecking <lhecking@nmrc.ucc.ie> Laszlo Molnar <laszlo.molnar@eth.ericsson.se> Leif Huhn <leif@hale.dkstat.com> Len Johnson <lenjay@ibm.net> -Leon Brocard <acme@astray.com> +Orange Brocard <acme@astray.com> Les Peters <lpeters@aol.net> Lesley Binks <lesley.binks@gmail.com> Lincoln D. Stein <lstein@cshl.org> If you are a committer to Perl and you think the patch is good, you can then merge it into blead then push it out to the main repository: % git checkout blead % git merge experimental % git push origin blead If you want to delete your temporary branch, you may do so with: % git checkout blead % git branch -d experimental error: The branch 'experimental' is not an ancestor of your current HEAD. If you are sure you want to delete it, run 'git branch -D experimental'. % git branch -D experimental Deleted branch experimental. =head2 Committing to blead The 'blead' branch will become the next production release of Perl. Before pushing I<any> local change to blead, it's incredibly important that you do a few things, lest other committers come after you with pitchforks and torches: =over =item * Make sure you have a good commit message. See L<perlhack/Commit message> for details. =item * Run the test suite. You might not think that one typo fix would break a test file. You'd be wrong. Here's an example of where not running the suite caused problems. A patch was submitted that added a couple of tests to an existing F<.t>. It couldn't possibly affect anything else, so no need to test beyond the single affected F<.t>, right? But, the submitter's email address had changed since the last of their submissions, and this caused other tests to fail. Running the test target given in the next item would have caught this problem. =item * If you don't run the full test suite, at least C<make test_porting>. This will run basic sanity checks. To see which sanity checks, have a look in F<t/porting>. =item * If you make any changes that affect miniperl or core routines that have different code paths for miniperl, be sure to run C<make minitest>. This will catch problems that even the full test suite will not catch because it runs a subset of tests under miniperl rather than perl. =back =head2 On merging and rebasing Simple, one-off commits pushed to the 'blead' branch should be simple commits that apply cleanly. In other words, you should make sure your work is committed against the current position of blead, so that you can push back to the master repository without merging. Sometimes, blead will move while you're building or testing your changes. When this happens, your push will be rejected with a message like this: To ssh://perl5.git.perl.org/perl.git ! [rejected] blead -> blead (non-fast-forward) error: failed to push some refs to 'ssh://perl5.git.perl.org/perl.git' To prevent you from losing history, non-fast-forward updates were rejected Merge the remote changes (e.g. 'git pull') before pushing again. See the 'Note about fast-forwards' section of 'git push --help' for details. When this happens, you can just I<rebase> your work against the new position of blead, like this (assuming your remote for the master repository is "p5p"): % git fetch p5p % git rebase p5p/blead You will see your commits being re-applied, and you will then be able to push safely. More information about rebasing can be found in the documentation for the git-rebase(1) command. For larger sets of commits that only make sense together, or that would benefit from a summary of the set's purpose, you should use a merge commit. You should perform your work on a L<topic branch|/Topic branches and rewriting history>, which you should regularly rebase against blead to ensure that your code is not broken by blead moving. When you have finished your work, please perform a final rebase and test. Linear history is something that gets lost with every commit on blead, but a final rebase makes the history linear again, making it easier for future maintainers to see what has happened. Rebase as follows (assuming your work was on the branch C<< committer/somework >>): % git checkout committer/somework % git rebase blead Then you can merge it into master like this: % git checkout blead % git merge --no-ff --no-commit committer/somework % git commit -a The switches above deserve explanation. C<--no-ff> indicates that even if all your work can be applied linearly against blead, a merge commit should still be prepared. This ensures that all your work will be shown as a side branch, with all its commits merged into the mainstream blead by the merge commit. C<--no-commit> means that the merge commit will be I<prepared> but not I<committed>. The commit is then actually performed when you run the next command, which will bring up your editor to describe the commit. Without C<--no-commit>, the commit would be made with nearly no useful message, which would greatly diminish the value of the merge commit as a placeholder for the work's description. When describing the merge commit, explain the purpose of the branch, and keep in mind that this description will probably be used by the eventual release engineer when reviewing the next perldelta document. =head2 Committing to maintenance versions Maintenance versions should only be altered to add critical bug fixes, see L<perlpolicy>. To commit to a maintenance version of perl, you need to create a local tracking branch: % git checkout --track -b maint-5.005 origin/maint-5.005 This creates a local branch named C<maint-5.005>, which tracks the remote branch C<origin/maint-5.005>. Then you can pull, commit, merge and push as before. You can also cherry-pick commits from blead and another branch, by using the C<git cherry-pick> command. It is recommended to use the B<-x> option to C<git cherry-pick> in order to record the SHA1 of the original commit in the new commit message. Before pushing any change to a maint version, make sure you've satisfied the steps in L</Committing to blead> above. =head2 Using a smoke-me branch to test changes Sometimes a change affects code paths which you cannot test on the OSes which are directly available to you and it would be wise to have users on other OSes test the change before you commit it to blead. Fortunately, there is a way to get your change smoke-tested on various OSes: push it to a "smoke-me" branch and wait for certain automated smoke-testers to report the results from their OSes. A "smoke-me" branch is identified by the branch name: specifically, as seen on github.com it must be a local branch whose first name component is precisely C<smoke-me>. The procedure for doing this is roughly as follows (using the example of tonyc's smoke-me branch called win32stat): First, make a local branch and switch to it: % git checkout -b win32stat Make some changes, build perl and test your changes, then commit them to your local branch. Then push your local branch to a remote smoke-me branch: % git push origin win32stat:smoke-me/tonyc/win32stat Now you can switch back to blead locally: % git checkout blead and continue working on other things while you wait a day or two, keeping an eye on the results reported for your smoke-me branch at L<http://perl.develop-help.com/?b=smoke-me/tonyc/win32state>. If all is well then update your blead branch: % git pull then checkout your smoke-me branch once more and rebase it on blead: % git rebase blead win32stat Now switch back to blead and merge your smoke-me branch into it: % git checkout blead % git merge win32stat As described earlier, if there are many changes on your smoke-me branch then you should prepare a merge commit in which to give an overview of those changes by using the following command instead of the last command above: % git merge win32stat --no-ff --no-commit You should now build perl and test your (merged) changes one last time (ideally run the whole test suite, but failing that at least run the F<t/porting/*.t> tests) before pushing your changes as usual: % git push origin blead Finally, you should then delete the remote smoke-me branch: % git push origin :smoke-me/tonyc/win32stat (which is likely to produce a warning like this, which can be ignored: remote: fatal: ambiguous argument 'refs/heads/smoke-me/tonyc/win32stat': unknown revision or path not in the working tree. remote: Use '--' to separate paths from revisions ) and then delete your local branch: % git branch -d win32stat PK �=�[��7�� �� perl56delta.podnu �[��� =head1 NAME perl56delta - what's new for perl v5.6.0 =head1 DESCRIPTION This document describes differences between the 5.005 release and the 5.6.0 release. =head1 Core Enhancements =head2 Interpreter cloning, threads, and concurrency Perl 5.6.0 introduces the beginnings of support for running multiple interpreters concurrently in different threads. In conjunction with the perl_clone() API call, which can be used to selectively duplicate the state of any given interpreter, it is possible to compile a piece of code once in an interpreter, clone that interpreter one or more times, and run all the resulting interpreters in distinct threads. On the Windows platform, this feature is used to emulate fork() at the interpreter level. See L<perlfork> for details about that. This feature is still in evolution. It is eventually meant to be used to selectively clone a subroutine and data reachable from that subroutine in a separate interpreter and run the cloned subroutine in a separate thread. Since there is no shared data between the interpreters, little or no locking will be needed (unless parts of the symbol table are explicitly shared). This is obviously intended to be an easy-to-use replacement for the existing threads support. Support for cloning interpreters and interpreter concurrency can be enabled using the -Dusethreads Configure option (see win32/Makefile for how to enable it on Windows.) The resulting perl executable will be functionally identical to one that was built with -Dmultiplicity, but the perl_clone() API call will only be available in the former. -Dusethreads enables the cpp macro USE_ITHREADS by default, which in turn enables Perl source code changes that provide a clear separation between the op tree and the data it operates with. The former is immutable, and can therefore be shared between an interpreter and all of its clones, while the latter is considered local to each interpreter, and is therefore copied for each clone. Note that building Perl with the -Dusemultiplicity Configure option is adequate if you wish to run multiple B<independent> interpreters concurrently in different threads. -Dusethreads only provides the additional functionality of the perl_clone() API call and other support for running B<cloned> interpreters concurrently. NOTE: This is an experimental feature. Implementation details are subject to change. =head2 Lexically scoped warning categories You can now control the granularity of warnings emitted by perl at a finer level using the C<use warnings> pragma. L<warnings> and L<perllexwarn> have copious documentation on this feature. =head2 Unicode and UTF-8 support Perl now uses UTF-8 as its internal representation for character strings. The C<utf8> and C<bytes> pragmas are used to control this support in the current lexical scope. See L<perlunicode>, L<utf8> and L<bytes> for more information. This feature is expected to evolve quickly to support some form of I/O disciplines that can be used to specify the kind of input and output data (bytes or characters). Until that happens, additional modules from CPAN will be needed to complete the toolkit for dealing with Unicode. NOTE: This should be considered an experimental feature. Implementation details are subject to change. =head2 Support for interpolating named characters The new C<\N> escape interpolates named characters within strings. For example, C<"Hi! \N{WHITE SMILING FACE}"> evaluates to a string with a unicode smiley face at the end. =head2 "our" declarations An "our" declaration introduces a value that can be best understood as a lexically scoped symbolic alias to a global variable in the package that was current where the variable was declared. This is mostly useful as an alternative to the C<vars> pragma, but also provides the opportunity to introduce typing and other attributes for such variables. See L<perlfunc/our>. =head2 Support for strings represented as a vector of ordinals Literals of the form C<v1.2.3.4> are now parsed as a string composed of characters with the specified ordinals. This is an alternative, more readable way to construct (possibly unicode) strings instead of interpolating characters, as in C<"\x{1}\x{2}\x{3}\x{4}">. The leading C<v> may be omitted if there are more than two ordinals, so C<1.2.3> is parsed the same as C<v1.2.3>. Strings written in this form are also useful to represent version "numbers". It is easy to compare such version "numbers" (which are really just plain strings) using any of the usual string comparison operators C<eq>, C<ne>, C<lt>, C<gt>, etc., or perform bitwise string operations on them using C<|>, C<&>, etc. In conjunction with the new C<$^V> magic variable (which contains the perl version as a string), such literals can be used as a readable way to check if you're running a particular version of Perl: # this will parse in older versions of Perl also if ($^V and $^V gt v5.6.0) { # new features supported } C<require> and C<use> also have some special magic to support such literals, but this particular usage should be avoided because it leads to misleading error messages under versions of Perl which don't support vector strings. Using a true version number will ensure correct behavior in all versions of Perl: require 5.006; # run time check for v5.6 use 5.006_001; # compile time check for v5.6.1 Also, C<sprintf> and C<printf> support the Perl-specific format flag C<%v> to print ordinals of characters in arbitrary strings: printf "v%vd", $^V; # prints current version, such as "v5.5.650" printf "%*vX", ":", $addr; # formats IPv6 address printf "%*vb", " ", $bits; # displays bitstring See L<perldata/"Scalar value constructors"> for additional information. =head2 Improved Perl version numbering system Beginning with Perl version 5.6.0, the version number convention has been changed to a "dotted integer" scheme that is more commonly found in open source projects. Maintenance versions of v5.6.0 will be released as v5.6.1, v5.6.2 etc. The next development series following v5.6.0 will be numbered v5.7.x, beginning with v5.7.0, and the next major production release following v5.6.0 will be v5.8.0. The English module now sets $PERL_VERSION to $^V (a string value) rather than C<$]> (a numeric value). (This is a potential incompatibility. Send us a report via perlbug if you are affected by this.) The v1.2.3 syntax is also now legal in Perl. See L</Support for strings represented as a vector of ordinals> for more on that. To cope with the new versioning system's use of at least three significant digits for each version component, the method used for incrementing the subversion number has also changed slightly. We assume that versions older than v5.6.0 have been incrementing the subversion component in multiples of 10. Versions after v5.6.0 will increment them by 1. Thus, using the new notation, 5.005_03 is the "same" as v5.5.30, and the first maintenance version following v5.6.0 will be v5.6.1 (which should be read as being equivalent to a floating point value of 5.006_001 in the older format, stored in C<$]>). =head2 New syntax for declaring subroutine attributes Formerly, if you wanted to mark a subroutine as being a method call or as requiring an automatic lock() when it is entered, you had to declare that with a C<use attrs> pragma in the body of the subroutine. That can now be accomplished with declaration syntax, like this: sub mymethod : locked method; ... sub mymethod : locked method { ... } sub othermethod :locked :method; ... sub othermethod :locked :method { ... } (Note how only the first C<:> is mandatory, and whitespace surrounding the C<:> is optional.) F<AutoSplit.pm> and F<SelfLoader.pm> have been updated to keep the attributes with the stubs they provide. See L<attributes>. =head2 File and directory handles can be autovivified Similar to how constructs such as C<< $x->[0] >> autovivify a reference, handle constructors (open(), opendir(), pipe(), socketpair(), sysopen(), socket(), and accept()) now autovivify a file or directory handle if the handle passed to them is an uninitialized scalar variable. This allows the constructs such as C<open(my $fh, ...)> and C<open(local $fh,...)> to be used to create filehandles that will conveniently be closed automatically when the scope ends, provided there are no other references to them. This largely eliminates the need for typeglobs when opening filehandles that must be passed around, as in the following example: sub myopen { open my $fh, "@_" or die "Can't open '@_': $!"; return $fh; } { my $f = myopen("</etc/motd"); print <$f>; # $f implicitly closed here } =head2 open() with more than two arguments If open() is passed three arguments instead of two, the second argument is used as the mode and the third argument is taken to be the file name. This is primarily useful for protecting against unintended magic behavior of the traditional two-argument form. See L<perlfunc/open>. =head2 64-bit support Any platform that has 64-bit integers either (1) natively as longs or ints (2) via special compiler flags (3) using long long or int64_t is able to use "quads" (64-bit integers) as follows: =over 4 =item * constants (decimal, hexadecimal, octal, binary) in the code =item * arguments to oct() and hex() =item * arguments to print(), printf() and sprintf() (flag prefixes ll, L, q) =item * printed as such =item * pack() and unpack() "q" and "Q" formats =item * in basic arithmetics: + - * / % (NOTE: operating close to the limits of the integer values may produce surprising results) =item * in bit arithmetics: & | ^ ~ << >> (NOTE: these used to be forced to be 32 bits wide but now operate on the full native width.) =item * vec() =back Note that unless you have the case (a) you will have to configure and compile Perl using the -Duse64bitint Configure flag. NOTE: The Configure flags -Duselonglong and -Duse64bits have been deprecated. Use -Duse64bitint instead. There are actually two modes of 64-bitness: the first one is achieved using Configure -Duse64bitint and the second one using Configure -Duse64bitall. The difference is that the first one is minimal and the second one maximal. The first works in more places than the second. The C<use64bitint> does only as much as is required to get 64-bit integers into Perl (this may mean, for example, using "long longs") while your memory may still be limited to 2 gigabytes (because your pointers could still be 32-bit). Note that the name C<64bitint> does not imply that your C compiler will be using 64-bit C<int>s (it might, but it doesn't have to): the C<use64bitint> means that you will be able to have 64 bits wide scalar values. The C<use64bitall> goes all the way by attempting to switch also integers (if it can), longs (and pointers) to being 64-bit. This may create an even more binary incompatible Perl than -Duse64bitint: the resulting executable may not run at all in a 32-bit box, or you may have to reboot/reconfigure/rebuild your operating system to be 64-bit aware. Natively 64-bit systems like Alpha and Cray need neither -Duse64bitint nor -Duse64bitall. Last but not least: note that due to Perl's habit of always using floating point numbers, the quads are still not true integers. When quads overflow their limits (0...18_446_744_073_709_551_615 unsigned, -9_223_372_036_854_775_808...9_223_372_036_854_775_807 signed), they are silently promoted to floating point numbers, after which they will start losing precision (in their lower digits). NOTE: 64-bit support is still experimental on most platforms. Existing support only covers the LP64 data model. In particular, the LLP64 data model is not yet supported. 64-bit libraries and system APIs on many platforms have not stabilized--your mileage may vary. =head2 Large file support If you have filesystems that support "large files" (files larger than 2 gigabytes), you may now also be able to create and access them from Perl. NOTE: The default action is to enable large file support, if available on the platform. If the large file support is on, and you have a Fcntl constant O_LARGEFILE, the O_LARGEFILE is automatically added to the flags of sysopen(). Beware that unless your filesystem also supports "sparse files" seeking to umpteen petabytes may be inadvisable. Note that in addition to requiring a proper file system to do large files you may also need to adjust your per-process (or your per-system, or per-process-group, or per-user-group) maximum filesize limits before running Perl scripts that try to handle large files, especially if you intend to write such files. Finally, in addition to your process/process group maximum filesize limits, you may have quota limits on your filesystems that stop you (your user id or your user group id) from using large files. Adjusting your process/user/group/file system/operating system limits is outside the scope of Perl core language. For process limits, you may try increasing the limits using your shell's limits/limit/ulimit command before running Perl. The BSD::Resource extension (not included with the standard Perl distribution) may also be of use, it offers the getrlimit/setrlimit interface that can be used to adjust process resource usage limits, including the maximum filesize limit. =head2 Long doubles In some systems you may be able to use long doubles to enhance the range and precision of your double precision floating point numbers (that is, Perl's numbers). Use Configure -Duselongdouble to enable this support (if it is available). =head2 "more bits" You can "Configure -Dusemorebits" to turn on both the 64-bit support and the long double support. =head2 Enhanced support for sort() subroutines Perl subroutines with a prototype of C<($$)>, and XSUBs in general, can now be used as sort subroutines. In either case, the two elements to be compared are passed as normal parameters in @_. See L<perlfunc/sort>. For unprototyped sort subroutines, the historical behavior of passing the elements to be compared as the global variables $a and $b remains unchanged. =head2 C<sort $coderef @foo> allowed sort() did not accept a subroutine reference as the comparison function in earlier versions. This is now permitted. =head2 File globbing implemented internally Perl now uses the File::Glob implementation of the glob() operator automatically. This avoids using an external csh process and the problems associated with it. NOTE: This is currently an experimental feature. Interfaces and implementation are subject to change. =head2 Support for CHECK blocks In addition to C<BEGIN>, C<INIT>, C<END>, C<DESTROY> and C<AUTOLOAD>, subroutines named C<CHECK> are now special. These are queued up during compilation and behave similar to END blocks, except they are called at the end of compilation rather than at the end of execution. They cannot be called directly. =head2 POSIX character class syntax [: :] supported For example to match alphabetic characters use /[[:alpha:]]/. See L<perlre> for details. =head2 Better pseudo-random number generator In 5.005_0x and earlier, perl's rand() function used the C library rand(3) function. As of 5.005_52, Configure tests for drand48(), random(), and rand() (in that order) and picks the first one it finds. These changes should result in better random numbers from rand(). =head2 Improved C<qw//> operator The C<qw//> operator is now evaluated at compile time into a true list instead of being replaced with a run time call to C<split()>. This removes the confusing misbehaviour of C<qw//> in scalar context, which had inherited that behaviour from split(). Thus: $foo = ($bar) = qw(a b c); print "$foo|$bar\n"; now correctly prints "3|a", instead of "2|a". =head2 Better worst-case behavior of hashes Small changes in the hashing algorithm have been implemented in order to improve the distribution of lower order bits in the hashed value. This is expected to yield better performance on keys that are repeated sequences. =head2 pack() format 'Z' supported The new format type 'Z' is useful for packing and unpacking null-terminated strings. See L<perlfunc/"pack">. =head2 pack() format modifier '!' supported The new format type modifier '!' is useful for packing and unpacking native shorts, ints, and longs. See L<perlfunc/"pack">. =head2 pack() and unpack() support counted strings The template character '/' can be used to specify a counted string type to be packed or unpacked. See L<perlfunc/"pack">. =head2 Comments in pack() templates The '#' character in a template introduces a comment up to end of the line. This facilitates documentation of pack() templates. =head2 Weak references In previous versions of Perl, you couldn't cache objects so as to allow them to be deleted if the last reference from outside the cache is deleted. The reference in the cache would hold a reference count on the object and the objects would never be destroyed. Another familiar problem is with circular references. When an object references itself, its reference count would never go down to zero, and it would not get destroyed until the program is about to exit. Weak references solve this by allowing you to "weaken" any reference, that is, make it not count towards the reference count. When the last non-weak reference to an object is deleted, the object is destroyed and all the weak references to the object are automatically undef-ed. To use this feature, you need the Devel::WeakRef package from CPAN, which contains additional documentation. NOTE: This is an experimental feature. Details are subject to change. =head2 Binary numbers supported Binary numbers are now supported as literals, in s?printf formats, and C<oct()>: $answer = 0b101010; printf "The answer is: %b\n", oct("0b101010"); =head2 Lvalue subroutines Subroutines can now return modifiable lvalues. See L<perlsub/"Lvalue subroutines">. NOTE: This is an experimental feature. Details are subject to change. =head2 Some arrows may be omitted in calls through references Perl now allows the arrow to be omitted in many constructs involving subroutine calls through references. For example, C<< $foo[10]->('foo') >> may now be written C<$foo[10]('foo')>. This is rather similar to how the arrow may be omitted from C<< $foo[10]->{'foo'} >>. Note however, that the arrow is still required for C<< foo(10)->('bar') >>. =head2 Boolean assignment operators are legal lvalues Constructs such as C<($a ||= 2) += 1> are now allowed. =head2 exists() is supported on subroutine names The exists() builtin now works on subroutine names. A subroutine is considered to exist if it has been declared (even if implicitly). See L<perlfunc/exists> for examples. =head2 exists() and delete() are supported on array elements The exists() and delete() builtins now work on simple arrays as well. The behavior is similar to that on hash elements. exists() can be used to check whether an array element has been initialized. This avoids autovivifying array elements that don't exist. If the array is tied, the EXISTS() method in the corresponding tied package will be invoked. delete() may be used to remove an element from the array and return it. The array element at that position returns to its uninitialized state, so that testing for the same element with exists() will return false. If the element happens to be the one at the end, the size of the array also shrinks up to the highest element that tests true for exists(), or 0 if none such is found. If the array is tied, the DELETE() method in the corresponding tied package will be invoked. See L<perlfunc/exists> and L<perlfunc/delete> for examples. =head2 Pseudo-hashes work better Dereferencing some types of reference values in a pseudo-hash, such as C<< $ph->{foo}[1] >>, was accidentally disallowed. This has been corrected. When applied to a pseudo-hash element, exists() now reports whether the specified value exists, not merely if the key is valid. delete() now works on pseudo-hashes. When given a pseudo-hash element or slice it deletes the values corresponding to the keys (but not the keys themselves). See L<perlref/"Pseudo-hashes: Using an array as a hash">. Pseudo-hash slices with constant keys are now optimized to array lookups at compile-time. List assignments to pseudo-hash slices are now supported. The C<fields> pragma now provides ways to create pseudo-hashes, via fields::new() and fields::phash(). See L<fields>. NOTE: The pseudo-hash data type continues to be experimental. Limiting oneself to the interface elements provided by the fields pragma will provide protection from any future changes. =head2 Automatic flushing of output buffers fork(), exec(), system(), qx//, and pipe open()s now flush buffers of all files opened for output when the operation was attempted. This mostly eliminates confusing buffering mishaps suffered by users unaware of how Perl internally handles I/O. This is not supported on some platforms like Solaris where a suitably correct implementation of fflush(NULL) isn't available. =head2 Better diagnostics on meaningless filehandle operations Constructs such as C<< open(<FH>) >> and C<< close(<FH>) >> are compile time errors. Attempting to read from filehandles that were opened only for writing will now produce warnings (just as writing to read-only filehandles does). =head2 Where possible, buffered data discarded from duped input filehandle C<< open(NEW, "<&OLD") >> now attempts to discard any data that was previously read and buffered in C<OLD> before duping the handle. On platforms where doing this is allowed, the next read operation on C<NEW> will return the same data as the corresponding operation on C<OLD>. Formerly, it would have returned the data from the start of the following disk block instead. =head2 eof() has the same old magic as <> C<eof()> would return true if no attempt to read from C<< <> >> had yet been made. C<eof()> has been changed to have a little magic of its own, it now opens the C<< <> >> files. =head2 binmode() can be used to set :crlf and :raw modes binmode() now accepts a second argument that specifies a discipline for the handle in question. The two pseudo-disciplines ":raw" and ":crlf" are currently supported on DOS-derivative platforms. See L<perlfunc/"binmode"> and L<open>. =head2 C<-T> filetest recognizes UTF-8 encoded files as "text" The algorithm used for the C<-T> filetest has been enhanced to correctly identify UTF-8 content as "text". =head2 system(), backticks and pipe open now reflect exec() failure On Unix and similar platforms, system(), qx() and open(FOO, "cmd |") etc., are implemented via fork() and exec(). When the underlying exec() fails, earlier versions did not report the error properly, since the exec() happened to be in a different process. The child process now communicates with the parent about the error in launching the external command, which allows these constructs to return with their usual error value and set $!. =head2 Improved diagnostics Line numbers are no longer suppressed (under most likely circumstances) during the global destruction phase. Diagnostics emitted from code running in threads other than the main thread are now accompanied by the thread ID. Embedded null characters in diagnostics now actually show up. They used to truncate the message in prior versions. $foo::a and $foo::b are now exempt from "possible typo" warnings only if sort() is encountered in package C<foo>. Unrecognized alphabetic escapes encountered when parsing quote constructs now generate a warning, since they may take on new semantics in later versions of Perl. Many diagnostics now report the internal operation in which the warning was provoked, like so: Use of uninitialized value in concatenation (.) at (eval 1) line 1. Use of uninitialized value in print at (eval 1) line 1. Diagnostics that occur within eval may also report the file and line number where the eval is located, in addition to the eval sequence number and the line number within the evaluated text itself. For example: Not enough arguments for scalar at (eval 4)[newlib/perl5db.pl:1411] line 2, at EOF =head2 Diagnostics follow STDERR Diagnostic output now goes to whichever file the C<STDERR> handle is pointing at, instead of always going to the underlying C runtime library's C<stderr>. =head2 More consistent close-on-exec behavior On systems that support a close-on-exec flag on filehandles, the flag is now set for any handles created by pipe(), socketpair(), socket(), and accept(), if that is warranted by the value of $^F that may be in effect. Earlier versions neglected to set the flag for handles created with these operators. See L<perlfunc/pipe>, L<perlfunc/socketpair>, L<perlfunc/socket>, L<perlfunc/accept>, and L<perlvar/$^F>. =head2 syswrite() ease-of-use The length argument of C<syswrite()> has become optional. =head2 Better syntax checks on parenthesized unary operators Expressions such as: print defined(&foo,&bar,&baz); print uc("foo","bar","baz"); undef($foo,&bar); used to be accidentally allowed in earlier versions, and produced unpredictable behaviour. Some produced ancillary warnings when used in this way; others silently did the wrong thing. The parenthesized forms of most unary operators that expect a single argument now ensure that they are not called with more than one argument, making the cases shown above syntax errors. The usual behaviour of: print defined &foo, &bar, &baz; print uc "foo", "bar", "baz"; undef $foo, &bar; remains unchanged. See L<perlop>. =head2 Bit operators support full native integer width The bit operators (& | ^ ~ << >>) now operate on the full native integral width (the exact size of which is available in $Config{ivsize}). For example, if your platform is either natively 64-bit or if Perl has been configured to use 64-bit integers, these operations apply to 8 bytes (as opposed to 4 bytes on 32-bit platforms). For portability, be sure to mask off the excess bits in the result of unary C<~>, e.g., C<~$x & 0xffffffff>. =head2 Improved security features More potentially unsafe operations taint their results for improved security. The C<passwd> and C<shell> fields returned by the getpwent(), getpwnam(), and getpwuid() are now tainted, because the user can affect their own encrypted password and login shell. The variable modified by shmread(), and messages returned by msgrcv() (and its object-oriented interface IPC::SysV::Msg::rcv) are also tainted, because other untrusted processes can modify messages and shared memory segments for their own nefarious purposes. =head2 More functional bareword prototype (*) Bareword prototypes have been rationalized to enable them to be used to override builtins that accept barewords and interpret them in a special way, such as C<require> or C<do>. Arguments prototyped as C<*> will now be visible within the subroutine as either a simple scalar or as a reference to a typeglob. See L<perlsub/Prototypes>. =head2 C<require> and C<do> may be overridden C<require> and C<do 'file'> operations may be overridden locally by importing subroutines of the same name into the current package (or globally by importing them into the CORE::GLOBAL:: namespace). Overriding C<require> will also affect C<use>, provided the override is visible at compile-time. See L<perlsub/"Overriding Built-in Functions">. =head2 $^X variables may now have names longer than one character Formerly, $^X was synonymous with ${"\cX"}, but $^XY was a syntax error. Now variable names that begin with a control character may be arbitrarily long. However, for compatibility reasons, these variables I<must> be written with explicit braces, as C<${^XY}> for example. C<${^XYZ}> is synonymous with ${"\cXYZ"}. Variable names with more than one control character, such as C<${^XY^Z}>, are illegal. The old syntax has not changed. As before, `^X' may be either a literal control-X character or the two-character sequence `caret' plus `X'. When braces are omitted, the variable name stops after the control character. Thus C<"$^XYZ"> continues to be synonymous with C<$^X . "YZ"> as before. As before, lexical variables may not have names beginning with control characters. As before, variables whose names begin with a control character are always forced to be in package `main'. All such variables are reserved for future extensions, except those that begin with C<^_>, which may be used by user programs and are guaranteed not to acquire special meaning in any future version of Perl. =head2 New variable $^C reflects C<-c> switch C<$^C> has a boolean value that reflects whether perl is being run in compile-only mode (i.e. via the C<-c> switch). Since BEGIN blocks are executed under such conditions, this variable enables perl code to determine whether actions that make sense only during normal running are warranted. See L<perlvar>. =head2 New variable $^V contains Perl version as a string C<$^V> contains the Perl version number as a string composed of characters whose ordinals match the version numbers, i.e. v5.6.0. This may be used in string comparisons. See C<Support for strings represented as a vector of ordinals> for an example. =head2 Optional Y2K warnings If Perl is built with the cpp macro C<PERL_Y2KWARN> defined, it emits optional warnings when concatenating the number 19 with another number. This behavior must be specifically enabled when running Configure. See F<INSTALL> and F<README.Y2K>. =head2 Arrays now always interpolate into double-quoted strings In double-quoted strings, arrays now interpolate, no matter what. The behavior in earlier versions of perl 5 was that arrays would interpolate into strings if the array had been mentioned before the string was compiled, and otherwise Perl would raise a fatal compile-time error. In versions 5.000 through 5.003, the error was Literal @example now requires backslash In versions 5.004_01 through 5.6.0, the error was In string, @example now must be written as \@example The idea here was to get people into the habit of writing C<"fred\@example.com"> when they wanted a literal C<@> sign, just as they have always written C<"Give me back my \$5"> when they wanted a literal C<$> sign. Starting with 5.6.1, when Perl now sees an C<@> sign in a double-quoted string, it I<always> attempts to interpolate an array, regardless of whether or not the array has been used or declared already. The fatal error has been downgraded to an optional warning: Possible unintended interpolation of @example in string This warns you that C<"fred@example.com"> is going to turn into C<fred.com> if you don't backslash the C<@>. See http://perl.plover.com/at-error.html for more details about the history here. =head2 @- and @+ provide starting/ending offsets of regex matches The new magic variables @- and @+ provide the starting and ending offsets, respectively, of $&, $1, $2, etc. See L<perlvar> for details. =head1 Modules and Pragmata =head2 Modules =over 4 =item attributes While used internally by Perl as a pragma, this module also provides a way to fetch subroutine and variable attributes. See L<attributes>. =item B The Perl Compiler suite has been extensively reworked for this release. More of the standard Perl test suite passes when run under the Compiler, but there is still a significant way to go to achieve production quality compiled executables. NOTE: The Compiler suite remains highly experimental. The generated code may not be correct, even when it manages to execute without errors. =item Benchmark Overall, Benchmark results exhibit lower average error and better timing accuracy. You can now run tests for I<n> seconds instead of guessing the right number of tests to run: e.g., timethese(-5, ...) will run each code for at least 5 CPU seconds. Zero as the "number of repetitions" means "for at least 3 CPU seconds". The output format has also changed. For example: use Benchmark;$x=3;timethese(-5,{a=>sub{$x*$x},b=>sub{$x**2}}) will now output something like this: Benchmark: running a, b, each for at least 5 CPU seconds... a: 5 wallclock secs ( 5.77 usr + 0.00 sys = 5.77 CPU) @ 200551.91/s (n=1156516) b: 4 wallclock secs ( 5.00 usr + 0.02 sys = 5.02 CPU) @ 159605.18/s (n=800686) New features: "each for at least N CPU seconds...", "wallclock secs", and the "@ operations/CPU second (n=operations)". timethese() now returns a reference to a hash of Benchmark objects containing the test results, keyed on the names of the tests. timethis() now returns the iterations field in the Benchmark result object instead of 0. timethese(), timethis(), and the new cmpthese() (see below) can also take a format specifier of 'none' to suppress output. A new function countit() is just like timeit() except that it takes a TIME instead of a COUNT. A new function cmpthese() prints a chart comparing the results of each test returned from a timethese() call. For each possible pair of tests, the percentage speed difference (iters/sec or seconds/iter) is shown. For other details, see L<Benchmark>. =item ByteLoader The ByteLoader is a dedicated extension to generate and run Perl bytecode. See L<ByteLoader>. =item constant References can now be used. The new version also allows a leading underscore in constant names, but disallows a double leading underscore (as in "__LINE__"). Some other names are disallowed or warned against, including BEGIN, END, etc. Some names which were forced into main:: used to fail silently in some cases; now they're fatal (outside of main::) and an optional warning (inside of main::). The ability to detect whether a constant had been set with a given name has been added. See L<constant>. =item charnames This pragma implements the C<\N> string escape. See L<charnames>. =item Data::Dumper A C<Maxdepth> setting can be specified to avoid venturing too deeply into deep data structures. See L<Data::Dumper>. The XSUB implementation of Dump() is now automatically called if the C<Useqq> setting is not in use. Dumping C<qr//> objects works correctly. =item DB C<DB> is an experimental module that exposes a clean abstraction to Perl's debugging API. =item DB_File DB_File can now be built with Berkeley DB versions 1, 2 or 3. See C<ext/DB_File/Changes>. =item Devel::DProf Devel::DProf, a Perl source code profiler has been added. See L<Devel::DProf> and L<dprofpp>. =item Devel::Peek The Devel::Peek module provides access to the internal representation of Perl variables and data. It is a data debugging tool for the XS programmer. =item Dumpvalue The Dumpvalue module provides screen dumps of Perl data. =item DynaLoader DynaLoader now supports a dl_unload_file() function on platforms that support unloading shared objects using dlclose(). Perl can also optionally arrange to unload all extension shared objects loaded by Perl. To enable this, build Perl with the Configure option C<-Accflags=-DDL_UNLOAD_ALL_AT_EXIT>. (This maybe useful if you are using Apache with mod_perl.) =item English $PERL_VERSION now stands for C<$^V> (a string value) rather than for C<$]> (a numeric value). =item Env Env now supports accessing environment variables like PATH as array variables. =item Fcntl More Fcntl constants added: F_SETLK64, F_SETLKW64, O_LARGEFILE for large file (more than 4GB) access (NOTE: the O_LARGEFILE is automatically added to sysopen() flags if large file support has been configured, as is the default), Free/Net/OpenBSD locking behaviour flags F_FLOCK, F_POSIX, Linux F_SHLCK, and O_ACCMODE: the combined mask of O_RDONLY, O_WRONLY, and O_RDWR. The seek()/sysseek() constants SEEK_SET, SEEK_CUR, and SEEK_END are available via the C<:seek> tag. The chmod()/stat() S_IF* constants and S_IS* functions are available via the C<:mode> tag. =item File::Compare A compare_text() function has been added, which allows custom comparison functions. See L<File::Compare>. =item File::Find File::Find now works correctly when the wanted() function is either autoloaded or is a symbolic reference. A bug that caused File::Find to lose track of the working directory when pruning top-level directories has been fixed. File::Find now also supports several other options to control its behavior. It can follow symbolic links if the C<follow> option is specified. Enabling the C<no_chdir> option will make File::Find skip changing the current directory when walking directories. The C<untaint> flag can be useful when running with taint checks enabled. See L<File::Find>. =item File::Glob This extension implements BSD-style file globbing. By default, it will also be used for the internal implementation of the glob() operator. See L<File::Glob>. =item File::Spec New methods have been added to the File::Spec module: devnull() returns the name of the null device (/dev/null on Unix) and tmpdir() the name of the temp directory (normally /tmp on Unix). There are now also methods to convert between absolute and relative filenames: abs2rel() and rel2abs(). For compatibility with operating systems that specify volume names in file paths, the splitpath(), splitdir(), and catdir() methods have been added. =item File::Spec::Functions The new File::Spec::Functions modules provides a function interface to the File::Spec module. Allows shorthand $fullname = catfile($dir1, $dir2, $file); instead of $fullname = File::Spec->catfile($dir1, $dir2, $file); =item Getopt::Long Getopt::Long licensing has changed to allow the Perl Artistic License as well as the GPL. It used to be GPL only, which got in the way of non-GPL applications that wanted to use Getopt::Long. Getopt::Long encourages the use of Pod::Usage to produce help messages. For example: use Getopt::Long; use Pod::Usage; my $man = 0; my $help = 0; GetOptions('help|?' => \$help, man => \$man) or pod2usage(2); pod2usage(1) if $help; pod2usage(-exitstatus => 0, -verbose => 2) if $man; __END__ =head1 NAME sample - Using Getopt::Long and Pod::Usage =head1 SYNOPSIS sample [options] [file ...] Options: -help brief help message -man full documentation =head1 OPTIONS =over 8 =item B<-help> Print a brief help message and exits. =item B<-man> Prints the manual page and exits. =back =head1 DESCRIPTION B<This program> will read the given input file(s) and do something useful with the contents thereof. =cut See L<Pod::Usage> for details. A bug that prevented the non-option call-back <> from being specified as the first argument has been fixed. To specify the characters < and > as option starters, use ><. Note, however, that changing option starters is strongly deprecated. =item IO write() and syswrite() will now accept a single-argument form of the call, for consistency with Perl's syswrite(). You can now create a TCP-based IO::Socket::INET without forcing a connect attempt. This allows you to configure its options (like making it non-blocking) and then call connect() manually. A bug that prevented the IO::Socket::protocol() accessor from ever returning the correct value has been corrected. IO::Socket::connect now uses non-blocking IO instead of alarm() to do connect timeouts. IO::Socket::accept now uses select() instead of alarm() for doing timeouts. IO::Socket::INET->new now sets $! correctly on failure. $@ is still set for backwards compatibility. =item JPL Java Perl Lingo is now distributed with Perl. See jpl/README for more information. =item lib C<use lib> now weeds out any trailing duplicate entries. C<no lib> removes all named entries. =item Math::BigInt The bitwise operations C<<< << >>>, C<<< >> >>>, C<&>, C<|>, and C<~> are now supported on bigints. =item Math::Complex The accessor methods Re, Im, arg, abs, rho, and theta can now also act as mutators (accessor $z->Re(), mutator $z->Re(3)). The class method C<display_format> and the corresponding object method C<display_format>, in addition to accepting just one argument, now can also accept a parameter hash. Recognized keys of a parameter hash are C<"style">, which corresponds to the old one parameter case, and two new parameters: C<"format">, which is a printf()-style format string (defaults usually to C<"%.15g">, you can revert to the default by setting the format string to C<undef>) used for both parts of a complex number, and C<"polar_pretty_print"> (defaults to true), which controls whether an attempt is made to try to recognize small multiples and rationals of pi (2pi, pi/2) at the argument (angle) of a polar complex number. The potentially disruptive change is that in list context both methods now I<return the parameter hash>, instead of only the value of the C<"style"> parameter. =item Math::Trig A little bit of radial trigonometry (cylindrical and spherical), radial coordinate conversions, and the great circle distance were added. =item Pod::Parser, Pod::InputObjects Pod::Parser is a base class for parsing and selecting sections of pod documentation from an input stream. This module takes care of identifying pod paragraphs and commands in the input and hands off the parsed paragraphs and commands to user-defined methods which are free to interpret or translate them as they see fit. Pod::InputObjects defines some input objects needed by Pod::Parser, and for advanced users of Pod::Parser that need more about a command besides its name and text. As of release 5.6.0 of Perl, Pod::Parser is now the officially sanctioned "base parser code" recommended for use by all pod2xxx translators. Pod::Text (pod2text) and Pod::Man (pod2man) have already been converted to use Pod::Parser and efforts to convert Pod::HTML (pod2html) are already underway. For any questions or comments about pod parsing and translating issues and utilities, please use the pod-people@perl.org mailing list. For further information, please see L<Pod::Parser> and L<Pod::InputObjects>. =item Pod::Checker, podchecker This utility checks pod files for correct syntax, according to L<perlpod>. Obvious errors are flagged as such, while warnings are printed for mistakes that can be handled gracefully. The checklist is not complete yet. See L<Pod::Checker>. =item Pod::ParseUtils, Pod::Find These modules provide a set of gizmos that are useful mainly for pod translators. L<Pod::Find|Pod::Find> traverses directory structures and returns found pod files, along with their canonical names (like C<File::Spec::Unix>). L<Pod::ParseUtils|Pod::ParseUtils> contains B<Pod::List> (useful for storing pod list information), B<Pod::Hyperlink> (for parsing the contents of C<LE<lt>E<gt>> sequences) and B<Pod::Cache> (for caching information about pod files, e.g., link nodes). =item Pod::Select, podselect Pod::Select is a subclass of Pod::Parser which provides a function named "podselect()" to filter out user-specified sections of raw pod documentation from an input stream. podselect is a script that provides access to Pod::Select from other scripts to be used as a filter. See L<Pod::Select>. =item Pod::Usage, pod2usage Pod::Usage provides the function "pod2usage()" to print usage messages for a Perl script based on its embedded pod documentation. The pod2usage() function is generally useful to all script authors since it lets them write and maintain a single source (the pods) for documentation, thus removing the need to create and maintain redundant usage message text consisting of information already in the pods. There is also a pod2usage script which can be used from other kinds of scripts to print usage messages from pods (even for non-Perl scripts with pods embedded in comments). For details and examples, please see L<Pod::Usage>. =item Pod::Text and Pod::Man Pod::Text has been rewritten to use Pod::Parser. While pod2text() is still available for backwards compatibility, the module now has a new preferred interface. See L<Pod::Text> for the details. The new Pod::Text module is easily subclassed for tweaks to the output, and two such subclasses (Pod::Text::Termcap for man-page-style bold and underlining using termcap information, and Pod::Text::Color for markup with ANSI color sequences) are now standard. pod2man has been turned into a module, Pod::Man, which also uses Pod::Parser. In the process, several outstanding bugs related to quotes in section headers, quoting of code escapes, and nested lists have been fixed. pod2man is now a wrapper script around this module. =item SDBM_File An EXISTS method has been added to this module (and sdbm_exists() has been added to the underlying sdbm library), so one can now call exists on an SDBM_File tied hash and get the correct result, rather than a runtime error. A bug that may have caused data loss when more than one disk block happens to be read from the database in a single FETCH() has been fixed. =item Sys::Syslog Sys::Syslog now uses XSUBs to access facilities from syslog.h so it no longer requires syslog.ph to exist. =item Sys::Hostname Sys::Hostname now uses XSUBs to call the C library's gethostname() or uname() if they exist. =item Term::ANSIColor Term::ANSIColor is a very simple module to provide easy and readable access to the ANSI color and highlighting escape sequences, supported by most ANSI terminal emulators. It is now included standard. =item Time::Local The timelocal() and timegm() functions used to silently return bogus results when the date fell outside the machine's integer range. They now consistently croak() if the date falls in an unsupported range. =item Win32 The error return value in list context has been changed for all functions that return a list of values. Previously these functions returned a list with a single element C<undef> if an error occurred. Now these functions return the empty list in these situations. This applies to the following functions: Win32::FsType Win32::GetOSVersion The remaining functions are unchanged and continue to return C<undef> on error even in list context. The Win32::SetLastError(ERROR) function has been added as a complement to the Win32::GetLastError() function. The new Win32::GetFullPathName(FILENAME) returns the full absolute pathname for FILENAME in scalar context. In list context it returns a two-element list containing the fully qualified directory name and the filename. See L<Win32>. =item XSLoader The XSLoader extension is a simpler alternative to DynaLoader. See L<XSLoader>. =item DBM Filters A new feature called "DBM Filters" has been added to all the DBM modules--DB_File, GDBM_File, NDBM_File, ODBM_File, and SDBM_File. DBM Filters add four new methods to each DBM module: filter_store_key filter_store_value filter_fetch_key filter_fetch_value These can be used to filter key-value pairs before the pairs are written to the database or just after they are read from the database. See L<perldbmfilter> for further information. =back =head2 Pragmata C<use attrs> is now obsolete, and is only provided for backward-compatibility. It's been replaced by the C<sub : attributes> syntax. See L<perlsub/"Subroutine Attributes"> and L<attributes>. Lexical warnings pragma, C<use warnings;>, to control optional warnings. See L<perllexwarn>. C<use filetest> to control the behaviour of filetests (C<-r> C<-w> ...). Currently only one subpragma implemented, "use filetest 'access';", that uses access(2) or equivalent to check permissions instead of using stat(2) as usual. This matters in filesystems where there are ACLs (access control lists): the stat(2) might lie, but access(2) knows better. The C<open> pragma can be used to specify default disciplines for handle constructors (e.g. open()) and for qx//. The two pseudo-disciplines C<:raw> and C<:crlf> are currently supported on DOS-derivative platforms (i.e. where binmode is not a no-op). See also L</"binmode() can be used to set :crlf and :raw modes">. =head1 Utility Changes =head2 dprofpp C<dprofpp> is used to display profile data generated using C<Devel::DProf>. See L<dprofpp>. =head2 find2perl The C<find2perl> utility now uses the enhanced features of the File::Find module. The -depth and -follow options are supported. Pod documentation is also included in the script. =head2 h2xs The C<h2xs> tool can now work in conjunction with C<C::Scan> (available from CPAN) to automatically parse real-life header files. The C<-M>, C<-a>, C<-k>, and C<-o> options are new. =head2 perlcc C<perlcc> now supports the C and Bytecode backends. By default, it generates output from the simple C backend rather than the optimized C backend. Support for non-Unix platforms has been improved. =head2 perldoc C<perldoc> has been reworked to avoid possible security holes. It will not by default let itself be run as the superuser, but you may still use the B<-U> switch to try to make it drop privileges first. =head2 The Perl Debugger Many bug fixes and enhancements were added to F<perl5db.pl>, the Perl debugger. The help documentation was rearranged. New commands include C<< < ? >>, C<< > ? >>, and C<< { ? >> to list out current actions, C<man I<docpage>> to run your doc viewer on some perl docset, and support for quoted options. The help information was rearranged, and should be viewable once again if you're using B<less> as your pager. A serious security hole was plugged--you should immediately remove all older versions of the Perl debugger as installed in previous releases, all the way back to perl3, from your system to avoid being bitten by this. =head1 Improved Documentation Many of the platform-specific README files are now part of the perl installation. See L<perl> for the complete list. =over 4 =item perlapi.pod The official list of public Perl API functions. =item perlboot.pod A tutorial for beginners on object-oriented Perl. =item perlcompile.pod An introduction to using the Perl Compiler suite. =item perldbmfilter.pod A howto document on using the DBM filter facility. =item perldebug.pod All material unrelated to running the Perl debugger, plus all low-level guts-like details that risked crushing the casual user of the debugger, have been relocated from the old manpage to the next entry below. =item perldebguts.pod This new manpage contains excessively low-level material not related to the Perl debugger, but slightly related to debugging Perl itself. It also contains some arcane internal details of how the debugging process works that may only be of interest to developers of Perl debuggers. =item perlfork.pod Notes on the fork() emulation currently available for the Windows platform. =item perlfilter.pod An introduction to writing Perl source filters. =item perlhack.pod Some guidelines for hacking the Perl source code. =item perlintern.pod A list of internal functions in the Perl source code. (List is currently empty.) =item perllexwarn.pod Introduction and reference information about lexically scoped warning categories. =item perlnumber.pod Detailed information about numbers as they are represented in Perl. =item perlopentut.pod A tutorial on using open() effectively. =item perlreftut.pod A tutorial that introduces the essentials of references. =item perltootc.pod A tutorial on managing class data for object modules. =item perltodo.pod Discussion of the most often wanted features that may someday be supported in Perl. =item perlunicode.pod An introduction to Unicode support features in Perl. =back =head1 Performance enhancements =head2 Simple sort() using { $a <=> $b } and the like are optimized Many common sort() operations using a simple inlined block are now optimized for faster performance. =head2 Optimized assignments to lexical variables Certain operations in the RHS of assignment statements have been optimized to directly set the lexical variable on the LHS, eliminating redundant copying overheads. =head2 Faster subroutine calls Minor changes in how subroutine calls are handled internally provide marginal improvements in performance. =head2 delete(), each(), values() and hash iteration are faster The hash values returned by delete(), each(), values() and hashes in a list context are the actual values in the hash, instead of copies. This results in significantly better performance, because it eliminates needless copying in most situations. =head1 Installation and Configuration Improvements =head2 -Dusethreads means something different The -Dusethreads flag now enables the experimental interpreter-based thread support by default. To get the flavor of experimental threads that was in 5.005 instead, you need to run Configure with "-Dusethreads -Duse5005threads". As of v5.6.0, interpreter-threads support is still lacking a way to create new threads from Perl (i.e., C<use Thread;> will not work with interpreter threads). C<use Thread;> continues to be available when you specify the -Duse5005threads option to Configure, bugs and all. NOTE: Support for threads continues to be an experimental feature. Interfaces and implementation are subject to sudden and drastic changes. =head2 New Configure flags The following new flags may be enabled on the Configure command line by running Configure with C<-Dflag>. usemultiplicity usethreads useithreads (new interpreter threads: no Perl API yet) usethreads use5005threads (threads as they were in 5.005) use64bitint (equal to now deprecated 'use64bits') use64bitall uselongdouble usemorebits uselargefiles usesocks (only SOCKS v5 supported) =head2 Threadedness and 64-bitness now more daring The Configure options enabling the use of threads and the use of 64-bitness are now more daring in the sense that they no more have an explicit list of operating systems of known threads/64-bit capabilities. In other words: if your operating system has the necessary APIs and datatypes, you should be able just to go ahead and use them, for threads by Configure -Dusethreads, and for 64 bits either explicitly by Configure -Duse64bitint or implicitly if your system has 64-bit wide datatypes. See also L</"64-bit support">. =head2 Long Doubles Some platforms have "long doubles", floating point numbers of even larger range than ordinary "doubles". To enable using long doubles for Perl's scalars, use -Duselongdouble. =head2 -Dusemorebits You can enable both -Duse64bitint and -Duselongdouble with -Dusemorebits. See also L</"64-bit support">. =head2 -Duselargefiles Some platforms support system APIs that are capable of handling large files (typically, files larger than two gigabytes). Perl will try to use these APIs if you ask for -Duselargefiles. See L</"Large file support"> for more information. =head2 installusrbinperl You can use "Configure -Uinstallusrbinperl" which causes installperl to skip installing perl also as /usr/bin/perl. This is useful if you prefer not to modify /usr/bin for some reason or another but harmful because many scripts assume to find Perl in /usr/bin/perl. =head2 SOCKS support You can use "Configure -Dusesocks" which causes Perl to probe for the SOCKS proxy protocol library (v5, not v4). For more information on SOCKS, see: http://www.socks.nec.com/ =head2 C<-A> flag You can "post-edit" the Configure variables using the Configure C<-A> switch. The editing happens immediately after the platform specific hints files have been processed but before the actual configuration process starts. Run C<Configure -h> to find out the full C<-A> syntax. =head2 Enhanced Installation Directories The installation structure has been enriched to improve the support for maintaining multiple versions of perl, to provide locations for vendor-supplied modules, scripts, and manpages, and to ease maintenance of locally-added modules, scripts, and manpages. See the section on Installation Directories in the INSTALL file for complete details. For most users building and installing from source, the defaults should be fine. If you previously used C<Configure -Dsitelib> or C<-Dsitearch> to set special values for library directories, you might wish to consider using the new C<-Dsiteprefix> setting instead. Also, if you wish to re-use a config.sh file from an earlier version of perl, you should be sure to check that Configure makes sensible choices for the new directories. See INSTALL for complete details. =head1 Platform specific changes =head2 Supported platforms =over 4 =item * The Mach CThreads (NEXTSTEP, OPENSTEP) are now supported by the Thread extension. =item * GNU/Hurd is now supported. =item * Rhapsody/Darwin is now supported. =item * EPOC is now supported (on Psion 5). =item * The cygwin port (formerly cygwin32) has been greatly improved. =back =head2 DOS =over 4 =item * Perl now works with djgpp 2.02 (and 2.03 alpha). =item * Environment variable names are not converted to uppercase any more. =item * Incorrect exit codes from backticks have been fixed. =item * This port continues to use its own builtin globbing (not File::Glob). =back =head2 OS390 (OpenEdition MVS) Support for this EBCDIC platform has not been renewed in this release. There are difficulties in reconciling Perl's standardization on UTF-8 as its internal representation for characters with the EBCDIC character set, because the two are incompatible. It is unclear whether future versions will renew support for this platform, but the possibility exists. =head2 VMS Numerous revisions and extensions to configuration, build, testing, and installation process to accommodate core changes and VMS-specific options. Expand %ENV-handling code to allow runtime mapping to logical names, CLI symbols, and CRTL environ array. Extension of subprocess invocation code to accept filespecs as command "verbs". Add to Perl command line processing the ability to use default file types and to recognize Unix-style C<2E<gt>&1>. Expansion of File::Spec::VMS routines, and integration into ExtUtils::MM_VMS. Extension of ExtUtils::MM_VMS to handle complex extensions more flexibly. Barewords at start of Unix-syntax paths may be treated as text rather than only as logical names. Optional secure translation of several logical names used internally by Perl. Miscellaneous bugfixing and porting of new core code to VMS. Thanks are gladly extended to the many people who have contributed VMS patches, testing, and ideas. =head2 Win32 Perl can now emulate fork() internally, using multiple interpreters running in different concurrent threads. This support must be enabled at build time. See L<perlfork> for detailed information. When given a pathname that consists only of a drivename, such as C<A:>, opendir() and stat() now use the current working directory for the drive rather than the drive root. The builtin XSUB functions in the Win32:: namespace are documented. See L<Win32>. $^X now contains the full path name of the running executable. A Win32::GetLongPathName() function is provided to complement Win32::GetFullPathName() and Win32::GetShortPathName(). See L<Win32>. POSIX::uname() is supported. system(1,...) now returns true process IDs rather than process handles. kill() accepts any real process id, rather than strictly return values from system(1,...). For better compatibility with Unix, C<kill(0, $pid)> can now be used to test whether a process exists. The C<Shell> module is supported. Better support for building Perl under command.com in Windows 95 has been added. Scripts are read in binary mode by default to allow ByteLoader (and the filter mechanism in general) to work properly. For compatibility, the DATA filehandle will be set to text mode if a carriage return is detected at the end of the line containing the __END__ or __DATA__ token; if not, the DATA filehandle will be left open in binary mode. Earlier versions always opened the DATA filehandle in text mode. The glob() operator is implemented via the C<File::Glob> extension, which supports glob syntax of the C shell. This increases the flexibility of the glob() operator, but there may be compatibility issues for programs that relied on the older globbing syntax. If you want to preserve compatibility with the older syntax, you might want to run perl with C<-MFile::DosGlob>. For details and compatibility information, see L<File::Glob>. =head1 Significant bug fixes =head2 <HANDLE> on empty files With C<$/> set to C<undef>, "slurping" an empty file returns a string of zero length (instead of C<undef>, as it used to) the first time the HANDLE is read after C<$/> is set to C<undef>. Further reads yield C<undef>. This means that the following will append "foo" to an empty file (it used to do nothing): perl -0777 -pi -e 's/^/foo/' empty_file The behaviour of: perl -pi -e 's/^/foo/' empty_file is unchanged (it continues to leave the file empty). =head2 C<eval '...'> improvements Line numbers (as reflected by caller() and most diagnostics) within C<eval '...'> were often incorrect where here documents were involved. This has been corrected. Lexical lookups for variables appearing in C<eval '...'> within functions that were themselves called within an C<eval '...'> were searching the wrong place for lexicals. The lexical search now correctly ends at the subroutine's block boundary. The use of C<return> within C<eval {...}> caused $@ not to be reset correctly when no exception occurred within the eval. This has been fixed. Parsing of here documents used to be flawed when they appeared as the replacement expression in C<eval 's/.../.../e'>. This has been fixed. =head2 All compilation errors are true errors Some "errors" encountered at compile time were by necessity generated as warnings followed by eventual termination of the program. This enabled more such errors to be reported in a single run, rather than causing a hard stop at the first error that was encountered. The mechanism for reporting such errors has been reimplemented to queue compile-time errors and report them at the end of the compilation as true errors rather than as warnings. This fixes cases where error messages leaked through in the form of warnings when code was compiled at run time using C<eval STRING>, and also allows such errors to be reliably trapped using C<eval "...">. =head2 Implicitly closed filehandles are safer Sometimes implicitly closed filehandles (as when they are localized, and Perl automatically closes them on exiting the scope) could inadvertently set $? or $!. This has been corrected. =head2 Behavior of list slices is more consistent When taking a slice of a literal list (as opposed to a slice of an array or hash), Perl used to return an empty list if the result happened to be composed of all undef values. The new behavior is to produce an empty list if (and only if) the original list was empty. Consider the following example: @a = (1,undef,undef,2)[2,1,2]; The old behavior would have resulted in @a having no elements. The new behavior ensures it has three undefined elements. Note in particular that the behavior of slices of the following cases remains unchanged: @a = ()[1,2]; @a = (getpwent)[7,0]; @a = (anything_returning_empty_list())[2,1,2]; @a = @b[2,1,2]; @a = @c{'a','b','c'}; See L<perldata>. =head2 C<(\$)> prototype and C<$foo{a}> A scalar reference prototype now correctly allows a hash or array element in that slot. =head2 C<goto &sub> and AUTOLOAD The C<goto &sub> construct works correctly when C<&sub> happens to be autoloaded. =head2 C<-bareword> allowed under C<use integer> The autoquoting of barewords preceded by C<-> did not work in prior versions when the C<integer> pragma was enabled. This has been fixed. =head2 Failures in DESTROY() When code in a destructor threw an exception, it went unnoticed in earlier versions of Perl, unless someone happened to be looking in $@ just after the point the destructor happened to run. Such failures are now visible as warnings when warnings are enabled. =head2 Locale bugs fixed printf() and sprintf() previously reset the numeric locale back to the default "C" locale. This has been fixed. Numbers formatted according to the local numeric locale (such as using a decimal comma instead of a decimal dot) caused "isn't numeric" warnings, even while the operations accessing those numbers produced correct results. These warnings have been discontinued. =head2 Memory leaks The C<eval 'return sub {...}'> construct could sometimes leak memory. This has been fixed. Operations that aren't filehandle constructors used to leak memory when used on invalid filehandles. This has been fixed. Constructs that modified C<@_> could fail to deallocate values in C<@_> and thus leak memory. This has been corrected. =head2 Spurious subroutine stubs after failed subroutine calls Perl could sometimes create empty subroutine stubs when a subroutine was not found in the package. Such cases stopped later method lookups from progressing into base packages. This has been corrected. =head2 Taint failures under C<-U> When running in unsafe mode, taint violations could sometimes cause silent failures. This has been fixed. =head2 END blocks and the C<-c> switch Prior versions used to run BEGIN B<and> END blocks when Perl was run in compile-only mode. Since this is typically not the expected behavior, END blocks are not executed anymore when the C<-c> switch is used, or if compilation fails. See L</"Support for CHECK blocks"> for how to run things when the compile phase ends. =head2 Potential to leak DATA filehandles Using the C<__DATA__> token creates an implicit filehandle to the file that contains the token. It is the program's responsibility to close it when it is done reading from it. This caveat is now better explained in the documentation. See L<perldata>. =head1 New or Changed Diagnostics =over 4 =item "%s" variable %s masks earlier declaration in same %s (W misc) A "my" or "our" variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure referents to it are destroyed. =item "my sub" not yet implemented (F) Lexically scoped subroutines are not yet implemented. Don't try that yet. =item "our" variable %s redeclared (W misc) You seem to have already declared the same global once before in the current lexical scope. =item '!' allowed only after types %s (F) The '!' is allowed in pack() and unpack() only after certain types. See L<perlfunc/pack>. =item / cannot take a count (F) You had an unpack template indicating a counted-length string, but you have also specified an explicit size for the string. See L<perlfunc/pack>. =item / must be followed by a, A or Z (F) You had an unpack template indicating a counted-length string, which must be followed by one of the letters a, A or Z to indicate what sort of string is to be unpacked. See L<perlfunc/pack>. =item / must be followed by a*, A* or Z* (F) You had a pack template indicating a counted-length string, Currently the only things that can have their length counted are a*, A* or Z*. See L<perlfunc/pack>. =item / must follow a numeric type (F) You had an unpack template that contained a '#', but this did not follow some numeric unpack specification. See L<perlfunc/pack>. =item /%s/: Unrecognized escape \\%c passed through (W regexp) You used a backslash-character combination which is not recognized by Perl. This combination appears in an interpolated variable or a C<'>-delimited regular expression. The character was understood literally. =item /%s/: Unrecognized escape \\%c in character class passed through (W regexp) You used a backslash-character combination which is not recognized by Perl inside character classes. The character was understood literally. =item /%s/ should probably be written as "%s" (W syntax) You have used a pattern where Perl expected to find a string, as in the first argument to C<join>. Perl will treat the true or false result of matching the pattern against $_ as the string, which is probably not what you had in mind. =item %s() called too early to check prototype (W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See L<perlsub>. =item %s argument is not a HASH or ARRAY element (F) The argument to exists() must be a hash or array element, such as: $foo{$bar} $ref->{"susie"}[12] =item %s argument is not a HASH or ARRAY element or slice (F) The argument to delete() must be either a hash or array element, such as: $foo{$bar} $ref->{"susie"}[12] or a hash or array slice, such as: @foo[$bar, $baz, $xyzzy] @{$ref->[12]}{"susie", "queue"} =item %s argument is not a subroutine name (F) The argument to exists() for C<exists &sub> must be a subroutine name, and not a subroutine call. C<exists &sub()> will generate this error. =item %s package attribute may clash with future reserved word: %s (W reserved) A lowercase attribute name was used that had a package-specific handler. That name might have a meaning to Perl itself some day, even though it doesn't yet. Perhaps you should use a mixed-case attribute name, instead. See L<attributes>. =item (in cleanup) %s (W misc) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated. Failure of user callbacks dispatched using the C<G_KEEPERR> flag could also result in this warning. See L<perlcall/G_KEEPERR>. =item <> should be quotes (F) You wrote C<< require <file> >> when you should have written C<require 'file'>. =item Attempt to join self (F) You tried to join a thread from within itself, which is an impossible task. You may be joining the wrong thread, or you may need to move the join() to some other thread. =item Bad evalled substitution pattern (F) You've used the /e switch to evaluate the replacement for a substitution, but perl found a syntax error in the code to evaluate, most likely an unexpected right brace '}'. =item Bad realloc() ignored (S) An internal routine called realloc() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable C<PERL_BADFREE> to 1. =item Bareword found in conditional (W bareword) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example: open FOO || die; It may also indicate a misspelled constant that has been interpreted as a bareword: use constant TYPO => 1; if (TYOP) { print "foo" } The C<strict> pragma is useful in avoiding such errors. =item Binary number > 0b11111111111111111111111111111111 non-portable (W portable) The binary number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. =item Bit vector size > 32 non-portable (W portable) Using bit vector sizes larger than 32 is non-portable. =item Buffer overflow in prime_env_iter: %s (W internal) A warning peculiar to VMS. While Perl was preparing to iterate over %ENV, it encountered a logical name or symbol definition which was too long, so it was truncated to the string shown. =item Can't check filesystem of script "%s" (P) For some reason you can't check the filesystem of the script for nosuid. =item Can't declare class for non-scalar %s in "%s" (S) Currently, only scalar variables can declared with a specific class qualifier in a "my" or "our" declaration. The semantics may be extended for other types of variables in future. =item Can't declare %s in "%s" (F) Only scalar, array, and hash variables may be declared as "my" or "our" variables. They must have ordinary identifiers as names. =item Can't ignore signal CHLD, forcing to default (W signal) Perl has detected that it is being run with the SIGCHLD signal (sometimes known as SIGCLD) disabled. Since disabling this signal will interfere with proper determination of exit status of child processes, Perl has reset the signal to its default value. This situation typically indicates that the parent program under which Perl may be running (e.g., cron) is being very careless. =item Can't modify non-lvalue subroutine call (F) Subroutines meant to be used in lvalue context should be declared as such, see L<perlsub/"Lvalue subroutines">. =item Can't read CRTL environ (S) A warning peculiar to VMS. Perl tried to read an element of %ENV from the CRTL's internal environment array and discovered the array was missing. You need to figure out where your CRTL misplaced its environ or define F<PERL_ENV_TABLES> (see L<perlvms>) so that environ is not searched. =item Can't remove %s: %s, skipping file (S) You requested an inplace edit without creating a backup file. Perl was unable to remove the original file to replace it with the modified file. The file was left unmodified. =item Can't return %s from lvalue subroutine (F) Perl detected an attempt to return illegal lvalues (such as temporary or readonly values) from a subroutine used as an lvalue. This is not allowed. =item Can't weaken a nonreference (F) You attempted to weaken something that was not a reference. Only references can be weakened. =item Character class [:%s:] unknown (F) The class in the character class [: :] syntax is unknown. See L<perlre>. =item Character class syntax [%s] belongs inside character classes (W unsafe) The character class constructs [: :], [= =], and [. .] go I<inside> character classes, the [] are part of the construct, for example: /[012[:alpha:]345]/. Note that [= =] and [. .] are not currently implemented; they are simply placeholders for future extensions. =item Constant is not %s reference (F) A constant value (perhaps declared using the C<use constant> pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See L<perlsub/"Constant Functions"> and L<constant>. =item constant(%s): %s (F) The parser found inconsistencies either while attempting to define an overloaded constant, or when trying to find the character name specified in the C<\N{...}> escape. Perhaps you forgot to load the corresponding C<overload> or C<charnames> pragma? See L<charnames> and L<overload>. =item CORE::%s is not a keyword (F) The CORE:: namespace is reserved for Perl keywords. =item defined(@array) is deprecated (D) defined() is not usually useful on arrays because it checks for an undefined I<scalar> value. If you want to see if the array is empty, just use C<if (@array) { # not empty }> for example. =item defined(%hash) is deprecated (D) defined() is not usually useful on hashes because it checks for an undefined I<scalar> value. If you want to see if the hash is empty, just use C<if (%hash) { # not empty }> for example. =item Did not produce a valid header See Server error. =item (Did you mean "local" instead of "our"?) (W misc) Remember that "our" does not localize the declared global variable. You have declared it again in the same lexical scope, which seems superfluous. =item Document contains no data See Server error. =item entering effective %s failed (F) While under the C<use filetest> pragma, switching the real and effective uids or gids failed. =item false [] range "%s" in regexp (W regexp) A character class range must start and end at a literal character, not another character class like C<\d> or C<[:alpha:]>. The "-" in your false range is interpreted as a literal "-". Consider quoting the "-", "\-". See L<perlre>. =item Filehandle %s opened only for output (W io) You tried to read from a filehandle opened only for writing. If you intended it to be a read/write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to read from the file, use "<". See L<perlfunc/open>. =item flock() on closed filehandle %s (W closed) The filehandle you're attempting to flock() got itself closed some time before now. Check your logic flow. flock() operates on filehandles. Are you attempting to call flock() on a dirhandle by the same name? =item Global symbol "%s" requires explicit package name (F) You've said "use strict vars", which indicates that all variables must either be lexically scoped (using "my"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::"). =item Hexadecimal number > 0xffffffff non-portable (W portable) The hexadecimal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. =item Ill-formed CRTL environ value "%s" (W internal) A warning peculiar to VMS. Perl tried to read the CRTL's internal environ array, and encountered an element without the C<=> delimiter used to separate keys from values. The element is ignored. =item Ill-formed message in prime_env_iter: |%s| (W internal) A warning peculiar to VMS. Perl tried to read a logical name or CLI symbol definition when preparing to iterate over %ENV, and didn't see the expected delimiter between key and value, so the line was ignored. =item Illegal binary digit %s (F) You used a digit other than 0 or 1 in a binary number. =item Illegal binary digit %s ignored (W digit) You may have tried to use a digit other than 0 or 1 in a binary number. Interpretation of the binary number stopped before the offending digit. =item Illegal number of bits in vec (F) The number of bits in vec() (the third argument) must be a power of two from 1 to 32 (or 64, if your platform supports that). =item Integer overflow in %s number (W overflow) The hexadecimal, octal or binary number you have specified either as a literal or as an argument to hex() or oct() is too big for your architecture, and has been converted to a floating point number. On a 32-bit architecture the largest hexadecimal, octal or binary number representable without overflow is 0xFFFFFFFF, 037777777777, or 0b11111111111111111111111111111111 respectively. Note that Perl transparently promotes all numbers to a floating point representation internally--subject to loss of precision errors in subsequent operations. =item Invalid %s attribute: %s The indicated attribute for a subroutine or variable was not recognized by Perl or by a user-supplied handler. See L<attributes>. =item Invalid %s attributes: %s The indicated attributes for a subroutine or variable were not recognized by Perl or by a user-supplied handler. See L<attributes>. =item invalid [] range "%s" in regexp The offending range is now explicitly displayed. =item Invalid separator character %s in attribute list (F) Something other than a colon or whitespace was seen between the elements of an attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. See L<attributes>. =item Invalid separator character %s in subroutine attribute list (F) Something other than a colon or whitespace was seen between the elements of a subroutine attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. =item leaving effective %s failed (F) While under the C<use filetest> pragma, switching the real and effective uids or gids failed. =item Lvalue subs returning %s not implemented yet (F) Due to limitations in the current implementation, array and hash values cannot be returned in subroutines used in lvalue context. See L<perlsub/"Lvalue subroutines">. =item Method %s not permitted See Server error. =item Missing %sbrace%s on \N{} (F) Wrong syntax of character name literal C<\N{charname}> within double-quotish context. =item Missing command in piped open (W pipe) You used the C<open(FH, "| command")> or C<open(FH, "command |")> construction, but the command was missing or blank. =item Missing name in "my sub" (F) The reserved syntax for lexically scoped subroutines requires that they have a name with which they can be found. =item No %s specified for -%c (F) The indicated command line switch needs a mandatory argument, but you haven't specified one. =item No package name allowed for variable %s in "our" (F) Fully qualified variable names are not allowed in "our" declarations, because that doesn't make much sense under existing semantics. Such syntax is reserved for future extensions. =item No space allowed after -%c (F) The argument to the indicated command line switch must follow immediately after the switch, without intervening spaces. =item no UTC offset information; assuming local time is UTC (S) A warning peculiar to VMS. Perl was unable to find the local timezone offset, so it's assuming that local system time is equivalent to UTC. If it's not, define the logical name F<SYS$TIMEZONE_DIFFERENTIAL> to translate to the number of seconds which need to be added to UTC to get local time. =item Octal number > 037777777777 non-portable (W portable) The octal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. See also L<perlport> for writing portable code. =item panic: del_backref (P) Failed an internal consistency check while trying to reset a weak reference. =item panic: kid popen errno read (F) forked child returned an incomprehensible message about its errno. =item panic: magic_killbackrefs (P) Failed an internal consistency check while trying to reset all weak references to an object. =item Parentheses missing around "%s" list (W parenthesis) You said something like my $foo, $bar = @_; when you meant my ($foo, $bar) = @_; Remember that "my", "our", and "local" bind tighter than comma. =item Possible unintended interpolation of %s in string (W ambiguous) It used to be that Perl would try to guess whether you wanted an array interpolated or a literal @. It no longer does this; arrays are now I<always> interpolated into strings. This means that if you try something like: print "fred@example.com"; and the array C<@example> doesn't exist, Perl is going to print C<fred.com>, which is probably not what you wanted. To get a literal C<@> sign in a string, put a backslash before it, just as you would to get a literal C<$> sign. =item Possible Y2K bug: %s (W y2k) You are concatenating the number 19 with another number, which could be a potential Year 2000 problem. =item pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead (W deprecated) You have written something like this: sub doit { use attrs qw(locked); } You should use the new declaration syntax instead. sub doit : locked { ... The C<use attrs> pragma is now obsolete, and is only provided for backward-compatibility. See L<perlsub/"Subroutine Attributes">. =item Premature end of script headers See Server error. =item Repeat count in pack overflows (F) You can't specify a repeat count so large that it overflows your signed integers. See L<perlfunc/pack>. =item Repeat count in unpack overflows (F) You can't specify a repeat count so large that it overflows your signed integers. See L<perlfunc/unpack>. =item realloc() of freed memory ignored (S) An internal routine called realloc() on something that had already been freed. =item Reference is already weak (W misc) You have attempted to weaken a reference that is already weak. Doing so has no effect. =item setpgrp can't take arguments (F) Your system has the setpgrp() from BSD 4.2, which takes no arguments, unlike POSIX setpgid(), which takes a process ID and process group ID. =item Strange *+?{} on zero-length expression (W regexp) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is C</abc(?=(?:xyz){3})/>, not C</abc(?=xyz){3}/>. =item switching effective %s is not implemented (F) While under the C<use filetest> pragma, we cannot switch the real and effective uids or gids. =item This Perl can't reset CRTL environ elements (%s) =item This Perl can't set CRTL environ elements (%s=%s) (W internal) Warnings peculiar to VMS. You tried to change or delete an element of the CRTL's internal environ array, but your copy of Perl wasn't built with a CRTL that contained the setenv() function. You'll need to rebuild Perl with a CRTL that does, or redefine F<PERL_ENV_TABLES> (see L<perlvms>) so that the environ array isn't the target of the change to %ENV which produced the warning. =item Too late to run %s block (W void) A CHECK or INIT block is being defined during run time proper, when the opportunity to run them has already passed. Perhaps you are loading a file with C<require> or C<do> when you should be using C<use> instead. Or perhaps you should put the C<require> or C<do> inside a BEGIN block. =item Unknown open() mode '%s' (F) The second argument of 3-argument open() is not among the list of valid modes: C<< < >>, C<< > >>, C<<< >> >>>, C<< +< >>, C<< +> >>, C<<< +>> >>>, C<-|>, C<|->. =item Unknown process %x sent message to prime_env_iter: %s (P) An error peculiar to VMS. Perl was reading values for %ENV before iterating over it, and someone else stuck a message in the stream of data Perl expected. Someone's very confused, or perhaps trying to subvert Perl's population of %ENV for nefarious purposes. =item Unrecognized escape \\%c passed through (W misc) You used a backslash-character combination which is not recognized by Perl. The character was understood literally. =item Unterminated attribute parameter in attribute list (F) The lexer saw an opening (left) parenthesis character while parsing an attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. See L<attributes>. =item Unterminated attribute list (F) The lexer found something other than a simple identifier at the start of an attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. See L<attributes>. =item Unterminated attribute parameter in subroutine attribute list (F) The lexer saw an opening (left) parenthesis character while parsing a subroutine attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. =item Unterminated subroutine attribute list (F) The lexer found something other than a simple identifier at the start of a subroutine attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. =item Value of CLI symbol "%s" too long (W misc) A warning peculiar to VMS. Perl tried to read the value of an %ENV element from a CLI symbol table, and found a resultant string longer than 1024 characters. The return value has been truncated to 1024 characters. =item Version number must be a constant number (P) The attempt to translate a C<use Module n.n LIST> statement into its equivalent C<BEGIN> block found an internal inconsistency with the version number. =back =head1 New tests =over 4 =item lib/attrs Compatibility tests for C<sub : attrs> vs the older C<use attrs>. =item lib/env Tests for new environment scalar capability (e.g., C<use Env qw($BAR);>). =item lib/env-array Tests for new environment array capability (e.g., C<use Env qw(@PATH);>). =item lib/io_const IO constants (SEEK_*, _IO*). =item lib/io_dir Directory-related IO methods (new, read, close, rewind, tied delete). =item lib/io_multihomed INET sockets with multi-homed hosts. =item lib/io_poll IO poll(). =item lib/io_unix UNIX sockets. =item op/attrs Regression tests for C<my ($x,@y,%z) : attrs> and <sub : attrs>. =item op/filetest File test operators. =item op/lex_assign Verify operations that access pad objects (lexicals and temporaries). =item op/exists_sub Verify C<exists &sub> operations. =back =head1 Incompatible Changes =head2 Perl Source Incompatibilities Beware that any new warnings that have been added or old ones that have been enhanced are B<not> considered incompatible changes. Since all new warnings must be explicitly requested via the C<-w> switch or the C<warnings> pragma, it is ultimately the programmer's responsibility to ensure that warnings are enabled judiciously. =over 4 =item CHECK is a new keyword All subroutine definitions named CHECK are now special. See C</"Support for CHECK blocks"> for more information. =item Treatment of list slices of undef has changed There is a potential incompatibility in the behavior of list slices that are comprised entirely of undefined values. See L</"Behavior of list slices is more consistent">. =item Format of $English::PERL_VERSION is different The English module now sets $PERL_VERSION to $^V (a string value) rather than C<$]> (a numeric value). This is a potential incompatibility. Send us a report via perlbug if you are affected by this. See L</"Improved Perl version numbering system"> for the reasons for this change. =item Literals of the form C<1.2.3> parse differently Previously, numeric literals with more than one dot in them were interpreted as a floating point number concatenated with one or more numbers. Such "numbers" are now parsed as strings composed of the specified ordinals. For example, C<print 97.98.99> used to output C<97.9899> in earlier versions, but now prints C<abc>. See L</"Support for strings represented as a vector of ordinals">. =item Possibly changed pseudo-random number generator Perl programs that depend on reproducing a specific set of pseudo-random numbers may now produce different output due to improvements made to the rand() builtin. You can use C<sh Configure -Drandfunc=rand> to obtain the old behavior. See L</"Better pseudo-random number generator">. =item Hashing function for hash keys has changed Even though Perl hashes are not order preserving, the apparently random order encountered when iterating on the contents of a hash is actually determined by the hashing algorithm used. Improvements in the algorithm may yield a random order that is B<different> from that of previous versions, especially when iterating on hashes. See L</"Better worst-case behavior of hashes"> for additional information. =item C<undef> fails on read only values Using the C<undef> operator on a readonly value (such as $1) has the same effect as assigning C<undef> to the readonly value--it throws an exception. =item Close-on-exec bit may be set on pipe and socket handles Pipe and socket handles are also now subject to the close-on-exec behavior determined by the special variable $^F. See L</"More consistent close-on-exec behavior">. =item Writing C<"$$1"> to mean C<"${$}1"> is unsupported Perl 5.004 deprecated the interpretation of C<$$1> and similar within interpolated strings to mean C<$$ . "1">, but still allowed it. In Perl 5.6.0 and later, C<"$$1"> always means C<"${$1}">. =item delete(), each(), values() and C<\(%h)> operate on aliases to values, not copies delete(), each(), values() and hashes (e.g. C<\(%h)>) in a list context return the actual values in the hash, instead of copies (as they used to in earlier versions). Typical idioms for using these constructs copy the returned values, but this can make a significant difference when creating references to the returned values. Keys in the hash are still returned as copies when iterating on a hash. See also L</"delete(), each(), values() and hash iteration are faster">. =item vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS vec() generates a run-time error if the BITS argument is not a valid power-of-two integer. =item Text of some diagnostic output has changed Most references to internal Perl operations in diagnostics have been changed to be more descriptive. This may be an issue for programs that may incorrectly rely on the exact text of diagnostics for proper functioning. =item C<%@> has been removed The undocumented special variable C<%@> that used to accumulate "background" errors (such as those that happen in DESTROY()) has been removed, because it could potentially result in memory leaks. =item Parenthesized not() behaves like a list operator The C<not> operator now falls under the "if it looks like a function, it behaves like a function" rule. As a result, the parenthesized form can be used with C<grep> and C<map>. The following construct used to be a syntax error before, but it works as expected now: grep not($_), @things; On the other hand, using C<not> with a literal list slice may not work. The following previously allowed construct: print not (1,2,3)[0]; needs to be written with additional parentheses now: print not((1,2,3)[0]); The behavior remains unaffected when C<not> is not followed by parentheses. =item Semantics of bareword prototype C<(*)> have changed The semantics of the bareword prototype C<*> have changed. Perl 5.005 always coerced simple scalar arguments to a typeglob, which wasn't useful in situations where the subroutine must distinguish between a simple scalar and a typeglob. The new behavior is to not coerce bareword arguments to a typeglob. The value will always be visible as either a simple scalar or as a reference to a typeglob. See L</"More functional bareword prototype (*)">. =item Semantics of bit operators may have changed on 64-bit platforms If your platform is either natively 64-bit or if Perl has been configured to used 64-bit integers, i.e., $Config{ivsize} is 8, there may be a potential incompatibility in the behavior of bitwise numeric operators (& | ^ ~ << >>). These operators used to strictly operate on the lower 32 bits of integers in previous versions, but now operate over the entire native integral width. In particular, note that unary C<~> will produce different results on platforms that have different $Config{ivsize}. For portability, be sure to mask off the excess bits in the result of unary C<~>, e.g., C<~$x & 0xffffffff>. See L</"Bit operators support full native integer width">. =item More builtins taint their results As described in L</"Improved security features">, there may be more sources of taint in a Perl program. To avoid these new tainting behaviors, you can build Perl with the Configure option C<-Accflags=-DINCOMPLETE_TAINTS>. Beware that the ensuing perl binary may be insecure. =back =head2 C Source Incompatibilities =over 4 =item C<PERL_POLLUTE> Release 5.005 grandfathered old global symbol names by providing preprocessor macros for extension source compatibility. As of release 5.6.0, these preprocessor definitions are not available by default. You need to explicitly compile perl with C<-DPERL_POLLUTE> to get these definitions. For extensions still using the old symbols, this option can be specified via MakeMaker: perl Makefile.PL POLLUTE=1 =item C<PERL_IMPLICIT_CONTEXT> This new build option provides a set of macros for all API functions such that an implicit interpreter/thread context argument is passed to every API function. As a result of this, something like C<sv_setsv(foo,bar)> amounts to a macro invocation that actually translates to something like C<Perl_sv_setsv(my_perl,foo,bar)>. While this is generally expected to not have any significant source compatibility issues, the difference between a macro and a real function call will need to be considered. This means that there B<is> a source compatibility issue as a result of this if your extensions attempt to use pointers to any of the Perl API functions. Note that the above issue is not relevant to the default build of Perl, whose interfaces continue to match those of prior versions (but subject to the other options described here). See L<perlguts/Background and PERL_IMPLICIT_CONTEXT> for detailed information on the ramifications of building Perl with this option. NOTE: PERL_IMPLICIT_CONTEXT is automatically enabled whenever Perl is built with one of -Dusethreads, -Dusemultiplicity, or both. It is not intended to be enabled by users at this time. =item C<PERL_POLLUTE_MALLOC> Enabling Perl's malloc in release 5.005 and earlier caused the namespace of the system's malloc family of functions to be usurped by the Perl versions, since by default they used the same names. Besides causing problems on platforms that do not allow these functions to be cleanly replaced, this also meant that the system versions could not be called in programs that used Perl's malloc. Previous versions of Perl have allowed this behaviour to be suppressed with the HIDEMYMALLOC and EMBEDMYMALLOC preprocessor definitions. As of release 5.6.0, Perl's malloc family of functions have default names distinct from the system versions. You need to explicitly compile perl with C<-DPERL_POLLUTE_MALLOC> to get the older behaviour. HIDEMYMALLOC and EMBEDMYMALLOC have no effect, since the behaviour they enabled is now the default. Note that these functions do B<not> constitute Perl's memory allocation API. See L<perlguts/"Memory Allocation"> for further information about that. =back =head2 Compatible C Source API Changes =over 4 =item C<PATCHLEVEL> is now C<PERL_VERSION> The cpp macros C<PERL_REVISION>, C<PERL_VERSION>, and C<PERL_SUBVERSION> are now available by default from perl.h, and reflect the base revision, patchlevel, and subversion respectively. C<PERL_REVISION> had no prior equivalent, while C<PERL_VERSION> and C<PERL_SUBVERSION> were previously available as C<PATCHLEVEL> and C<SUBVERSION>. The new names cause less pollution of the B<cpp> namespace and reflect what the numbers have come to stand for in common practice. For compatibility, the old names are still supported when F<patchlevel.h> is explicitly included (as required before), so there is no source incompatibility from the change. =back =head2 Binary Incompatibilities In general, the default build of this release is expected to be binary compatible for extensions built with the 5.005 release or its maintenance versions. However, specific platforms may have broken binary compatibility due to changes in the defaults used in hints files. Therefore, please be sure to always check the platform-specific README files for any notes to the contrary. The usethreads or usemultiplicity builds are B<not> binary compatible with the corresponding builds in 5.005. On platforms that require an explicit list of exports (AIX, OS/2 and Windows, among others), purely internal symbols such as parser functions and the run time opcodes are not exported by default. Perl 5.005 used to export all functions irrespective of whether they were considered part of the public API or not. For the full list of public API functions, see L<perlapi>. =head1 Known Problems =head2 Thread test failures The subtests 19 and 20 of lib/thr5005.t test are known to fail due to fundamental problems in the 5.005 threading implementation. These are not new failures--Perl 5.005_0x has the same bugs, but didn't have these tests. =head2 EBCDIC platforms not supported In earlier releases of Perl, EBCDIC environments like OS390 (also known as Open Edition MVS) and VM-ESA were supported. Due to changes required by the UTF-8 (Unicode) support, the EBCDIC platforms are not supported in Perl 5.6.0. =head2 In 64-bit HP-UX the lib/io_multihomed test may hang The lib/io_multihomed test may hang in HP-UX if Perl has been configured to be 64-bit. Because other 64-bit platforms do not hang in this test, HP-UX is suspect. All other tests pass in 64-bit HP-UX. The test attempts to create and connect to "multihomed" sockets (sockets which have multiple IP addresses). =head2 NEXTSTEP 3.3 POSIX test failure In NEXTSTEP 3.3p2 the implementation of the strftime(3) in the operating system libraries is buggy: the %j format numbers the days of a month starting from zero, which, while being logical to programmers, will cause the subtests 19 to 27 of the lib/posix test may fail. =head2 Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc If compiled with gcc 2.95 the lib/sdbm test will fail (dump core). The cure is to use the vendor cc, it comes with the operating system and produces good code. =head2 UNICOS/mk CC failures during Configure run In UNICOS/mk the following errors may appear during the Configure run: Guessing which symbols your C compiler and preprocessor define... CC-20 cc: ERROR File = try.c, Line = 3 ... bad switch yylook 79bad switch yylook 79bad switch yylook 79bad switch yylook 79#ifdef A29K ... 4 errors detected in the compilation of "try.c". The culprit is the broken awk of UNICOS/mk. The effect is fortunately rather mild: Perl itself is not adversely affected by the error, only the h2ph utility coming with Perl, and that is rather rarely needed these days. =head2 Arrow operator and arrays When the left argument to the arrow operator C<< -> >> is an array, or the C<scalar> operator operating on an array, the result of the operation must be considered erroneous. For example: @x->[2] scalar(@x)->[2] These expressions will get run-time errors in some future release of Perl. =head2 Experimental features As discussed above, many features are still experimental. Interfaces and implementation of these features are subject to change, and in extreme cases, even subject to removal in some future release of Perl. These features include the following: =over 4 =item Threads =item Unicode =item 64-bit support =item Lvalue subroutines =item Weak references =item The pseudo-hash data type =item The Compiler suite =item Internal implementation of file globbing =item The DB module =item The regular expression code constructs: C<(?{ code })> and C<(??{ code })> =back =head1 Obsolete Diagnostics =over 4 =item Character class syntax [: :] is reserved for future extensions (W) Within regular expression character classes ([]) the syntax beginning with "[:" and ending with ":]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[:" and ":\]". =item Ill-formed logical name |%s| in prime_env_iter (W) A warning peculiar to VMS. A logical name was encountered when preparing to iterate over %ENV which violates the syntactic rules governing logical names. Because it cannot be translated normally, it is skipped, and will not appear in %ENV. This may be a benign occurrence, as some software packages might directly modify logical name tables and introduce nonstandard names, or it may indicate that a logical name table has been corrupted. =item In string, @%s now must be written as \@%s The description of this error used to say: (Someday it will simply assume that an unbackslashed @ interpolates an array.) That day has come, and this fatal error has been removed. It has been replaced by a non-fatal warning instead. See L</Arrays now always interpolate into double-quoted strings> for details. =item Probable precedence problem on %s (W) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example: open FOO || die; =item regexp too big (F) The current implementation of regular expressions uses shorts as address offsets within a string. Unfortunately this means that if the regular expression compiles to longer than 32767, it'll blow up. Usually when you want a regular expression this big, there is a better way to do it with multiple statements. See L<perlre>. =item Use of "$$<digit>" to mean "${$}<digit>" is deprecated (D) Perl versions before 5.004 misinterpreted any type marker followed by "$" and a digit. For example, "$$0" was incorrectly taken to mean "${$}0" instead of "${$0}". This bug is (mostly) fixed in Perl 5.004. However, the developers of Perl 5.004 could not fix this bug completely, because at least two widely-used modules depend on the old meaning of "$$0" in a string. So Perl 5.004 still interprets "$$<digit>" in the old (broken) way inside strings; but it generates this message as a warning. And in Perl 5.005, this special treatment will cease. =back =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup. There may also be information at http://www.perl.com/perl/ , the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. =head1 SEE ALSO The F<Changes> file for exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =head1 HISTORY Written by Gurusamy Sarathy <F<gsar@activestate.com>>, with many contributions from The Perl Porters. Send omissions or corrections to <F<perlbug@perl.org>>. =cut PK �=�[�V��[ [ perldelta.podnu �[��� =encoding utf8 =head1 NAME perldelta - what is new for perl v5.32.1 =head1 DESCRIPTION This document describes differences between the 5.32.0 release and the 5.32.1 release. If you are upgrading from an earlier release such as 5.31.0, first read L<perl5320delta>, which describes differences between 5.31.0 and 5.32.0. =head1 Incompatible Changes There are no changes intentionally incompatible with Perl 5.32.0. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Data::Dumper> has been upgraded from version 2.174 to 2.174_01. A number of memory leaks have been fixed. =item * L<DynaLoader> has been upgraded from version 1.47 to 1.47_01. =item * L<Module::CoreList> has been upgraded from version 5.20200620 to 5.20210123. =item * L<Opcode> has been upgraded from version 1.47 to 1.48. A warning has been added about evaluating untrusted code with the perl interpreter. =item * L<Safe> has been upgraded from version 2.41 to 2.41_01. A warning has been added about evaluating untrusted code with the perl interpreter. =back =head1 Documentation =head2 New Documentation =head3 L<perlgov> Documentation of the newly formed rules of governance for Perl. =head3 L<perlsecpolicy> Documentation of how the Perl security team operates and how the team evaluates new security reports. =head2 Changes to Existing Documentation We have attempted to update the documentation to reflect the changes listed in this document. If you find any we have missed, open an issue at L<https://github.com/Perl/perl5/issues>. Additionally, the following selected changes have been made: =head3 L<perlop> =over 4 =item * Document range op behaviour change. =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 Changes to Existing Diagnostics =over 4 =item * L<\K not permitted in lookahead/lookbehind in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"\K not permitted in lookahead/lookbehind in regex; marked by <-- HERE in m/%s/"> This error was incorrectly produced in some cases involving nested lookarounds. This has been fixed. [L<GH #18123|https://github.com/Perl/perl5/issues/18123>] =back =head1 Configuration and Compilation =over 4 =item * Newer 64-bit versions of the Intel C/C++ compiler are now recognized and have the correct flags set. =item * We now trap SIGBUS when F<Configure> checks for C<va_copy>. On several systems the attempt to determine if we need C<va_copy> or similar results in a SIGBUS instead of the expected SIGSEGV, which previously caused a core dump. [L<GH #18148|https://github.com/Perl/perl5/issues/18148>] =back =head1 Testing Tests were added and changed to reflect the other additions and changes in this release. =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item MacOS (Darwin) The hints file for darwin has been updated to handle future macOS versions beyond 10. Perl can now be built on macOS Big Sur. [L<GH #17946|https://github.com/Perl/perl5/issues/17946>, L<GH #18406|https://github.com/Perl/perl5/issues/18406>] =item Minix Build errors on Minix have been fixed. [L<GH #17908|https://github.com/Perl/perl5/issues/17908>] =back =head1 Selected Bug Fixes =over 4 =item * Some list assignments involving C<undef> on the left-hand side were over-optimized and produced incorrect results. [L<GH #16685|https://github.com/Perl/perl5/issues/16685>, L<GH #17816|https://github.com/Perl/perl5/issues/17816>] =item * Fixed a bug in which some regexps with recursive subpatterns matched incorrectly. [L<GH #18096|https://github.com/Perl/perl5/issues/18096>] =item * Fixed a deadlock that hung the build when Perl is compiled for debugging memory problems and has PERL_MEM_LOG enabled. [L<GH #18341|https://github.com/Perl/perl5/issues/18341>] =item * Fixed a crash in the use of chained comparison operators when run under "no warnings 'uninitialized'". [L<GH #17917|https://github.com/Perl/perl5/issues/17917>, L<GH #18380|https://github.com/Perl/perl5/issues/18380>] =item * Exceptions thrown from destructors during global destruction are no longer swallowed. [L<GH #18063|https://github.com/Perl/perl5/issues/18063>] =back =head1 Acknowledgements Perl 5.32.1 represents approximately 7 months of development since Perl 5.32.0 and contains approximately 7,000 lines of changes across 80 files from 23 authors. Excluding auto-generated files, documentation and release tools, there were approximately 1,300 lines of changes to 23 .pm, .t, .c and .h files. Perl continues to flourish into its fourth decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.32.1: Adam Hartley, Andy Dougherty, Dagfinn Ilmari Mannsåker, Dan Book, David Mitchell, Graham Knop, Graham Ollis, Hauke D, H.Merijn Brand, Hugo van der Sanden, John Lightsey, Karen Etheridge, Karl Williamson, Leon Timmermans, Max Maischein, Nicolas R., Ricardo Signes, Richard Leach, Sawyer X, Sevan Janiyan, Steve Hay, Tom Hukins, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the perl bug database at L<https://github.com/Perl/perl5/issues>. There may also be information at L<http://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please open an issue at L<https://github.com/Perl/perl5/issues>. Be sure to trim your bug down to a tiny but sufficient test case. If the bug you are reporting has security implications which make it inappropriate to send to a public issue tracker, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 Give Thanks If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the C<perlthanks> program: perlthanks This will send an email to the Perl 5 Porters list with your show of thanks. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�H� � perlsource.podnu �[��� =encoding utf8 =for comment Consistent formatting of this file is achieved with: perl ./Porting/podtidy pod/perlsource.pod =head1 NAME perlsource - A guide to the Perl source tree =head1 DESCRIPTION This document describes the layout of the Perl source tree. If you're hacking on the Perl core, this will help you find what you're looking for. =head1 FINDING YOUR WAY AROUND The Perl source tree is big. Here's some of the thing you'll find in it: =head2 C code The C source code and header files mostly live in the root of the source tree. There are a few platform-specific directories which contain C code. In addition, some of the modules shipped with Perl include C or XS code. See L<perlinterp> for more details on the files that make up the Perl interpreter, as well as details on how it works. =head2 Core modules Modules shipped as part of the Perl core live in four subdirectories. Two of these directories contain modules that live in the core, and two contain modules that can also be released separately on CPAN. Modules which can be released on cpan are known as "dual-life" modules. =over 4 =item * F<lib/> This directory contains pure-Perl modules which are only released as part of the core. This directory contains I<all> of the modules and their tests, unlike other core modules. =item * F<ext/> Like F<lib/>, this directory contains modules which are only released as part of the core. Unlike F<lib/>, however, a module under F<ext/> generally has a CPAN-style directory- and file-layout and its own F<Makefile.PL>. There is no expectation that a module under F<ext/> will work with earlier versions of Perl 5. Hence, such a module may take full advantage of syntactical and other improvements in Perl 5 blead. =item * F<dist/> This directory is for dual-life modules where the blead source is canonical. Note that some modules in this directory may not yet have been released separately on CPAN. Modules under F<dist/> should make an effort to work with earlier versions of Perl 5. =item * F<cpan/> This directory contains dual-life modules where the CPAN module is canonical. Do not patch these modules directly! Changes to these modules should be submitted to the maintainer of the CPAN module. Once those changes are applied and released, the new version of the module will be incorporated into the core. =back For some dual-life modules, it has not yet been determined if the CPAN version or the blead source is canonical. Until that is done, those modules should be in F<cpan/>. =head2 Tests The Perl core has an extensive test suite. If you add new tests (or new modules with tests), you may need to update the F<t/TEST> file so that the tests are run. =over 4 =item * Module tests Tests for core modules in the F<lib/> directory are right next to the module itself. For example, we have F<lib/strict.pm> and F<lib/strict.t>. Tests for modules in F<ext/> and the dual-life modules are in F<t/> subdirectories for each module, like a standard CPAN distribution. =item * F<t/base/> Tests for the absolute basic functionality of Perl. This includes C<if>, basic file reads and writes, simple regexes, etc. These are run first in the test suite and if any of them fail, something is I<really> broken. =item * F<t/cmd/> Tests for basic control structures, C<if>/C<else>, C<while>, subroutines, etc. =item * F<t/comp/> Tests for basic issues of how Perl parses and compiles itself. =item * F<t/io/> Tests for built-in IO functions, including command line arguments. =item * F<t/mro/> Tests for perl's method resolution order implementations (see L<mro>). =item * F<t/op/> Tests for perl's built in functions that don't fit into any of the other directories. =item * F<t/opbasic/> Tests for perl's built in functions which, like those in F<t/op/>, do not fit into any of the other directories, but which, in addition, cannot use F<t/test.pl>,as that program depends on functionality which the test file itself is testing. =item * F<t/re/> Tests for regex related functions or behaviour. (These used to live in t/op). =item * F<t/run/> Tests for features of how perl actually runs, including exit codes and handling of PERL* environment variables. =item * F<t/uni/> Tests for the core support of Unicode. =item * F<t/win32/> Windows-specific tests. =item * F<t/porting/> Tests the state of the source tree for various common errors. For example, it tests that everyone who is listed in the git log has a corresponding entry in the F<AUTHORS> file. =item * F<t/lib/> The old home for the module tests, you shouldn't put anything new in here. There are still some bits and pieces hanging around in here that need to be moved. Perhaps you could move them? Thanks! =back =head2 Documentation All of the core documentation intended for end users lives in F<pod/>. Individual modules in F<lib/>, F<ext/>, F<dist/>, and F<cpan/> usually have their own documentation, either in the F<Module.pm> file or an accompanying F<Module.pod> file. Finally, documentation intended for core Perl developers lives in the F<Porting/> directory. =head2 Hacking tools and documentation The F<Porting> directory contains a grab bag of code and documentation intended to help porters work on Perl. Some of the highlights include: =over 4 =item * F<check*> These are scripts which will check the source things like ANSI C violations, POD encoding issues, etc. =item * F<Maintainers>, F<Maintainers.pl>, and F<Maintainers.pm> These files contain information on who maintains which modules. Run C<perl Porting/Maintainers -M Module::Name> to find out more information about a dual-life module. =item * F<podtidy> Tidies a pod file. It's a good idea to run this on a pod file you've patched. =back =head2 Build system The Perl build system on *nix-like systems starts with the F<Configure> script in the root directory. Platform-specific pieces of the build system also live in platform-specific directories like F<win32/>, F<vms/>, etc. Windows and VMS have their own Configure-like scripts, in their respective directories. The F<Configure> script (or a platform-specific similar script) is ultimately responsible for generating a F<Makefile> from F<Makefile.SH>. The build system that Perl uses is called metaconfig. This system is maintained separately from the Perl core, and knows about the platform-specific Configure-like scripts, as well as F<Configure> itself. The metaconfig system has its own git repository. Please see its README file in L<https://github.com/Perl/metaconfig> for more details. The F<Cross> directory contains various files related to cross-compiling Perl. See F<Cross/README> for more details. =head2 F<AUTHORS> This file lists everyone who's contributed to Perl. If you submit a patch, you should add your name to this file as part of the patch. =head2 F<MANIFEST> The F<MANIFEST> file in the root of the source tree contains a list of every file in the Perl core, as well as a brief description of each file. You can get an overview of all the files with this command: % perl -lne 'print if /^[^\/]+\.[ch]\s+/' MANIFEST PK �=�[�ow( ( perlmodinstall.podnu �[��� =head1 NAME perlmodinstall - Installing CPAN Modules =head1 DESCRIPTION You can think of a module as the fundamental unit of reusable Perl code; see L<perlmod> for details. Whenever anyone creates a chunk of Perl code that they think will be useful to the world, they register as a Perl developer at L<https://www.cpan.org/modules/04pause.html> so that they can then upload their code to the CPAN. The CPAN is the Comprehensive Perl Archive Network and can be accessed at L<https://www.cpan.org/> , and searched at L<https://metacpan.org/> . This documentation is for people who want to download CPAN modules and install them on their own computer. =head2 PREAMBLE First, are you sure that the module isn't already on your system? Try C<perl -MFoo -e 1>. (Replace "Foo" with the name of the module; for instance, C<perl -MCGI::Carp -e 1>.) If you don't see an error message, you have the module. (If you do see an error message, it's still possible you have the module, but that it's not in your path, which you can display with C<perl -e "print qq(@INC)">.) For the remainder of this document, we'll assume that you really honestly truly lack an installed module, but have found it on the CPAN. So now you have a file ending in .tar.gz (or, less often, .zip). You know there's a tasty module inside. There are four steps you must now take: =over 5 =item B<DECOMPRESS> the file =item B<UNPACK> the file into a directory =item B<BUILD> the module (sometimes unnecessary) =item B<INSTALL> the module. =back Here's how to perform each step for each operating system. This is <not> a substitute for reading the README and INSTALL files that might have come with your module! Also note that these instructions are tailored for installing the module into your system's repository of Perl modules, but you can install modules into any directory you wish. For instance, where I say C<perl Makefile.PL>, you can substitute C<perl Makefile.PL PREFIX=/my/perl_directory> to install the modules into F</my/perl_directory>. Then you can use the modules from your Perl programs with C<use lib "/my/perl_directory/lib/site_perl";> or sometimes just C<use "/my/perl_directory";>. If you're on a system that requires superuser/root access to install modules into the directories you see when you type C<perl -e "print qq(@INC)">, you'll want to install them into a local directory (such as your home directory) and use this approach. =over 4 =item * B<If you're on a Unix or Unix-like system,> You can use Andreas Koenig's CPAN module ( L<https://metacpan.org/release/CPAN> ) to automate the following steps, from DECOMPRESS through INSTALL. A. DECOMPRESS Decompress the file with C<gzip -d yourmodule.tar.gz> You can get gzip from L<ftp://prep.ai.mit.edu/pub/gnu/> Or, you can combine this step with the next to save disk space: gzip -dc yourmodule.tar.gz | tar -xof - B. UNPACK Unpack the result with C<tar -xof yourmodule.tar> C. BUILD Go into the newly-created directory and type: perl Makefile.PL make test or perl Makefile.PL PREFIX=/my/perl_directory to install it locally. (Remember that if you do this, you'll have to put C<use lib "/my/perl_directory";> near the top of the program that is to use this module. D. INSTALL While still in that directory, type: make install Make sure you have the appropriate permissions to install the module in your Perl 5 library directory. Often, you'll need to be root. That's all you need to do on Unix systems with dynamic linking. Most Unix systems have dynamic linking. If yours doesn't, or if for another reason you have a statically-linked perl, B<and> the module requires compilation, you'll need to build a new Perl binary that includes the module. Again, you'll probably need to be root. =item * B<If you're running ActivePerl (Win95/98/2K/NT/XP, Linux, Solaris),> First, type C<ppm> from a shell and see whether ActiveState's PPM repository has your module. If so, you can install it with C<ppm> and you won't have to bother with any of the other steps here. You might be able to use the CPAN instructions from the "Unix or Linux" section above as well; give it a try. Otherwise, you'll have to follow the steps below. A. DECOMPRESS You can use the open source 7-zip ( L<https://www.7-zip.org/> ) or the shareware Winzip ( L<https://www.winzip.com> ) to decompress and unpack modules. B. UNPACK If you used WinZip, this was already done for you. C. BUILD You'll need the C<nmake> utility, available at L<http://download.microsoft.com/download/vc15/Patch/1.52/W95/EN-US/nmake15.exe> or dmake, available on CPAN. L<https://metacpan.org/release/dmake> Does the module require compilation (i.e. does it have files that end in .xs, .c, .h, .y, .cc, .cxx, or .C)? If it does, life is now officially tough for you, because you have to compile the module yourself (no easy feat on Windows). You'll need a compiler such as Visual C++. Alternatively, you can download a pre-built PPM package from ActiveState. L<http://aspn.activestate.com/ASPN/Downloads/ActivePerl/PPM/> Go into the newly-created directory and type: perl Makefile.PL nmake test D. INSTALL While still in that directory, type: nmake install =item * B<If you're on the DJGPP port of DOS,> A. DECOMPRESS djtarx ( L<ftp://ftp.delorie.com/pub/djgpp/current/v2/> ) will both uncompress and unpack. B. UNPACK See above. C. BUILD Go into the newly-created directory and type: perl Makefile.PL make test You will need the packages mentioned in F<README.dos> in the Perl distribution. D. INSTALL While still in that directory, type: make install You will need the packages mentioned in F<README.dos> in the Perl distribution. =item * B<If you're on OS/2,> Get the EMX development suite and gzip/tar from Hobbes ( L<http://hobbes.nmsu.edu/h-browse.php?dir=/pub/os2/dev/emx/v0.9d> ), and then follow the instructions for Unix. =item * B<If you're on VMS,> When downloading from CPAN, save your file with a C<.tgz> extension instead of C<.tar.gz>. All other periods in the filename should be replaced with underscores. For example, C<Your-Module-1.33.tar.gz> should be downloaded as C<Your-Module-1_33.tgz>. A. DECOMPRESS Type gzip -d Your-Module.tgz or, for zipped modules, type unzip Your-Module.zip Executables for gzip, zip, and VMStar: http://www.hp.com/go/openvms/freeware/ and their source code: http://www.fsf.org/order/ftp.html Note that GNU's gzip/gunzip is not the same as Info-ZIP's zip/unzip package. The former is a simple compression tool; the latter permits creation of multi-file archives. B. UNPACK If you're using VMStar: VMStar xf Your-Module.tar Or, if you're fond of VMS command syntax: tar/extract/verbose Your_Module.tar C. BUILD Make sure you have MMS (from Digital) or the freeware MMK ( available from MadGoat at L<http://www.madgoat.com> ). Then type this to create the DESCRIP.MMS for the module: perl Makefile.PL Now you're ready to build: mms test Substitute C<mmk> for C<mms> above if you're using MMK. D. INSTALL Type mms install Substitute C<mmk> for C<mms> above if you're using MMK. =item * B<If you're on MVS>, Introduce the F<.tar.gz> file into an HFS as binary; don't translate from ASCII to EBCDIC. A. DECOMPRESS Decompress the file with C<gzip -d yourmodule.tar.gz> You can get gzip from L<http://www.s390.ibm.com/products/oe/bpxqp1.html> B. UNPACK Unpack the result with pax -o to=IBM-1047,from=ISO8859-1 -r < yourmodule.tar The BUILD and INSTALL steps are identical to those for Unix. Some modules generate Makefiles that work better with GNU make, which is available from L<http://www.mks.com/s390/gnu/> =back =head1 PORTABILITY Note that not all modules will work with on all platforms. See L<perlport> for more information on portability issues. Read the documentation to see if the module will work on your system. There are basically three categories of modules that will not work "out of the box" with all platforms (with some possibility of overlap): =over 4 =item * B<Those that should, but don't.> These need to be fixed; consider contacting the author and possibly writing a patch. =item * B<Those that need to be compiled, where the target platform doesn't have compilers readily available.> (These modules contain F<.xs> or F<.c> files, usually.) You might be able to find existing binaries on the CPAN or elsewhere, or you might want to try getting compilers and building it yourself, and then release the binary for other poor souls to use. =item * B<Those that are targeted at a specific platform.> (Such as the Win32:: modules.) If the module is targeted specifically at a platform other than yours, you're out of luck, most likely. =back Check the CPAN Testers if a module should work with your platform but it doesn't behave as you'd expect, or you aren't sure whether or not a module will work under your platform. If the module you want isn't listed there, you can test it yourself and let CPAN Testers know, you can join CPAN Testers, or you can request it be tested. https://cpantesters.org/ =head1 HEY If you have any suggested changes for this page, let me know. Please don't send me mail asking for help on how to install your modules. There are too many modules, and too few Orwants, for me to be able to answer or even acknowledge all your questions. Contact the module author instead, ask someone familiar with Perl on your operating system, or if all else fails, file a ticket at L<https://rt.cpan.org/>. =head1 AUTHOR Jon Orwant orwant@medita.mit.edu with invaluable help from Chris Nandor, and valuable help from Brandon Allbery, Charles Bailey, Graham Barr, Dominic Dunlop, Jarkko Hietaniemi, Ben Holzman, Tom Horsley, Nick Ing-Simmons, Tuomas J. Lukka, Laszlo Molnar, Alan Olsen, Peter Prymmer, Gurusamy Sarathy, Christoph Spalinger, Dan Sugalski, Larry Virden, and Ilya Zakharevich. First version July 22, 1998; last revised November 21, 2001. =head1 COPYRIGHT Copyright (C) 1998, 2002, 2003 Jon Orwant. All Rights Reserved. This document may be distributed under the same terms as Perl itself. PK �=�[B���3 �3 perl5140delta.podnu �[��� =encoding utf8 =head1 NAME perl5140delta - what is new for perl v5.14.0 =head1 DESCRIPTION This document describes differences between the 5.12.0 release and the 5.14.0 release. If you are upgrading from an earlier release such as 5.10.0, first read L<perl5120delta>, which describes differences between 5.10.0 and 5.12.0. Some of the bug fixes in this release have been backported to subsequent releases of 5.12.x. Those are indicated with the 5.12.x version in parentheses. =head1 Notice As described in L<perlpolicy>, the release of Perl 5.14.0 marks the official end of support for Perl 5.10. Users of Perl 5.10 or earlier should consider upgrading to a more recent release of Perl. =head1 Core Enhancements =head2 Unicode =head3 Unicode Version 6.0 is now supported (mostly) Perl comes with the Unicode 6.0 data base updated with L<Corrigendum #8|http://www.unicode.org/versions/corrigendum8.html>, with one exception noted below. See L<http://unicode.org/versions/Unicode6.0.0/> for details on the new release. Perl does not support any Unicode provisional properties, including the new ones for this release. Unicode 6.0 has chosen to use the name C<BELL> for the character at U+1F514, which is a symbol that looks like a bell, and is used in Japanese cell phones. This conflicts with the long-standing Perl usage of having C<BELL> mean the ASCII C<BEL> character, U+0007. In Perl 5.14, C<\N{BELL}> continues to mean U+0007, but its use generates a deprecation warning message unless such warnings are turned off. The new name for U+0007 in Perl is C<ALERT>, which corresponds nicely with the existing shorthand sequence for it, C<"\a">. C<\N{BEL}> means U+0007, with no warning given. The character at U+1F514 has no name in 5.14, but can be referred to by C<\N{U+1F514}>. In Perl 5.16, C<\N{BELL}> will refer to U+1F514; all code that uses C<\N{BELL}> should be converted to use C<\N{ALERT}>, C<\N{BEL}>, or C<"\a"> before upgrading. =head3 Full functionality for C<use feature 'unicode_strings'> This release provides full functionality for C<use feature 'unicode_strings'>. Under its scope, all string operations executed and regular expressions compiled (even if executed outside its scope) have Unicode semantics. See L<feature/"the 'unicode_strings' feature">. However, see L</Inverted bracketed character classes and multi-character folds>, below. This feature avoids most forms of the "Unicode Bug" (see L<perlunicode/The "Unicode Bug"> for details). If there is any possibility that your code will process Unicode strings, you are I<strongly> encouraged to use this subpragma to avoid nasty surprises. =head3 C<\N{I<NAME>}> and C<charnames> enhancements =over =item * C<\N{I<NAME>}> and C<charnames::vianame> now know about the abbreviated character names listed by Unicode, such as NBSP, SHY, LRO, ZWJ, etc.; all customary abbreviations for the C0 and C1 control characters (such as ACK, BEL, CAN, etc.); and a few new variants of some C1 full names that are in common usage. =item * Unicode has several I<named character sequences>, in which particular sequences of code points are given names. C<\N{I<NAME>}> now recognizes these. =item * C<\N{I<NAME>}>, C<charnames::vianame>, and C<charnames::viacode> now know about every character in Unicode. In earlier releases of Perl, they didn't know about the Hangul syllables nor several CJK (Chinese/Japanese/Korean) characters. =item * It is now possible to override Perl's abbreviations with your own custom aliases. =item * You can now create a custom alias of the ordinal of a character, known by C<\N{I<NAME>}>, C<charnames::vianame()>, and C<charnames::viacode()>. Previously, aliases had to be to official Unicode character names. This made it impossible to create an alias for unnamed code points, such as those reserved for private use. =item * The new function charnames::string_vianame() is a run-time version of C<\N{I<NAME>}}>, returning the string of characters whose Unicode name is its parameter. It can handle Unicode named character sequences, whereas the pre-existing charnames::vianame() cannot, as the latter returns a single code point. =back See L<charnames> for details on all these changes. =head3 New warnings categories for problematic (non-)Unicode code points. Three new warnings subcategories of "utf8" have been added. These allow you to turn off some "utf8" warnings, while allowing other warnings to remain on. The three categories are: C<surrogate> when UTF-16 surrogates are encountered; C<nonchar> when Unicode non-character code points are encountered; and C<non_unicode> when code points above the legal Unicode maximum of 0x10FFFF are encountered. =head3 Any unsigned value can be encoded as a character With this release, Perl is adopting a model that any unsigned value can be treated as a code point and encoded internally (as utf8) without warnings, not just the code points that are legal in Unicode. However, unless utf8 or the corresponding sub-category (see previous item) of lexical warnings have been explicitly turned off, outputting or executing a Unicode-defined operation such as upper-casing on such a code point generates a warning. Attempting to input these using strict rules (such as with the C<:encoding(UTF-8)> layer) will continue to fail. Prior to this release, handling was inconsistent and in places, incorrect. Unicode non-characters, some of which previously were erroneously considered illegal in places by Perl, contrary to the Unicode Standard, are now always legal internally. Inputting or outputting them works the same as with the non-legal Unicode code points, because the Unicode Standard says they are (only) illegal for "open interchange". =head3 Unicode database files not installed The Unicode database files are no longer installed with Perl. This doesn't affect any functionality in Perl and saves significant disk space. If you need these files, you can download them from L<http://www.unicode.org/Public/zipped/6.0.0/>. =head2 Regular Expressions =head3 C<(?^...)> construct signifies default modifiers An ASCII caret C<"^"> immediately following a C<"(?"> in a regular expression now means that the subexpression does not inherit surrounding modifiers such as C</i>, but reverts to the Perl defaults. Any modifiers following the caret override the defaults. Stringification of regular expressions now uses this notation. For example, C<qr/hlagh/i> would previously be stringified as C<(?i-xsm:hlagh)>, but now it's stringified as C<(?^i:hlagh)>. The main purpose of this change is to allow tests that rely on the stringification I<not> to have to change whenever new modifiers are added. See L<perlre/Extended Patterns>. This change is likely to break code that compares stringified regular expressions with fixed strings containing C<?-xism>. =head3 C</d>, C</l>, C</u>, and C</a> modifiers Four new regular expression modifiers have been added. These are mutually exclusive: one only can be turned on at a time. =over =item * The C</l> modifier says to compile the regular expression as if it were in the scope of C<use locale>, even if it is not. =item * The C</u> modifier says to compile the regular expression as if it were in the scope of a C<use feature 'unicode_strings'> pragma. =item * The C</d> (default) modifier is used to override any C<use locale> and C<use feature 'unicode_strings'> pragmas in effect at the time of compiling the regular expression. =item * The C</a> regular expression modifier restricts C<\s>, C<\d> and C<\w> and the POSIX (C<[[:posix:]]>) character classes to the ASCII range. Their complements and C<\b> and C<\B> are correspondingly affected. Otherwise, C</a> behaves like the C</u> modifier, in that case-insensitive matching uses Unicode semantics. If the C</a> modifier is repeated, then additionally in case-insensitive matching, no ASCII character can match a non-ASCII character. For example, "k" =~ /\N{KELVIN SIGN}/ai "\xDF" =~ /ss/ai match but "k" =~ /\N{KELVIN SIGN}/aai "\xDF" =~ /ss/aai do not match. =back See L<perlre/Modifiers> for more detail. =head3 Non-destructive substitution The substitution (C<s///>) and transliteration (C<y///>) operators now support an C</r> option that copies the input variable, carries out the substitution on the copy, and returns the result. The original remains unmodified. my $old = "cat"; my $new = $old =~ s/cat/dog/r; # $old is "cat" and $new is "dog" This is particularly useful with C<map>. See L<perlop> for more examples. =head3 Re-entrant regular expression engine It is now safe to use regular expressions within C<(?{...})> and C<(??{...})> code blocks inside regular expressions. These blocks are still experimental, however, and still have problems with lexical (C<my>) variables and abnormal exiting. =head3 C<use re '/flags'> The C<re> pragma now has the ability to turn on regular expression flags till the end of the lexical scope: use re "/x"; "foo" =~ / (.+) /; # /x implied See L<re/"'/flags' mode"> for details. =head3 \o{...} for octals There is a new octal escape sequence, C<"\o">, in doublequote-like contexts. This construct allows large octal ordinals beyond the current max of 0777 to be represented. It also allows you to specify a character in octal which can safely be concatenated with other regex snippets and which won't be confused with being a backreference to a regex capture group. See L<perlre/Capture groups>. =head3 Add C<\p{Titlecase}> as a synonym for C<\p{Title}> This synonym is added for symmetry with the Unicode property names C<\p{Uppercase}> and C<\p{Lowercase}>. =head3 Regular expression debugging output improvement Regular expression debugging output (turned on by C<use re 'debug'>) now uses hexadecimal when escaping non-ASCII characters, instead of octal. =head3 Return value of C<delete $+{...}> Custom regular expression engines can now determine the return value of C<delete> on an entry of C<%+> or C<%->. =head2 Syntactical Enhancements =head3 Array and hash container functions accept references B<Warning:> This feature is considered experimental, as the exact behaviour may change in a future version of Perl. All builtin functions that operate directly on array or hash containers now also accept unblessed hard references to arrays or hashes: |----------------------------+---------------------------| | Traditional syntax | Terse syntax | |----------------------------+---------------------------| | push @$arrayref, @stuff | push $arrayref, @stuff | | unshift @$arrayref, @stuff | unshift $arrayref, @stuff | | pop @$arrayref | pop $arrayref | | shift @$arrayref | shift $arrayref | | splice @$arrayref, 0, 2 | splice $arrayref, 0, 2 | | keys %$hashref | keys $hashref | | keys @$arrayref | keys $arrayref | | values %$hashref | values $hashref | | values @$arrayref | values $arrayref | | ($k,$v) = each %$hashref | ($k,$v) = each $hashref | | ($k,$v) = each @$arrayref | ($k,$v) = each $arrayref | |----------------------------+---------------------------| This allows these builtin functions to act on long dereferencing chains or on the return value of subroutines without needing to wrap them in C<@{}> or C<%{}>: push @{$obj->tags}, $new_tag; # old way push $obj->tags, $new_tag; # new way for ( keys %{$hoh->{genres}{artists}} ) {...} # old way for ( keys $hoh->{genres}{artists} ) {...} # new way =head3 Single term prototype The C<+> prototype is a special alternative to C<$> that acts like C<\[@%]> when given a literal array or hash variable, but will otherwise force scalar context on the argument. See L<perlsub/Prototypes>. =head3 C<package> block syntax A package declaration can now contain a code block, in which case the declaration is in scope inside that block only. So C<package Foo { ... }> is precisely equivalent to C<{ package Foo; ... }>. It also works with a version number in the declaration, as in C<package Foo 1.2 { ... }>, which is its most attractive feature. See L<perlfunc>. =head3 Statement labels can appear in more places Statement labels can now occur before any type of statement or declaration, such as C<package>. =head3 Stacked labels Multiple statement labels can now appear before a single statement. =head3 Uppercase X/B allowed in hexadecimal/binary literals Literals may now use either upper case C<0X...> or C<0B...> prefixes, in addition to the already supported C<0x...> and C<0b...> syntax [perl #76296]. C, Ruby, Python, and PHP already support this syntax, and it makes Perl more internally consistent: a round-trip with C<eval sprintf "%#X", 0x10> now returns C<16>, just like C<eval sprintf "%#x", 0x10>. =head3 Overridable tie functions C<tie>, C<tied> and C<untie> can now be overridden [perl #75902]. =head2 Exception Handling To make them more reliable and consistent, several changes have been made to how C<die>, C<warn>, and C<$@> behave. =over =item * When an exception is thrown inside an C<eval>, the exception is no longer at risk of being clobbered by destructor code running during unwinding. Previously, the exception was written into C<$@> early in the throwing process, and would be overwritten if C<eval> was used internally in the destructor for an object that had to be freed while exiting from the outer C<eval>. Now the exception is written into C<$@> last thing before exiting the outer C<eval>, so the code running immediately thereafter can rely on the value in C<$@> correctly corresponding to that C<eval>. (C<$@> is still also set before exiting the C<eval>, for the sake of destructors that rely on this.) Likewise, a C<local $@> inside an C<eval> no longer clobbers any exception thrown in its scope. Previously, the restoration of C<$@> upon unwinding would overwrite any exception being thrown. Now the exception gets to the C<eval> anyway. So C<local $@> is safe before a C<die>. Exceptions thrown from object destructors no longer modify the C<$@> of the surrounding context. (If the surrounding context was exception unwinding, this used to be another way to clobber the exception being thrown.) Previously such an exception was sometimes emitted as a warning, and then either was string-appended to the surrounding C<$@> or completely replaced the surrounding C<$@>, depending on whether that exception and the surrounding C<$@> were strings or objects. Now, an exception in this situation is always emitted as a warning, leaving the surrounding C<$@> untouched. In addition to object destructors, this also affects any function call run by XS code using the C<G_KEEPERR> flag. =item * Warnings for C<warn> can now be objects in the same way as exceptions for C<die>. If an object-based warning gets the default handling of writing to standard error, it is stringified as before with the filename and line number appended. But a C<$SIG{__WARN__}> handler now receives an object-based warning as an object, where previously it was passed the result of stringifying the object. =back =head2 Other Enhancements =head3 Assignment to C<$0> sets the legacy process name with prctl() on Linux On Linux the legacy process name is now set with L<prctl(2)>, in addition to altering the POSIX name via C<argv[0]>, as Perl has done since version 4.000. Now system utilities that read the legacy process name such as I<ps>, I<top>, and I<killall> recognize the name you set when assigning to C<$0>. The string you supply is truncated at 16 bytes; this limitation is imposed by Linux. =head3 srand() now returns the seed This allows programs that need to have repeatable results not to have to come up with their own seed-generating mechanism. Instead, they can use srand() and stash the return value for future use. One example is a test program with too many combinations to test comprehensively in the time available for each run. It can test a random subset each time and, should there be a failure, log the seed used for that run so this can later be used to produce the same results. =head3 printf-like functions understand post-1980 size modifiers Perl's printf and sprintf operators, and Perl's internal printf replacement function, now understand the C90 size modifiers "hh" (C<char>), "z" (C<size_t>), and "t" (C<ptrdiff_t>). Also, when compiled with a C99 compiler, Perl now understands the size modifier "j" (C<intmax_t>) (but this is not portable). So, for example, on any modern machine, C<sprintf("%hhd", 257)> returns "1". =head3 New global variable C<${^GLOBAL_PHASE}> A new global variable, C<${^GLOBAL_PHASE}>, has been added to allow introspection of the current phase of the Perl interpreter. It's explained in detail in L<perlvar/"${^GLOBAL_PHASE}"> and in L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END">. =head3 C<-d:-foo> calls C<Devel::foo::unimport> The syntax B<-d:foo> was extended in 5.6.1 to make B<-d:foo=bar> equivalent to B<-MDevel::foo=bar>, which expands internally to C<use Devel::foo 'bar'>. Perl now allows prefixing the module name with B<->, with the same semantics as B<-M>; that is: =over 4 =item C<-d:-foo> Equivalent to B<-M-Devel::foo>: expands to C<no Devel::foo> and calls C<< Devel::foo->unimport() >> if that method exists. =item C<-d:-foo=bar> Equivalent to B<-M-Devel::foo=bar>: expands to C<no Devel::foo 'bar'>, and calls C<< Devel::foo->unimport("bar") >> if that method exists. =back This is particularly useful for suppressing the default actions of a C<Devel::*> module's C<import> method whilst still loading it for debugging. =head3 Filehandle method calls load L<IO::File> on demand When a method call on a filehandle would die because the method cannot be resolved and L<IO::File> has not been loaded, Perl now loads L<IO::File> via C<require> and attempts method resolution again: open my $fh, ">", $file; $fh->binmode(":raw"); # loads IO::File and succeeds This also works for globs like C<STDOUT>, C<STDERR>, and C<STDIN>: STDOUT->autoflush(1); Because this on-demand load happens only if method resolution fails, the legacy approach of manually loading an L<IO::File> parent class for partial method support still works as expected: use IO::Handle; open my $fh, ">", $file; $fh->autoflush(1); # IO::File not loaded =head3 Improved IPv6 support The C<Socket> module provides new affordances for IPv6, including implementations of the C<Socket::getaddrinfo()> and C<Socket::getnameinfo()> functions, along with related constants and a handful of new functions. See L<Socket>. =head3 DTrace probes now include package name The C<DTrace> probes now include an additional argument, C<arg3>, which contains the package the subroutine being entered or left was compiled in. For example, using the following DTrace script: perl$target:::sub-entry { printf("%s::%s\n", copyinstr(arg0), copyinstr(arg3)); } and then running: $ perl -e 'sub test { }; test' C<DTrace> will print: main::test =head2 New C APIs See L</Internal Changes>. =head1 Security =head2 User-defined regular expression properties L<perlunicode/"User-Defined Character Properties"> documented that you can create custom properties by defining subroutines whose names begin with "In" or "Is". However, Perl did not actually enforce that naming restriction, so C<\p{foo::bar}> could call foo::bar() if it existed. The documented convention is now enforced. Also, Perl no longer allows tainted regular expressions to invoke a user-defined property. It simply dies instead [perl #82616]. =head1 Incompatible Changes Perl 5.14.0 is not binary-compatible with any previous stable release. In addition to the sections that follow, see L</C API Changes>. =head2 Regular Expressions and String Escapes =head3 Inverted bracketed character classes and multi-character folds Some characters match a sequence of two or three characters in C</i> regular expression matching under Unicode rules. One example is C<LATIN SMALL LETTER SHARP S> which matches the sequence C<ss>. 'ss' =~ /\A[\N{LATIN SMALL LETTER SHARP S}]\z/i # Matches This, however, can lead to very counter-intuitive results, especially when inverted. Because of this, Perl 5.14 does not use multi-character C</i> matching in inverted character classes. 'ss' =~ /\A[^\N{LATIN SMALL LETTER SHARP S}]+\z/i # ??? This should match any sequences of characters that aren't the C<SHARP S> nor what C<SHARP S> matches under C</i>. C<"s"> isn't C<SHARP S>, but Unicode says that C<"ss"> is what C<SHARP S> matches under C</i>. So which one "wins"? Do you fail the match because the string has C<ss> or accept it because it has an C<s> followed by another C<s>? Earlier releases of Perl did allow this multi-character matching, but due to bugs, it mostly did not work. =head3 \400-\777 In certain circumstances, C<\400>-C<\777> in regexes have behaved differently than they behave in all other doublequote-like contexts. Since 5.10.1, Perl has issued a deprecation warning when this happens. Now, these literals behave the same in all doublequote-like contexts, namely to be equivalent to C<\x{100}>-C<\x{1FF}>, with no deprecation warning. Use of C<\400>-C<\777> in the command-line option B<-0> retain their conventional meaning. They slurp whole input files; previously, this was documented only for B<-0777>. Because of various ambiguities, you should use the new C<\o{...}> construct to represent characters in octal instead. =head3 Most C<\p{}> properties are now immune to case-insensitive matching For most Unicode properties, it doesn't make sense to have them match differently under C</i> case-insensitive matching. Doing so can lead to unexpected results and potential security holes. For example m/\p{ASCII_Hex_Digit}+/i could previously match non-ASCII characters because of the Unicode matching rules (although there were several bugs with this). Now matching under C</i> gives the same results as non-C</i> matching except for those few properties where people have come to expect differences, namely the ones where casing is an integral part of their meaning, such as C<m/\p{Uppercase}/i> and C<m/\p{Lowercase}/i>, both of which match the same code points as matched by C<m/\p{Cased}/i>. Details are in L<perlrecharclass/Unicode Properties>. User-defined property handlers that need to match differently under C</i> must be changed to read the new boolean parameter passed to them, which is non-zero if case-insensitive matching is in effect and 0 otherwise. See L<perlunicode/User-Defined Character Properties>. =head3 \p{} implies Unicode semantics Specifying a Unicode property in the pattern indicates that the pattern is meant for matching according to Unicode rules, the way C<\N{I<NAME>}> does. =head3 Regular expressions retain their localeness when interpolated Regular expressions compiled under C<use locale> now retain this when interpolated into a new regular expression compiled outside a C<use locale>, and vice-versa. Previously, one regular expression interpolated into another inherited the localeness of the surrounding regex, losing whatever state it originally had. This is considered a bug fix, but may trip up code that has come to rely on the incorrect behaviour. =head3 Stringification of regexes has changed Default regular expression modifiers are now notated using C<(?^...)>. Code relying on the old stringification will fail. This is so that when new modifiers are added, such code won't have to keep changing each time this happens, because the stringification will automatically incorporate the new modifiers. Code that needs to work properly with both old- and new-style regexes can avoid the whole issue by using (for perls since 5.9.5; see L<re>): use re qw(regexp_pattern); my ($pat, $mods) = regexp_pattern($re_ref); If the actual stringification is important or older Perls need to be supported, you can use something like the following: # Accept both old and new-style stringification my $modifiers = (qr/foobar/ =~ /\Q(?^/) ? "^" : "-xism"; And then use C<$modifiers> instead of C<-xism>. =head3 Run-time code blocks in regular expressions inherit pragmata Code blocks in regular expressions (C<(?{...})> and C<(??{...})>) previously did not inherit pragmata (strict, warnings, etc.) if the regular expression was compiled at run time as happens in cases like these two: use re "eval"; $foo =~ $bar; # when $bar contains (?{...}) $foo =~ /$bar(?{ $finished = 1 })/; This bug has now been fixed, but code that relied on the buggy behaviour may need to be fixed to account for the correct behaviour. =head2 Stashes and Package Variables =head3 Localised tied hashes and arrays are no longed tied In the following: tie @a, ...; { local @a; # here, @a is a now a new, untied array } # here, @a refers again to the old, tied array Earlier versions of Perl incorrectly tied the new local array. This has now been fixed. This fix could however potentially cause a change in behaviour of some code. =head3 Stashes are now always defined C<defined %Foo::> now always returns true, even when no symbols have yet been defined in that package. This is a side-effect of removing a special-case kludge in the tokeniser, added for 5.10.0, to hide side-effects of changes to the internal storage of hashes. The fix drastically reduces hashes' memory overhead. Calling defined on a stash has been deprecated since 5.6.0, warned on lexicals since 5.6.0, and warned for stashes and other package variables since 5.12.0. C<defined %hash> has always exposed an implementation detail: emptying a hash by deleting all entries from it does not make C<defined %hash> false. Hence C<defined %hash> is not valid code to determine whether an arbitrary hash is empty. Instead, use the behaviour of an empty C<%hash> always returning false in scalar context. =head3 Clearing stashes Stash list assignment C<%foo:: = ()> used to make the stash temporarily anonymous while it was being emptied. Consequently, any of its subroutines referenced elsewhere would become anonymous, showing up as "(unknown)" in C<caller>. They now retain their package names such that C<caller> returns the original sub name if there is still a reference to its typeglob and "foo::__ANON__" otherwise [perl #79208]. =head3 Dereferencing typeglobs If you assign a typeglob to a scalar variable: $glob = *foo; the glob that is copied to C<$glob> is marked with a special flag indicating that the glob is just a copy. This allows subsequent assignments to C<$glob> to overwrite the glob. The original glob, however, is immutable. Some Perl operators did not distinguish between these two types of globs. This would result in strange behaviour in edge cases: C<untie $scalar> would not untie the scalar if the last thing assigned to it was a glob (because it treated it as C<untie *$scalar>, which unties a handle). Assignment to a glob slot (such as C<*$glob = \@some_array>) would simply assign C<\@some_array> to C<$glob>. To fix this, the C<*{}> operator (including its C<*foo> and C<*$foo> forms) has been modified to make a new immutable glob if its operand is a glob copy. This allows operators that make a distinction between globs and scalars to be modified to treat only immutable globs as globs. (C<tie>, C<tied> and C<untie> have been left as they are for compatibility's sake, but will warn. See L</Deprecations>.) This causes an incompatible change in code that assigns a glob to the return value of C<*{}> when that operator was passed a glob copy. Take the following code, for instance: $glob = *foo; *$glob = *bar; The C<*$glob> on the second line returns a new immutable glob. That new glob is made an alias to C<*bar>. Then it is discarded. So the second assignment has no effect. See L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=77810> for more detail. =head3 Magic variables outside the main package In previous versions of Perl, magic variables like C<$!>, C<%SIG>, etc. would "leak" into other packages. So C<%foo::SIG> could be used to access signals, C<${"foo::!"}> (with strict mode off) to access C's C<errno>, etc. This was a bug, or an "unintentional" feature, which caused various ill effects, such as signal handlers being wiped when modules were loaded, etc. This has been fixed (or the feature has been removed, depending on how you see it). =head3 local($_) strips all magic from $_ local() on scalar variables gives them a new value but keeps all their magic intact. This has proven problematic for the default scalar variable $_, where L<perlsub> recommends that any subroutine that assigns to $_ should first localize it. This would throw an exception if $_ is aliased to a read-only variable, and could in general have various unintentional side-effects. Therefore, as an exception to the general rule, local($_) will not only assign a new value to $_, but also remove all existing magic from it as well. =head3 Parsing of package and variable names Parsing the names of packages and package variables has changed: multiple adjacent pairs of colons, as in C<foo::::bar>, are now all treated as package separators. Regardless of this change, the exact parsing of package separators has never been guaranteed and is subject to change in future Perl versions. =head2 Changes to Syntax or to Perl Operators =head3 C<given> return values C<given> blocks now return the last evaluated expression, or an empty list if the block was exited by C<break>. Thus you can now write: my $type = do { given ($num) { break when undef; "integer" when /^[+-]?[0-9]+$/; "float" when /^[+-]?[0-9]+(?:\.[0-9]+)?$/; "unknown"; } }; See L<perlsyn/Return value> for details. =head3 Change in parsing of certain prototypes Functions declared with the following prototypes now behave correctly as unary functions: * \$ \% \@ \* \& \[...] ;$ ;* ;\$ ;\% etc. ;\[...] Due to this bug fix [perl #75904], functions using the C<(*)>, C<(;$)> and C<(;*)> prototypes are parsed with higher precedence than before. So in the following example: sub foo(;$); foo $a < $b; the second line is now parsed correctly as C<< foo($a) < $b >>, rather than C<< foo($a < $b) >>. This happens when one of these operators is used in an unparenthesised argument: < > <= >= lt gt le ge == != <=> eq ne cmp ~~ & | ^ && || // .. ... ?: = += -= *= etc. , => =head3 Smart-matching against array slices Previously, the following code resulted in a successful match: my @a = qw(a y0 z); my @b = qw(a x0 z); @a[0 .. $#b] ~~ @b; This odd behaviour has now been fixed [perl #77468]. =head3 Negation treats strings differently from before The unary negation operator, C<->, now treats strings that look like numbers as numbers [perl #57706]. =head3 Negative zero Negative zero (-0.0), when converted to a string, now becomes "0" on all platforms. It used to become "-0" on some, but "0" on others. If you still need to determine whether a zero is negative, use C<sprintf("%g", $zero) =~ /^-/> or the L<Data::Float> module on CPAN. =head3 C<:=> is now a syntax error Previously C<my $pi := 4> was exactly equivalent to C<my $pi : = 4>, with the C<:> being treated as the start of an attribute list, ending before the C<=>. The use of C<:=> to mean C<: => was deprecated in 5.12.0, and is now a syntax error. This allows future use of C<:=> as a new token. Outside the core's tests for it, we find no Perl 5 code on CPAN using this construction, so we believe that this change will have little impact on real-world codebases. If it is absolutely necessary to have empty attribute lists (for example, because of a code generator), simply avoid the error by adding a space before the C<=>. =head3 Change in the parsing of identifiers Characters outside the Unicode "XIDStart" set are no longer allowed at the beginning of an identifier. This means that certain accents and marks that normally follow an alphabetic character may no longer be the first character of an identifier. =head2 Threads and Processes =head3 Directory handles not copied to threads On systems other than Windows that do not have a C<fchdir> function, newly-created threads no longer inherit directory handles from their parent threads. Such programs would usually have crashed anyway [perl #75154]. =head3 C<close> on shared pipes To avoid deadlocks, the C<close> function no longer waits for the child process to exit if the underlying file descriptor is still in use by another thread. It returns true in such cases. =head3 fork() emulation will not wait for signalled children On Windows parent processes would not terminate until all forked children had terminated first. However, C<kill("KILL", ...)> is inherently unstable on pseudo-processes, and C<kill("TERM", ...)> might not get delivered if the child is blocked in a system call. To avoid the deadlock and still provide a safe mechanism to terminate the hosting process, Perl now no longer waits for children that have been sent a SIGTERM signal. It is up to the parent process to waitpid() for these children if child-cleanup processing must be allowed to finish. However, it is also then the responsibility of the parent to avoid the deadlock by making sure the child process can't be blocked on I/O. See L<perlfork> for more information about the fork() emulation on Windows. =head2 Configuration =head3 Naming fixes in Policy_sh.SH may invalidate Policy.sh Several long-standing typos and naming confusions in F<Policy_sh.SH> have been fixed, standardizing on the variable names used in F<config.sh>. This will change the behaviour of F<Policy.sh> if you happen to have been accidentally relying on its incorrect behaviour. =head3 Perl source code is read in text mode on Windows Perl scripts used to be read in binary mode on Windows for the benefit of the L<ByteLoader> module (which is no longer part of core Perl). This had the side-effect of breaking various operations on the C<DATA> filehandle, including seek()/tell(), and even simply reading from C<DATA> after filehandles have been flushed by a call to system(), backticks, fork() etc. The default build options for Windows have been changed to read Perl source code on Windows in text mode now. L<ByteLoader> will (hopefully) be updated on CPAN to automatically handle this situation [perl #28106]. =head1 Deprecations See also L</Deprecated C APIs>. =head2 Omitting a space between a regular expression and subsequent word Omitting the space between a regular expression operator or its modifiers and the following word is deprecated. For example, C<< m/foo/sand $bar >> is for now still parsed as C<< m/foo/s and $bar >>, but will now issue a warning. =head2 C<\cI<X>> The backslash-c construct was designed as a way of specifying non-printable characters, but there were no restrictions (on ASCII platforms) on what the character following the C<c> could be. Now, a deprecation warning is raised if that character isn't an ASCII character. Also, a deprecation warning is raised for C<"\c{"> (which is the same as simply saying C<";">). =head2 C<"\b{"> and C<"\B{"> In regular expressions, a literal C<"{"> immediately following a C<"\b"> (not in a bracketed character class) or a C<"\B{"> is now deprecated to allow for its future use by Perl itself. =head2 Perl 4-era .pl libraries Perl bundles a handful of library files that predate Perl 5. This bundling is now deprecated for most of these files, which are now available from CPAN. The affected files now warn when run, if they were installed as part of the core. This is a mandatory warning, not obeying B<-X> or lexical warning bits. The warning is modelled on that supplied by F<deprecate.pm> for deprecated-in-core F<.pm> libraries. It points to the specific CPAN distribution that contains the F<.pl> libraries. The CPAN versions, of course, do not generate the warning. =head2 List assignment to C<$[> Assignment to C<$[> was deprecated and started to give warnings in Perl version 5.12.0. This version of Perl (5.14) now also emits a warning when assigning to C<$[> in list context. This fixes an oversight in 5.12.0. =head2 Use of qw(...) as parentheses Historically the parser fooled itself into thinking that C<qw(...)> literals were always enclosed in parentheses, and as a result you could sometimes omit parentheses around them: for $x qw(a b c) { ... } The parser no longer lies to itself in this way. Wrap the list literal in parentheses like this: for $x (qw(a b c)) { ... } This is being deprecated because the parentheses in C<for $i (1,2,3) { ... }> are not part of expression syntax. They are part of the statement syntax, with the C<for> statement wanting literal parentheses. The synthetic parentheses that a C<qw> expression acquired were only intended to be treated as part of expression syntax. Note that this does not change the behaviour of cases like: use POSIX qw(setlocale localeconv); our @EXPORT = qw(foo bar baz); where parentheses were never required around the expression. =head2 C<\N{BELL}> This is because Unicode is using that name for a different character. See L</Unicode Version 6.0 is now supported (mostly)> for more explanation. =head2 C<?PATTERN?> C<?PATTERN?> (without the initial C<m>) has been deprecated and now produces a warning. This is to allow future use of C<?> in new operators. The match-once functionality is still available as C<m?PATTERN?>. =head2 Tie functions on scalars holding typeglobs Calling a tie function (C<tie>, C<tied>, C<untie>) with a scalar argument acts on a filehandle if the scalar happens to hold a typeglob. This is a long-standing bug that will be removed in Perl 5.16, as there is currently no way to tie the scalar itself when it holds a typeglob, and no way to untie a scalar that has had a typeglob assigned to it. Now there is a deprecation warning whenever a tie function is used on a handle without an explicit C<*>. =head2 User-defined case-mapping This feature is being deprecated due to its many issues, as documented in L<perlunicode/User-Defined Case Mappings (for serious hackers only)>. This feature will be removed in Perl 5.16. Instead use the CPAN module L<Unicode::Casing>, which provides improved functionality. =head2 Deprecated modules The following module will be removed from the core distribution in a future release, and should be installed from CPAN instead. Distributions on CPAN that require this should add it to their prerequisites. The core version of these module now issues a deprecation warning. If you ship a packaged version of Perl, either alone or as part of a larger system, then you should carefully consider the repercussions of core module deprecations. You may want to consider shipping your default build of Perl with a package for the deprecated module that installs into C<vendor> or C<site> Perl library directories. This will inhibit the deprecation warnings. Alternatively, you may want to consider patching F<lib/deprecate.pm> to provide deprecation warnings specific to your packaging system or distribution of Perl, consistent with how your packaging system or distribution manages a staged transition from a release where the installation of a single package provides the given functionality, to a later release where the system administrator needs to know to install multiple packages to get that same functionality. You can silence these deprecation warnings by installing the module in question from CPAN. To install the latest version of it by role rather than by name, just install C<Task::Deprecations::5_14>. =over =item L<Devel::DProf> We strongly recommend that you install and use L<Devel::NYTProf> instead of L<Devel::DProf>, as L<Devel::NYTProf> offers significantly improved profiling and reporting. =back =head1 Performance Enhancements =head2 "Safe signals" optimisation Signal dispatch has been moved from the runloop into control ops. This should give a few percent speed increase, and eliminates nearly all the speed penalty caused by the introduction of "safe signals" in 5.8.0. Signals should still be dispatched within the same statement as they were previously. If this does I<not> happen, or if you find it possible to create uninterruptible loops, this is a bug, and reports are encouraged of how to recreate such issues. =head2 Optimisation of shift() and pop() calls without arguments Two fewer OPs are used for shift() and pop() calls with no argument (with implicit C<@_>). This change makes shift() 5% faster than C<shift @_> on non-threaded perls, and 25% faster on threaded ones. =head2 Optimisation of regexp engine string comparison work The C<foldEQ_utf8> API function for case-insensitive comparison of strings (which is used heavily by the regexp engine) was substantially refactored and optimised -- and its documentation much improved as a free bonus. =head2 Regular expression compilation speed-up Compiling regular expressions has been made faster when upgrading the regex to utf8 is necessary but this isn't known when the compilation begins. =head2 String appending is 100 times faster When doing a lot of string appending, perls built to use the system's C<malloc> could end up allocating a lot more memory than needed in a inefficient way. C<sv_grow>, the function used to allocate more memory if necessary when appending to a string, has been taught to round up the memory it requests to a certain geometric progression, making it much faster on certain platforms and configurations. On Win32, it's now about 100 times faster. =head2 Eliminate C<PL_*> accessor functions under ithreads When C<MULTIPLICITY> was first developed, and interpreter state moved into an interpreter struct, thread- and interpreter-local C<PL_*> variables were defined as macros that called accessor functions (returning the address of the value) outside the Perl core. The intent was to allow members within the interpreter struct to change size without breaking binary compatibility, so that bug fixes could be merged to a maintenance branch that necessitated such a size change. This mechanism was redundant and penalised well-behaved code. It has been removed. =head2 Freeing weak references When there are many weak references to an object, freeing that object can under some circumstances take O(I<N*N>) time to free, where I<N> is the number of references. The circumstances in which this can happen have been reduced [perl #75254] =head2 Lexical array and hash assignments An earlier optimisation to speed up C<my @array = ...> and C<my %hash = ...> assignments caused a bug and was disabled in Perl 5.12.0. Now we have found another way to speed up these assignments [perl #82110]. =head2 C<@_> uses less memory Previously, C<@_> was allocated for every subroutine at compile time with enough space for four entries. Now this allocation is done on demand when the subroutine is called [perl #72416]. =head2 Size optimisations to SV and HV structures C<xhv_fill> has been eliminated from C<struct xpvhv>, saving 1 IV per hash and on some systems will cause C<struct xpvhv> to become cache-aligned. To avoid this memory saving causing a slowdown elsewhere, boolean use of C<HvFILL> now calls C<HvTOTALKEYS> instead (which is equivalent), so while the fill data when actually required are now calculated on demand, cases when this needs to be done should be rare. The order of structure elements in SV bodies has changed. Effectively, the NV slot has swapped location with STASH and MAGIC. As all access to SV members is via macros, this should be completely transparent. This change allows the space saving for PVHVs documented above, and may reduce the memory allocation needed for PVIVs on some architectures. C<XPV>, C<XPVIV>, and C<XPVNV> now allocate only the parts of the C<SV> body they actually use, saving some space. Scalars containing regular expressions now allocate only the part of the C<SV> body they actually use, saving some space. =head2 Memory consumption improvements to Exporter The C<@EXPORT_FAIL> AV is no longer created unless needed, hence neither is the typeglob backing it. This saves about 200 bytes for every package that uses Exporter but doesn't use this functionality. =head2 Memory savings for weak references For weak references, the common case of just a single weak reference per referent has been optimised to reduce the storage required. In this case it saves the equivalent of one small Perl array per referent. =head2 C<%+> and C<%-> use less memory The bulk of the C<Tie::Hash::NamedCapture> module used to be in the Perl core. It has now been moved to an XS module to reduce overhead for programs that do not use C<%+> or C<%->. =head2 Multiple small improvements to threads The internal structures of threading now make fewer API calls and fewer allocations, resulting in noticeably smaller object code. Additionally, many thread context checks have been deferred so they're done only as needed (although this is only possible for non-debugging builds). =head2 Adjacent pairs of nextstate opcodes are now optimized away Previously, in code such as use constant DEBUG => 0; sub GAK { warn if DEBUG; print "stuff\n"; } the ops for C<warn if DEBUG> would be folded to a C<null> op (C<ex-const>), but the C<nextstate> op would remain, resulting in a runtime op dispatch of C<nextstate>, C<nextstate>, etc. The execution of a sequence of C<nextstate> ops is indistinguishable from just the last C<nextstate> op so the peephole optimizer now eliminates the first of a pair of C<nextstate> ops except when the first carries a label, since labels must not be eliminated by the optimizer, and label usage isn't conclusively known at compile time. =head1 Modules and Pragmata =head2 New Modules and Pragmata =over 4 =item * L<CPAN::Meta::YAML> 0.003 has been added as a dual-life module. It supports a subset of YAML sufficient for reading and writing F<META.yml> and F<MYMETA.yml> files included with CPAN distributions or generated by the module installation toolchain. It should not be used for any other general YAML parsing or generation task. =item * L<CPAN::Meta> version 2.110440 has been added as a dual-life module. It provides a standard library to read, interpret and write CPAN distribution metadata files (like F<META.json> and F<META.yml>) that describe a distribution, its contents, and the requirements for building it and installing it. The latest CPAN distribution metadata specification is included as L<CPAN::Meta::Spec> and notes on changes in the specification over time are given in L<CPAN::Meta::History>. =item * L<HTTP::Tiny> 0.012 has been added as a dual-life module. It is a very small, simple HTTP/1.1 client designed for simple GET requests and file mirroring. It has been added so that F<CPAN.pm> and L<CPANPLUS> can "bootstrap" HTTP access to CPAN using pure Perl without relying on external binaries like L<curl(1)> or L<wget(1)>. =item * L<JSON::PP> 2.27105 has been added as a dual-life module to allow CPAN clients to read F<META.json> files in CPAN distributions. =item * L<Module::Metadata> 1.000004 has been added as a dual-life module. It gathers package and POD information from Perl module files. It is a standalone module based on L<Module::Build::ModuleInfo> for use by other module installation toolchain components. L<Module::Build::ModuleInfo> has been deprecated in favor of this module instead. =item * L<Perl::OSType> 1.002 has been added as a dual-life module. It maps Perl operating system names (like "dragonfly" or "MSWin32") to more generic types with standardized names (like "Unix" or "Windows"). It has been refactored out of L<Module::Build> and L<ExtUtils::CBuilder> and consolidates such mappings into a single location for easier maintenance. =item * The following modules were added by the L<Unicode::Collate> upgrade. See below for details. L<Unicode::Collate::CJK::Big5> L<Unicode::Collate::CJK::GB2312> L<Unicode::Collate::CJK::JISX0208> L<Unicode::Collate::CJK::Korean> L<Unicode::Collate::CJK::Pinyin> L<Unicode::Collate::CJK::Stroke> =item * L<Version::Requirements> version 0.101020 has been added as a dual-life module. It provides a standard library to model and manipulates module prerequisites and version constraints defined in L<CPAN::Meta::Spec>. =back =head2 Updated Modules and Pragma =over 4 =item * L<attributes> has been upgraded from version 0.12 to 0.14. =item * L<Archive::Extract> has been upgraded from version 0.38 to 0.48. Updates since 0.38 include: a safe print method that guards L<Archive::Extract> from changes to C<$\>; a fix to the tests when run in core Perl; support for TZ files; a modification for the lzma logic to favour L<IO::Uncompress::Unlzma>; and a fix for an issue with NetBSD-current and its new L<unzip(1)> executable. =item * L<Archive::Tar> has been upgraded from version 1.54 to 1.76. Important changes since 1.54 include the following: =over =item * Compatibility with busybox implementations of L<tar(1)>. =item * A fix so that write() and create_archive() close only filehandles they themselves opened. =item * A bug was fixed regarding the exit code of extract_archive. =item * The L<ptar(1)> utility has a new option to allow safe creation of tarballs without world-writable files on Windows, allowing those archives to be uploaded to CPAN. =item * A new L<ptargrep(1)> utility for using regular expressions against the contents of files in a tar archive. =item * L<pax> extended headers are now skipped. =back =item * L<Attribute::Handlers> has been upgraded from version 0.87 to 0.89. =item * L<autodie> has been upgraded from version 2.06_01 to 2.1001. =item * L<AutoLoader> has been upgraded from version 5.70 to 5.71. =item * The L<B> module has been upgraded from version 1.23 to 1.29. It no longer crashes when taking apart a C<y///> containing characters outside the octet range or compiled in a C<use utf8> scope. The size of the shared object has been reduced by about 40%, with no reduction in functionality. =item * L<B::Concise> has been upgraded from version 0.78 to 0.83. L<B::Concise> marks rv2sv(), rv2av(), and rv2hv() ops with the new C<OPpDEREF> flag as "DREFed". It no longer produces mangled output with the B<-tree> option [perl #80632]. =item * L<B::Debug> has been upgraded from version 1.12 to 1.16. =item * L<B::Deparse> has been upgraded from version 0.96 to 1.03. The deparsing of a C<nextstate> op has changed when it has both a change of package relative to the previous nextstate, or a change of C<%^H> or other state and a label. The label was previously emitted first, but is now emitted last (5.12.1). The C<no 5.13.2> or similar form is now correctly handled by L<B::Deparse> (5.12.3). L<B::Deparse> now properly handles the code that applies a conditional pattern match against implicit C<$_> as it was fixed in [perl #20444]. Deparsing of C<our> followed by a variable with funny characters (as permitted under the C<use utf8> pragma) has also been fixed [perl #33752]. =item * L<B::Lint> has been upgraded from version 1.11_01 to 1.13. =item * L<base> has been upgraded from version 2.15 to 2.16. =item * L<Benchmark> has been upgraded from version 1.11 to 1.12. =item * L<bignum> has been upgraded from version 0.23 to 0.27. =item * L<Carp> has been upgraded from version 1.15 to 1.20. L<Carp> now detects incomplete L<caller()|perlfunc/"caller EXPR"> overrides and avoids using bogus C<@DB::args>. To provide backtraces, Carp relies on particular behaviour of the caller() builtin. L<Carp> now detects if other code has overridden this with an incomplete implementation, and modifies its backtrace accordingly. Previously incomplete overrides would cause incorrect values in backtraces (best case), or obscure fatal errors (worst case). This fixes certain cases of "Bizarre copy of ARRAY" caused by modules overriding caller() incorrectly (5.12.2). It now also avoids using regular expressions that cause Perl to load its Unicode tables, so as to avoid the "BEGIN not safe after errors" error that ensue if there has been a syntax error [perl #82854]. =item * L<CGI> has been upgraded from version 3.48 to 3.52. This provides the following security fixes: the MIME boundary in multipart_init() is now random and the handling of newlines embedded in header values has been improved. =item * L<Compress::Raw::Bzip2> has been upgraded from version 2.024 to 2.033. It has been updated to use L<bzip2(1)> 1.0.6. =item * L<Compress::Raw::Zlib> has been upgraded from version 2.024 to 2.033. =item * L<constant> has been upgraded from version 1.20 to 1.21. Unicode constants work once more. They have been broken since Perl 5.10.0 [CPAN RT #67525]. =item * L<CPAN> has been upgraded from version 1.94_56 to 1.9600. Major highlights: =over 4 =item * much less configuration dialog hassle =item * support for F<META/MYMETA.json> =item * support for L<local::lib> =item * support for L<HTTP::Tiny> to reduce the dependency on FTP sites =item * automatic mirror selection =item * iron out all known bugs in configure_requires =item * support for distributions compressed with L<bzip2(1)> =item * allow F<Foo/Bar.pm> on the command line to mean C<Foo::Bar> =back =item * L<CPANPLUS> has been upgraded from version 0.90 to 0.9103. A change to F<cpanp-run-perl> resolves L<RT #55964|http://rt.cpan.org/Public/Bug/Display.html?id=55964> and L<RT #57106|http://rt.cpan.org/Public/Bug/Display.html?id=57106>, both of which related to failures to install distributions that use C<Module::Install::DSL> (5.12.2). A dependency on L<Config> was not recognised as a core module dependency. This has been fixed. L<CPANPLUS> now includes support for F<META.json> and F<MYMETA.json>. =item * L<CPANPLUS::Dist::Build> has been upgraded from version 0.46 to 0.54. =item * L<Data::Dumper> has been upgraded from version 2.125 to 2.130_02. The indentation used to be off when C<$Data::Dumper::Terse> was set. This has been fixed [perl #73604]. This upgrade also fixes a crash when using custom sort functions that might cause the stack to change [perl #74170]. L<Dumpxs> no longer crashes with globs returned by C<*$io_ref> [perl #72332]. =item * L<DB_File> has been upgraded from version 1.820 to 1.821. =item * L<DBM_Filter> has been upgraded from version 0.03 to 0.04. =item * L<Devel::DProf> has been upgraded from version 20080331.00 to 20110228.00. Merely loading L<Devel::DProf> now no longer triggers profiling to start. Both C<use Devel::DProf> and C<perl -d:DProf ...> behave as before and start the profiler. B<NOTE>: L<Devel::DProf> is deprecated and will be removed from a future version of Perl. We strongly recommend that you install and use L<Devel::NYTProf> instead, as it offers significantly improved profiling and reporting. =item * L<Devel::Peek> has been upgraded from version 1.04 to 1.07. =item * L<Devel::SelfStubber> has been upgraded from version 1.03 to 1.05. =item * L<diagnostics> has been upgraded from version 1.19 to 1.22. It now renders pod links slightly better, and has been taught to find descriptions for messages that share their descriptions with other messages. =item * L<Digest::MD5> has been upgraded from version 2.39 to 2.51. It is now safe to use this module in combination with threads. =item * L<Digest::SHA> has been upgraded from version 5.47 to 5.61. C<shasum> now more closely mimics L<sha1sum(1)>/L<md5sum(1)>. C<addfile> accepts all POSIX filenames. New SHA-512/224 and SHA-512/256 transforms (ref. NIST Draft FIPS 180-4 [February 2011]) =item * L<DirHandle> has been upgraded from version 1.03 to 1.04. =item * L<Dumpvalue> has been upgraded from version 1.13 to 1.16. =item * L<DynaLoader> has been upgraded from version 1.10 to 1.13. It fixes a buffer overflow when passed a very long file name. It no longer inherits from L<AutoLoader>; hence it no longer produces weird error messages for unsuccessful method calls on classes that inherit from L<DynaLoader> [perl #84358]. =item * L<Encode> has been upgraded from version 2.39 to 2.42. Now, all 66 Unicode non-characters are treated the same way U+FFFF has always been treated: in cases when it was disallowed, all 66 are disallowed, and in cases where it warned, all 66 warn. =item * L<Env> has been upgraded from version 1.01 to 1.02. =item * L<Errno> has been upgraded from version 1.11 to 1.13. The implementation of L<Errno> has been refactored to use about 55% less memory. On some platforms with unusual header files, like Win32 L<gcc(1)> using C<mingw64> headers, some constants that weren't actually error numbers have been exposed by L<Errno>. This has been fixed [perl #77416]. =item * L<Exporter> has been upgraded from version 5.64_01 to 5.64_03. Exporter no longer overrides C<$SIG{__WARN__}> [perl #74472] =item * L<ExtUtils::CBuilder> has been upgraded from version 0.27 to 0.280203. =item * L<ExtUtils::Command> has been upgraded from version 1.16 to 1.17. =item * L<ExtUtils::Constant> has been upgraded from 0.22 to 0.23. The L<AUTOLOAD> helper code generated by C<ExtUtils::Constant::ProxySubs> can now croak() for missing constants, or generate a complete C<AUTOLOAD> subroutine in XS, allowing simplification of many modules that use it (L<Fcntl>, L<File::Glob>, L<GDBM_File>, L<I18N::Langinfo>, L<POSIX>, L<Socket>). L<ExtUtils::Constant::ProxySubs> can now optionally push the names of all constants onto the package's C<@EXPORT_OK>. =item * L<ExtUtils::Install> has been upgraded from version 1.55 to 1.56. =item * L<ExtUtils::MakeMaker> has been upgraded from version 6.56 to 6.57_05. =item * L<ExtUtils::Manifest> has been upgraded from version 1.57 to 1.58. =item * L<ExtUtils::ParseXS> has been upgraded from version 2.21 to 2.2210. =item * L<Fcntl> has been upgraded from version 1.06 to 1.11. =item * L<File::Basename> has been upgraded from version 2.78 to 2.82. =item * L<File::CheckTree> has been upgraded from version 4.4 to 4.41. =item * L<File::Copy> has been upgraded from version 2.17 to 2.21. =item * L<File::DosGlob> has been upgraded from version 1.01 to 1.04. It allows patterns containing literal parentheses: they no longer need to be escaped. On Windows, it no longer adds an extra F<./> to file names returned when the pattern is a relative glob with a drive specification, like F<C:*.pl> [perl #71712]. =item * L<File::Fetch> has been upgraded from version 0.24 to 0.32. L<HTTP::Lite> is now supported for the "http" scheme. The L<fetch(1)> utility is supported on FreeBSD, NetBSD, and Dragonfly BSD for the C<http> and C<ftp> schemes. =item * L<File::Find> has been upgraded from version 1.15 to 1.19. It improves handling of backslashes on Windows, so that paths like F<C:\dir\/file> are no longer generated [perl #71710]. =item * L<File::Glob> has been upgraded from version 1.07 to 1.12. =item * L<File::Spec> has been upgraded from version 3.31 to 3.33. Several portability fixes were made in L<File::Spec::VMS>: a colon is now recognized as a delimiter in native filespecs; caret-escaped delimiters are recognized for better handling of extended filespecs; catpath() returns an empty directory rather than the current directory if the input directory name is empty; and abs2rel() properly handles Unix-style input (5.12.2). =item * L<File::stat> has been upgraded from 1.02 to 1.05. The C<-x> and C<-X> file test operators now work correctly when run by the superuser. =item * L<Filter::Simple> has been upgraded from version 0.84 to 0.86. =item * L<GDBM_File> has been upgraded from 1.10 to 1.14. This fixes a memory leak when DBM filters are used. =item * L<Hash::Util> has been upgraded from 0.07 to 0.11. L<Hash::Util> no longer emits spurious "uninitialized" warnings when recursively locking hashes that have undefined values [perl #74280]. =item * L<Hash::Util::FieldHash> has been upgraded from version 1.04 to 1.09. =item * L<I18N::Collate> has been upgraded from version 1.01 to 1.02. =item * L<I18N::Langinfo> has been upgraded from version 0.03 to 0.08. langinfo() now defaults to using C<$_> if there is no argument given, just as the documentation has always claimed. =item * L<I18N::LangTags> has been upgraded from version 0.35 to 0.35_01. =item * L<if> has been upgraded from version 0.05 to 0.0601. =item * L<IO> has been upgraded from version 1.25_02 to 1.25_04. This version of L<IO> includes a new L<IO::Select>, which now allows L<IO::Handle> objects (and objects in derived classes) to be removed from an L<IO::Select> set even if the underlying file descriptor is closed or invalid. =item * L<IPC::Cmd> has been upgraded from version 0.54 to 0.70. Resolves an issue with splitting Win32 command lines. An argument consisting of the single character "0" used to be omitted (CPAN RT #62961). =item * L<IPC::Open3> has been upgraded from 1.05 to 1.09. open3() now produces an error if the C<exec> call fails, allowing this condition to be distinguished from a child process that exited with a non-zero status [perl #72016]. The internal xclose() routine now knows how to handle file descriptors as documented, so duplicating C<STDIN> in a child process using its file descriptor now works [perl #76474]. =item * L<IPC::SysV> has been upgraded from version 2.01 to 2.03. =item * L<lib> has been upgraded from version 0.62 to 0.63. =item * L<Locale::Maketext> has been upgraded from version 1.14 to 1.19. L<Locale::Maketext> now supports external caches. This upgrade also fixes an infinite loop in C<Locale::Maketext::Guts::_compile()> when working with tainted values (CPAN RT #40727). C<< ->maketext >> calls now back up and restore C<$@> so error messages are not suppressed (CPAN RT #34182). =item * L<Log::Message> has been upgraded from version 0.02 to 0.04. =item * L<Log::Message::Simple> has been upgraded from version 0.06 to 0.08. =item * L<Math::BigInt> has been upgraded from version 1.89_01 to 1.994. This fixes, among other things, incorrect results when computing binomial coefficients [perl #77640]. It also prevents C<sqrt($int)> from crashing under C<use bigrat>. [perl #73534]. =item * L<Math::BigInt::FastCalc> has been upgraded from version 0.19 to 0.28. =item * L<Math::BigRat> has been upgraded from version 0.24 to 0.26_02. =item * L<Memoize> has been upgraded from version 1.01_03 to 1.02. =item * L<MIME::Base64> has been upgraded from 3.08 to 3.13. Includes new functions to calculate the length of encoded and decoded base64 strings. Now provides encode_base64url() and decode_base64url() functions to process the base64 scheme for "URL applications". =item * L<Module::Build> has been upgraded from version 0.3603 to 0.3800. A notable change is the deprecation of several modules. L<Module::Build::Version> has been deprecated and L<Module::Build> now relies on the L<version> pragma directly. L<Module::Build::ModuleInfo> has been deprecated in favor of a standalone copy called L<Module::Metadata>. L<Module::Build::YAML> has been deprecated in favor of L<CPAN::Meta::YAML>. L<Module::Build> now also generates F<META.json> and F<MYMETA.json> files in accordance with version 2 of the CPAN distribution metadata specification, L<CPAN::Meta::Spec>. The older format F<META.yml> and F<MYMETA.yml> files are still generated. =item * L<Module::CoreList> has been upgraded from version 2.29 to 2.47. Besides listing the updated core modules of this release, it also stops listing the C<Filespec> module. That module never existed in core. The scripts generating L<Module::CoreList> confused it with L<VMS::Filespec>, which actually is a core module as of Perl 5.8.7. =item * L<Module::Load> has been upgraded from version 0.16 to 0.18. =item * L<Module::Load::Conditional> has been upgraded from version 0.34 to 0.44. =item * The L<mro> pragma has been upgraded from version 1.02 to 1.07. =item * L<NDBM_File> has been upgraded from version 1.08 to 1.12. This fixes a memory leak when DBM filters are used. =item * L<Net::Ping> has been upgraded from version 2.36 to 2.38. =item * L<NEXT> has been upgraded from version 0.64 to 0.65. =item * L<Object::Accessor> has been upgraded from version 0.36 to 0.38. =item * L<ODBM_File> has been upgraded from version 1.07 to 1.10. This fixes a memory leak when DBM filters are used. =item * L<Opcode> has been upgraded from version 1.15 to 1.18. =item * The L<overload> pragma has been upgraded from 1.10 to 1.13. C<overload::Method> can now handle subroutines that are themselves blessed into overloaded classes [perl #71998]. The documentation has greatly improved. See L</Documentation> below. =item * L<Params::Check> has been upgraded from version 0.26 to 0.28. =item * The L<parent> pragma has been upgraded from version 0.223 to 0.225. =item * L<Parse::CPAN::Meta> has been upgraded from version 1.40 to 1.4401. The latest Parse::CPAN::Meta can now read YAML and JSON files using L<CPAN::Meta::YAML> and L<JSON::PP>, which are now part of the Perl core. =item * L<PerlIO::encoding> has been upgraded from version 0.12 to 0.14. =item * L<PerlIO::scalar> has been upgraded from 0.07 to 0.11. A read() after a seek() beyond the end of the string no longer thinks it has data to read [perl #78716]. =item * L<PerlIO::via> has been upgraded from version 0.09 to 0.11. =item * L<Pod::Html> has been upgraded from version 1.09 to 1.11. =item * L<Pod::LaTeX> has been upgraded from version 0.58 to 0.59. =item * L<Pod::Perldoc> has been upgraded from version 3.15_02 to 3.15_03. =item * L<Pod::Simple> has been upgraded from version 3.13 to 3.16. =item * L<POSIX> has been upgraded from 1.19 to 1.24. It now includes constants for POSIX signal constants. =item * The L<re> pragma has been upgraded from version 0.11 to 0.18. The C<use re '/flags'> subpragma is new. The regmust() function used to crash when called on a regular expression belonging to a pluggable engine. Now it croaks instead. regmust() no longer leaks memory. =item * L<Safe> has been upgraded from version 2.25 to 2.29. Coderefs returned by reval() and rdo() are now wrapped via wrap_code_refs() (5.12.1). This fixes a possible infinite loop when looking for coderefs. It adds several C<version::vxs::*> routines to the default share. =item * L<SDBM_File> has been upgraded from version 1.06 to 1.09. =item * L<SelfLoader> has been upgraded from 1.17 to 1.18. It now works in taint mode [perl #72062]. =item * The L<sigtrap> pragma has been upgraded from version 1.04 to 1.05. It no longer tries to modify read-only arguments when generating a backtrace [perl #72340]. =item * L<Socket> has been upgraded from version 1.87 to 1.94. See L</Improved IPv6 support> above. =item * L<Storable> has been upgraded from version 2.22 to 2.27. Includes performance improvement for overloaded classes. This adds support for serialising code references that contain UTF-8 strings correctly. The L<Storable> minor version number changed as a result, meaning that L<Storable> users who set C<$Storable::accept_future_minor> to a C<FALSE> value will see errors (see L<Storable/FORWARD COMPATIBILITY> for more details). Freezing no longer gets confused if the Perl stack gets reallocated during freezing [perl #80074]. =item * L<Sys::Hostname> has been upgraded from version 1.11 to 1.16. =item * L<Term::ANSIColor> has been upgraded from version 2.02 to 3.00. =item * L<Term::UI> has been upgraded from version 0.20 to 0.26. =item * L<Test::Harness> has been upgraded from version 3.17 to 3.23. =item * L<Test::Simple> has been upgraded from version 0.94 to 0.98. Among many other things, subtests without a C<plan> or C<no_plan> now have an implicit done_testing() added to them. =item * L<Thread::Semaphore> has been upgraded from version 2.09 to 2.12. It provides two new methods that give more control over the decrementing of semaphores: C<down_nb> and C<down_force>. =item * L<Thread::Queue> has been upgraded from version 2.11 to 2.12. =item * The L<threads> pragma has been upgraded from version 1.75 to 1.83. =item * The L<threads::shared> pragma has been upgraded from version 1.32 to 1.37. =item * L<Tie::Hash> has been upgraded from version 1.03 to 1.04. Calling C<< Tie::Hash->TIEHASH() >> used to loop forever. Now it C<croak>s. =item * L<Tie::Hash::NamedCapture> has been upgraded from version 0.06 to 0.08. =item * L<Tie::RefHash> has been upgraded from version 1.38 to 1.39. =item * L<Time::HiRes> has been upgraded from version 1.9719 to 1.9721_01. =item * L<Time::Local> has been upgraded from version 1.1901_01 to 1.2000. =item * L<Time::Piece> has been upgraded from version 1.15_01 to 1.20_01. =item * L<Unicode::Collate> has been upgraded from version 0.52_01 to 0.73. L<Unicode::Collate> has been updated to use Unicode 6.0.0. L<Unicode::Collate::Locale> now supports a plethora of new locales: I<ar, be, bg, de__phonebook, hu, hy, kk, mk, nso, om, tn, vi, hr, ig, ja, ko, ru, sq, se, sr, to, uk, zh, zh__big5han, zh__gb2312han, zh__pinyin>, and I<zh__stroke>. The following modules have been added: L<Unicode::Collate::CJK::Big5> for C<zh__big5han> which makes tailoring of CJK Unified Ideographs in the order of CLDR's big5han ordering. L<Unicode::Collate::CJK::GB2312> for C<zh__gb2312han> which makes tailoring of CJK Unified Ideographs in the order of CLDR's gb2312han ordering. L<Unicode::Collate::CJK::JISX0208> which makes tailoring of 6355 kanji (CJK Unified Ideographs) in the JIS X 0208 order. L<Unicode::Collate::CJK::Korean> which makes tailoring of CJK Unified Ideographs in the order of CLDR's Korean ordering. L<Unicode::Collate::CJK::Pinyin> for C<zh__pinyin> which makes tailoring of CJK Unified Ideographs in the order of CLDR's pinyin ordering. L<Unicode::Collate::CJK::Stroke> for C<zh__stroke> which makes tailoring of CJK Unified Ideographs in the order of CLDR's stroke ordering. This also sees the switch from using the pure-Perl version of this module to the XS version. =item * L<Unicode::Normalize> has been upgraded from version 1.03 to 1.10. =item * L<Unicode::UCD> has been upgraded from version 0.27 to 0.32. A new function, Unicode::UCD::num(), has been added. This function returns the numeric value of the string passed it or C<undef> if the string in its entirety has no "safe" numeric value. (For more detail, and for the definition of "safe", see L<Unicode::UCD/num()>.) This upgrade also includes several bug fixes: =over 4 =item charinfo() =over 4 =item * It is now updated to Unicode Version 6.0.0 with I<Corrigendum #8>, excepting that, just as with Perl 5.14, the code point at U+1F514 has no name. =item * Hangul syllable code points have the correct names, and their decompositions are always output without requiring L<Lingua::KO::Hangul::Util> to be installed. =item * CJK (Chinese-Japanese-Korean) code points U+2A700 to U+2B734 and U+2B740 to U+2B81D are now properly handled. =item * Numeric values are now output for those CJK code points that have them. =item * Names output for code points with multiple aliases are now the corrected ones. =back =item charscript() This now correctly returns "Unknown" instead of C<undef> for the script of a code point that hasn't been assigned another one. =item charblock() This now correctly returns "No_Block" instead of C<undef> for the block of a code point that hasn't been assigned to another one. =back =item * The L<version> pragma has been upgraded from 0.82 to 0.88. Because of a bug, now fixed, the is_strict() and is_lax() functions did not work when exported (5.12.1). =item * The L<warnings> pragma has been upgraded from version 1.09 to 1.12. Calling C<use warnings> without arguments is now significantly more efficient. =item * The L<warnings::register> pragma has been upgraded from version 1.01 to 1.02. It is now possible to register warning categories other than the names of packages using L<warnings::register>. See L<perllexwarn(1)> for more information. =item * L<XSLoader> has been upgraded from version 0.10 to 0.13. =item * L<VMS::DCLsym> has been upgraded from version 1.03 to 1.05. Two bugs have been fixed [perl #84086]: The symbol table name was lost when tying a hash, due to a thinko in C<TIEHASH>. The result was that all tied hashes interacted with the local symbol table. Unless a symbol table name had been explicitly specified in the call to the constructor, querying the special key C<:LOCAL> failed to identify objects connected to the local symbol table. =item * The L<Win32> module has been upgraded from version 0.39 to 0.44. This release has several new functions: Win32::GetSystemMetrics(), Win32::GetProductInfo(), Win32::GetOSDisplayName(). The names returned by Win32::GetOSName() and Win32::GetOSDisplayName() have been corrected. =item * L<XS::Typemap> has been upgraded from version 0.03 to 0.05. =back =head2 Removed Modules and Pragmata As promised in Perl 5.12.0's release notes, the following modules have been removed from the core distribution, and if needed should be installed from CPAN instead. =over =item * L<Class::ISA> has been removed from the Perl core. Prior version was 0.36. =item * L<Pod::Plainer> has been removed from the Perl core. Prior version was 1.02. =item * L<Switch> has been removed from the Perl core. Prior version was 2.16. =back The removal of L<Shell> has been deferred until after 5.14, as the implementation of L<Shell> shipped with 5.12.0 did not correctly issue the warning that it was to be removed from core. =head1 Documentation =head2 New Documentation =head3 L<perlgpl> L<perlgpl> has been updated to contain GPL version 1, as is included in the F<README> distributed with Perl (5.12.1). =head3 Perl 5.12.x delta files The perldelta files for Perl 5.12.1 to 5.12.3 have been added from the maintenance branch: L<perl5121delta>, L<perl5122delta>, L<perl5123delta>. =head3 L<perlpodstyle> New style guide for POD documentation, split mostly from the NOTES section of the L<pod2man(1)> manpage. =head3 L<perlsource>, L<perlinterp>, L<perlhacktut>, and L<perlhacktips> See L</perlhack and perlrepository revamp>, below. =head2 Changes to Existing Documentation =head3 L<perlmodlib> is now complete The L<perlmodlib> manpage that came with Perl 5.12.0 was missing several modules due to a bug in the script that generates the list. This has been fixed [perl #74332] (5.12.1). =head3 Replace incorrect tr/// table in L<perlebcdic> L<perlebcdic> contains a helpful table to use in C<tr///> to convert between EBCDIC and Latin1/ASCII. The table was the inverse of the one it describes, though the code that used the table worked correctly for the specific example given. The table has been corrected and the sample code changed to correspond. The table has also been changed to hex from octal, and the recipes in the pod have been altered to print out leading zeros to make all values the same length. =head3 Tricks for user-defined casing L<perlunicode> now contains an explanation of how to override, mangle and otherwise tweak the way Perl handles upper-, lower- and other-case conversions on Unicode data, and how to provide scoped changes to alter one's own code's behaviour without stomping on anybody else's. =head3 INSTALL explicitly states that Perl requires a C89 compiler This was already true, but it's now Officially Stated For The Record (5.12.2). =head3 Explanation of C<\xI<HH>> and C<\oI<OOO>> escapes L<perlop> has been updated with more detailed explanation of these two character escapes. =head3 B<-0I<NNN>> switch In L<perlrun>, the behaviour of the B<-0NNN> switch for B<-0400> or higher has been clarified (5.12.2). =head3 Maintenance policy L<perlpolicy> now contains the policy on what patches are acceptable for maintenance branches (5.12.1). =head3 Deprecation policy L<perlpolicy> now contains the policy on compatibility and deprecation along with definitions of terms like "deprecation" (5.12.2). =head3 New descriptions in L<perldiag> The following existing diagnostics are now documented: =over 4 =item * L<Ambiguous use of %c resolved as operator %c|perldiag/"Ambiguous use of %c resolved as operator %c"> =item * L<Ambiguous use of %c{%s} resolved to %c%s|perldiag/"Ambiguous use of %c{%s} resolved to %c%s"> =item * L<Ambiguous use of %c{%s[...]} resolved to %c%s[...]|perldiag/"Ambiguous use of %c{%s[...]} resolved to %c%s[...]"> =item * L<Ambiguous use of %c{%s{...}} resolved to %c%s{...}|perldiag/"Ambiguous use of %c{%s{...}} resolved to %c%s{...}"> =item * L<Ambiguous use of -%s resolved as -&%s()|perldiag/"Ambiguous use of -%s resolved as -&%s()"> =item * L<Invalid strict version format (%s)|perldiag/"Invalid strict version format (%s)"> =item * L<Invalid version format (%s)|perldiag/"Invalid version format (%s)"> =item * L<Invalid version object|perldiag/"Invalid version object"> =back =head3 L<perlbook> L<perlbook> has been expanded to cover many more popular books. =head3 C<SvTRUE> macro The documentation for the C<SvTRUE> macro in L<perlapi> was simply wrong in stating that get-magic is not processed. It has been corrected. =head3 op manipulation functions Several API functions that process optrees have been newly documented. =head3 L<perlvar> revamp L<perlvar> reorders the variables and groups them by topic. Each variable introduced after Perl 5.000 notes the first version in which it is available. L<perlvar> also has a new section for deprecated variables to note when they were removed. =head3 Array and hash slices in scalar context These are now documented in L<perldata>. =head3 C<use locale> and formats L<perlform> and L<perllocale> have been corrected to state that C<use locale> affects formats. =head3 L<overload> L<overload>'s documentation has practically undergone a rewrite. It is now much more straightforward and clear. =head3 perlhack and perlrepository revamp The L<perlhack> document is now much shorter, and focuses on the Perl 5 development process and submitting patches to Perl. The technical content has been moved to several new documents, L<perlsource>, L<perlinterp>, L<perlhacktut>, and L<perlhacktips>. This technical content has been only lightly edited. The perlrepository document has been renamed to L<perlgit>. This new document is just a how-to on using git with the Perl source code. Any other content that used to be in perlrepository has been moved to L<perlhack>. =head3 Time::Piece examples Examples in L<perlfaq4> have been updated to show the use of L<Time::Piece>. =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 New Diagnostics =head3 New Errors =over =item Closure prototype called This error occurs when a subroutine reference passed to an attribute handler is called, if the subroutine is a closure [perl #68560]. =item Insecure user-defined property %s Perl detected tainted data when trying to compile a regular expression that contains a call to a user-defined character property function, meaning C<\p{IsFoo}> or C<\p{InFoo}>. See L<perlunicode/User-Defined Character Properties> and L<perlsec>. =item panic: gp_free failed to free glob pointer - something is repeatedly re-creating entries This new error is triggered if a destructor called on an object in a typeglob that is being freed creates a new typeglob entry containing an object with a destructor that creates a new entry containing an object etc. =item Parsing code internal error (%s) This new fatal error is produced when parsing code supplied by an extension violates the parser's API in a detectable way. =item refcnt: fd %d%s This new error only occurs if an internal consistency check fails when a pipe is about to be closed. =item Regexp modifier "/%c" may not appear twice The regular expression pattern has one of the mutually exclusive modifiers repeated. =item Regexp modifiers "/%c" and "/%c" are mutually exclusive The regular expression pattern has more than one of the mutually exclusive modifiers. =item Using !~ with %s doesn't make sense This error occurs when C<!~> is used with C<s///r> or C<y///r>. =back =head3 New Warnings =over =item "\b{" is deprecated; use "\b\{" instead =item "\B{" is deprecated; use "\B\{" instead Use of an unescaped "{" immediately following a C<\b> or C<\B> is now deprecated in order to reserve its use for Perl itself in a future release. =item Operation "%s" returns its argument for ... Performing an operation requiring Unicode semantics (such as case-folding) on a Unicode surrogate or a non-Unicode character now triggers this warning. =item Use of qw(...) as parentheses is deprecated See L</"Use of qw(...) as parentheses">, above, for details. =back =head2 Changes to Existing Diagnostics =over 4 =item * The "Variable $foo is not imported" warning that precedes a C<strict 'vars'> error has now been assigned the "misc" category, so that C<no warnings> will suppress it [perl #73712]. =item * warn() and die() now produce "Wide character" warnings when fed a character outside the byte range if C<STDERR> is a byte-sized handle. =item * The "Layer does not match this perl" error message has been replaced with these more helpful messages [perl #73754]: =over 4 =item * PerlIO layer function table size (%d) does not match size expected by this perl (%d) =item * PerlIO layer instance size (%d) does not match size expected by this perl (%d) =back =item * The "Found = in conditional" warning that is emitted when a constant is assigned to a variable in a condition is now withheld if the constant is actually a subroutine or one generated by C<use constant>, since the value of the constant may not be known at the time the program is written [perl #77762]. =item * Previously, if none of the gethostbyaddr(), gethostbyname() and gethostent() functions were implemented on a given platform, they would all die with the message "Unsupported socket function 'gethostent' called", with analogous messages for getnet*() and getserv*(). This has been corrected. =item * The warning message about unrecognized regular expression escapes passed through has been changed to include any literal "{" following the two-character escape. For example, "\q{" is now emitted instead of "\q". =back =head1 Utility Changes =head3 L<perlbug(1)> =over 4 =item * L<perlbug> now looks in the EMAIL environment variable for a return address if the REPLY-TO and REPLYTO variables are empty. =item * L<perlbug> did not previously generate a "From:" header, potentially resulting in dropped mail; it now includes that header. =item * The user's address is now used as the Return-Path. Many systems these days don't have a valid Internet domain name, and perlbug@perl.org does not accept email with a return-path that does not resolve. So the user's address is now passed to sendmail so it's less likely to get stuck in a mail queue somewhere [perl #82996]. =item * L<perlbug> now always gives the reporter a chance to change the email address it guesses for them (5.12.2). =item * L<perlbug> should no longer warn about uninitialized values when using the B<-d> and B<-v> options (5.12.2). =back =head3 L<perl5db.pl> =over =item * The remote terminal works after forking and spawns new sessions, one per forked process. =back =head3 L<ptargrep> =over 4 =item * L<ptargrep> is a new utility to apply pattern matching to the contents of files in a tar archive. It comes with C<Archive::Tar>. =back =head1 Configuration and Compilation See also L</"Naming fixes in Policy_sh.SH may invalidate Policy.sh">, above. =over 4 =item * CCINCDIR and CCLIBDIR for the mingw64 cross-compiler are now correctly under F<$(CCHOME)\mingw\include> and F<\lib> rather than immediately below F<$(CCHOME)>. This means the "incpath", "libpth", "ldflags", "lddlflags" and "ldflags_nolargefiles" values in F<Config.pm> and F<Config_heavy.pl> are now set correctly. =item * C<make test.valgrind> has been adjusted to account for F<cpan/dist/ext> separation. =item * On compilers that support it, B<-Wwrite-strings> is now added to cflags by default. =item * The L<Encode> module can now (once again) be included in a static Perl build. The special-case handling for this situation got broken in Perl 5.11.0, and has now been repaired. =item * The previous default size of a PerlIO buffer (4096 bytes) has been increased to the larger of 8192 bytes and your local BUFSIZ. Benchmarks show that doubling this decade-old default increases read and write performance by around 25% to 50% when using the default layers of perlio on top of unix. To choose a non-default size, such as to get back the old value or to obtain an even larger value, configure with: ./Configure -Accflags=-DPERLIOBUF_DEFAULT_BUFSIZ=N where N is the desired size in bytes; it should probably be a multiple of your page size. =item * An "incompatible operand types" error in ternary expressions when building with C<clang> has been fixed (5.12.2). =item * Perl now skips setuid L<File::Copy> tests on partitions it detects mounted as C<nosuid> (5.12.2). =back =head1 Platform Support =head2 New Platforms =over 4 =item AIX Perl now builds on AIX 4.2 (5.12.1). =back =head2 Discontinued Platforms =over 4 =item Apollo DomainOS The last vestiges of support for this platform have been excised from the Perl distribution. It was officially discontinued in version 5.12.0. It had not worked for years before that. =item MacOS Classic The last vestiges of support for this platform have been excised from the Perl distribution. It was officially discontinued in an earlier version. =back =head2 Platform-Specific Notes =head3 AIX =over =item * F<README.aix> has been updated with information about the XL C/C++ V11 compiler suite (5.12.2). =back =head3 ARM =over =item * The C<d_u32align> configuration probe on ARM has been fixed (5.12.2). =back =head3 Cygwin =over 4 =item * L<MakeMaker> has been updated to build manpages on cygwin. =item * Improved rebase behaviour If a DLL is updated on cygwin the old imagebase address is reused. This solves most rebase errors, especially when updating on core DLL's. See L<http://www.tishler.net/jason/software/rebase/rebase-2.4.2.README> for more information. =item * Support for the standard cygwin dll prefix (needed for FFIs) =item * Updated build hints file =back =head3 FreeBSD 7 =over =item * FreeBSD 7 no longer contains F</usr/bin/objformat>. At build time, Perl now skips the F<objformat> check for versions 7 and higher and assumes ELF (5.12.1). =back =head3 HP-UX =over =item * Perl now allows B<-Duse64bitint> without promoting to C<use64bitall> on HP-UX (5.12.1). =back =head3 IRIX =over =item * Conversion of strings to floating-point numbers is now more accurate on IRIX systems [perl #32380]. =back =head3 Mac OS X =over =item * Early versions of Mac OS X (Darwin) had buggy implementations of the setregid(), setreuid(), setrgid(,) and setruid() functions, so Perl would pretend they did not exist. These functions are now recognised on Mac OS 10.5 (Leopard; Darwin 9) and higher, as they have been fixed [perl #72990]. =back =head3 MirBSD =over =item * Previously if you built Perl with a shared F<libperl.so> on MirBSD (the default config), it would work up to the installation; however, once installed, it would be unable to find F<libperl>. Path handling is now treated as in the other BSD dialects. =back =head3 NetBSD =over =item * The NetBSD hints file has been changed to make the system malloc the default. =back =head3 OpenBSD =over =item * OpenBSD E<gt> 3.7 has a new malloc implementation which is I<mmap>-based, and as such can release memory back to the OS; however, Perl's use of this malloc causes a substantial slowdown, so we now default to using Perl's malloc instead [perl #75742]. =back =head3 OpenVOS =over =item * Perl now builds again with OpenVOS (formerly known as Stratus VOS) [perl #78132] (5.12.3). =back =head3 Solaris =over =item * DTrace is now supported on Solaris. There used to be build failures, but these have been fixed [perl #73630] (5.12.3). =back =head3 VMS =over =item * Extension building on older (pre 7.3-2) VMS systems was broken because configure.com hit the DCL symbol length limit of 1K. We now work within this limit when assembling the list of extensions in the core build (5.12.1). =item * We fixed configuring and building Perl with B<-Uuseperlio> (5.12.1). =item * C<PerlIOUnix_open> now honours the default permissions on VMS. When C<perlio> became the default and C<unix> became the default bottom layer, the most common path for creating files from Perl became C<PerlIOUnix_open>, which has always explicitly used C<0666> as the permission mask. This prevents inheriting permissions from RMS defaults and ACLs, so to avoid that problem, we now pass C<0777> to open(). In the VMS CRTL, C<0777> has a special meaning over and above intersecting with the current umask; specifically, it allows Unix syscalls to preserve native default permissions (5.12.3). =item * The shortening of symbols longer than 31 characters in the core C sources and in extensions is now by default done by the C compiler rather than by xsubpp (which could only do so for generated symbols in XS code). You can reenable xsubpp's symbol shortening by configuring with -Uuseshortenedsymbols, but you'll have some work to do to get the core sources to compile. =item * Record-oriented files (record format variable or variable with fixed control) opened for write by the C<perlio> layer will now be line-buffered to prevent the introduction of spurious line breaks whenever the perlio buffer fills up. =item * F<git_version.h> is now installed on VMS. This was an oversight in v5.12.0 which caused some extensions to fail to build (5.12.2). =item * Several memory leaks in L<stat()|perlfunc/"stat FILEHANDLE"> have been fixed (5.12.2). =item * A memory leak in Perl_rename() due to a double allocation has been fixed (5.12.2). =item * A memory leak in vms_fid_to_name() (used by realpath() and realname()> has been fixed (5.12.2). =back =head3 Windows See also L</"fork() emulation will not wait for signalled children"> and L</"Perl source code is read in text mode on Windows">, above. =over 4 =item * Fixed build process for SDK2003SP1 compilers. =item * Compilation with Visual Studio 2010 is now supported. =item * When using old 32-bit compilers, the define C<_USE_32BIT_TIME_T> is now set in C<$Config{ccflags}>. This improves portability when compiling XS extensions using new compilers, but for a Perl compiled with old 32-bit compilers. =item * C<$Config{gccversion}> is now set correctly when Perl is built using the mingw64 compiler from L<http://mingw64.org> [perl #73754]. =item * When building Perl with the mingw64 x64 cross-compiler C<incpath>, C<libpth>, C<ldflags>, C<lddlflags> and C<ldflags_nolargefiles> values in F<Config.pm> and F<Config_heavy.pl> were not previously being set correctly because, with that compiler, the include and lib directories are not immediately below C<$(CCHOME)> (5.12.2). =item * The build process proceeds more smoothly with mingw and dmake when F<C:\MSYS\bin> is in the PATH, due to a C<Cwd> fix. =item * Support for building with Visual C++ 2010 is now underway, but is not yet complete. See F<README.win32> or L<perlwin32> for more details. =item * The option to use an externally-supplied crypt(), or to build with no crypt() at all, has been removed. Perl supplies its own crypt() implementation for Windows, and the political situation that required this part of the distribution to sometimes be omitted is long gone. =back =head1 Internal Changes =head2 New APIs =head3 CLONE_PARAMS structure added to ease correct thread creation Modules that create threads should now create C<CLONE_PARAMS> structures by calling the new function Perl_clone_params_new(), and free them with Perl_clone_params_del(). This will ensure compatibility with any future changes to the internals of the C<CLONE_PARAMS> structure layout, and that it is correctly allocated and initialised. =head3 New parsing functions Several functions have been added for parsing Perl statements and expressions. These functions are meant to be used by XS code invoked during Perl parsing, in a recursive-descent manner, to allow modules to augment the standard Perl syntax. =over =item * L<parse_stmtseq()|perlapi/parse_stmtseq> parses a sequence of statements, up to closing brace or EOF. =item * L<parse_fullstmt()|perlapi/parse_fullstmt> parses a complete Perl statement, including optional label. =item * L<parse_barestmt()|perlapi/parse_barestmt> parses a statement without a label. =item * L<parse_block()|perlapi/parse_block> parses a code block. =item * L<parse_label()|perlapi/parse_label> parses a statement label, separate from statements. =item * L<C<parse_fullexpr()>|perlapi/parse_fullexpr>, L<C<parse_listexpr()>|perlapi/parse_listexpr>, L<C<parse_termexpr()>|perlapi/parse_termexpr>, and L<C<parse_arithexpr()>|perlapi/parse_arithexpr> parse expressions at various precedence levels. =back =head3 Hints hash API A new C API for introspecting the hinthash C<%^H> at runtime has been added. See C<cop_hints_2hv>, C<cop_hints_fetchpvn>, C<cop_hints_fetchpvs>, C<cop_hints_fetchsv>, and C<hv_copy_hints_hv> in L<perlapi> for details. A new, experimental API has been added for accessing the internal structure that Perl uses for C<%^H>. See the functions beginning with C<cophh_> in L<perlapi>. =head3 C interface to caller() The C<caller_cx> function has been added as an XSUB-writer's equivalent of caller(). See L<perlapi> for details. =head3 Custom per-subroutine check hooks XS code in an extension module can now annotate a subroutine (whether implemented in XS or in Perl) so that nominated XS code will be called at compile time (specifically as part of op checking) to change the op tree of that subroutine. The compile-time check function (supplied by the extension module) can implement argument processing that can't be expressed as a prototype, generate customised compile-time warnings, perform constant folding for a pure function, inline a subroutine consisting of sufficiently simple ops, replace the whole call with a custom op, and so on. This was previously all possible by hooking the C<entersub> op checker, but the new mechanism makes it easy to tie the hook to a specific subroutine. See L<perlapi/cv_set_call_checker>. To help in writing custom check hooks, several subtasks within standard C<entersub> op checking have been separated out and exposed in the API. =head3 Improved support for custom OPs Custom ops can now be registered with the new C<custom_op_register> C function and the C<XOP> structure. This will make it easier to add new properties of custom ops in the future. Two new properties have been added already, C<xop_class> and C<xop_peep>. C<xop_class> is one of the OA_*OP constants. It allows L<B> and other introspection mechanisms to work with custom ops that aren't BASEOPs. C<xop_peep> is a pointer to a function that will be called for ops of this type from C<Perl_rpeep>. See L<perlguts/Custom Operators> and L<perlapi/Custom Operators> for more detail. The old C<PL_custom_op_names>/C<PL_custom_op_descs> interface is still supported but discouraged. =head3 Scope hooks It is now possible for XS code to hook into Perl's lexical scope mechanism at compile time, using the new C<Perl_blockhook_register> function. See L<perlguts/"Compile-time scope hooks">. =head3 The recursive part of the peephole optimizer is now hookable In addition to C<PL_peepp>, for hooking into the toplevel peephole optimizer, a C<PL_rpeepp> is now available to hook into the optimizer recursing into side-chains of the optree. =head3 New non-magical variants of existing functions The following functions/macros have been added to the API. The C<*_nomg> macros are equivalent to their non-C<_nomg> variants, except that they ignore get-magic. Those ending in C<_flags> allow one to specify whether get-magic is processed. sv_2bool_flags SvTRUE_nomg sv_2nv_flags SvNV_nomg sv_cmp_flags sv_cmp_locale_flags sv_eq_flags sv_collxfrm_flags In some of these cases, the non-C<_flags> functions have been replaced with wrappers around the new functions. =head3 pv/pvs/sv versions of existing functions Many functions ending with pvn now have equivalent C<pv/pvs/sv> versions. =head3 List op-building functions List op-building functions have been added to the API. See L<op_append_elem|perlapi/op_append_elem>, L<op_append_list|perlapi/op_append_list>, and L<op_prepend_elem|perlapi/op_prepend_elem> in L<perlapi>. =head3 C<LINKLIST> The L<LINKLIST|perlapi/LINKLIST> macro, part of op building that constructs the execution-order op chain, has been added to the API. =head3 Localisation functions The C<save_freeop>, C<save_op>, C<save_pushi32ptr> and C<save_pushptrptr> functions have been added to the API. =head3 Stash names A stash can now have a list of effective names in addition to its usual name. The first effective name can be accessed via the C<HvENAME> macro, which is now the recommended name to use in MRO linearisations (C<HvNAME> being a fallback if there is no C<HvENAME>). These names are added and deleted via C<hv_ename_add> and C<hv_ename_delete>. These two functions are I<not> part of the API. =head3 New functions for finding and removing magic The L<C<mg_findext()>|perlapi/mg_findext> and L<C<sv_unmagicext()>|perlapi/sv_unmagicext> functions have been added to the API. They allow extension authors to find and remove magic attached to scalars based on both the magic type and the magic virtual table, similar to how sv_magicext() attaches magic of a certain type and with a given virtual table to a scalar. This eliminates the need for extensions to walk the list of C<MAGIC> pointers of an C<SV> to find the magic that belongs to them. =head3 C<find_rundefsv> This function returns the SV representing C<$_>, whether it's lexical or dynamic. =head3 C<Perl_croak_no_modify> Perl_croak_no_modify() is short-hand for C<Perl_croak("%s", PL_no_modify)>. =head3 C<PERL_STATIC_INLINE> define The C<PERL_STATIC_INLINE> define has been added to provide the best-guess incantation to use for static inline functions, if the C compiler supports C99-style static inline. If it doesn't, it'll give a plain C<static>. C<HAS_STATIC_INLINE> can be used to check if the compiler actually supports inline functions. =head3 New C<pv_escape> option for hexadecimal escapes A new option, C<PERL_PV_ESCAPE_NONASCII>, has been added to C<pv_escape> to dump all characters above ASCII in hexadecimal. Before, one could get all characters as hexadecimal or the Latin1 non-ASCII as octal. =head3 C<lex_start> C<lex_start> has been added to the API, but is considered experimental. =head3 op_scope() and op_lvalue() The op_scope() and op_lvalue() functions have been added to the API, but are considered experimental. =head2 C API Changes =head3 C<PERL_POLLUTE> has been removed The option to define C<PERL_POLLUTE> to expose older 5.005 symbols for backwards compatibility has been removed. Its use was always discouraged, and MakeMaker contains a more specific escape hatch: perl Makefile.PL POLLUTE=1 This can be used for modules that have not been upgraded to 5.6 naming conventions (and really should be completely obsolete by now). =head3 Check API compatibility when loading XS modules When Perl's API changes in incompatible ways (which usually happens between major releases), XS modules compiled for previous versions of Perl will no longer work. They need to be recompiled against the new Perl. The C<XS_APIVERSION_BOOTCHECK> macro has been added to ensure that modules are recompiled and to prevent users from accidentally loading modules compiled for old perls into newer perls. That macro, which is called when loading every newly compiled extension, compares the API version of the running perl with the version a module has been compiled for and raises an exception if they don't match. =head3 Perl_fetch_cop_label The first argument of the C API function C<Perl_fetch_cop_label> has changed from C<struct refcounted_he *> to C<COP *>, to insulate the user from implementation details. This API function was marked as "may change", and likely isn't in use outside the core. (Neither an unpacked CPAN nor Google's codesearch finds any other references to it.) =head3 GvCV() and GvGP() are no longer lvalues The new GvCV_set() and GvGP_set() macros are now provided to replace assignment to those two macros. This allows a future commit to eliminate some backref magic between GV and CVs, which will require complete control over assignment to the C<gp_cv> slot. =head3 CvGV() is no longer an lvalue Under some circumstances, the CvGV() field of a CV is now reference-counted. To ensure consistent behaviour, direct assignment to it, for example C<CvGV(cv) = gv> is now a compile-time error. A new macro, C<CvGV_set(cv,gv)> has been introduced to run this operation safely. Note that modification of this field is not part of the public API, regardless of this new macro (and despite its being listed in this section). =head3 CvSTASH() is no longer an lvalue The CvSTASH() macro can now only be used as an rvalue. CvSTASH_set() has been added to replace assignment to CvSTASH(). This is to ensure that backreferences are handled properly. These macros are not part of the API. =head3 Calling conventions for C<newFOROP> and C<newWHILEOP> The way the parser handles labels has been cleaned up and refactored. As a result, the newFOROP() constructor function no longer takes a parameter stating what label is to go in the state op. The newWHILEOP() and newFOROP() functions no longer accept a line number as a parameter. =head3 Flags passed to C<uvuni_to_utf8_flags> and C<utf8n_to_uvuni> Some of the flags parameters to uvuni_to_utf8_flags() and utf8n_to_uvuni() have changed. This is a result of Perl's now allowing internal storage and manipulation of code points that are problematic in some situations. Hence, the default actions for these functions has been complemented to allow these code points. The new flags are documented in L<perlapi>. Code that requires the problematic code points to be rejected needs to change to use the new flags. Some flag names are retained for backward source compatibility, though they do nothing, as they are now the default. However the flags C<UNICODE_ALLOW_FDD0>, C<UNICODE_ALLOW_FFFF>, C<UNICODE_ILLEGAL>, and C<UNICODE_IS_ILLEGAL> have been removed, as they stem from a fundamentally broken model of how the Unicode non-character code points should be handled, which is now described in L<perlunicode/Non-character code points>. See also the Unicode section under L</Selected Bug Fixes>. =head2 Deprecated C APIs =over =item C<Perl_ptr_table_clear> C<Perl_ptr_table_clear> is no longer part of Perl's public API. Calling it now generates a deprecation warning, and it will be removed in a future release. =item C<sv_compile_2op> The sv_compile_2op() API function is now deprecated. Searches suggest that nothing on CPAN is using it, so this should have zero impact. It attempted to provide an API to compile code down to an optree, but failed to bind correctly to lexicals in the enclosing scope. It's not possible to fix this problem within the constraints of its parameters and return value. =item C<find_rundefsvoffset> The C<find_rundefsvoffset> function has been deprecated. It appeared that its design was insufficient for reliably getting the lexical C<$_> at run-time. Use the new C<find_rundefsv> function or the C<UNDERBAR> macro instead. They directly return the right SV representing C<$_>, whether it's lexical or dynamic. =item C<CALL_FPTR> and C<CPERLscope> Those are left from an old implementation of C<MULTIPLICITY> using C++ objects, which was removed in Perl 5.8. Nowadays these macros do exactly nothing, so they shouldn't be used anymore. For compatibility, they are still defined for external C<XS> code. Only extensions defining C<PERL_CORE> must be updated now. =back =head2 Other Internal Changes =head3 Stack unwinding The protocol for unwinding the C stack at the last stage of a C<die> has changed how it identifies the target stack frame. This now uses a separate variable C<PL_restartjmpenv>, where previously it relied on the C<blk_eval.cur_top_env> pointer in the C<eval> context frame that has nominally just been discarded. This change means that code running during various stages of Perl-level unwinding no longer needs to take care to avoid destroying the ghost frame. =head3 Scope stack entries The format of entries on the scope stack has been changed, resulting in a reduction of memory usage of about 10%. In particular, the memory used by the scope stack to record each active lexical variable has been halved. =head3 Memory allocation for pointer tables Memory allocation for pointer tables has been changed. Previously C<Perl_ptr_table_store> allocated memory from the same arena system as C<SV> bodies and C<HE>s, with freed memory remaining bound to those arenas until interpreter exit. Now it allocates memory from arenas private to the specific pointer table, and that memory is returned to the system when C<Perl_ptr_table_free> is called. Additionally, allocation and release are both less CPU intensive. =head3 C<UNDERBAR> The C<UNDERBAR> macro now calls C<find_rundefsv>. C<dUNDERBAR> is now a noop but should still be used to ensure past and future compatibility. =head3 String comparison routines renamed The C<ibcmp_*> functions have been renamed and are now called C<foldEQ>, C<foldEQ_locale>, and C<foldEQ_utf8>. The old names are still available as macros. =head3 C<chop> and C<chomp> implementations merged The opcode bodies for C<chop> and C<chomp> and for C<schop> and C<schomp> have been merged. The implementation functions Perl_do_chop() and Perl_do_chomp(), never part of the public API, have been merged and moved to a static function in F<pp.c>. This shrinks the Perl binary slightly, and should not affect any code outside the core (unless it is relying on the order of side-effects when C<chomp> is passed a I<list> of values). =head1 Selected Bug Fixes =head2 I/O =over 4 =item * Perl no longer produces this warning: $ perl -we 'open(my $f, ">", \my $x); binmode($f, "scalar")' Use of uninitialized value in binmode at -e line 1. =item * Opening a glob reference via C<< open($fh, ">", \*glob) >> no longer causes the glob to be corrupted when the filehandle is printed to. This would cause Perl to crash whenever the glob's contents were accessed [perl #77492]. =item * PerlIO no longer crashes when called recursively, such as from a signal handler. Now it just leaks memory [perl #75556]. =item * Most I/O functions were not warning for unopened handles unless the "closed" and "unopened" warnings categories were both enabled. Now only C<use warnings 'unopened'> is necessary to trigger these warnings, as had always been the intention. =item * There have been several fixes to PerlIO layers: When C<binmode(FH, ":crlf")> pushes the C<:crlf> layer on top of the stack, it no longer enables crlf layers lower in the stack so as to avoid unexpected results [perl #38456]. Opening a file in C<:raw> mode now does what it advertises to do (first open the file, then C<binmode> it), instead of simply leaving off the top layer [perl #80764]. The three layers C<:pop>, C<:utf8>, and C<:bytes> didn't allow stacking when opening a file. For example this: open(FH, ">:pop:perlio", "some.file") or die $!; would throw an "Invalid argument" error. This has been fixed in this release [perl #82484]. =back =head2 Regular Expression Bug Fixes =over =item * The regular expression engine no longer loops when matching C<"\N{LATIN SMALL LIGATURE FF}" =~ /f+/i> and similar expressions [perl #72998] (5.12.1). =item * The trie runtime code should no longer allocate massive amounts of memory, fixing #74484. =item * Syntax errors in C<< (?{...}) >> blocks no longer cause panic messages [perl #2353]. =item * A pattern like C<(?:(o){2})?> no longer causes a "panic" error [perl #39233]. =item * A fatal error in regular expressions containing C<(.*?)> when processing UTF-8 data has been fixed [perl #75680] (5.12.2). =item * An erroneous regular expression engine optimisation that caused regex verbs like C<*COMMIT> sometimes to be ignored has been removed. =item * The regular expression bracketed character class C<[\8\9]> was effectively the same as C<[89\000]>, incorrectly matching a NULL character. It also gave incorrect warnings that the C<8> and C<9> were ignored. Now C<[\8\9]> is the same as C<[89]> and gives legitimate warnings that C<\8> and C<\9> are unrecognized escape sequences, passed-through. =item * A regular expression match in the right-hand side of a global substitution (C<s///g>) that is in the same scope will no longer cause match variables to have the wrong values on subsequent iterations. This can happen when an array or hash subscript is interpolated in the right-hand side, as in C<s|(.)|@a{ print($1), /./ }|g> [perl #19078]. =item * Several cases in which characters in the Latin-1 non-ASCII range (0x80 to 0xFF) used not to match themselves, or used to match both a character class and its complement, have been fixed. For instance, U+00E2 could match both C<\w> and C<\W> [perl #78464] [perl #18281] [perl #60156]. =item * Matching a Unicode character against an alternation containing characters that happened to match continuation bytes in the former's UTF8 representation (like C<qq{\x{30ab}} =~ /\xab|\xa9/>) would cause erroneous warnings [perl #70998]. =item * The trie optimisation was not taking empty groups into account, preventing "foo" from matching C</\A(?:(?:)foo|bar|zot)\z/> [perl #78356]. =item * A pattern containing a C<+> inside a lookahead would sometimes cause an incorrect match failure in a global match (for example, C</(?=(\S+))/g>) [perl #68564]. =item * A regular expression optimisation would sometimes cause a match with a C<{n,m}> quantifier to fail when it should have matched [perl #79152]. =item * Case-insensitive matching in regular expressions compiled under C<use locale> now works much more sanely when the pattern or target string is internally encoded in UTF8. Previously, under these conditions the localeness was completely lost. Now, code points above 255 are treated as Unicode, but code points between 0 and 255 are treated using the current locale rules, regardless of whether the pattern or the string is encoded in UTF8. The few case-insensitive matches that cross the 255/256 boundary are not allowed. For example, 0xFF does not caselessly match the character at 0x178, LATIN CAPITAL LETTER Y WITH DIAERESIS, because 0xFF may not be LATIN SMALL LETTER Y in the current locale, and Perl has no way of knowing if that character even exists in the locale, much less what code point it is. =item * The C<(?|...)> regular expression construct no longer crashes if the final branch has more sets of capturing parentheses than any other branch. This was fixed in Perl 5.10.1 for the case of a single branch, but that fix did not take multiple branches into account [perl #84746]. =item * A bug has been fixed in the implementation of C<{...}> quantifiers in regular expressions that prevented the code block in C</((\w+)(?{ print $2 })){2}/> from seeing the C<$2> sometimes [perl #84294]. =back =head2 Syntax/Parsing Bugs =over =item * C<when (scalar) {...}> no longer crashes, but produces a syntax error [perl #74114] (5.12.1). =item * A label right before a string eval (C<foo: eval $string>) no longer causes the label to be associated also with the first statement inside the eval [perl #74290] (5.12.1). =item * The C<no 5.13.2> form of C<no> no longer tries to turn on features or pragmata (like L<strict>) [perl #70075] (5.12.2). =item * C<BEGIN {require 5.12.0}> now behaves as documented, rather than behaving identically to C<use 5.12.0>. Previously, C<require> in a C<BEGIN> block was erroneously executing the C<use feature ':5.12.0'> and C<use strict> behaviour, which only C<use> was documented to provide [perl #69050]. =item * A regression introduced in Perl 5.12.0, making C<< my $x = 3; $x = length(undef) >> result in C<$x> set to C<3> has been fixed. C<$x> will now be C<undef> [perl #85508] (5.12.2). =item * When strict "refs" mode is off, C<%{...}> in rvalue context returns C<undef> if its argument is undefined. An optimisation introduced in Perl 5.12.0 to make C<keys %{...}> faster when used as a boolean did not take this into account, causing C<keys %{+undef}> (and C<keys %$foo> when C<$foo> is undefined) to be an error, which it should be so in strict mode only [perl #81750]. =item * Constant-folding used to cause $text =~ ( 1 ? /phoo/ : /bear/) to turn into $text =~ /phoo/ at compile time. Now it correctly matches against C<$_> [perl #20444]. =item * Parsing Perl code (either with string C<eval> or by loading modules) from within a C<UNITCHECK> block no longer causes the interpreter to crash [perl #70614]. =item * String C<eval>s no longer fail after 2 billion scopes have been compiled [perl #83364]. =item * The parser no longer hangs when encountering certain Unicode characters, such as U+387 [perl #74022]. =item * Defining a constant with the same name as one of Perl's special blocks (like C<INIT>) stopped working in 5.12.0, but has now been fixed [perl #78634]. =item * A reference to a literal value used as a hash key (C<$hash{\"foo"}>) used to be stringified, even if the hash was tied [perl #79178]. =item * A closure containing an C<if> statement followed by a constant or variable is no longer treated as a constant [perl #63540]. =item * C<state> can now be used with attributes. It used to mean the same thing as C<my> if any attributes were present [perl #68658]. =item * Expressions like C<< @$a > 3 >> no longer cause C<$a> to be mentioned in the "Use of uninitialized value in numeric gt" warning when C<$a> is undefined (since it is not part of the C<< > >> expression, but the operand of the C<@>) [perl #72090]. =item * Accessing an element of a package array with a hard-coded number (as opposed to an arbitrary expression) would crash if the array did not exist. Usually the array would be autovivified during compilation, but typeglob manipulation could remove it, as in these two cases which used to crash: *d = *a; print $d[0]; undef *d; print $d[0]; =item * The B<-C> command-line option, when used on the shebang line, can now be followed by other options [perl #72434]. =item * The C<B> module was returning C<B::OP>s instead of C<B::LOGOP>s for C<entertry> [perl #80622]. This was due to a bug in the Perl core, not in C<B> itself. =back =head2 Stashes, Globs and Method Lookup Perl 5.10.0 introduced a new internal mechanism for caching MROs (method resolution orders, or lists of parent classes; aka "isa" caches) to make method lookup faster (so C<@ISA> arrays would not have to be searched repeatedly). Unfortunately, this brought with it quite a few bugs. Almost all of these have been fixed now, along with a few MRO-related bugs that existed before 5.10.0: =over =item * The following used to have erratic effects on method resolution, because the "isa" caches were not reset or otherwise ended up listing the wrong classes. These have been fixed. =over =item Aliasing packages by assigning to globs [perl #77358] =item Deleting packages by deleting their containing stash elements =item Undefining the glob containing a package (C<undef *Foo::>) =item Undefining an ISA glob (C<undef *Foo::ISA>) =item Deleting an ISA stash element (C<delete $Foo::{ISA}>) =item Sharing @ISA arrays between classes (via C<*Foo::ISA = \@Bar::ISA> or C<*Foo::ISA = *Bar::ISA>) [perl #77238] =back C<undef *Foo::ISA> would even stop a new C<@Foo::ISA> array from updating caches. =item * Typeglob assignments would crash if the glob's stash no longer existed, so long as the glob assigned to were named C<ISA> or the glob on either side of the assignment contained a subroutine. =item * C<PL_isarev>, which is accessible to Perl via C<mro::get_isarev> is now updated properly when packages are deleted or removed from the C<@ISA> of other classes. This allows many packages to be created and deleted without causing a memory leak [perl #75176]. =back In addition, various other bugs related to typeglobs and stashes have been fixed: =over =item * Some work has been done on the internal pointers that link between symbol tables (stashes), typeglobs, and subroutines. This has the effect that various edge cases related to deleting stashes or stash entries (for example, <%FOO:: = ()>), and complex typeglob or code-reference aliasing, will no longer crash the interpreter. =item * Assigning a reference to a glob copy now assigns to a glob slot instead of overwriting the glob with a scalar [perl #1804] [perl #77508]. =item * A bug when replacing the glob of a loop variable within the loop has been fixed [perl #21469]. This means the following code will no longer crash: for $x (...) { *x = *y; } =item * Assigning a glob to a PVLV used to convert it to a plain string. Now it works correctly, and a PVLV can hold a glob. This would happen when a nonexistent hash or array element was passed to a subroutine: sub { $_[0] = *foo }->($hash{key}); # $_[0] would have been the string "*main::foo" It also happened when a glob was assigned to, or returned from, an element of a tied array or hash [perl #36051]. =item * When trying to report C<Use of uninitialized value $Foo::BAR>, crashes could occur if the glob holding the global variable in question had been detached from its original stash by, for example, C<delete $::{"Foo::"}>. This has been fixed by disabling the reporting of variable names in those cases. =item * During the restoration of a localised typeglob on scope exit, any destructors called as a result would be able to see the typeglob in an inconsistent state, containing freed entries, which could result in a crash. This would affect code like this: local *@; eval { die bless [] }; # puts an object in $@ sub DESTROY { local $@; # boom } Now the glob entries are cleared before any destructors are called. This also means that destructors can vivify entries in the glob. So Perl tries again and, if the entries are re-created too many times, dies with a "panic: gp_free ..." error message. =item * If a typeglob is freed while a subroutine attached to it is still referenced elsewhere, the subroutine is renamed to C<__ANON__> in the same package, unless the package has been undefined, in which case the C<__ANON__> package is used. This could cause packages to be sometimes autovivified, such as if the package had been deleted. Now this no longer occurs. The C<__ANON__> package is also now used when the original package is no longer attached to the symbol table. This avoids memory leaks in some cases [perl #87664]. =item * Subroutines and package variables inside a package whose name ends with C<::> can now be accessed with a fully qualified name. =back =head2 Unicode =over =item * What has become known as "the Unicode Bug" is almost completely resolved in this release. Under C<use feature 'unicode_strings'> (which is automatically selected by C<use 5.012> and above), the internal storage format of a string no longer affects the external semantics. [perl #58182]. There are two known exceptions: =over =item 1 The now-deprecated, user-defined case-changing functions require utf8-encoded strings to operate. The CPAN module L<Unicode::Casing> has been written to replace this feature without its drawbacks, and the feature is scheduled to be removed in 5.16. =item 2 quotemeta() (and its in-line equivalent C<\Q>) can also give different results depending on whether a string is encoded in UTF-8. See L<perlunicode/The "Unicode Bug">. =back =item * Handling of Unicode non-character code points has changed. Previously they were mostly considered illegal, except that in some place only one of the 66 of them was known. The Unicode Standard considers them all legal, but forbids their "open interchange". This is part of the change to allow internal use of any code point (see L</Core Enhancements>). Together, these changes resolve [perl #38722], [perl #51918], [perl #51936], and [perl #63446]. =item * Case-insensitive C<"/i"> regular expression matching of Unicode characters that match multiple characters now works much more as intended. For example "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi/ui and "ffi" =~ /\N{LATIN SMALL LIGATURE FFI}/ui are both true. Previously, there were many bugs with this feature. What hasn't been fixed are the places where the pattern contains the multiple characters, but the characters are split up by other things, such as in "\N{LATIN SMALL LIGATURE FFI}" =~ /(f)(f)i/ui or "\N{LATIN SMALL LIGATURE FFI}" =~ /ffi*/ui or "\N{LATIN SMALL LIGATURE FFI}" =~ /[a-f][f-m][g-z]/ui None of these match. Also, this matching doesn't fully conform to the current Unicode Standard, which asks that the matching be made upon the NFD (Normalization Form Decomposed) of the text. However, as of this writing (April 2010), the Unicode Standard is currently in flux about what they will recommend doing with regard in such scenarios. It may be that they will throw out the whole concept of multi-character matches. [perl #71736]. =item * Naming a deprecated character in C<\N{I<NAME>}> no longer leaks memory. =item * We fixed a bug that could cause C<\N{I<NAME>}> constructs followed by a single C<"."> to be parsed incorrectly [perl #74978] (5.12.1). =item * C<chop> now correctly handles characters above C<"\x{7fffffff}"> [perl #73246]. =item * Passing to C<index> an offset beyond the end of the string when the string is encoded internally in UTF8 no longer causes panics [perl #75898]. =item * warn() and die() now respect utf8-encoded scalars [perl #45549]. =item * Sometimes the UTF8 length cache would not be reset on a value returned by substr, causing C<length(substr($uni_string, ...))> to give wrong answers. With C<${^UTF8CACHE}> set to -1, it would also produce a "panic" error message [perl #77692]. =back =head2 Ties, Overloading and Other Magic =over =item * Overloading now works properly in conjunction with tied variables. What formerly happened was that most ops checked their arguments for overloading I<before> checking for magic, so for example an overloaded object returned by a tied array access would usually be treated as not overloaded [RT #57012]. =item * Various instances of magic (like tie methods) being called on tied variables too many or too few times have been fixed: =over =item * C<< $tied->() >> did not always call FETCH [perl #8438]. =item * Filetest operators and C<y///> and C<tr///> were calling FETCH too many times. =item * The C<=> operator used to ignore magic on its right-hand side if the scalar happened to hold a typeglob (if a typeglob was the last thing returned from or assigned to a tied scalar) [perl #77498]. =item * Dereference operators used to ignore magic if the argument was a reference already (such as from a previous FETCH) [perl #72144]. =item * C<splice> now calls set-magic (so changes made by C<splice @ISA> are respected by method calls) [perl #78400]. =item * In-memory files created by C<< open($fh, ">", \$buffer) >> were not calling FETCH/STORE at all [perl #43789] (5.12.2). =item * utf8::is_utf8() now respects get-magic (like C<$1>) (5.12.1). =back =item * Non-commutative binary operators used to swap their operands if the same tied scalar was used for both operands and returned a different value for each FETCH. For instance, if C<$t> returned 2 the first time and 3 the second, then C<$t/$t> would evaluate to 1.5. This has been fixed [perl #87708]. =item * String C<eval> now detects taintedness of overloaded or tied arguments [perl #75716]. =item * String C<eval> and regular expression matches against objects with string overloading no longer cause memory corruption or crashes [perl #77084]. =item * L<readline|perlfunc/"readline EXPR"> now honors C<< <> >> overloading on tied arguments. =item * C<< <expr> >> always respects overloading now if the expression is overloaded. Because "S<< <> as >> glob" was parsed differently from "S<< <> as >> filehandle" from 5.6 onwards, something like C<< <$foo[0]> >> did not handle overloading, even if C<$foo[0]> was an overloaded object. This was contrary to the documentation for L<overload>, and meant that C<< <> >> could not be used as a general overloaded iterator operator. =item * The fallback behaviour of overloading on binary operators was asymmetric [perl #71286]. =item * Magic applied to variables in the main package no longer affects other packages. See L</Magic variables outside the main package> above [perl #76138]. =item * Sometimes magic (ties, taintedness, etc.) attached to variables could cause an object to last longer than it should, or cause a crash if a tied variable were freed from within a tie method. These have been fixed [perl #81230]. =item * DESTROY methods of objects implementing ties are no longer able to crash by accessing the tied variable through a weak reference [perl #86328]. =item * Fixed a regression of kill() when a match variable is used for the process ID to kill [perl #75812]. =item * C<$AUTOLOAD> used to remain tainted forever if it ever became tainted. Now it is correctly untainted if an autoloaded method is called and the method name was not tainted. =item * C<sprintf> now dies when passed a tainted scalar for the format. It did already die for arbitrary expressions, but not for simple scalars [perl #82250]. =item * C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> no longer return untainted strings when the argument is tainted. This has been broken since perl 5.8.9 [perl #87336]. =back =head2 The Debugger =over =item * The Perl debugger now also works in taint mode [perl #76872]. =item * Subroutine redefinition works once more in the debugger [perl #48332]. =item * When B<-d> is used on the shebang (C<#!>) line, the debugger now has access to the lines of the main program. In the past, this sometimes worked and sometimes did not, depending on the order in which things happened to be arranged in memory [perl #71806]. =item * A possible memory leak when using L<caller()|perlfunc/"caller EXPR"> to set C<@DB::args> has been fixed (5.12.2). =item * Perl no longer stomps on C<$DB::single>, C<$DB::trace>, and C<$DB::signal> if these variables already have values when C<$^P> is assigned to [perl #72422]. =item * C<#line> directives in string evals were not properly updating the arrays of lines of code (C<< @{"_< ..."} >>) that the debugger (or any debugging or profiling module) uses. In threaded builds, they were not being updated at all. In non-threaded builds, the line number was ignored, so any change to the existing line number would cause the lines to be misnumbered [perl #79442]. =back =head2 Threads =over =item * Perl no longer accidentally clones lexicals in scope within active stack frames in the parent when creating a child thread [perl #73086]. =item * Several memory leaks in cloning and freeing threaded Perl interpreters have been fixed [perl #77352]. =item * Creating a new thread when directory handles were open used to cause a crash, because the handles were not cloned, but simply passed to the new thread, resulting in a double free. Now directory handles are cloned properly on Windows and on systems that have a C<fchdir> function. On other systems, new threads simply do not inherit directory handles from their parent threads [perl #75154]. =item * The typeglob C<*,>, which holds the scalar variable C<$,> (output field separator), had the wrong reference count in child threads. =item * [perl #78494] When pipes are shared between threads, the C<close> function (and any implicit close, such as on thread exit) no longer blocks. =item * Perl now does a timely cleanup of SVs that are cloned into a new thread but then discovered to be orphaned (that is, their owners are I<not> cloned). This eliminates several "scalars leaked" warnings when joining threads. =back =head2 Scoping and Subroutines =over =item * Lvalue subroutines are again able to return copy-on-write scalars. This had been broken since version 5.10.0 [perl #75656] (5.12.3). =item * C<require> no longer causes C<caller> to return the wrong file name for the scope that called C<require> and other scopes higher up that had the same file name [perl #68712]. =item * C<sort> with a C<($$)>-prototyped comparison routine used to cause the value of C<@_> to leak out of the sort. Taking a reference to C<@_> within the sorting routine could cause a crash [perl #72334]. =item * Match variables (like C<$1>) no longer persist between calls to a sort subroutine [perl #76026]. =item * Iterating with C<foreach> over an array returned by an lvalue sub now works [perl #23790]. =item * C<$@> is now localised during calls to C<binmode> to prevent action at a distance [perl #78844]. =item * Calling a closure prototype (what is passed to an attribute handler for a closure) now results in a "Closure prototype called" error message instead of a crash [perl #68560]. =item * Mentioning a read-only lexical variable from the enclosing scope in a string C<eval> no longer causes the variable to become writable [perl #19135]. =back =head2 Signals =over =item * Within signal handlers, C<$!> is now implicitly localized. =item * CHLD signals are no longer unblocked after a signal handler is called if they were blocked before by C<POSIX::sigprocmask> [perl #82040]. =item * A signal handler called within a signal handler could cause leaks or double-frees. Now fixed [perl #76248]. =back =head2 Miscellaneous Memory Leaks =over =item * Several memory leaks when loading XS modules were fixed (5.12.2). =item * L<substr()|perlfunc/"substr EXPR,OFFSET,LENGTH,REPLACEMENT">, L<pos()|perlfunc/"index STR,SUBSTR,POSITION">, L<keys()|perlfunc/"keys HASH">, and L<vec()|perlfunc/"vec EXPR,OFFSET,BITS"> could, when used in combination with lvalues, result in leaking the scalar value they operate on, and cause its destruction to happen too late. This has now been fixed. =item * The postincrement and postdecrement operators, C<++> and C<-->, used to cause leaks when used on references. This has now been fixed. =item * Nested C<map> and C<grep> blocks no longer leak memory when processing large lists [perl #48004]. =item * C<use I<VERSION>> and C<no I<VERSION>> no longer leak memory [perl #78436] [perl #69050]. =item * C<.=> followed by C<< <> >> or C<readline> would leak memory if C<$/> contained characters beyond the octet range and the scalar assigned to happened to be encoded as UTF8 internally [perl #72246]. =item * C<eval 'BEGIN{die}'> no longer leaks memory on non-threaded builds. =back =head2 Memory Corruption and Crashes =over =item * glob() no longer crashes when C<%File::Glob::> is empty and C<CORE::GLOBAL::glob> isn't present [perl #75464] (5.12.2). =item * readline() has been fixed when interrupted by signals so it no longer returns the "same thing" as before or random memory. =item * When assigning a list with duplicated keys to a hash, the assignment used to return garbage and/or freed values: @a = %h = (list with some duplicate keys); This has now been fixed [perl #31865]. =item * The mechanism for freeing objects in globs used to leave dangling pointers to freed SVs, meaning Perl users could see corrupted state during destruction. Perl now frees only the affected slots of the GV, rather than freeing the GV itself. This makes sure that there are no dangling refs or corrupted state during destruction. =item * The interpreter no longer crashes when freeing deeply-nested arrays of arrays. Hashes have not been fixed yet [perl #44225]. =item * Concatenating long strings under C<use encoding> no longer causes Perl to crash [perl #78674]. =item * Calling C<< ->import >> on a class lacking an import method could corrupt the stack, resulting in strange behaviour. For instance, push @a, "foo", $b = bar->import; would assign "foo" to C<$b> [perl #63790]. =item * The C<recv> function could crash when called with the MSG_TRUNC flag [perl #75082]. =item * C<formline> no longer crashes when passed a tainted format picture. It also taints C<$^A> now if its arguments are tainted [perl #79138]. =item * A bug in how we process filetest operations could cause a segfault. Filetests don't always expect an op on the stack, so we now use TOPs only if we're sure that we're not C<stat>ing the C<_> filehandle. This is indicated by C<OPf_KIDS> (as checked in ck_ftst) [perl #74542] (5.12.1). =item * unpack() now handles scalar context correctly for C<%32H> and C<%32u>, fixing a potential crash. split() would crash because the third item on the stack wasn't the regular expression it expected. C<unpack("%2H", ...)> would return both the unpacked result and the checksum on the stack, as would C<unpack("%2u", ...)> [perl #73814] (5.12.2). =back =head2 Fixes to Various Perl Operators =over =item * The C<&>, C<|>, and C<^> bitwise operators no longer coerce read-only arguments [perl #20661]. =item * Stringifying a scalar containing "-0.0" no longer has the effect of turning false into true [perl #45133]. =item * Some numeric operators were converting integers to floating point, resulting in loss of precision on 64-bit platforms [perl #77456]. =item * sprintf() was ignoring locales when called with constant arguments [perl #78632]. =item * Combining the vector (C<%v>) flag and dynamic precision would cause C<sprintf> to confuse the order of its arguments, making it treat the string as the precision and vice-versa [perl #83194]. =back =head2 Bugs Relating to the C API =over =item * The C-level C<lex_stuff_pvn> function would sometimes cause a spurious syntax error on the last line of the file if it lacked a final semicolon [perl #74006] (5.12.1). =item * The C<eval_sv> and C<eval_pv> C functions now set C<$@> correctly when there is a syntax error and no C<G_KEEPERR> flag, and never set it if the C<G_KEEPERR> flag is present [perl #3719]. =item * The XS multicall API no longer causes subroutines to lose reference counts if called via the multicall interface from within those very subroutines. This affects modules like L<List::Util>. Calling one of its functions with an active subroutine as the first argument could cause a crash [perl #78070]. =item * The C<SvPVbyte> function available to XS modules now calls magic before downgrading the SV, to avoid warnings about wide characters [perl #72398]. =item * The ref types in the typemap for XS bindings now support magical variables [perl #72684]. =item * C<sv_catsv_flags> no longer calls C<mg_get> on its second argument (the source string) if the flags passed to it do not include SV_GMAGIC. So it now matches the documentation. =item * C<my_strftime> no longer leaks memory. This fixes a memory leak in C<POSIX::strftime> [perl #73520]. =item * F<XSUB.h> now correctly redefines fgets under PERL_IMPLICIT_SYS [perl #55049] (5.12.1). =item * XS code using fputc() or fputs() on Windows could cause an error due to their arguments being swapped [perl #72704] (5.12.1). =item * A possible segfault in the C<T_PTROBJ> default typemap has been fixed (5.12.2). =item * A bug that could cause "Unknown error" messages when C<call_sv(code, G_EVAL)> is called from an XS destructor has been fixed (5.12.2). =back =head1 Known Problems This is a list of significant unresolved issues which are regressions from earlier versions of Perl or which affect widely-used CPAN modules. =over 4 =item * C<List::Util::first> misbehaves in the presence of a lexical C<$_> (typically introduced by C<my $_> or implicitly by C<given>). The variable that gets set for each iteration is the package variable C<$_>, not the lexical C<$_>. A similar issue may occur in other modules that provide functions which take a block as their first argument, like foo { ... $_ ...} list See also: L<http://rt.perl.org/rt3/Public/Bug/Display.html?id=67694> =item * readline() returns an empty string instead of a cached previous value when it is interrupted by a signal =item * The changes in prototype handling break L<Switch>. A patch has been sent upstream and will hopefully appear on CPAN soon. =item * The upgrade to F<ExtUtils-MakeMaker-6.57_05> has caused some tests in the F<Module-Install> distribution on CPAN to fail. (Specifically, F<02_mymeta.t> tests 5 and 21; F<18_all_from.t> tests 6 and 15; F<19_authors.t> tests 5, 13, 21, and 29; and F<20_authors_with_special_characters.t> tests 6, 15, and 23 in version 1.00 of that distribution now fail.) =item * On VMS, C<Time::HiRes> tests will fail due to a bug in the CRTL's implementation of C<setitimer>: previous timer values would be cleared if a timer expired but not if the timer was reset before expiring. HP OpenVMS Engineering have corrected the problem and will release a patch in due course (Quix case # QXCM1001115136). =item * On VMS, there were a handful of C<Module::Build> test failures we didn't get to before the release; please watch CPAN for updates. =back =head1 Errata =head2 keys(), values(), and each() work on arrays You can now use the keys(), values(), and each() builtins on arrays; previously you could use them only on hashes. See L<perlfunc> for details. This is actually a change introduced in perl 5.12.0, but it was missed from that release's L<perl5120delta>. =head2 split() and C<@_> split() no longer modifies C<@_> when called in scalar or void context. In void context it now produces a "Useless use of split" warning. This was also a perl 5.12.0 change that missed the perldelta. =head1 Obituary Randy Kobes, creator of http://kobesearch.cpan.org/ and contributor/maintainer to several core Perl toolchain modules, passed away on September 18, 2010 after a battle with lung cancer. The community was richer for his involvement. He will be missed. =head1 Acknowledgements Perl 5.14.0 represents one year of development since Perl 5.12.0 and contains nearly 550,000 lines of changes across nearly 3,000 files from 150 authors and committers. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.14.0: Aaron Crane, Abhijit Menon-Sen, Abigail, Ævar Arnfjörð Bjarmason, Alastair Douglas, Alexander Alekseev, Alexander Hartmaier, Alexandr Ciornii, Alex Davies, Alex Vandiver, Ali Polatel, Allen Smith, Andreas König, Andrew Rodland, Andy Armstrong, Andy Dougherty, Aristotle Pagaltzis, Arkturuz, Arvan, A. Sinan Unur, Ben Morrow, Bo Lindbergh, Boris Ratner, Brad Gilbert, Bram, brian d foy, Brian Phillips, Casey West, Charles Bailey, Chas. Owens, Chip Salzenberg, Chris 'BinGOs' Williams, chromatic, Craig A. Berry, Curtis Jewell, Dagfinn Ilmari Mannsåker, Dan Dascalescu, Dave Rolsky, David Caldwell, David Cantrell, David Golden, David Leadbeater, David Mitchell, David Wheeler, Eric Brine, Father Chrysostomos, Fingle Nark, Florian Ragwitz, Frank Wiegand, Franz Fasching, Gene Sullivan, George Greer, Gerard Goossen, Gisle Aas, Goro Fuji, Grant McLean, gregor herrmann, H.Merijn Brand, Hongwen Qiu, Hugo van der Sanden, Ian Goodacre, James E Keenan, James Mastros, Jan Dubois, Jay Hannah, Jerry D. Hedden, Jesse Vincent, Jim Cromie, Jirka Hruška, John Peacock, Joshua ben Jore, Joshua Pritikin, Karl Williamson, Kevin Ryde, kmx, Lars Dɪᴇᴄᴋᴏᴡ 迪拉斯, Larwan Berke, Leon Brocard, Leon Timmermans, Lubomir Rintel, Lukas Mai, Maik Hentsche, Marty Pauley, Marvin Humphrey, Matt Johnson, Matt S Trout, Max Maischein, Michael Breen, Michael Fig, Michael G Schwern, Michael Parker, Michael Stevens, Michael Witten, Mike Kelly, Moritz Lenz, Nicholas Clark, Nick Cleaton, Nick Johnston, Nicolas Kaiser, Niko Tyni, Noirin Shirley, Nuno Carvalho, Paul Evans, Paul Green, Paul Johnson, Paul Marquess, Peter J. Holzer, Peter John Acklam, Peter Martini, Philippe Bruhat (BooK), Piotr Fusik, Rafael Garcia-Suarez, Rainer Tammer, Reini Urban, Renee Baecker, Ricardo Signes, Richard Möhn, Richard Soderberg, Rob Hoelz, Robin Barker, Ruslan Zakirov, Salvador Fandiño, Salvador Ortiz Garcia, Shlomi Fish, Sinan Unur, Sisyphus, Slaven Rezic, Steffen Müller, Steve Hay, Steven Schubiger, Steve Peters, Sullivan Beck, Tatsuhiko Miyagawa, Tim Bunce, Todd Rinaldo, Tom Christiansen, Tom Hukins, Tony Cook, Tye McQueen, Vadim Konovalov, Vernon Lyon, Vincent Pit, Walt Mankowski, Wolfram Humann, Yves Orton, Zefram, and Zsbán Ambrus. This is woefully incomplete as it's automatically generated from version control history. In particular, it doesn't include the names of the (very much appreciated) contributors who reported issues in previous versions of Perl that helped make Perl 5.14.0 better. For a more complete list of all of Perl's historical contributors, please see the C<AUTHORS> file in the Perl 5.14.0 distribution. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the Perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who are able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please use this address for security issues in the Perl core I<only>, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[I<�h �h perlootut.podnu �[��� =encoding utf8 =for comment Consistent formatting of this file is achieved with: perl ./Porting/podtidy pod/perlootut.pod =head1 NAME perlootut - Object-Oriented Programming in Perl Tutorial =head1 DATE This document was created in February, 2011, and the last major revision was in February, 2013. If you are reading this in the future then it's possible that the state of the art has changed. We recommend you start by reading the perlootut document in the latest stable release of Perl, rather than this version. =head1 DESCRIPTION This document provides an introduction to object-oriented programming in Perl. It begins with a brief overview of the concepts behind object oriented design. Then it introduces several different OO systems from L<CPAN|https://www.cpan.org> which build on top of what Perl provides. By default, Perl's built-in OO system is very minimal, leaving you to do most of the work. This minimalism made a lot of sense in 1994, but in the years since Perl 5.0 we've seen a number of common patterns emerge in Perl OO. Fortunately, Perl's flexibility has allowed a rich ecosystem of Perl OO systems to flourish. If you want to know how Perl OO works under the hood, the L<perlobj> document explains the nitty gritty details. This document assumes that you already understand the basics of Perl syntax, variable types, operators, and subroutine calls. If you don't understand these concepts yet, please read L<perlintro> first. You should also read the L<perlsyn>, L<perlop>, and L<perlsub> documents. =head1 OBJECT-ORIENTED FUNDAMENTALS Most object systems share a number of common concepts. You've probably heard terms like "class", "object, "method", and "attribute" before. Understanding the concepts will make it much easier to read and write object-oriented code. If you're already familiar with these terms, you should still skim this section, since it explains each concept in terms of Perl's OO implementation. Perl's OO system is class-based. Class-based OO is fairly common. It's used by Java, C++, C#, Python, Ruby, and many other languages. There are other object orientation paradigms as well. JavaScript is the most popular language to use another paradigm. JavaScript's OO system is prototype-based. =head2 Object An B<object> is a data structure that bundles together data and subroutines which operate on that data. An object's data is called B<attributes>, and its subroutines are called B<methods>. An object can be thought of as a noun (a person, a web service, a computer). An object represents a single discrete thing. For example, an object might represent a file. The attributes for a file object might include its path, content, and last modification time. If we created an object to represent F</etc/hostname> on a machine named "foo.example.com", that object's path would be "/etc/hostname", its content would be "foo\n", and it's last modification time would be 1304974868 seconds since the beginning of the epoch. The methods associated with a file might include C<rename()> and C<write()>. In Perl most objects are hashes, but the OO systems we recommend keep you from having to worry about this. In practice, it's best to consider an object's internal data structure opaque. =head2 Class A B<class> defines the behavior of a category of objects. A class is a name for a category (like "File"), and a class also defines the behavior of objects in that category. All objects belong to a specific class. For example, our F</etc/hostname> object belongs to the C<File> class. When we want to create a specific object, we start with its class, and B<construct> or B<instantiate> an object. A specific object is often referred to as an B<instance> of a class. In Perl, any package can be a class. The difference between a package which is a class and one which isn't is based on how the package is used. Here's our "class declaration" for the C<File> class: package File; In Perl, there is no special keyword for constructing an object. However, most OO modules on CPAN use a method named C<new()> to construct a new object: my $hostname = File->new( path => '/etc/hostname', content => "foo\n", last_mod_time => 1304974868, ); (Don't worry about that C<< -> >> operator, it will be explained later.) =head3 Blessing As we said earlier, most Perl objects are hashes, but an object can be an instance of any Perl data type (scalar, array, etc.). Turning a plain data structure into an object is done by B<blessing> that data structure using Perl's C<bless> function. While we strongly suggest you don't build your objects from scratch, you should know the term B<bless>. A B<blessed> data structure (aka "a referent") is an object. We sometimes say that an object has been "blessed into a class". Once a referent has been blessed, the C<blessed> function from the L<Scalar::Util> core module can tell us its class name. This subroutine returns an object's class when passed an object, and false otherwise. use Scalar::Util 'blessed'; print blessed($hash); # undef print blessed($hostname); # File =head3 Constructor A B<constructor> creates a new object. In Perl, a class's constructor is just another method, unlike some other languages, which provide syntax for constructors. Most Perl classes use C<new> as the name for their constructor: my $file = File->new(...); =head2 Methods You already learned that a B<method> is a subroutine that operates on an object. You can think of a method as the things that an object can I<do>. If an object is a noun, then methods are its verbs (save, print, open). In Perl, methods are simply subroutines that live in a class's package. Methods are always written to receive the object as their first argument: sub print_info { my $self = shift; print "This file is at ", $self->path, "\n"; } $file->print_info; # The file is at /etc/hostname What makes a method special is I<how it's called>. The arrow operator (C<< -> >>) tells Perl that we are calling a method. When we make a method call, Perl arranges for the method's B<invocant> to be passed as the first argument. B<Invocant> is a fancy name for the thing on the left side of the arrow. The invocant can either be a class name or an object. We can also pass additional arguments to the method: sub print_info { my $self = shift; my $prefix = shift // "This file is at "; print $prefix, ", ", $self->path, "\n"; } $file->print_info("The file is located at "); # The file is located at /etc/hostname =head2 Attributes Each class can define its B<attributes>. When we instantiate an object, we assign values to those attributes. For example, every C<File> object has a path. Attributes are sometimes called B<properties>. Perl has no special syntax for attributes. Under the hood, attributes are often stored as keys in the object's underlying hash, but don't worry about this. We recommend that you only access attributes via B<accessor> methods. These are methods that can get or set the value of each attribute. We saw this earlier in the C<print_info()> example, which calls C<< $self->path >>. You might also see the terms B<getter> and B<setter>. These are two types of accessors. A getter gets the attribute's value, while a setter sets it. Another term for a setter is B<mutator> Attributes are typically defined as read-only or read-write. Read-only attributes can only be set when the object is first created, while read-write attributes can be altered at any time. The value of an attribute may itself be another object. For example, instead of returning its last mod time as a number, the C<File> class could return a L<DateTime> object representing that value. It's possible to have a class that does not expose any publicly settable attributes. Not every class has attributes and methods. =head2 Polymorphism B<Polymorphism> is a fancy way of saying that objects from two different classes share an API. For example, we could have C<File> and C<WebPage> classes which both have a C<print_content()> method. This method might produce different output for each class, but they share a common interface. While the two classes may differ in many ways, when it comes to the C<print_content()> method, they are the same. This means that we can try to call the C<print_content()> method on an object of either class, and B<we don't have to know what class the object belongs to!> Polymorphism is one of the key concepts of object-oriented design. =head2 Inheritance B<Inheritance> lets you create a specialized version of an existing class. Inheritance lets the new class reuse the methods and attributes of another class. For example, we could create an C<File::MP3> class which B<inherits> from C<File>. An C<File::MP3> B<is-a> I<more specific> type of C<File>. All mp3 files are files, but not all files are mp3 files. We often refer to inheritance relationships as B<parent-child> or C<superclass>/C<subclass> relationships. Sometimes we say that the child has an B<is-a> relationship with its parent class. C<File> is a B<superclass> of C<File::MP3>, and C<File::MP3> is a B<subclass> of C<File>. package File::MP3; use parent 'File'; The L<parent> module is one of several ways that Perl lets you define inheritance relationships. Perl allows multiple inheritance, which means that a class can inherit from multiple parents. While this is possible, we strongly recommend against it. Generally, you can use B<roles> to do everything you can do with multiple inheritance, but in a cleaner way. Note that there's nothing wrong with defining multiple subclasses of a given class. This is both common and safe. For example, we might define C<File::MP3::FixedBitrate> and C<File::MP3::VariableBitrate> classes to distinguish between different types of mp3 file. =head3 Overriding methods and method resolution Inheritance allows two classes to share code. By default, every method in the parent class is also available in the child. The child can explicitly B<override> a parent's method to provide its own implementation. For example, if we have an C<File::MP3> object, it has the C<print_info()> method from C<File>: my $cage = File::MP3->new( path => 'mp3s/My-Body-Is-a-Cage.mp3', content => $mp3_data, last_mod_time => 1304974868, title => 'My Body Is a Cage', ); $cage->print_info; # The file is at mp3s/My-Body-Is-a-Cage.mp3 If we wanted to include the mp3's title in the greeting, we could override the method: package File::MP3; use parent 'File'; sub print_info { my $self = shift; print "This file is at ", $self->path, "\n"; print "Its title is ", $self->title, "\n"; } $cage->print_info; # The file is at mp3s/My-Body-Is-a-Cage.mp3 # Its title is My Body Is a Cage The process of determining what method should be used is called B<method resolution>. What Perl does is look at the object's class first (C<File::MP3> in this case). If that class defines the method, then that class's version of the method is called. If not, Perl looks at each parent class in turn. For C<File::MP3>, its only parent is C<File>. If C<File::MP3> does not define the method, but C<File> does, then Perl calls the method in C<File>. If C<File> inherited from C<DataSource>, which inherited from C<Thing>, then Perl would keep looking "up the chain" if necessary. It is possible to explicitly call a parent method from a child: package File::MP3; use parent 'File'; sub print_info { my $self = shift; $self->SUPER::print_info(); print "Its title is ", $self->title, "\n"; } The C<SUPER::> bit tells Perl to look for the C<print_info()> in the C<File::MP3> class's inheritance chain. When it finds the parent class that implements this method, the method is called. We mentioned multiple inheritance earlier. The main problem with multiple inheritance is that it greatly complicates method resolution. See L<perlobj> for more details. =head2 Encapsulation B<Encapsulation> is the idea that an object is opaque. When another developer uses your class, they don't need to know I<how> it is implemented, they just need to know I<what> it does. Encapsulation is important for several reasons. First, it allows you to separate the public API from the private implementation. This means you can change that implementation without breaking the API. Second, when classes are well encapsulated, they become easier to subclass. Ideally, a subclass uses the same APIs to access object data that its parent class uses. In reality, subclassing sometimes involves violating encapsulation, but a good API can minimize the need to do this. We mentioned earlier that most Perl objects are implemented as hashes under the hood. The principle of encapsulation tells us that we should not rely on this. Instead, we should use accessor methods to access the data in that hash. The object systems that we recommend below all automate the generation of accessor methods. If you use one of them, you should never have to access the object as a hash directly. =head2 Composition In object-oriented code, we often find that one object references another object. This is called B<composition>, or a B<has-a> relationship. Earlier, we mentioned that the C<File> class's C<last_mod_time> accessor could return a L<DateTime> object. This is a perfect example of composition. We could go even further, and make the C<path> and C<content> accessors return objects as well. The C<File> class would then be B<composed> of several other objects. =head2 Roles B<Roles> are something that a class I<does>, rather than something that it I<is>. Roles are relatively new to Perl, but have become rather popular. Roles are B<applied> to classes. Sometimes we say that classes B<consume> roles. Roles are an alternative to inheritance for providing polymorphism. Let's assume we have two classes, C<Radio> and C<Computer>. Both of these things have on/off switches. We want to model that in our class definitions. We could have both classes inherit from a common parent, like C<Machine>, but not all machines have on/off switches. We could create a parent class called C<HasOnOffSwitch>, but that is very artificial. Radios and computers are not specializations of this parent. This parent is really a rather ridiculous creation. This is where roles come in. It makes a lot of sense to create a C<HasOnOffSwitch> role and apply it to both classes. This role would define a known API like providing C<turn_on()> and C<turn_off()> methods. Perl does not have any built-in way to express roles. In the past, people just bit the bullet and used multiple inheritance. Nowadays, there are several good choices on CPAN for using roles. =head2 When to Use OO Object Orientation is not the best solution to every problem. In I<Perl Best Practices> (copyright 2004, Published by O'Reilly Media, Inc.), Damian Conway provides a list of criteria to use when deciding if OO is the right fit for your problem: =over 4 =item * The system being designed is large, or is likely to become large. =item * The data can be aggregated into obvious structures, especially if there's a large amount of data in each aggregate. =item * The various types of data aggregate form a natural hierarchy that facilitates the use of inheritance and polymorphism. =item * You have a piece of data on which many different operations are applied. =item * You need to perform the same general operations on related types of data, but with slight variations depending on the specific type of data the operations are applied to. =item * It's likely you'll have to add new data types later. =item * The typical interactions between pieces of data are best represented by operators. =item * The implementation of individual components of the system is likely to change over time. =item * The system design is already object-oriented. =item * Large numbers of other programmers will be using your code modules. =back =head1 PERL OO SYSTEMS As we mentioned before, Perl's built-in OO system is very minimal, but also quite flexible. Over the years, many people have developed systems which build on top of Perl's built-in system to provide more features and convenience. We strongly recommend that you use one of these systems. Even the most minimal of them eliminates a lot of repetitive boilerplate. There's really no good reason to write your classes from scratch in Perl. If you are interested in the guts underlying these systems, check out L<perlobj>. =head2 Moose L<Moose> bills itself as a "postmodern object system for Perl 5". Don't be scared, the "postmodern" label is a callback to Larry's description of Perl as "the first postmodern computer language". C<Moose> provides a complete, modern OO system. Its biggest influence is the Common Lisp Object System, but it also borrows ideas from Smalltalk and several other languages. C<Moose> was created by Stevan Little, and draws heavily from his work on the Raku OO design. Here is our C<File> class using C<Moose>: package File; use Moose; has path => ( is => 'ro' ); has content => ( is => 'ro' ); has last_mod_time => ( is => 'ro' ); sub print_info { my $self = shift; print "This file is at ", $self->path, "\n"; } C<Moose> provides a number of features: =over 4 =item * Declarative sugar C<Moose> provides a layer of declarative "sugar" for defining classes. That sugar is just a set of exported functions that make declaring how your class works simpler and more palatable. This lets you describe I<what> your class is, rather than having to tell Perl I<how> to implement your class. The C<has()> subroutine declares an attribute, and C<Moose> automatically creates accessors for these attributes. It also takes care of creating a C<new()> method for you. This constructor knows about the attributes you declared, so you can set them when creating a new C<File>. =item * Roles built-in C<Moose> lets you define roles the same way you define classes: package HasOnOffSwitch; use Moose::Role; has is_on => ( is => 'rw', isa => 'Bool', ); sub turn_on { my $self = shift; $self->is_on(1); } sub turn_off { my $self = shift; $self->is_on(0); } =item * A miniature type system In the example above, you can see that we passed C<< isa => 'Bool' >> to C<has()> when creating our C<is_on> attribute. This tells C<Moose> that this attribute must be a boolean value. If we try to set it to an invalid value, our code will throw an error. =item * Full introspection and manipulation Perl's built-in introspection features are fairly minimal. C<Moose> builds on top of them and creates a full introspection layer for your classes. This lets you ask questions like "what methods does the File class implement?" It also lets you modify your classes programmatically. =item * Self-hosted and extensible C<Moose> describes itself using its own introspection API. Besides being a cool trick, this means that you can extend C<Moose> using C<Moose> itself. =item * Rich ecosystem There is a rich ecosystem of C<Moose> extensions on CPAN under the L<MooseX|https://metacpan.org/search?q=MooseX> namespace. In addition, many modules on CPAN already use C<Moose>, providing you with lots of examples to learn from. =item * Many more features C<Moose> is a very powerful tool, and we can't cover all of its features here. We encourage you to learn more by reading the C<Moose> documentation, starting with L<Moose::Manual|https://metacpan.org/pod/Moose::Manual>. =back Of course, C<Moose> isn't perfect. C<Moose> can make your code slower to load. C<Moose> itself is not small, and it does a I<lot> of code generation when you define your class. This code generation means that your runtime code is as fast as it can be, but you pay for this when your modules are first loaded. This load time hit can be a problem when startup speed is important, such as with a command-line script or a "plain vanilla" CGI script that must be loaded each time it is executed. Before you panic, know that many people do use C<Moose> for command-line tools and other startup-sensitive code. We encourage you to try C<Moose> out first before worrying about startup speed. C<Moose> also has several dependencies on other modules. Most of these are small stand-alone modules, a number of which have been spun off from C<Moose>. C<Moose> itself, and some of its dependencies, require a compiler. If you need to install your software on a system without a compiler, or if having I<any> dependencies is a problem, then C<Moose> may not be right for you. =head3 Moo If you try C<Moose> and find that one of these issues is preventing you from using C<Moose>, we encourage you to consider L<Moo> next. C<Moo> implements a subset of C<Moose>'s functionality in a simpler package. For most features that it does implement, the end-user API is I<identical> to C<Moose>, meaning you can switch from C<Moo> to C<Moose> quite easily. C<Moo> does not implement most of C<Moose>'s introspection API, so it's often faster when loading your modules. Additionally, none of its dependencies require XS, so it can be installed on machines without a compiler. One of C<Moo>'s most compelling features is its interoperability with C<Moose>. When someone tries to use C<Moose>'s introspection API on a C<Moo> class or role, it is transparently inflated into a C<Moose> class or role. This makes it easier to incorporate C<Moo>-using code into a C<Moose> code base and vice versa. For example, a C<Moose> class can subclass a C<Moo> class using C<extends> or consume a C<Moo> role using C<with>. The C<Moose> authors hope that one day C<Moo> can be made obsolete by improving C<Moose> enough, but for now it provides a worthwhile alternative to C<Moose>. =head2 Class::Accessor L<Class::Accessor> is the polar opposite of C<Moose>. It provides very few features, nor is it self-hosting. It is, however, very simple, pure Perl, and it has no non-core dependencies. It also provides a "Moose-like" API on demand for the features it supports. Even though it doesn't do much, it is still preferable to writing your own classes from scratch. Here's our C<File> class with C<Class::Accessor>: package File; use Class::Accessor 'antlers'; has path => ( is => 'ro' ); has content => ( is => 'ro' ); has last_mod_time => ( is => 'ro' ); sub print_info { my $self = shift; print "This file is at ", $self->path, "\n"; } The C<antlers> import flag tells C<Class::Accessor> that you want to define your attributes using C<Moose>-like syntax. The only parameter that you can pass to C<has> is C<is>. We recommend that you use this Moose-like syntax if you choose C<Class::Accessor> since it means you will have a smoother upgrade path if you later decide to move to C<Moose>. Like C<Moose>, C<Class::Accessor> generates accessor methods and a constructor for your class. =head2 Class::Tiny Finally, we have L<Class::Tiny>. This module truly lives up to its name. It has an incredibly minimal API and absolutely no dependencies on any recent Perl. Still, we think it's a lot easier to use than writing your own OO code from scratch. Here's our C<File> class once more: package File; use Class::Tiny qw( path content last_mod_time ); sub print_info { my $self = shift; print "This file is at ", $self->path, "\n"; } That's it! With C<Class::Tiny>, all accessors are read-write. It generates a constructor for you, as well as the accessors you define. You can also use L<Class::Tiny::Antlers> for C<Moose>-like syntax. =head2 Role::Tiny As we mentioned before, roles provide an alternative to inheritance, but Perl does not have any built-in role support. If you choose to use Moose, it comes with a full-fledged role implementation. However, if you use one of our other recommended OO modules, you can still use roles with L<Role::Tiny> C<Role::Tiny> provides some of the same features as Moose's role system, but in a much smaller package. Most notably, it doesn't support any sort of attribute declaration, so you have to do that by hand. Still, it's useful, and works well with C<Class::Accessor> and C<Class::Tiny> =head2 OO System Summary Here's a brief recap of the options we covered: =over 4 =item * L<Moose> C<Moose> is the maximal option. It has a lot of features, a big ecosystem, and a thriving user base. We also covered L<Moo> briefly. C<Moo> is C<Moose> lite, and a reasonable alternative when Moose doesn't work for your application. =item * L<Class::Accessor> C<Class::Accessor> does a lot less than C<Moose>, and is a nice alternative if you find C<Moose> overwhelming. It's been around a long time and is well battle-tested. It also has a minimal C<Moose> compatibility mode which makes moving from C<Class::Accessor> to C<Moose> easy. =item * L<Class::Tiny> C<Class::Tiny> is the absolute minimal option. It has no dependencies, and almost no syntax to learn. It's a good option for a super minimal environment and for throwing something together quickly without having to worry about details. =item * L<Role::Tiny> Use C<Role::Tiny> with C<Class::Accessor> or C<Class::Tiny> if you find yourself considering multiple inheritance. If you go with C<Moose>, it comes with its own role implementation. =back =head2 Other OO Systems There are literally dozens of other OO-related modules on CPAN besides those covered here, and you're likely to run across one or more of them if you work with other people's code. In addition, plenty of code in the wild does all of its OO "by hand", using just the Perl built-in OO features. If you need to maintain such code, you should read L<perlobj> to understand exactly how Perl's built-in OO works. =head1 CONCLUSION As we said before, Perl's minimal OO system has led to a profusion of OO systems on CPAN. While you can still drop down to the bare metal and write your classes by hand, there's really no reason to do that with modern Perl. For small systems, L<Class::Tiny> and L<Class::Accessor> both provide minimal object systems that take care of basic boilerplate for you. For bigger projects, L<Moose> provides a rich set of features that will let you focus on implementing your business logic. L<Moo> provides a nice alternative to L<Moose> when you want a lot of features but need faster compile time or to avoid XS. We encourage you to play with and evaluate L<Moose>, L<Moo>, L<Class::Accessor>, and L<Class::Tiny> to see which OO system is right for you. =cut PK �=�[T�ٰ/ �/ perlko.podnu �[��� =encoding utf8 이 파일을 내용 그대로 읽고 있다면 우스꽝스러운 문자는 무시해주세요. 이 문서는 POD로 읽을 수 있도록 POD 형식(F<pod/perlpod.pod> 문서를 확인하세요)으로 작성되어 있습니다. =head1 NAME perlko - 한국어 Perl 안내서 =head1 DESCRIPTION Perl의 세계에 오신 것을 환영합니다! Perl은 가끔 B<'Practical Extraction and Report Language'>라고 하기도 합니다만 다른 널리 알려진 것들 중에서 B<'Pathologically Eclectic Rubbish Lister'>라고 하기도 합니다. 사실 이것은 끼워 맞춘 것이며 Perl이 이것들의 첫 글자를 가져와서 이름을 붙인 것은 아닙니다. Perl의 창시자 Larry가 첫 번째 이름을 먼저 생각했고 널리 알려진 것을 나중에 지었기 때문입니다. 그렇기 때문에 B<'Perl'>은 모두 대문자가 아닙니다. 널리 알려진 어떤 것을 가지고 논쟁하는 것은 의미가 없습니다. Larry는 두 개 다 지지합니다. 가끔 p가 소문자로 작성된 B<'perl'>을 볼 것입니다. P가 대문자로 되어 있는 B<'Perl'>은 언어를 참조할 때 쓰이며 B<'perl'>처럼 p가 소문자인 경우는 여러분의 프로그램을 컴파일하고 돌릴 때 사용되는 해석기를 지칭할 때 사용됩니다. =head1 Perl에 관하여 Perl은 본래 문자열 생성을 위해 만들졌지만 지금은 시스템 관리와 웹 개발, 네트워크 프로그래밍, GUI 개발 등을 포함한 여러 분야에서 널리 사용되는 범용 프로그래밍 언어입니다. 이 언어는 아름다움(아주 작고, 우아하고, 아주 적고)보다 실용적(사용하기 쉽고, 효율적이며, 가능한 최대한)인 것을 지향하고 있습니다. 사용하기 쉽고, 절차적 프로그래밍과 객체 지향 프로그래밍을 모두 지원하고, 강력한 문자열 처리 기능을 내장하고, 세상에서 가장 인상적인 제 3자의 모듈 모음처를 가지고 있다는 것은 Perl의 가장 중요한 특징입니다. Perl의 언어적 특징은 F<pod/perlintro.pod> 문서에서 소개합니다. 이번 릴리스에서 가장 중요한 변화는 F<pod/perldelta.pod>에서 논의합니다. 또한 다양한 출판사가 출판한 많은 Perl 책은 다양한 주제를 다루고 있습니다. 자세한 정보는 F<pod/perlbook.pod> 문서를 확인하세요. =head1 설치 여러분이 비교적 현대의 운영체제를 사용하고 있고 현재 버전의 Perl을 지역적으로 설치하고 싶다면 다음 명령을 실행하세요. ./Configure -des -Dprefix=$HOME/localperl make test make install 앞의 명령은 여러분의 플랫폼에 맞게 환경을 설정하고 컴파일을 수행한 후, 회기 테스트를 수행한뒤, 홈 디렉터리 하부의 F<localperl> 디렉터리에 perl을 설치합니다. 여러분이 어떠한 문제든 겪게 되거나 사용자 정의 버전 Perl을 설치할 필요가 있다면 현재 배포판에 들어있는 F<INSTALL> 파일 안의 자세한 설명을 읽어야 합니다. 추가적으로 일반적이지 않은 다양한 플랫폼에서 Perl을 빌드하고 사용하는 방법에 대한 도움말과 귀띔이 적혀있는 많은 수의 F<README> 파일이 있습니다. 일단 Perl을 설치하고 나면 C<perldoc> 도구를 이용해 풍부한 문서를 사용할 수 있습니다. 시작하기 위해서 다음 명령을 실행하세요. perldoc perl =head1 실행에 어려움을 겪는다면 Perl은 뜨개질에서 부터 로켓 과학까지 모든 분야에서 사용할 수 있는 크고 복잡한 시스템입니다. 여러분이 어려움에 부딪혔을때 그 문제는 이미 다른 사람이 해결했을 가능성이 높습니다. 문서를 모두 확인했는데도 버그가 확실하다면 C<perlbug> 도구를 이용해서 저희에게 버그를 보고해주세요. C<perlbug>에 대한 더 자세한 정보는 C<perldoc perlbug> 또는 C<perlbug>를 명령줄에서 실행해서 확인할 수 있습니다. Perl을 사용 가능하게 만들었다 하더라도 Perl은 계속해서 진화하기 때문에 여러분이 맞닥뜨린 버그를 수정했거나 여러분이 유용하다고 생각할법한 새로운 기능이 추가된 좀 더 최신 버전이 있을 수 있습니다. 여러분은 항상 최신 버전의 perl을 CPAN (Comprehensive Perl Archive Network) 사이트 L<http://www.cpan.org/src/> 에서 찾을 수 있습니다. perl 소스에 간단한 패치를 등록하고 싶다면 F<pod/perlhack.pod> 문서의 B<"SUPER QUICK PATCH GUIDE">를 살펴보세요. 그냥 개인적으로 참고하세요. 제가 이것처럼 멋진 물건을 만든다는 것을 여러분이 알기를 바랍니다. 그것은 제 이야기의 B<"저자(Author)">를 기쁘게하기 때문입니다. 이것이 여러분을 귀찮게 한다면 여러분의 B<"저작(Authorship)">에 대한 생각을 정정해야 할 수도 있습니다. 하지만 어쨌거나 여러분은 Perl을 사용하는데는 문제가 없답니다. :-) - B<"저자">로부터. =head1 인코딩 Perl은 5.8.0판부터 유니코드/ISO 10646에 대해 광범위하게 지원합니다. 유니코드 지원의 일환으로 한중일을 비롯한 세계 각국에서 유니코드 이전에 쓰고 있었고 지금도 널리 쓰이고 있는 수많은 인코딩을 지원합니다. 유니코드는 전 세계에서 쓰이는 모든 언어를 위한 표기 체계(유럽의 라틴 알파벳, 키릴 알파벳, 그리스 알파벳, 인도와 동남 아시아의 브라미 계열 스크립트, 아랍 문자, 히브리 문자, 한중일의 한자, 한국어의 한글, 일본어의 가나, 북미 인디안의 표기 체계 등)를 수용하는 것을 목표로 하고 있기 때문에 기존에 쓰이던 각 언어 및 국가 그리고 운영 체계에 고유한 문자 집합과 인코딩에 쓸 수 있는 모든 글자는 물론이고 기존 문자 집합에서 지원하고 있지 않던 아주 많은 글자를 포함하고 있습니다. Perl은 내부적으로 유니코드를 문자 표현을 위해 사용합니다. 보다 구체적으로 말하면 Perl 스크립트 안에서 UTF-8 문자열을 쓸 수 있고, 각종 함수와 연산자(예를 들어, 정규식, index, substr)가 바이트 단위 대신 유니코드 글자 단위로 동작합니다. 더 자세한 것은 F<pod/perlunicode.pod> 문서를 참고하세요. 유니코드가 널리 보급되기 전에 널리 쓰이고 있었고, 여전히 널리 쓰이고 있는 각국/각 언어별 인코딩으로 입출력을 하고 이들 인코딩으로 된 데이터와 문서를 다루는 것을 돕기 위해 L<Encode> 모듈이 쓰이고 있습니다. 무엇보다 L<Encode> 모듈을 사용하면 수많은 인코딩 사이의 변환을 쉽게 할 수 있습니다. =head2 Encode 모듈 =head3 지원 인코딩 L<Encode> 모듈은 다음과 같은 한국어 인코딩을 지원합니다. =over 4 =item * C<euc-kr> US-ASCII와 KS X 1001을 같이 쓰는 멀티바이트 인코딩으로 흔히 완성형이라고 불림. KS X 2901과 RFC 1557 참고. =item * C<cp949> MS-Windows 9x/ME에서 쓰이는 확장 완성형. euc-kr에 8,822자의 한글 음절을 더한 것임. alias는 uhc, windows-949, x-windows-949, ks_c_5601-1987. 맨 마지막 이름은 적절하지 않은 이름이지만, Microsoft 제품에서 CP949의 의미로 쓰이고 있음. =item * C<johab> KS X 1001:1998 부록 3에서 규정한 조합형. 문자 레퍼토리는 cp949와 마찬가지로 US-ASCII와 KS X 1001에 8,822자의 한글 음절을 더한 것으로 인코딩 방식은 전혀 다름. =item * C<iso-2022-kr> RFC 1557에서 규정한 한국어 인터넷 메일 교환용 인코딩으로 US-ASCII와 KS X 1001을 레퍼토리로 하는 점에서 euc-kr과 같지만 인코딩 방식이 다름. 1997-8년 경까지 쓰였으나 더 이상 메일 교환에 쓰이지 않음. =item * C<ksc5601-raw> KS X 1001(KS C 5601)을 GL(즉, MSB를 0으로 한 경우)에 놓았을 때의 인코딩. US-ASCII와 결합하지 않고 단독으로 쓰이는 일은 X11 등에서 글꼴 인코딩(ksc5601.1987-0. '0'은 GL을 의미함)으로 쓰이는 것을 제외하고는 거의 없음. KS C 5601은 1997년 KS X 1001로 이름을 바꾸었음. 1998년에는 두 글자(유로화 부호와 등록 상표 부호)가 더해졌음. =back =head3 변환 예제 예를 들어, euc-kr 인코딩으로 된 파일을 UTF-8로 변환하려면 명령줄에서 다음처럼 실행합니다. perl -Mencoding=euc-kr,STDOUT,utf8 -pe1 < file.euc-kr > file.utf8 반대로 변환할 경우 다음처럼 실행합니다. perl -Mencoding=utf8,STDOUT,euc-kr -pe1 < file.utf8 > file.euc-kr 이런 변환을 좀더 편리하게 할 수 있도록 도와주는 F<piconv>가 Perl에 기본으로 들어있습니다. 이 유틸리티는 L<Encode> 모듈을 이용한 순수 Perl 유틸리티로 이름에서 알 수 있듯이 Unix의 C<iconv>를 모델로 한 것입니다. 사용법은 다음과 같습니다. piconv -f euc-kr -t utf8 < file.euc-kr > file.utf8 piconv -f utf8 -t euc-kr < file.utf8 > file.euc-kr =head3 모범 사례 Perl은 기본적으로 내부에서 UTF-8을 사용하며 Encode 모듈을 통해 다양한 인코딩을 지원하지만 항상 다음 규칙을 지킴으로써 인코딩과 관련한 다양하게 발생할 수 있는 문제의 가능성을 줄이는 것을 추천합니다. =over 4 =item * 소스 코드는 항상 UTF-8 인코딩으로 저장 =item * 소스 코드 상단에 C<use utf8;> 프라그마 사용 =item * 소스 코드, 터미널, 운영체제, 데이터 인코딩을 분리해서 이해 =item * 입출력 파일 핸들에 명시적인 인코딩을 사용 =item * 중복(double) 인코딩을 주의 =back =head3 유니코드 및 한국어 인코딩 관련 자료 =over 4 =item * L<perluniintro> =item * L<perlunicode> =item * L<Encode> =item * L<Encode::KR> =item * L<encoding> =item * L<https://www.unicode.org/> 유니코드 컨소시엄 =item * L<https://std.dkuug.dk/JTC1/SC2/WG2> 기본적으로 Unicode와 같은 ISO 표준인 ISO/IEC 10646 UCS(Universal Character Set)을 만드는 ISO/IEC JTC1/SC2/WG2의 웹 페이지 =item * L<https://www.cl.cam.ac.uk/~mgk25/unicode.html> 유닉스/리눅스 사용자를 위한 UTF-8 및 유니코드 관련 FAQ =item * L<http://wiki.kldp.org/Translations/html/UTF8-Unicode-KLDP/UTF8-Unicode-KLDP.html> 유닉스/리눅스 사용자를 위한 UTF-8 및 유니코드 관련 FAQ의 한국어 번역 =back =head1 Perl 관련 자료 다음은 공식적인 Perl 관련 자료중 일부입니다. =over 4 =item * L<https://www.perl.org/> Perl 공식 홈페이지 =item * L<https://www.perl.com/> O'Reilly의 Perl 웹 페이지 =item * L<https://www.cpan.org/> CPAN - Comprehensive Perl Archive Network, 통합적 Perl 파일 보관 네트워크 =item * L<https://metacpan.org> 메타 CPAN =item * L<https://lists.perl.org/> Perl 메일링 리스트 =item * L<https://blogs.perl.org/> Perl 메타 블로그 =item * L<https://www.perlmonks.org/> Perl 수도승들을 위한 수도원 =item * L<https://www.pm.org/groups/asia.html> 아시아 지역 Perl 몽거스 모임 =item * L<http://www.perladvent.org/> Perl 크리스마스 달력 =back 다음은 Perl을 더 깊게 공부하는데 도움을 줄 수 있는 한국어 관련 사이트입니다. =over 4 =item * L<https://perl.kr/> 한국 Perl 커뮤니티 공식 포털 =item * L<https://doc.perl.kr/> Perl 문서 한글화 프로젝트 =item * L<https://cafe.naver.com/perlstudy.cafe> 네이버 Perl 카페 =item * L<http://www.perl.or.kr/> 한국 Perl 사용자 모임 =item * L<https://advent.perl.kr> Seoul.pm Perl 크리스마스 달력 (2010 ~ 2012) =item * L<http://gypark.pe.kr/wiki/Perl> GYPARK(Geunyoung Park)의 Perl 관련 한글 문서 저장소 =back =head1 라이센스 F<README> 파일의 B<'LICENSING'> 항목을 참고하세요. =head1 AUTHORS =over =item * Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt> =item * 신정식 E<lt>jshin@mailaps.orgE<gt> =item * 김도형 E<lt>keedi@cpan.orgE<gt> =back =cut PK �=�[�u���V �V perldebtut.podnu �[��� =head1 NAME perldebtut - Perl debugging tutorial =head1 DESCRIPTION A (very) lightweight introduction in the use of the perl debugger, and a pointer to existing, deeper sources of information on the subject of debugging perl programs. There's an extraordinary number of people out there who don't appear to know anything about using the perl debugger, though they use the language every day. This is for them. =head1 use strict First of all, there's a few things you can do to make your life a lot more straightforward when it comes to debugging perl programs, without using the debugger at all. To demonstrate, here's a simple script, named "hello", with a problem: #!/usr/bin/perl $var1 = 'Hello World'; # always wanted to do that :-) $var2 = "$varl\n"; print $var2; exit; While this compiles and runs happily, it probably won't do what's expected, namely it doesn't print "Hello World\n" at all; It will on the other hand do exactly what it was told to do, computers being a bit that way inclined. That is, it will print out a newline character, and you'll get what looks like a blank line. It looks like there's 2 variables when (because of the typo) there's really 3: $var1 = 'Hello World'; $varl = undef; $var2 = "\n"; To catch this kind of problem, we can force each variable to be declared before use by pulling in the strict module, by putting 'use strict;' after the first line of the script. Now when you run it, perl complains about the 3 undeclared variables and we get four error messages because one variable is referenced twice: Global symbol "$var1" requires explicit package name at ./t1 line 4. Global symbol "$var2" requires explicit package name at ./t1 line 5. Global symbol "$varl" requires explicit package name at ./t1 line 5. Global symbol "$var2" requires explicit package name at ./t1 line 7. Execution of ./hello aborted due to compilation errors. Luvverly! and to fix this we declare all variables explicitly and now our script looks like this: #!/usr/bin/perl use strict; my $var1 = 'Hello World'; my $varl = undef; my $var2 = "$varl\n"; print $var2; exit; We then do (always a good idea) a syntax check before we try to run it again: > perl -c hello hello syntax OK And now when we run it, we get "\n" still, but at least we know why. Just getting this script to compile has exposed the '$varl' (with the letter 'l') variable, and simply changing $varl to $var1 solves the problem. =head1 Looking at data and -w and v Ok, but how about when you want to really see your data, what's in that dynamic variable, just before using it? #!/usr/bin/perl use strict; my $key = 'welcome'; my %data = ( 'this' => qw(that), 'tom' => qw(and jerry), 'welcome' => q(Hello World), 'zip' => q(welcome), ); my @data = keys %data; print "$data{$key}\n"; exit; Looks OK, after it's been through the syntax check (perl -c scriptname), we run it and all we get is a blank line again! Hmmmm. One common debugging approach here, would be to liberally sprinkle a few print statements, to add a check just before we print out our data, and another just after: print "All OK\n" if grep($key, keys %data); print "$data{$key}\n"; print "done: '$data{$key}'\n"; And try again: > perl data All OK done: '' After much staring at the same piece of code and not seeing the wood for the trees for some time, we get a cup of coffee and try another approach. That is, we bring in the cavalry by giving perl the 'B<-d>' switch on the command line: > perl -d data Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(./data:4): my $key = 'welcome'; Now, what we've done here is to launch the built-in perl debugger on our script. It's stopped at the first line of executable code and is waiting for input. Before we go any further, you'll want to know how to quit the debugger: use just the letter 'B<q>', not the words 'quit' or 'exit': DB<1> q > That's it, you're back on home turf again. =head1 help Fire the debugger up again on your script and we'll look at the help menu. There's a couple of ways of calling help: a simple 'B<h>' will get the summary help list, 'B<|h>' (pipe-h) will pipe the help through your pager (which is (probably 'more' or 'less'), and finally, 'B<h h>' (h-space-h) will give you the entire help screen. Here is the summary page: DB<1>h List/search source lines: Control script execution: l [ln|sub] List source code T Stack trace - or . List previous/current line s [expr] Single step [in expr] v [line] View around line n [expr] Next, steps over subs f filename View source in file <CR/Enter> Repeat last n or s /pattern/ ?patt? Search forw/backw r Return from subroutine M Show module versions c [ln|sub] Continue until position Debugger controls: L List break/watch/ actions o [...] Set debugger options t [expr] Toggle trace [trace expr] <[<]|{[{]|>[>] [cmd] Do pre/post-prompt b [ln|event|sub] [cnd] Set breakpoint ! [N|pat] Redo a previous command B ln|* Delete a/all breakpoints H [-num] Display last num commands a [ln] cmd Do cmd before line = [a val] Define/list an alias A ln|* Delete a/all actions h [db_cmd] Get help on command w expr Add a watch expression h h Complete help page W expr|* Delete a/all watch exprs |[|]db_cmd Send output to pager ![!] syscmd Run cmd in a subprocess q or ^D Quit R Attempt a restart Data Examination: expr Execute perl code, also see: s,n,t expr x|m expr Evals expr in list context, dumps the result or lists methods. p expr Print expression (uses script's current package). S [[!]pat] List subroutine names [not] matching pattern V [Pk [Vars]] List Variables in Package. Vars can be ~pattern or !pattern. X [Vars] Same as "V current_package [Vars]". y [n [Vars]] List lexicals in higher scope <n>. Vars same as V. For more help, type h cmd_letter, or run man perldebug for all docs. More confusing options than you can shake a big stick at! It's not as bad as it looks and it's very useful to know more about all of it, and fun too! There's a couple of useful ones to know about straight away. You wouldn't think we're using any libraries at all at the moment, but 'B<M>' will show which modules are currently loaded, and their version number, while 'B<m>' will show the methods, and 'B<S>' shows all subroutines (by pattern) as shown below. 'B<V>' and 'B<X>' show variables in the program by package scope and can be constrained by pattern. DB<2>S str dumpvar::stringify strict::bits strict::import strict::unimport Using 'X' and cousins requires you not to use the type identifiers ($@%), just the 'name': DM<3>X ~err FileHandle(stderr) => fileno(2) Remember we're in our tiny program with a problem, we should have a look at where we are, and what our data looks like. First of all let's view some code at our present position (the first line of code in this case), via 'B<v>': DB<4> v 1 #!/usr/bin/perl 2: use strict; 3 4==> my $key = 'welcome'; 5: my %data = ( 6 'this' => qw(that), 7 'tom' => qw(and jerry), 8 'welcome' => q(Hello World), 9 'zip' => q(welcome), 10 ); At line number 4 is a helpful pointer, that tells you where you are now. To see more code, type 'v' again: DB<4> v 8 'welcome' => q(Hello World), 9 'zip' => q(welcome), 10 ); 11: my @data = keys %data; 12: print "All OK\n" if grep($key, keys %data); 13: print "$data{$key}\n"; 14: print "done: '$data{$key}'\n"; 15: exit; And if you wanted to list line 5 again, type 'l 5', (note the space): DB<4> l 5 5: my %data = ( In this case, there's not much to see, but of course normally there's pages of stuff to wade through, and 'l' can be very useful. To reset your view to the line we're about to execute, type a lone period '.': DB<5> . main::(./data_a:4): my $key = 'welcome'; The line shown is the one that is about to be executed B<next>, it hasn't happened yet. So while we can print a variable with the letter 'B<p>', at this point all we'd get is an empty (undefined) value back. What we need to do is to step through the next executable statement with an 'B<s>': DB<6> s main::(./data_a:5): my %data = ( main::(./data_a:6): 'this' => qw(that), main::(./data_a:7): 'tom' => qw(and jerry), main::(./data_a:8): 'welcome' => q(Hello World), main::(./data_a:9): 'zip' => q(welcome), main::(./data_a:10): ); Now we can have a look at that first ($key) variable: DB<7> p $key welcome line 13 is where the action is, so let's continue down to there via the letter 'B<c>', which by the way, inserts a 'one-time-only' breakpoint at the given line or sub routine: DB<8> c 13 All OK main::(./data_a:13): print "$data{$key}\n"; We've gone past our check (where 'All OK' was printed) and have stopped just before the meat of our task. We could try to print out a couple of variables to see what is happening: DB<9> p $data{$key} Not much in there, lets have a look at our hash: DB<10> p %data Hello Worldziptomandwelcomejerrywelcomethisthat DB<11> p keys %data Hello Worldtomwelcomejerrythis Well, this isn't very easy to read, and using the helpful manual (B<h h>), the 'B<x>' command looks promising: DB<12> x %data 0 'Hello World' 1 'zip' 2 'tom' 3 'and' 4 'welcome' 5 undef 6 'jerry' 7 'welcome' 8 'this' 9 'that' That's not much help, a couple of welcomes in there, but no indication of which are keys, and which are values, it's just a listed array dump and, in this case, not particularly helpful. The trick here, is to use a B<reference> to the data structure: DB<13> x \%data 0 HASH(0x8194bc4) 'Hello World' => 'zip' 'jerry' => 'welcome' 'this' => 'that' 'tom' => 'and' 'welcome' => undef The reference is truly dumped and we can finally see what we're dealing with. Our quoting was perfectly valid but wrong for our purposes, with 'and jerry' being treated as 2 separate words rather than a phrase, thus throwing the evenly paired hash structure out of alignment. The 'B<-w>' switch would have told us about this, had we used it at the start, and saved us a lot of trouble: > perl -w data Odd number of elements in hash assignment at ./data line 5. We fix our quoting: 'tom' => q(and jerry), and run it again, this time we get our expected output: > perl -w data Hello World While we're here, take a closer look at the 'B<x>' command, it's really useful and will merrily dump out nested references, complete objects, partial objects - just about whatever you throw at it: Let's make a quick object and x-plode it, first we'll start the debugger: it wants some form of input from STDIN, so we give it something non-committal, a zero: > perl -de 0 Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(-e:1): 0 Now build an on-the-fly object over a couple of lines (note the backslash): DB<1> $obj = bless({'unique_id'=>'123', 'attr'=> \ cont: {'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class') And let's have a look at it: DB<2> x $obj 0 MY_class=HASH(0x828ad98) 'attr' => HASH(0x828ad68) 'col' => 'black' 'things' => ARRAY(0x828abb8) 0 'this' 1 'that' 2 'etc' 'unique_id' => 123 DB<3> Useful, huh? You can eval nearly anything in there, and experiment with bits of code or regexes until the cows come home: DB<3> @data = qw(this that the other atheism leather theory scythe) DB<4> p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data)) atheism leather other scythe the theory saw -> 6 If you want to see the command History, type an 'B<H>': DB<5> H 4: p 'saw -> '.($cnt += map { print "\t:\t$_\n" } grep(/the/, sort @data)) 3: @data = qw(this that the other atheism leather theory scythe) 2: x $obj 1: $obj = bless({'unique_id'=>'123', 'attr'=> {'col' => 'black', 'things' => [qw(this that etc)]}}, 'MY_class') DB<5> And if you want to repeat any previous command, use the exclamation: 'B<!>': DB<5> !4 p 'saw -> '.($cnt += map { print "$_\n" } grep(/the/, sort @data)) atheism leather other scythe the theory saw -> 12 For more on references see L<perlref> and L<perlreftut> =head1 Stepping through code Here's a simple program which converts between Celsius and Fahrenheit, it too has a problem: #!/usr/bin/perl -w use strict; my $arg = $ARGV[0] || '-c20'; if ($arg =~ /^\-(c|f)((\-|\+)*\d+(\.\d+)*)$/) { my ($deg, $num) = ($1, $2); my ($in, $out) = ($num, $num); if ($deg eq 'c') { $deg = 'f'; $out = &c2f($num); } else { $deg = 'c'; $out = &f2c($num); } $out = sprintf('%0.2f', $out); $out =~ s/^((\-|\+)*\d+)\.0+$/$1/; print "$out $deg\n"; } else { print "Usage: $0 -[c|f] num\n"; } exit; sub f2c { my $f = shift; my $c = 5 * $f - 32 / 9; return $c; } sub c2f { my $c = shift; my $f = 9 * $c / 5 + 32; return $f; } For some reason, the Fahrenheit to Celsius conversion fails to return the expected output. This is what it does: > temp -c0.72 33.30 f > temp -f33.3 162.94 c Not very consistent! We'll set a breakpoint in the code manually and run it under the debugger to see what's going on. A breakpoint is a flag, to which the debugger will run without interruption, when it reaches the breakpoint, it will stop execution and offer a prompt for further interaction. In normal use, these debugger commands are completely ignored, and they are safe - if a little messy, to leave in production code. my ($in, $out) = ($num, $num); $DB::single=2; # insert at line 9! if ($deg eq 'c') ... > perl -d temp -f33.3 Default die handler restored. Loading DB routines from perl5db.pl version 1.07 Editor support available. Enter h or `h h' for help, or `man perldebug' for more help. main::(temp:4): my $arg = $ARGV[0] || '-c100'; We'll simply continue down to our pre-set breakpoint with a 'B<c>': DB<1> c main::(temp:10): if ($deg eq 'c') { Followed by a view command to see where we are: DB<1> v 7: my ($deg, $num) = ($1, $2); 8: my ($in, $out) = ($num, $num); 9: $DB::single=2; 10==> if ($deg eq 'c') { 11: $deg = 'f'; 12: $out = &c2f($num); 13 } else { 14: $deg = 'c'; 15: $out = &f2c($num); 16 } And a print to show what values we're currently using: DB<1> p $deg, $num f33.3 We can put another break point on any line beginning with a colon, we'll use line 17 as that's just as we come out of the subroutine, and we'd like to pause there later on: DB<2> b 17 There's no feedback from this, but you can see what breakpoints are set by using the list 'L' command: DB<3> L temp: 17: print "$out $deg\n"; break if (1) Note that to delete a breakpoint you use 'B'. Now we'll continue down into our subroutine, this time rather than by line number, we'll use the subroutine name, followed by the now familiar 'v': DB<3> c f2c main::f2c(temp:30): my $f = shift; DB<4> v 24: exit; 25 26 sub f2c { 27==> my $f = shift; 28: my $c = 5 * $f - 32 / 9; 29: return $c; 30 } 31 32 sub c2f { 33: my $c = shift; Note that if there was a subroutine call between us and line 29, and we wanted to B<single-step> through it, we could use the 'B<s>' command, and to step over it we would use 'B<n>' which would execute the sub, but not descend into it for inspection. In this case though, we simply continue down to line 29: DB<4> c 29 main::f2c(temp:29): return $c; And have a look at the return value: DB<5> p $c 162.944444444444 This is not the right answer at all, but the sum looks correct. I wonder if it's anything to do with operator precedence? We'll try a couple of other possibilities with our sum: DB<6> p (5 * $f - 32 / 9) 162.944444444444 DB<7> p 5 * $f - (32 / 9) 162.944444444444 DB<8> p (5 * $f) - 32 / 9 162.944444444444 DB<9> p 5 * ($f - 32) / 9 0.722222222222221 :-) that's more like it! Ok, now we can set our return variable and we'll return out of the sub with an 'r': DB<10> $c = 5 * ($f - 32) / 9 DB<11> r scalar context return from main::f2c: 0.722222222222221 Looks good, let's just continue off the end of the script: DB<12> c 0.72 c Debugged program terminated. Use q to quit or R to restart, use O inhibit_exit to avoid stopping after program termination, h q, h R or h O to get additional info. A quick fix to the offending line (insert the missing parentheses) in the actual program and we're finished. =head1 Placeholder for a, w, t, T Actions, watch variables, stack traces etc.: on the TODO list. a w t T =head1 REGULAR EXPRESSIONS Ever wanted to know what a regex looked like? You'll need perl compiled with the DEBUGGING flag for this one: > perl -Dr -e '/^pe(a)*rl$/i' Compiling REx `^pe(a)*rl$' size 17 first at 2 rarest char at 0 1: BOL(2) 2: EXACTF <pe>(4) 4: CURLYN[1] {0,32767}(14) 6: NOTHING(8) 8: EXACTF <a>(0) 12: WHILEM(0) 13: NOTHING(14) 14: EXACTF <rl>(16) 16: EOL(17) 17: END(0) floating `'$ at 4..2147483647 (checking floating) stclass `EXACTF <pe>' anchored(BOL) minlen 4 Omitting $` $& $' support. EXECUTING... Freeing REx: `^pe(a)*rl$' Did you really want to know? :-) For more gory details on getting regular expressions to work, have a look at L<perlre>, L<perlretut>, and to decode the mysterious labels (BOL and CURLYN, etc. above), see L<perldebguts>. =head1 OUTPUT TIPS To get all the output from your error log, and not miss any messages via helpful operating system buffering, insert a line like this, at the start of your script: $|=1; To watch the tail of a dynamically growing logfile, (from the command line): tail -f $error_log Wrapping all die calls in a handler routine can be useful to see how, and from where, they're being called, L<perlvar> has more information: BEGIN { $SIG{__DIE__} = sub { require Carp; Carp::confess(@_) } } Various useful techniques for the redirection of STDOUT and STDERR filehandles are explained in L<perlopentut> and L<perlfaq8>. =head1 CGI Just a quick hint here for all those CGI programmers who can't figure out how on earth to get past that 'waiting for input' prompt, when running their CGI script from the command-line, try something like this: > perl -d my_cgi.pl -nodebug Of course L<CGI> and L<perlfaq9> will tell you more. =head1 GUIs The command line interface is tightly integrated with an B<emacs> extension and there's a B<vi> interface too. You don't have to do this all on the command line, though, there are a few GUI options out there. The nice thing about these is you can wave a mouse over a variable and a dump of its data will appear in an appropriate window, or in a popup balloon, no more tiresome typing of 'x $varname' :-) In particular have a hunt around for the following: B<ptkdb> perlTK based wrapper for the built-in debugger B<ddd> data display debugger B<PerlDevKit> and B<PerlBuilder> are NT specific NB. (more info on these and others would be appreciated). =head1 SUMMARY We've seen how to encourage good coding practices with B<use strict> and B<-w>. We can run the perl debugger B<perl -d scriptname> to inspect your data from within the perl debugger with the B<p> and B<x> commands. You can walk through your code, set breakpoints with B<b> and step through that code with B<s> or B<n>, continue with B<c> and return from a sub with B<r>. Fairly intuitive stuff when you get down to it. There is of course lots more to find out about, this has just scratched the surface. The best way to learn more is to use perldoc to find out more about the language, to read the on-line help (L<perldebug> is probably the next place to go), and of course, experiment. =head1 SEE ALSO L<perldebug>, L<perldebguts>, L<perldiag>, L<perlrun> =head1 AUTHOR Richard Foley <richard.foley@rfi.net> Copyright (c) 2000 =head1 CONTRIBUTORS Various people have made helpful suggestions and contributions, in particular: Ronald J Kimball <rjk@linguist.dartmouth.edu> Hugo van der Sanden <hv@crypt0.demon.co.uk> Peter Scott <Peter@PSDT.com> PK �=�[+fQ��* �* perl5201delta.podnu �[��� =encoding utf8 =head1 NAME perl5201delta - what is new for perl v5.20.1 =head1 DESCRIPTION This document describes differences between the 5.20.0 release and the 5.20.1 release. If you are upgrading from an earlier release such as 5.18.0, first read L<perl5200delta>, which describes differences between 5.18.0 and 5.20.0. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.20.0. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Performance Enhancements =over 4 =item * An optimization to avoid problems with COW and deliberately overallocated PVs has been disabled because it interfered with another, more important, optimization, causing a slowdown on some platforms. L<[perl #121975]|https://rt.perl.org/Ticket/Display.html?id=121975> =item * Returning a string from a lexical variable could be slow in some cases. This has now been fixed. L<[perl #121977]|https://rt.perl.org/Ticket/Display.html?id=121977> =back =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Config::Perl::V> has been upgraded from version 0.20 to 0.22. The list of Perl versions covered has been updated and some flaws in the parsing have been fixed. =item * L<Exporter> has been upgraded from version 5.70 to 5.71. Illegal POD syntax in the documentation has been corrected. =item * L<ExtUtils::CBuilder> has been upgraded from version 0.280216 to 0.280217. Android builds now link to both B<-lperl> and C<$Config::Config{perllibs}>. =item * L<File::Copy> has been upgraded from version 2.29 to 2.30. The documentation now notes that C<copy> will not overwrite read-only files. =item * L<Module::CoreList> has been upgraded from version 3.11 to 5.020001. The list of Perl versions covered has been updated. =item * The PathTools module collection has been upgraded from version 3.47 to 3.48. Fallbacks are now in place when cross-compiling for Android and C<$Config::Config{sh}> is not yet defined. L<[perl #121963]|https://rt.perl.org/Ticket/Display.html?id=121963> =item * L<PerlIO::via> has been upgraded from version 0.14 to 0.15. A minor portability improvement has been made to the XS implementation. =item * L<Unicode::UCD> has been upgraded from version 0.57 to 0.58. The documentation includes many clarifications and fixes. =item * L<utf8> has been upgraded from version 1.13 to 1.13_01. The documentation has some minor formatting improvements. =item * L<version> has been upgraded from version 0.9908 to 0.9909. External libraries and Perl may have different ideas of what the locale is. This is problematic when parsing version strings if the locale's numeric separator has been changed. Version parsing has been patched to ensure it handles the locales correctly. L<[perl #121930]|https://rt.perl.org/Ticket/Display.html?id=121930> =back =head1 Documentation =head2 Changes to Existing Documentation =head3 L<perlapi> =over 4 =item * C<av_len> - Emphasize that this returns the highest index in the array, not the size of the array. L<[perl #120386]|https://rt.perl.org/Ticket/Display.html?id=120386> =item * Note that C<SvSetSV> doesn't do set magic. =item * C<sv_usepvn_flags> - Fix documentation to mention the use of C<NewX> instead of C<malloc>. L<[perl #121869]|https://rt.perl.org/Ticket/Display.html?id=121869> =item * Clarify where C<NUL> may be embedded or is required to terminate a string. =back =head3 L<perlfunc> =over 4 =item * Clarify the meaning of C<-B> and C<-T>. =item * C<-l> now notes that it will return false if symlinks aren't supported by the file system. L<[perl #121523]|https://rt.perl.org/Ticket/Display.html?id=121523> =item * Note that C<each>, C<keys> and C<values> may produce different orderings for tied hashes compared to other perl hashes. L<[perl #121404]|https://rt.perl.org/Ticket/Display.html?id=121404> =item * Note that C<exec LIST> and C<system LIST> may fall back to the shell on Win32. Only C<exec PROGRAM LIST> and C<system PROGRAM LIST> indirect object syntax will reliably avoid using the shell. This has also been noted in L<perlport>. L<[perl #122046]|https://rt.perl.org/Ticket/Display.html?id=122046> =item * Clarify the meaning of C<our>. L<[perl #122132]|https://rt.perl.org/Ticket/Display.html?id=122132> =back =head3 L<perlguts> =over 4 =item * Explain various ways of modifying an existing SV's buffer. L<[perl #116925]|https://rt.perl.org/Ticket/Display.html?id=116925> =back =head3 L<perlpolicy> =over 4 =item * We now have a code of conduct for the I<< p5p >> mailing list, as documented in L<< perlpolicy/STANDARDS OF CONDUCT >>. =back =head3 L<perlre> =over 4 =item * The C</x> modifier has been clarified to note that comments cannot be continued onto the next line by escaping them. =back =head3 L<perlsyn> =over 4 =item * Mention the use of empty conditionals in C<for>/C<while> loops for infinite loops. =back =head3 L<perlxs> =over 4 =item * Added a discussion of locale issues in XS code. =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 Changes to Existing Diagnostics =over 4 =item * L<Variable length lookbehind not implemented in regex mE<sol>%sE<sol>|perldiag/"Variable length lookbehind not implemented in regex m/%s/"> Information about Unicode behaviour has been added. =back =head1 Configuration and Compilation =over 4 =item * Building Perl no longer writes to the source tree when configured with F<Configure>'s B<-Dmksymlinks> option. L<[perl #121585]|https://rt.perl.org/Ticket/Display.html?id=121585> =back =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item Android Build support has been improved for cross-compiling in general and for Android in particular. =item OpenBSD Corrected architectures and version numbers used in configuration hints when building Perl. =item Solaris B<c99> options have been cleaned up, hints look for B<solstudio> as well as B<SUNWspro>, and support for native C<setenv> has been added. =item VMS An old bug in feature checking, mainly affecting pre-7.3 systems, has been fixed. =item Windows C<%I64d> is now being used instead of C<%lld> for MinGW. =back =head1 Internal Changes =over 4 =item * Added L<perlapi/sync_locale>. Changing the program's locale should be avoided by XS code. Nevertheless, certain non-Perl libraries called from XS, such as C<Gtk> do so. When this happens, Perl needs to be told that the locale has changed. Use this function to do so, before returning to Perl. =back =head1 Selected Bug Fixes =over 4 =item * A bug has been fixed where zero-length assertions and code blocks inside of a regex could cause C<pos> to see an incorrect value. L<[perl #122460]|https://rt.perl.org/Ticket/Display.html?id=122460> =item * Using C<s///e> on tainted utf8 strings could issue bogus "Malformed UTF-8 character (unexpected end of string)" warnings. This has now been fixed. L<[perl #122148]|https://rt.perl.org/Ticket/Display.html?id=122148> =item * C<system> and friends should now work properly on more Android builds. Due to an oversight, the value specified through B<-Dtargetsh> to F<Configure> would end up being ignored by some of the build process. This caused perls cross-compiled for Android to end up with defective versions of C<system>, C<exec> and backticks: the commands would end up looking for F</bin/sh> instead of F</system/bin/sh>, and so would fail for the vast majority of devices, leaving C<$!> as C<ENOENT>. =item * Many issues have been detected by L<Coverity|http://www.coverity.com/> and fixed. =back =head1 Acknowledgements Perl 5.20.1 represents approximately 4 months of development since Perl 5.20.0 and contains approximately 12,000 lines of changes across 170 files from 36 authors. Excluding auto-generated files, documentation and release tools, there were approximately 2,600 lines of changes to 110 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.20.1: Aaron Crane, Abigail, Alberto Simões, Alexandr Ciornii, Alexandre (Midnite) Jousset, Andrew Fresh, Andy Dougherty, Brian Fraser, Chris 'BinGOs' Williams, Craig A. Berry, Daniel Dragan, David Golden, David Mitchell, H.Merijn Brand, James E Keenan, Jan Dubois, Jarkko Hietaniemi, John Peacock, kafka, Karen Etheridge, Karl Williamson, Lukas Mai, Matthew Horsfall, Michael Bunk, Peter Martini, Rafael Garcia-Suarez, Reini Urban, Ricardo Signes, Shirakata Kentaro, Smylers, Steve Hay, Thomas Sibley, Todd Rinaldo, Tony Cook, Vladimir Marek, Yves Orton. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at https://rt.perl.org/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�9� �� perl5180delta.podnu �[��� =encoding utf8 =head1 NAME perl5180delta - what is new for perl v5.18.0 =head1 DESCRIPTION This document describes differences between the v5.16.0 release and the v5.18.0 release. If you are upgrading from an earlier release such as v5.14.0, first read L<perl5160delta>, which describes differences between v5.14.0 and v5.16.0. =head1 Core Enhancements =head2 New mechanism for experimental features Newly-added experimental features will now require this incantation: no warnings "experimental::feature_name"; use feature "feature_name"; # would warn without the prev line There is a new warnings category, called "experimental", containing warnings that the L<feature> pragma emits when enabling experimental features. Newly-added experimental features will also be given special warning IDs, which consist of "experimental::" followed by the name of the feature. (The plan is to extend this mechanism eventually to all warnings, to allow them to be enabled or disabled individually, and not just by category.) By saying no warnings "experimental::feature_name"; you are taking responsibility for any breakage that future changes to, or removal of, the feature may cause. Since some features (like C<~~> or C<my $_>) now emit experimental warnings, and you may want to disable them in code that is also run on perls that do not recognize these warning categories, consider using the C<if> pragma like this: no if $] >= 5.018, warnings => "experimental::feature_name"; Existing experimental features may begin emitting these warnings, too. Please consult L<perlexperiment> for information on which features are considered experimental. =head2 Hash overhaul Changes to the implementation of hashes in perl v5.18.0 will be one of the most visible changes to the behavior of existing code. By default, two distinct hash variables with identical keys and values may now provide their contents in a different order where it was previously identical. When encountering these changes, the key to cleaning up from them is to accept that B<hashes are unordered collections> and to act accordingly. =head3 Hash randomization The seed used by Perl's hash function is now random. This means that the order which keys/values will be returned from functions like C<keys()>, C<values()>, and C<each()> will differ from run to run. This change was introduced to make Perl's hashes more robust to algorithmic complexity attacks, and also because we discovered that it exposes hash ordering dependency bugs and makes them easier to track down. Toolchain maintainers might want to invest in additional infrastructure to test for things like this. Running tests several times in a row and then comparing results will make it easier to spot hash order dependencies in code. Authors are strongly encouraged not to expose the key order of Perl's hashes to insecure audiences. Further, every hash has its own iteration order, which should make it much more difficult to determine what the current hash seed is. =head3 New hash functions Perl v5.18 includes support for multiple hash functions, and changed the default (to ONE_AT_A_TIME_HARD), you can choose a different algorithm by defining a symbol at compile time. For a current list, consult the F<INSTALL> document. Note that as of Perl v5.18 we can only recommend use of the default or SIPHASH. All the others are known to have security issues and are for research purposes only. =head3 PERL_HASH_SEED environment variable now takes a hex value C<PERL_HASH_SEED> no longer accepts an integer as a parameter; instead the value is expected to be a binary value encoded in a hex string, such as "0xf5867c55039dc724". This is to make the infrastructure support hash seeds of arbitrary lengths, which might exceed that of an integer. (SipHash uses a 16 byte seed.) =head3 PERL_PERTURB_KEYS environment variable added The C<PERL_PERTURB_KEYS> environment variable allows one to control the level of randomization applied to C<keys> and friends. When C<PERL_PERTURB_KEYS> is 0, perl will not randomize the key order at all. The chance that C<keys> changes due to an insert will be the same as in previous perls, basically only when the bucket size is changed. When C<PERL_PERTURB_KEYS> is 1, perl will randomize keys in a non-repeatable way. The chance that C<keys> changes due to an insert will be very high. This is the most secure and default mode. When C<PERL_PERTURB_KEYS> is 2, perl will randomize keys in a repeatable way. Repeated runs of the same program should produce the same output every time. C<PERL_HASH_SEED> implies a non-default C<PERL_PERTURB_KEYS> setting. Setting C<PERL_HASH_SEED=0> (exactly one 0) implies C<PERL_PERTURB_KEYS=0> (hash key randomization disabled); setting C<PERL_HASH_SEED> to any other value implies C<PERL_PERTURB_KEYS=2> (deterministic and repeatable hash key randomization). Specifying C<PERL_PERTURB_KEYS> explicitly to a different level overrides this behavior. =head3 Hash::Util::hash_seed() now returns a string Hash::Util::hash_seed() now returns a string instead of an integer. This is to make the infrastructure support hash seeds of arbitrary lengths which might exceed that of an integer. (SipHash uses a 16 byte seed.) =head3 Output of PERL_HASH_SEED_DEBUG has been changed The environment variable PERL_HASH_SEED_DEBUG now makes perl show both the hash function perl was built with, I<and> the seed, in hex, in use for that process. Code parsing this output, should it exist, must change to accommodate the new format. Example of the new format: $ PERL_HASH_SEED_DEBUG=1 ./perl -e1 HASH_FUNCTION = MURMUR3 HASH_SEED = 0x1476bb9f =head2 Upgrade to Unicode 6.2 Perl now supports Unicode 6.2. A list of changes from Unicode 6.1 is at L<http://www.unicode.org/versions/Unicode6.2.0>. =head2 Character name aliases may now include non-Latin1-range characters It is possible to define your own names for characters for use in C<\N{...}>, C<charnames::vianame()>, etc. These names can now be comprised of characters from the whole Unicode range. This allows for names to be in your native language, and not just English. Certain restrictions apply to the characters that may be used (you can't define a name that has punctuation in it, for example). See L<charnames/CUSTOM ALIASES>. =head2 New DTrace probes The following new DTrace probes have been added: =over 4 =item * C<op-entry> =item * C<loading-file> =item * C<loaded-file> =back =head2 C<${^LAST_FH}> This new variable provides access to the filehandle that was last read. This is the handle used by C<$.> and by C<tell> and C<eof> without arguments. =head2 Regular Expression Set Operations This is an B<experimental> feature to allow matching against the union, intersection, etc., of sets of code points, similar to L<Unicode::Regex::Set>. It can also be used to extend C</x> processing to [bracketed] character classes, and as a replacement of user-defined properties, allowing more complex expressions than they do. See L<perlrecharclass/Extended Bracketed Character Classes>. =head2 Lexical subroutines This new feature is still considered B<experimental>. To enable it: use 5.018; no warnings "experimental::lexical_subs"; use feature "lexical_subs"; You can now declare subroutines with C<state sub foo>, C<my sub foo>, and C<our sub foo>. (C<state sub> requires that the "state" feature be enabled, unless you write it as C<CORE::state sub foo>.) C<state sub> creates a subroutine visible within the lexical scope in which it is declared. The subroutine is shared between calls to the outer sub. C<my sub> declares a lexical subroutine that is created each time the enclosing block is entered. C<state sub> is generally slightly faster than C<my sub>. C<our sub> declares a lexical alias to the package subroutine of the same name. For more information, see L<perlsub/Lexical Subroutines>. =head2 Computed Labels The loop controls C<next>, C<last> and C<redo>, and the special C<dump> operator, now allow arbitrary expressions to be used to compute labels at run time. Previously, any argument that was not a constant was treated as the empty string. =head2 More CORE:: subs Several more built-in functions have been added as subroutines to the CORE:: namespace - namely, those non-overridable keywords that can be implemented without custom parsers: C<defined>, C<delete>, C<exists>, C<glob>, C<pos>, C<prototype>, C<scalar>, C<split>, C<study>, and C<undef>. As some of these have prototypes, C<prototype('CORE::...')> has been changed to not make a distinction between overridable and non-overridable keywords. This is to make C<prototype('CORE::pos')> consistent with C<prototype(&CORE::pos)>. =head2 C<kill> with negative signal names C<kill> has always allowed a negative signal number, which kills the process group instead of a single process. It has also allowed signal names. But it did not behave consistently, because negative signal names were treated as 0. Now negative signals names like C<-INT> are supported and treated the same way as -2 [perl #112990]. =head1 Security =head2 See also: hash overhaul Some of the changes in the L<hash overhaul|/"Hash overhaul"> were made to enhance security. Please read that section. =head2 C<Storable> security warning in documentation The documentation for C<Storable> now includes a section which warns readers of the danger of accepting Storable documents from untrusted sources. The short version is that deserializing certain types of data can lead to loading modules and other code execution. This is documented behavior and wanted behavior, but this opens an attack vector for malicious entities. =head2 C<Locale::Maketext> allowed code injection via a malicious template If users could provide a translation string to Locale::Maketext, this could be used to invoke arbitrary Perl subroutines available in the current process. This has been fixed, but it is still possible to invoke any method provided by C<Locale::Maketext> itself or a subclass that you are using. One of these methods in turn will invoke the Perl core's C<sprintf> subroutine. In summary, allowing users to provide translation strings without auditing them is a bad idea. This vulnerability is documented in CVE-2012-6329. =head2 Avoid calling memset with a negative count Poorly written perl code that allows an attacker to specify the count to perl's C<x> string repeat operator can already cause a memory exhaustion denial-of-service attack. A flaw in versions of perl before v5.15.5 can escalate that into a heap buffer overrun; coupled with versions of glibc before 2.16, it possibly allows the execution of arbitrary code. The flaw addressed to this commit has been assigned identifier CVE-2012-5195 and was researched by Tim Brown. =head1 Incompatible Changes =head2 See also: hash overhaul Some of the changes in the L<hash overhaul|/"Hash overhaul"> are not fully compatible with previous versions of perl. Please read that section. =head2 An unknown character name in C<\N{...}> is now a syntax error Previously, it warned, and the Unicode REPLACEMENT CHARACTER was substituted. Unicode now recommends that this situation be a syntax error. Also, the previous behavior led to some confusing warnings and behaviors, and since the REPLACEMENT CHARACTER has no use other than as a stand-in for some unknown character, any code that has this problem is buggy. =head2 Formerly deprecated characters in C<\N{}> character name aliases are now errors. Since v5.12.0, it has been deprecated to use certain characters in user-defined C<\N{...}> character names. These now cause a syntax error. For example, it is now an error to begin a name with a digit, such as in my $undraftable = "\N{4F}"; # Syntax error! or to have commas anywhere in the name. See L<charnames/CUSTOM ALIASES>. =head2 C<\N{BELL}> now refers to U+1F514 instead of U+0007 Unicode 6.0 reused the name "BELL" for a different code point than it traditionally had meant. Since Perl v5.14, use of this name still referred to U+0007, but would raise a deprecation warning. Now, "BELL" refers to U+1F514, and the name for U+0007 is "ALERT". All the functions in L<charnames> have been correspondingly updated. =head2 New Restrictions in Multi-Character Case-Insensitive Matching in Regular Expression Bracketed Character Classes Unicode has now withdrawn their previous recommendation for regular expressions to automatically handle cases where a single character can match multiple characters case-insensitively, for example, the letter LATIN SMALL LETTER SHARP S and the sequence C<ss>. This is because it turns out to be impracticable to do this correctly in all circumstances. Because Perl has tried to do this as best it can, it will continue to do so. (We are considering an option to turn it off.) However, a new restriction is being added on such matches when they occur in [bracketed] character classes. People were specifying things such as C</[\0-\xff]/i>, and being surprised that it matches the two character sequence C<ss> (since LATIN SMALL LETTER SHARP S occurs in this range). This behavior is also inconsistent with using a property instead of a range: C<\p{Block=Latin1}> also includes LATIN SMALL LETTER SHARP S, but C</[\p{Block=Latin1}]/i> does not match C<ss>. The new rule is that for there to be a multi-character case-insensitive match within a bracketed character class, the character must be explicitly listed, and not as an end point of a range. This more closely obeys the Principle of Least Astonishment. See L<perlrecharclass/Bracketed Character Classes>. Note that a bug [perl #89774], now fixed as part of this change, prevented the previous behavior from working fully. =head2 Explicit rules for variable names and identifiers Due to an oversight, single character variable names in v5.16 were completely unrestricted. This opened the door to several kinds of insanity. As of v5.18, these now follow the rules of other identifiers, in addition to accepting characters that match the C<\p{POSIX_Punct}> property. There is no longer any difference in the parsing of identifiers specified by using braces versus without braces. For instance, perl used to allow C<${foo:bar}> (with a single colon) but not C<$foo:bar>. Now that both are handled by a single code path, they are both treated the same way: both are forbidden. Note that this change is about the range of permissible literal identifiers, not other expressions. =head2 Vertical tabs are now whitespace No one could recall why C<\s> didn't match C<\cK>, the vertical tab. Now it does. Given the extreme rarity of that character, very little breakage is expected. That said, here's what it means: C<\s> in a regex now matches a vertical tab in all circumstances. Literal vertical tabs in a regex literal are ignored when the C</x> modifier is used. Leading vertical tabs, alone or mixed with other whitespace, are now ignored when interpreting a string as a number. For example: $dec = " \cK \t 123"; $hex = " \cK \t 0xF"; say 0 + $dec; # was 0 with warning, now 123 say int $dec; # was 0, now 123 say oct $hex; # was 0, now 15 =head2 C</(?{})/> and C</(??{})/> have been heavily reworked The implementation of this feature has been almost completely rewritten. Although its main intent is to fix bugs, some behaviors, especially related to the scope of lexical variables, will have changed. This is described more fully in the L</Selected Bug Fixes> section. =head2 Stricter parsing of substitution replacement It is no longer possible to abuse the way the parser parses C<s///e> like this: %_=(_,"Just another "); $_="Perl hacker,\n"; s//_}->{_/e;print =head2 C<given> now aliases the global C<$_> Instead of assigning to an implicit lexical C<$_>, C<given> now makes the global C<$_> an alias for its argument, just like C<foreach>. However, it still uses lexical C<$_> if there is lexical C<$_> in scope (again, just like C<foreach>) [perl #114020]. =head2 The smartmatch family of features are now experimental Smart match, added in v5.10.0 and significantly revised in v5.10.1, has been a regular point of complaint. Although there are a number of ways in which it is useful, it has also proven problematic and confusing for both users and implementors of Perl. There have been a number of proposals on how to best address the problem. It is clear that smartmatch is almost certainly either going to change or go away in the future. Relying on its current behavior is not recommended. Warnings will now be issued when the parser sees C<~~>, C<given>, or C<when>. To disable these warnings, you can add this line to the appropriate scope: no if $] >= 5.018, warnings => "experimental::smartmatch"; Consider, though, replacing the use of these features, as they may change behavior again before becoming stable. =head2 Lexical C<$_> is now experimental Since it was introduced in Perl v5.10, it has caused much confusion with no obvious solution: =over =item * Various modules (e.g., List::Util) expect callback routines to use the global C<$_>. C<use List::Util 'first'; my $_; first { $_ == 1 } @list> does not work as one would expect. =item * A C<my $_> declaration earlier in the same file can cause confusing closure warnings. =item * The "_" subroutine prototype character allows called subroutines to access your lexical C<$_>, so it is not really private after all. =item * Nevertheless, subroutines with a "(@)" prototype and methods cannot access the caller's lexical C<$_>, unless they are written in XS. =item * But even XS routines cannot access a lexical C<$_> declared, not in the calling subroutine, but in an outer scope, iff that subroutine happened not to mention C<$_> or use any operators that default to C<$_>. =back It is our hope that lexical C<$_> can be rehabilitated, but this may cause changes in its behavior. Please use it with caution until it becomes stable. =head2 readline() with C<$/ = \N> now reads N characters, not N bytes Previously, when reading from a stream with I/O layers such as C<encoding>, the readline() function, otherwise known as the C<< <> >> operator, would read I<N> bytes from the top-most layer. [perl #79960] Now, I<N> characters are read instead. There is no change in behaviour when reading from streams with no extra layers, since bytes map exactly to characters. =head2 Overridden C<glob> is now passed one argument C<glob> overrides used to be passed a magical undocumented second argument that identified the caller. Nothing on CPAN was using this, and it got in the way of a bug fix, so it was removed. If you really need to identify the caller, see L<Devel::Callsite> on CPAN. =head2 Here doc parsing The body of a here document inside a quote-like operator now always begins on the line after the "<<foo" marker. Previously, it was documented to begin on the line following the containing quote-like operator, but that was only sometimes the case [perl #114040]. =head2 Alphanumeric operators must now be separated from the closing delimiter of regular expressions You may no longer write something like: m/a/and 1 Instead you must write m/a/ and 1 with whitespace separating the operator from the closing delimiter of the regular expression. Not having whitespace has resulted in a deprecation warning since Perl v5.14.0. =head2 qw(...) can no longer be used as parentheses C<qw> lists used to fool the parser into thinking they were always surrounded by parentheses. This permitted some surprising constructions such as C<foreach $x qw(a b c) {...}>, which should really be written C<foreach $x (qw(a b c)) {...}>. These would sometimes get the lexer into the wrong state, so they didn't fully work, and the similar C<foreach qw(a b c) {...}> that one might expect to be permitted never worked at all. This side effect of C<qw> has now been abolished. It has been deprecated since Perl v5.13.11. It is now necessary to use real parentheses everywhere that the grammar calls for them. =head2 Interaction of lexical and default warnings Turning on any lexical warnings used first to disable all default warnings if lexical warnings were not already enabled: $*; # deprecation warning use warnings "void"; $#; # void warning; no deprecation warning Now, the C<debugging>, C<deprecated>, C<glob>, C<inplace> and C<malloc> warnings categories are left on when turning on lexical warnings (unless they are turned off by C<no warnings>, of course). This may cause deprecation warnings to occur in code that used to be free of warnings. Those are the only categories consisting only of default warnings. Default warnings in other categories are still disabled by C<< use warnings "category" >>, as we do not yet have the infrastructure for controlling individual warnings. =head2 C<state sub> and C<our sub> Due to an accident of history, C<state sub> and C<our sub> were equivalent to a plain C<sub>, so one could even create an anonymous sub with C<our sub { ... }>. These are now disallowed outside of the "lexical_subs" feature. Under the "lexical_subs" feature they have new meanings described in L<perlsub/Lexical Subroutines>. =head2 Defined values stored in environment are forced to byte strings A value stored in an environment variable has always been stringified when inherited by child processes. In this release, when assigning to C<%ENV>, values are immediately stringified, and converted to be only a byte string. First, it is forced to be only a string. Then if the string is utf8 and the equivalent of C<utf8::downgrade()> works, that result is used; otherwise, the equivalent of C<utf8::encode()> is used, and a warning is issued about wide characters (L</Diagnostics>). =head2 C<require> dies for unreadable files When C<require> encounters an unreadable file, it now dies. It used to ignore the file and continue searching the directories in C<@INC> [perl #113422]. =head2 C<gv_fetchmeth_*> and SUPER The various C<gv_fetchmeth_*> XS functions used to treat a package whose named ended with C<::SUPER> specially. A method lookup on the C<Foo::SUPER> package would be treated as a C<SUPER> method lookup on the C<Foo> package. This is no longer the case. To do a C<SUPER> lookup, pass the C<Foo> stash and the C<GV_SUPER> flag. =head2 C<split>'s first argument is more consistently interpreted After some changes earlier in v5.17, C<split>'s behavior has been simplified: if the PATTERN argument evaluates to a string containing one space, it is treated the way that a I<literal> string containing one space once was. =head1 Deprecations =head2 Module removals The following modules will be removed from the core distribution in a future release, and will at that time need to be installed from CPAN. Distributions on CPAN which require these modules will need to list them as prerequisites. The core versions of these modules will now issue C<"deprecated">-category warnings to alert you to this fact. To silence these deprecation warnings, install the modules in question from CPAN. Note that these are (with rare exceptions) fine modules that you are encouraged to continue to use. Their disinclusion from core primarily hinges on their necessity to bootstrapping a fully functional, CPAN-capable Perl installation, not usually on concerns over their design. =over =item L<encoding> The use of this pragma is now strongly discouraged. It conflates the encoding of source text with the encoding of I/O data, reinterprets escape sequences in source text (a questionable choice), and introduces the UTF-8 bug to all runtime handling of character strings. It is broken as designed and beyond repair. For using non-ASCII literal characters in source text, please refer to L<utf8>. For dealing with textual I/O data, please refer to L<Encode> and L<open>. =item L<Archive::Extract> =item L<B::Lint> =item L<B::Lint::Debug> =item L<CPANPLUS> and all included C<CPANPLUS::*> modules =item L<Devel::InnerPackage> =item L<Log::Message> =item L<Log::Message::Config> =item L<Log::Message::Handlers> =item L<Log::Message::Item> =item L<Log::Message::Simple> =item L<Module::Pluggable> =item L<Module::Pluggable::Object> =item L<Object::Accessor> =item L<Pod::LaTeX> =item L<Term::UI> =item L<Term::UI::History> =back =head2 Deprecated Utilities The following utilities will be removed from the core distribution in a future release as their associated modules have been deprecated. They will remain available with the applicable CPAN distribution. =over =item L<cpanp> =item C<cpanp-run-perl> =item L<cpan2dist> These items are part of the C<CPANPLUS> distribution. =item L<pod2latex> This item is part of the C<Pod::LaTeX> distribution. =back =head2 PL_sv_objcount This interpreter-global variable used to track the total number of Perl objects in the interpreter. It is no longer maintained and will be removed altogether in Perl v5.20. =head2 Five additional characters should be escaped in patterns with C</x> When a regular expression pattern is compiled with C</x>, Perl treats 6 characters as white space to ignore, such as SPACE and TAB. However, Unicode recommends 11 characters be treated thusly. We will conform with this in a future Perl version. In the meantime, use of any of the missing characters will raise a deprecation warning, unless turned off. The five characters are: U+0085 NEXT LINE U+200E LEFT-TO-RIGHT MARK U+200F RIGHT-TO-LEFT MARK U+2028 LINE SEPARATOR U+2029 PARAGRAPH SEPARATOR =head2 User-defined charnames with surprising whitespace A user-defined character name with trailing or multiple spaces in a row is likely a typo. This now generates a warning when defined, on the assumption that uses of it will be unlikely to include the excess whitespace. =head2 Various XS-callable functions are now deprecated All the functions used to classify characters will be removed from a future version of Perl, and should not be used. With participating C compilers (e.g., gcc), compiling any file that uses any of these will generate a warning. These were not intended for public use; there are equivalent, faster, macros for most of them. See L<perlapi/Character classes>. The complete list is: C<is_uni_alnum>, C<is_uni_alnumc>, C<is_uni_alnumc_lc>, C<is_uni_alnum_lc>, C<is_uni_alpha>, C<is_uni_alpha_lc>, C<is_uni_ascii>, C<is_uni_ascii_lc>, C<is_uni_blank>, C<is_uni_blank_lc>, C<is_uni_cntrl>, C<is_uni_cntrl_lc>, C<is_uni_digit>, C<is_uni_digit_lc>, C<is_uni_graph>, C<is_uni_graph_lc>, C<is_uni_idfirst>, C<is_uni_idfirst_lc>, C<is_uni_lower>, C<is_uni_lower_lc>, C<is_uni_print>, C<is_uni_print_lc>, C<is_uni_punct>, C<is_uni_punct_lc>, C<is_uni_space>, C<is_uni_space_lc>, C<is_uni_upper>, C<is_uni_upper_lc>, C<is_uni_xdigit>, C<is_uni_xdigit_lc>, C<is_utf8_alnum>, C<is_utf8_alnumc>, C<is_utf8_alpha>, C<is_utf8_ascii>, C<is_utf8_blank>, C<is_utf8_char>, C<is_utf8_cntrl>, C<is_utf8_digit>, C<is_utf8_graph>, C<is_utf8_idcont>, C<is_utf8_idfirst>, C<is_utf8_lower>, C<is_utf8_mark>, C<is_utf8_perl_space>, C<is_utf8_perl_word>, C<is_utf8_posix_digit>, C<is_utf8_print>, C<is_utf8_punct>, C<is_utf8_space>, C<is_utf8_upper>, C<is_utf8_xdigit>, C<is_utf8_xidcont>, C<is_utf8_xidfirst>. In addition these three functions that have never worked properly are deprecated: C<to_uni_lower_lc>, C<to_uni_title_lc>, and C<to_uni_upper_lc>. =head2 Certain rare uses of backslashes within regexes are now deprecated There are three pairs of characters that Perl recognizes as metacharacters in regular expression patterns: C<{}>, C<[]>, and C<()>. These can be used as well to delimit patterns, as in: m{foo} s(foo)(bar) Since they are metacharacters, they have special meaning to regular expression patterns, and it turns out that you can't turn off that special meaning by the normal means of preceding them with a backslash, if you use them, paired, within a pattern delimited by them. For example, in m{foo\{1,3\}} the backslashes do not change the behavior, and this matches S<C<"f o">> followed by one to three more occurrences of C<"o">. Usages like this, where they are interpreted as metacharacters, are exceedingly rare; we think there are none, for example, in all of CPAN. Hence, this deprecation should affect very little code. It does give notice, however, that any such code needs to change, which will in turn allow us to change the behavior in future Perl versions so that the backslashes do have an effect, and without fear that we are silently breaking any existing code. =head2 Splitting the tokens C<(?> and C<(*> in regular expressions A deprecation warning is now raised if the C<(> and C<?> are separated by white space or comments in C<(?...)> regular expression constructs. Similarly, if the C<(> and C<*> are separated in C<(*VERB...)> constructs. =head2 Pre-PerlIO IO implementations In theory, you can currently build perl without PerlIO. Instead, you'd use a wrapper around stdio or sfio. In practice, this isn't very useful. It's not well tested, and without any support for IO layers or (thus) Unicode, it's not much of a perl. Building without PerlIO will most likely be removed in the next version of perl. PerlIO supports a C<stdio> layer if stdio use is desired. Similarly a sfio layer could be produced in the future, if needed. =head1 Future Deprecations =over =item * Platforms without support infrastructure Both Windows CE and z/OS have been historically under-maintained, and are currently neither successfully building nor regularly being smoke tested. Efforts are underway to change this situation, but it should not be taken for granted that the platforms are safe and supported. If they do not become buildable and regularly smoked, support for them may be actively removed in future releases. If you have an interest in these platforms and you can lend your time, expertise, or hardware to help support these platforms, please let the perl development effort know by emailing C<perl5-porters@perl.org>. Some platforms that appear otherwise entirely dead are also on the short list for removal between now and v5.20.0: =over =item DG/UX =item NeXT =back We also think it likely that current versions of Perl will no longer build AmigaOS, DJGPP, NetWare (natively), OS/2 and Plan 9. If you are using Perl on such a platform and have an interest in ensuring Perl's future on them, please contact us. We believe that Perl has long been unable to build on mixed endian architectures (such as PDP-11s), and intend to remove any remaining support code. Similarly, code supporting the long umaintained GNU dld will be removed soon if no-one makes themselves known as an active user. =item * Swapping of $< and $> Perl has supported the idiom of swapping $< and $> (and likewise $( and $)) to temporarily drop permissions since 5.0, like this: ($<, $>) = ($>, $<); However, this idiom modifies the real user/group id, which can have undesirable side-effects, is no longer useful on any platform perl supports and complicates the implementation of these variables and list assignment in general. As an alternative, assignment only to C<< $> >> is recommended: local $> = $<; See also: L<Setuid Demystified|http://www.cs.berkeley.edu/~daw/papers/setuid-usenix02.pdf>. =item * C<microperl>, long broken and of unclear present purpose, will be removed. =item * Revamping C<< "\Q" >> semantics in double-quotish strings when combined with other escapes. There are several bugs and inconsistencies involving combinations of C<\Q> and escapes like C<\x>, C<\L>, etc., within a C<\Q...\E> pair. These need to be fixed, and doing so will necessarily change current behavior. The changes have not yet been settled. =item * Use of C<$x>, where C<x> stands for any actual (non-printing) C0 control character will be disallowed in a future Perl version. Use C<${x}> instead (where again C<x> stands for a control character), or better, C<$^A> , where C<^> is a caret (CIRCUMFLEX ACCENT), and C<A> stands for any of the characters listed at the end of L<perlebcdic/OPERATOR DIFFERENCES>. =back =head1 Performance Enhancements =over 4 =item * Lists of lexical variable declarations (C<my($x, $y)>) are now optimised down to a single op and are hence faster than before. =item * A new C preprocessor define C<NO_TAINT_SUPPORT> was added that, if set, disables Perl's taint support altogether. Using the -T or -t command line flags will cause a fatal error. Beware that both core tests as well as many a CPAN distribution's tests will fail with this change. On the upside, it provides a small performance benefit due to reduced branching. B<Do not enable this unless you know exactly what you are getting yourself into.> =item * C<pack> with constant arguments is now constant folded in most cases [perl #113470]. =item * Speed up in regular expression matching against Unicode properties. The largest gain is for C<\X>, the Unicode "extended grapheme cluster." The gain for it is about 35% - 40%. Bracketed character classes, e.g., C<[0-9\x{100}]> containing code points above 255 are also now faster. =item * On platforms supporting it, several former macros are now implemented as static inline functions. This should speed things up slightly on non-GCC platforms. =item * The optimisation of hashes in boolean context has been extended to affect C<scalar(%hash)>, C<%hash ? ... : ...>, and C<sub { %hash || ... }>. =item * Filetest operators manage the stack in a fractionally more efficient manner. =item * Globs used in a numeric context are now numified directly in most cases, rather than being numified via stringification. =item * The C<x> repetition operator is now folded to a single constant at compile time if called in scalar context with constant operands and no parentheses around the left operand. =back =head1 Modules and Pragmata =head2 New Modules and Pragmata =over 4 =item * L<Config::Perl::V> version 0.16 has been added as a dual-lifed module. It provides structured data retrieval of C<perl -V> output including information only known to the C<perl> binary and not available via L<Config>. =back =head2 Updated Modules and Pragmata For a complete list of updates, run: $ corelist --diff 5.16.0 5.18.0 You can substitute your favorite version in place of C<5.16.0>, too. =over =item * L<Archive::Extract> has been upgraded to 0.68. Work around an edge case on Linux with Busybox's unzip. =item * L<Archive::Tar> has been upgraded to 1.90. ptar now supports the -T option as well as dashless options [rt.cpan.org #75473], [rt.cpan.org #75475]. Auto-encode filenames marked as UTF-8 [rt.cpan.org #75474]. Don't use C<tell> on L<IO::Zlib> handles [rt.cpan.org #64339]. Don't try to C<chown> on symlinks. =item * L<autodie> has been upgraded to 2.13. C<autodie> now plays nicely with the 'open' pragma. =item * L<B> has been upgraded to 1.42. The C<stashoff> method of COPs has been added. This provides access to an internal field added in perl 5.16 under threaded builds [perl #113034]. C<B::COP::stashpv> now supports UTF-8 package names and embedded NULs. All C<CVf_*> and C<GVf_*> and more SV-related flag values are now provided as constants in the C<B::> namespace and available for export. The default export list has not changed. This makes the module work with the new pad API. =item * L<B::Concise> has been upgraded to 0.95. The C<-nobanner> option has been fixed, and C<format>s can now be dumped. When passed a sub name to dump, it will check also to see whether it is the name of a format. If a sub and a format share the same name, it will dump both. This adds support for the new C<OpMAYBE_TRUEBOOL> and C<OPpTRUEBOOL> flags. =item * L<B::Debug> has been upgraded to 1.18. This adds support (experimentally) for C<B::PADLIST>, which was added in Perl 5.17.4. =item * L<B::Deparse> has been upgraded to 1.20. Avoid warning when run under C<perl -w>. It now deparses loop controls with the correct precedence, and multiple statements in a C<format> line are also now deparsed correctly. This release suppresses trailing semicolons in formats. This release adds stub deparsing for lexical subroutines. It no longer dies when deparsing C<sort> without arguments. It now correctly omits the comma for C<system $prog @args> and C<exec $prog @args>. =item * L<bignum>, L<bigint> and L<bigrat> have been upgraded to 0.33. The overrides for C<hex> and C<oct> have been rewritten, eliminating several problems, and making one incompatible change: =over =item * Formerly, whichever of C<use bigint> or C<use bigrat> was compiled later would take precedence over the other, causing C<hex> and C<oct> not to respect the other pragma when in scope. =item * Using any of these three pragmata would cause C<hex> and C<oct> anywhere else in the program to evaluate their arguments in list context and prevent them from inferring $_ when called without arguments. =item * Using any of these three pragmata would make C<oct("1234")> return 1234 (for any number not beginning with 0) anywhere in the program. Now "1234" is translated from octal to decimal, whether within the pragma's scope or not. =item * The global overrides that facilitate lexical use of C<hex> and C<oct> now respect any existing overrides that were in place before the new overrides were installed, falling back to them outside of the scope of C<use bignum>. =item * C<use bignum "hex">, C<use bignum "oct"> and similar invocations for bigint and bigrat now export a C<hex> or C<oct> function, instead of providing a global override. =back =item * L<Carp> has been upgraded to 1.29. Carp is no longer confused when C<caller> returns undef for a package that has been deleted. The C<longmess()> and C<shortmess()> functions are now documented. =item * L<CGI> has been upgraded to 3.63. Unrecognized HTML escape sequences are now handled better, problematic trailing newlines are no longer inserted after E<lt>formE<gt> tags by C<startform()> or C<start_form()>, and bogus "Insecure Dependency" warnings appearing with some versions of perl are now worked around. =item * L<Class::Struct> has been upgraded to 0.64. The constructor now respects overridden accessor methods [perl #29230]. =item * L<Compress::Raw::Bzip2> has been upgraded to 2.060. The misuse of Perl's "magic" API has been fixed. =item * L<Compress::Raw::Zlib> has been upgraded to 2.060. Upgrade bundled zlib to version 1.2.7. Fix build failures on Irix, Solaris, and Win32, and also when building as C++ [rt.cpan.org #69985], [rt.cpan.org #77030], [rt.cpan.org #75222]. The misuse of Perl's "magic" API has been fixed. C<compress()>, C<uncompress()>, C<memGzip()> and C<memGunzip()> have been speeded up by making parameter validation more efficient. =item * L<CPAN::Meta::Requirements> has been upgraded to 2.122. Treat undef requirements to C<from_string_hash> as 0 (with a warning). Added C<requirements_for_module> method. =item * L<CPANPLUS> has been upgraded to 0.9135. Allow adding F<blib/script> to PATH. Save the history between invocations of the shell. Handle multiple C<makemakerargs> and C<makeflags> arguments better. This resolves issues with the SQLite source engine. =item * L<Data::Dumper> has been upgraded to 2.145. It has been optimized to only build a seen-scalar hash as necessary, thereby speeding up serialization drastically. Additional tests were added in order to improve statement, branch, condition and subroutine coverage. On the basis of the coverage analysis, some of the internals of Dumper.pm were refactored. Almost all methods are now documented. =item * L<DB_File> has been upgraded to 1.827. The main Perl module no longer uses the C<"@_"> construct. =item * L<Devel::Peek> has been upgraded to 1.11. This fixes compilation with C++ compilers and makes the module work with the new pad API. =item * L<Digest::MD5> has been upgraded to 2.52. Fix C<Digest::Perl::MD5> OO fallback [rt.cpan.org #66634]. =item * L<Digest::SHA> has been upgraded to 5.84. This fixes a double-free bug, which might have caused vulnerabilities in some cases. =item * L<DynaLoader> has been upgraded to 1.18. This is due to a minor code change in the XS for the VMS implementation. This fixes warnings about using C<CODE> sections without an C<OUTPUT> section. =item * L<Encode> has been upgraded to 2.49. The Mac alias x-mac-ce has been added, and various bugs have been fixed in Encode::Unicode, Encode::UTF7 and Encode::GSM0338. =item * L<Env> has been upgraded to 1.04. Its SPLICE implementation no longer misbehaves in list context. =item * L<ExtUtils::CBuilder> has been upgraded to 0.280210. Manifest files are now correctly embedded for those versions of VC++ which make use of them. [perl #111782, #111798]. A list of symbols to export can now be passed to C<link()> when on Windows, as on other OSes [perl #115100]. =item * L<ExtUtils::ParseXS> has been upgraded to 3.18. The generated C code now avoids unnecessarily incrementing C<PL_amagic_generation> on Perl versions where it's done automatically (or on current Perl where the variable no longer exists). This avoids a bogus warning for initialised XSUB non-parameters [perl #112776]. =item * L<File::Copy> has been upgraded to 2.26. C<copy()> no longer zeros files when copying into the same directory, and also now fails (as it has long been documented to do) when attempting to copy a file over itself. =item * L<File::DosGlob> has been upgraded to 1.10. The internal cache of file names that it keeps for each caller is now freed when that caller is freed. This means C<< use File::DosGlob 'glob'; eval 'scalar <*>' >> no longer leaks memory. =item * L<File::Fetch> has been upgraded to 0.38. Added the 'file_default' option for URLs that do not have a file component. Use C<File::HomeDir> when available, and provide C<PERL5_CPANPLUS_HOME> to override the autodetection. Always re-fetch F<CHECKSUMS> if C<fetchdir> is set. =item * L<File::Find> has been upgraded to 1.23. This fixes inconsistent unixy path handling on VMS. Individual files may now appear in list of directories to be searched [perl #59750]. =item * L<File::Glob> has been upgraded to 1.20. File::Glob has had exactly the same fix as File::DosGlob. Since it is what Perl's own C<glob> operator itself uses (except on VMS), this means C<< eval 'scalar <*>' >> no longer leaks. A space-separated list of patterns return long lists of results no longer results in memory corruption or crashes. This bug was introduced in Perl 5.16.0. [perl #114984] =item * L<File::Spec::Unix> has been upgraded to 3.40. C<abs2rel> could produce incorrect results when given two relative paths or the root directory twice [perl #111510]. =item * L<File::stat> has been upgraded to 1.07. C<File::stat> ignores the L<filetest> pragma, and warns when used in combination therewith. But it was not warning for C<-r>. This has been fixed [perl #111640]. C<-p> now works, and does not return false for pipes [perl #111638]. Previously C<File::stat>'s overloaded C<-x> and C<-X> operators did not give the correct results for directories or executable files when running as root. They had been treating executable permissions for root just like for any other user, performing group membership tests I<etc> for files not owned by root. They now follow the correct Unix behaviour - for a directory they are always true, and for a file if any of the three execute permission bits are set then they report that root can execute the file. Perl's builtin C<-x> and C<-X> operators have always been correct. =item * L<File::Temp> has been upgraded to 0.23 Fixes various bugs involving directory removal. Defers unlinking tempfiles if the initial unlink fails, which fixes problems on NFS. =item * L<GDBM_File> has been upgraded to 1.15. The undocumented optional fifth parameter to C<TIEHASH> has been removed. This was intended to provide control of the callback used by C<gdbm*> functions in case of fatal errors (such as filesystem problems), but did not work (and could never have worked). No code on CPAN even attempted to use it. The callback is now always the previous default, C<croak>. Problems on some platforms with how the C<C> C<croak> function is called have also been resolved. =item * L<Hash::Util> has been upgraded to 0.15. C<hash_unlocked> and C<hashref_unlocked> now returns true if the hash is unlocked, instead of always returning false [perl #112126]. C<hash_unlocked>, C<hashref_unlocked>, C<lock_hash_recurse> and C<unlock_hash_recurse> are now exportable [perl #112126]. Two new functions, C<hash_locked> and C<hashref_locked>, have been added. Oddly enough, these two functions were already exported, even though they did not exist [perl #112126]. =item * L<HTTP::Tiny> has been upgraded to 0.025. Add SSL verification features [github #6], [github #9]. Include the final URL in the response hashref. Add C<local_address> option. This improves SSL support. =item * L<IO> has been upgraded to 1.28. C<sync()> can now be called on read-only file handles [perl #64772]. L<IO::Socket> tries harder to cache or otherwise fetch socket information. =item * L<IPC::Cmd> has been upgraded to 0.80. Use C<POSIX::_exit> instead of C<exit> in C<run_forked> [rt.cpan.org #76901]. =item * L<IPC::Open3> has been upgraded to 1.13. The C<open3()> function no longer uses C<POSIX::close()> to close file descriptors since that breaks the ref-counting of file descriptors done by PerlIO in cases where the file descriptors are shared by PerlIO streams, leading to attempts to close the file descriptors a second time when any such PerlIO streams are closed later on. =item * L<Locale::Codes> has been upgraded to 3.25. It includes some new codes. =item * L<Memoize> has been upgraded to 1.03. Fix the C<MERGE> cache option. =item * L<Module::Build> has been upgraded to 0.4003. Fixed bug where modules without C<$VERSION> might have a version of '0' listed in 'provides' metadata, which will be rejected by PAUSE. Fixed bug in PodParser to allow numerals in module names. Fixed bug where giving arguments twice led to them becoming arrays, resulting in install paths like F<ARRAY(0xdeadbeef)/lib/Foo.pm>. A minor bug fix allows markup to be used around the leading "Name" in a POD "abstract" line, and some documentation improvements have been made. =item * L<Module::CoreList> has been upgraded to 2.90 Version information is now stored as a delta, which greatly reduces the size of the F<CoreList.pm> file. This restores compatibility with older versions of perl and cleans up the corelist data for various modules. =item * L<Module::Load::Conditional> has been upgraded to 0.54. Fix use of C<requires> on perls installed to a path with spaces. Various enhancements include the new use of Module::Metadata. =item * L<Module::Metadata> has been upgraded to 1.000011. The creation of a Module::Metadata object for a typical module file has been sped up by about 40%, and some spurious warnings about C<$VERSION>s have been suppressed. =item * L<Module::Pluggable> has been upgraded to 4.7. Amongst other changes, triggers are now allowed on events, which gives a powerful way to modify behaviour. =item * L<Net::Ping> has been upgraded to 2.41. This fixes some test failures on Windows. =item * L<Opcode> has been upgraded to 1.25. Reflect the removal of the boolkeys opcode and the addition of the clonecv, introcv and padcv opcodes. =item * L<overload> has been upgraded to 1.22. C<no overload> now warns for invalid arguments, just like C<use overload>. =item * L<PerlIO::encoding> has been upgraded to 0.16. This is the module implementing the ":encoding(...)" I/O layer. It no longer corrupts memory or crashes when the encoding back-end reallocates the buffer or gives it a typeglob or shared hash key scalar. =item * L<PerlIO::scalar> has been upgraded to 0.16. The buffer scalar supplied may now only contain code points 0xFF or lower. [perl #109828] =item * L<Perl::OSType> has been upgraded to 1.003. This fixes a bug detecting the VOS operating system. =item * L<Pod::Html> has been upgraded to 1.18. The option C<--libpods> has been reinstated. It is deprecated, and its use does nothing other than issue a warning that it is no longer supported. Since the HTML files generated by pod2html claim to have a UTF-8 charset, actually write the files out using UTF-8 [perl #111446]. =item * L<Pod::Simple> has been upgraded to 3.28. Numerous improvements have been made, mostly to Pod::Simple::XHTML, which also has a compatibility change: the C<codes_in_verbatim> option is now disabled by default. See F<cpan/Pod-Simple/ChangeLog> for the full details. =item * L<re> has been upgraded to 0.23 Single character [class]es like C</[s]/> or C</[s]/i> are now optimized as if they did not have the brackets, i.e. C</s/> or C</s/i>. See note about C<op_comp> in the L</Internal Changes> section below. =item * L<Safe> has been upgraded to 2.35. Fix interactions with C<Devel::Cover>. Don't eval code under C<no strict>. =item * L<Scalar::Util> has been upgraded to version 1.27. Fix an overloading issue with C<sum>. C<first> and C<reduce> now check the callback first (so C<&first(1)> is disallowed). Fix C<tainted> on magical values [rt.cpan.org #55763]. Fix C<sum> on previously magical values [rt.cpan.org #61118]. Fix reading past the end of a fixed buffer [rt.cpan.org #72700]. =item * L<Search::Dict> has been upgraded to 1.07. No longer require C<stat> on filehandles. Use C<fc> for casefolding. =item * L<Socket> has been upgraded to 2.009. Constants and functions required for IP multicast source group membership have been added. C<unpack_sockaddr_in()> and C<unpack_sockaddr_in6()> now return just the IP address in scalar context, and C<inet_ntop()> now guards against incorrect length scalars being passed in. This fixes an uninitialized memory read. =item * L<Storable> has been upgraded to 2.41. Modifying C<$_[0]> within C<STORABLE_freeze> no longer results in crashes [perl #112358]. An object whose class implements C<STORABLE_attach> is now thawed only once when there are multiple references to it in the structure being thawed [perl #111918]. Restricted hashes were not always thawed correctly [perl #73972]. Storable would croak when freezing a blessed REF object with a C<STORABLE_freeze()> method [perl #113880]. It can now freeze and thaw vstrings correctly. This causes a slight incompatible change in the storage format, so the format version has increased to 2.9. This contains various bugfixes, including compatibility fixes for older versions of Perl and vstring handling. =item * L<Sys::Syslog> has been upgraded to 0.32. This contains several bug fixes relating to C<getservbyname()>, C<setlogsock()>and log levels in C<syslog()>, together with fixes for Windows, Haiku-OS and GNU/kFreeBSD. See F<cpan/Sys-Syslog/Changes> for the full details. =item * L<Term::ANSIColor> has been upgraded to 4.02. Add support for italics. Improve error handling. =item * L<Term::ReadLine> has been upgraded to 1.10. This fixes the use of the B<cpan> and B<cpanp> shells on Windows in the event that the current drive happens to contain a F<\dev\tty> file. =item * L<Test::Harness> has been upgraded to 3.26. Fix glob semantics on Win32 [rt.cpan.org #49732]. Don't use C<Win32::GetShortPathName> when calling perl [rt.cpan.org #47890]. Ignore -T when reading shebang [rt.cpan.org #64404]. Handle the case where we don't know the wait status of the test more gracefully. Make the test summary 'ok' line overridable so that it can be changed to a plugin to make the output of prove idempotent. Don't run world-writable files. =item * L<Text::Tabs> and L<Text::Wrap> have been upgraded to 2012.0818. Support for Unicode combining characters has been added to them both. =item * L<threads::shared> has been upgraded to 1.31. This adds the option to warn about or ignore attempts to clone structures that can't be cloned, as opposed to just unconditionally dying in that case. This adds support for dual-valued values as created by L<Scalar::Util::dualvar|Scalar::Util/"dualvar NUM, STRING">. =item * L<Tie::StdHandle> has been upgraded to 4.3. C<READ> now respects the offset argument to C<read> [perl #112826]. =item * L<Time::Local> has been upgraded to 1.2300. Seconds values greater than 59 but less than 60 no longer cause C<timegm()> and C<timelocal()> to croak. =item * L<Unicode::UCD> has been upgraded to 0.53. This adds a function L<all_casefolds()|Unicode::UCD/all_casefolds()> that returns all the casefolds. =item * L<Win32> has been upgraded to 0.47. New APIs have been added for getting and setting the current code page. =back =head2 Removed Modules and Pragmata =over =item * L<Version::Requirements> has been removed from the core distribution. It is available under a different name: L<CPAN::Meta::Requirements>. =back =head1 Documentation =head2 Changes to Existing Documentation =head3 L<perlcheat> =over 4 =item * L<perlcheat> has been reorganized, and a few new sections were added. =back =head3 L<perldata> =over 4 =item * Now explicitly documents the behaviour of hash initializer lists that contain duplicate keys. =back =head3 L<perldiag> =over 4 =item * The explanation of symbolic references being prevented by "strict refs" now doesn't assume that the reader knows what symbolic references are. =back =head3 L<perlfaq> =over 4 =item * L<perlfaq> has been synchronized with version 5.0150040 from CPAN. =back =head3 L<perlfunc> =over 4 =item * The return value of C<pipe> is now documented. =item * Clarified documentation of C<our>. =back =head3 L<perlop> =over 4 =item * Loop control verbs (C<dump>, C<goto>, C<next>, C<last> and C<redo>) have always had the same precedence as assignment operators, but this was not documented until now. =back =head3 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 New Diagnostics =head3 New Errors =over 4 =item * L<Unterminated delimiter for here document|perldiag/"Unterminated delimiter for here document"> This message now occurs when a here document label has an initial quotation mark but the final quotation mark is missing. This replaces a bogus and misleading error message about not finding the label itself [perl #114104]. =item * L<panic: child pseudo-process was never scheduled|perldiag/"panic: child pseudo-process was never scheduled"> This error is thrown when a child pseudo-process in the ithreads implementation on Windows was not scheduled within the time period allowed and therefore was not able to initialize properly [perl #88840]. =item * L<Group name must start with a non-digit word character in regex; marked by <-- HERE in mE<sol>%sE<sol>|perldiag/"Group name must start with a non-digit word character in regex; marked by <-- HERE in m/%s/"> This error has been added for C<(?&0)>, which is invalid. It used to produce an incomprehensible error message [perl #101666]. =item * L<Can't use an undefined value as a subroutine reference|perldiag/"Can't use an undefined value as %s reference"> Calling an undefined value as a subroutine now produces this error message. It used to, but was accidentally disabled, first in Perl 5.004 for non-magical variables, and then in Perl v5.14 for magical (e.g., tied) variables. It has now been restored. In the mean time, undef was treated as an empty string [perl #113576]. =item * L<Experimental "%s" subs not enabled|perldiag/"Experimental "%s" subs not enabled"> To use lexical subs, you must first enable them: no warnings 'experimental::lexical_subs'; use feature 'lexical_subs'; my sub foo { ... } =back =head3 New Warnings =over 4 =item * L<'Strings with code points over 0xFF may not be mapped into in-memory file handles'|perldiag/"Strings with code points over 0xFF may not be mapped into in-memory file handles"> =item * L<'%s' resolved to '\o{%s}%d'|perldiag/"'%s' resolved to '\o{%s}%d'"> =item * L<'Trailing white-space in a charnames alias definition is deprecated'|perldiag/"Trailing white-space in a charnames alias definition is deprecated"> =item * L<'A sequence of multiple spaces in a charnames alias definition is deprecated'|perldiag/"A sequence of multiple spaces in a charnames alias definition is deprecated"> =item * L<'Passing malformed UTF-8 to "%s" is deprecated'|perldiag/"Passing malformed UTF-8 to "%s" is deprecated"> =item * L<Subroutine "&%s" is not available|perldiag/"Subroutine "&%s" is not available"> (W closure) During compilation, an inner named subroutine or eval is attempting to capture an outer lexical subroutine that is not currently available. This can happen for one of two reasons. First, the lexical subroutine may be declared in an outer anonymous subroutine that has not yet been created. (Remember that named subs are created at compile time, while anonymous subs are created at run-time.) For example, sub { my sub a {...} sub f { \&a } } At the time that f is created, it can't capture the current the "a" sub, since the anonymous subroutine hasn't been created yet. Conversely, the following won't give a warning since the anonymous subroutine has by now been created and is live: sub { my sub a {...} eval 'sub f { \&a }' }->(); The second situation is caused by an eval accessing a variable that has gone out of scope, for example, sub f { my sub a {...} sub { eval '\&a' } } f()->(); Here, when the '\&a' in the eval is being compiled, f() is not currently being executed, so its &a is not available for capture. =item * L<"%s" subroutine &%s masks earlier declaration in same %s|perldiag/"%s" subroutine &%s masks earlier declaration in same %s> (W misc) A "my" or "state" subroutine has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier subroutine will still exist until the end of the scope or until all closure references to it are destroyed. =item * L<The %s feature is experimental|perldiag/"The %s feature is experimental"> (S experimental) This warning is emitted if you enable an experimental feature via C<use feature>. Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version: no warnings "experimental::lexical_subs"; use feature "lexical_subs"; =item * L<sleep(%u) too large|perldiag/"sleep(%u) too large"> (W overflow) You called C<sleep> with a number that was larger than it can reliably handle and C<sleep> probably slept for less time than requested. =item * L<Wide character in setenv|perldiag/"Wide character in %s"> Attempts to put wide characters into environment variables via C<%ENV> now provoke this warning. =item * "L<Invalid negative number (%s) in chr|perldiag/"Invalid negative number (%s) in chr">" C<chr()> now warns when passed a negative value [perl #83048]. =item * "L<Integer overflow in srand|perldiag/"Integer overflow in srand">" C<srand()> now warns when passed a value that doesn't fit in a C<UV> (since the value will be truncated rather than overflowing) [perl #40605]. =item * "L<-i used with no filenames on the command line, reading from STDIN|perldiag/"-i used with no filenames on the command line, reading from STDIN">" Running perl with the C<-i> flag now warns if no input files are provided on the command line [perl #113410]. =back =head2 Changes to Existing Diagnostics =over 4 =item * L<$* is no longer supported|perldiag/"$* is no longer supported"> The warning that use of C<$*> and C<$#> is no longer supported is now generated for every location that references them. Previously it would fail to be generated if another variable using the same typeglob was seen first (e.g. C<@*> before C<$*>), and would not be generated for the second and subsequent uses. (It's hard to fix the failure to generate warnings at all without also generating them every time, and warning every time is consistent with the warnings that C<$[> used to generate.) =item * The warnings for C<\b{> and C<\B{> were added. They are a deprecation warning which should be turned off by that category. One should not have to turn off regular regexp warnings as well to get rid of these. =item * L<Constant(%s): Call to &{$^H{%s}} did not return a defined value|perldiag/Constant(%s): Call to &{$^H{%s}} did not return a defined value> Constant overloading that returns C<undef> results in this error message. For numeric constants, it used to say "Constant(undef)". "undef" has been replaced with the number itself. =item * The error produced when a module cannot be loaded now includes a hint that the module may need to be installed: "Can't locate hopping.pm in @INC (you may need to install the hopping module) (@INC contains: ...)" =item * L<vector argument not supported with alpha versions|perldiag/vector argument not supported with alpha versions> This warning was not suppressible, even with C<no warnings>. Now it is suppressible, and has been moved from the "internal" category to the "printf" category. =item * C<< Can't do {n,m} with n > m in regex; marked by <-- HERE in m/%s/ >> This fatal error has been turned into a warning that reads: L<< Quantifier {n,m} with n > m can't match in regex | perldiag/Quantifier {n,m} with n > m can't match in regex >> (W regexp) Minima should be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}. =item * The "Runaway prototype" warning that occurs in bizarre cases has been removed as being unhelpful and inconsistent. =item * The "Not a format reference" error has been removed, as the only case in which it could be triggered was a bug. =item * The "Unable to create sub named %s" error has been removed for the same reason. =item * The 'Can't use "my %s" in sort comparison' error has been downgraded to a warning, '"my %s" used in sort comparison' (with 'state' instead of 'my' for state variables). In addition, the heuristics for guessing whether lexical $a or $b has been misused have been improved to generate fewer false positives. Lexical $a and $b are no longer disallowed if they are outside the sort block. Also, a named unary or list operator inside the sort block no longer causes the $a or $b to be ignored [perl #86136]. =back =head1 Utility Changes =head3 L<h2xs> =over 4 =item * F<h2xs> no longer produces invalid code for empty defines. [perl #20636] =back =head1 Configuration and Compilation =over 4 =item * Added C<useversionedarchname> option to Configure When set, it includes 'api_versionstring' in 'archname'. E.g. x86_64-linux-5.13.6-thread-multi. It is unset by default. This feature was requested by Tim Bunce, who observed that C<INSTALL_BASE> creates a library structure that does not differentiate by perl version. Instead, it places architecture specific files in "$install_base/lib/perl5/$archname". This makes it difficult to use a common C<INSTALL_BASE> library path with multiple versions of perl. By setting C<-Duseversionedarchname>, the $archname will be distinct for architecture I<and> API version, allowing mixed use of C<INSTALL_BASE>. =item * Add a C<PERL_NO_INLINE_FUNCTIONS> option If C<PERL_NO_INLINE_FUNCTIONS> is defined, don't include "inline.h" This permits test code to include the perl headers for definitions without creating a link dependency on the perl library (which may not exist yet). =item * Configure will honour the external C<MAILDOMAIN> environment variable, if set. =item * C<installman> no longer ignores the silent option =item * Both C<META.yml> and C<META.json> files are now included in the distribution. =item * F<Configure> will now correctly detect C<isblank()> when compiling with a C++ compiler. =item * The pager detection in F<Configure> has been improved to allow responses which specify options after the program name, e.g. B</usr/bin/less -R>, if the user accepts the default value. This helps B<perldoc> when handling ANSI escapes [perl #72156]. =back =head1 Testing =over 4 =item * The test suite now has a section for tests that require very large amounts of memory. These tests won't run by default; they can be enabled by setting the C<PERL_TEST_MEMORY> environment variable to the number of gibibytes of memory that may be safely used. =back =head1 Platform Support =head2 Discontinued Platforms =over 4 =item BeOS BeOS was an operating system for personal computers developed by Be Inc, initially for their BeBox hardware. The OS Haiku was written as an open source replacement for/continuation of BeOS, and its perl port is current and actively maintained. =item UTS Global Support code relating to UTS global has been removed. UTS was a mainframe version of System V created by Amdahl, subsequently sold to UTS Global. The port has not been touched since before Perl v5.8.0, and UTS Global is now defunct. =item VM/ESA Support for VM/ESA has been removed. The port was tested on 2.3.0, which IBM ended service on in March 2002. 2.4.0 ended service in June 2003, and was superseded by Z/VM. The current version of Z/VM is V6.2.0, and scheduled for end of service on 2015/04/30. =item MPE/IX Support for MPE/IX has been removed. =item EPOC Support code relating to EPOC has been removed. EPOC was a family of operating systems developed by Psion for mobile devices. It was the predecessor of Symbian. The port was last updated in April 2002. =item Rhapsody Support for Rhapsody has been removed. =back =head2 Platform-Specific Notes =head3 AIX Configure now always adds C<-qlanglvl=extc99> to the CC flags on AIX when using xlC. This will make it easier to compile a number of XS-based modules that assume C99 [perl #113778]. =head3 clang++ There is now a workaround for a compiler bug that prevented compiling with clang++ since Perl v5.15.7 [perl #112786]. =head3 C++ When compiling the Perl core as C++ (which is only semi-supported), the mathom functions are now compiled as C<extern "C">, to ensure proper binary compatibility. (However, binary compatibility isn't generally guaranteed anyway in the situations where this would matter.) =head3 Darwin Stop hardcoding an alignment on 8 byte boundaries to fix builds using -Dusemorebits. =head3 Haiku Perl should now work out of the box on Haiku R1 Alpha 4. =head3 MidnightBSD C<libc_r> was removed from recent versions of MidnightBSD and older versions work better with C<pthread>. Threading is now enabled using C<pthread> which corrects build errors with threading enabled on 0.4-CURRENT. =head3 Solaris In Configure, avoid running sed commands with flags not supported on Solaris. =head3 VMS =over =item * Where possible, the case of filenames and command-line arguments is now preserved by enabling the CRTL features C<DECC$EFS_CASE_PRESERVE> and C<DECC$ARGV_PARSE_STYLE> at start-up time. The latter only takes effect when extended parse is enabled in the process from which Perl is run. =item * The character set for Extended Filename Syntax (EFS) is now enabled by default on VMS. Among other things, this provides better handling of dots in directory names, multiple dots in filenames, and spaces in filenames. To obtain the old behavior, set the logical name C<DECC$EFS_CHARSET> to C<DISABLE>. =item * Fixed linking on builds configured with C<-Dusemymalloc=y>. =item * Experimental support for building Perl with the HP C++ compiler is available by configuring with C<-Dusecxx>. =item * All C header files from the top-level directory of the distribution are now installed on VMS, providing consistency with a long-standing practice on other platforms. Previously only a subset were installed, which broke non-core extension builds for extensions that depended on the missing include files. =item * Quotes are now removed from the command verb (but not the parameters) for commands spawned via C<system>, backticks, or a piped C<open>. Previously, quotes on the verb were passed through to DCL, which would fail to recognize the command. Also, if the verb is actually a path to an image or command procedure on an ODS-5 volume, quoting it now allows the path to contain spaces. =item * The B<a2p> build has been fixed for the HP C++ compiler on OpenVMS. =back =head3 Win32 =over =item * Perl can now be built using Microsoft's Visual C++ 2012 compiler by specifying CCTYPE=MSVC110 (or MSVC110FREE if you are using the free Express edition for Windows Desktop) in F<win32/Makefile>. =item * The option to build without C<USE_SOCKETS_AS_HANDLES> has been removed. =item * Fixed a problem where perl could crash while cleaning up threads (including the main thread) in threaded debugging builds on Win32 and possibly other platforms [perl #114496]. =item * A rare race condition that would lead to L<sleep|perlfunc/sleep> taking more time than requested, and possibly even hanging, has been fixed [perl #33096]. =item * C<link> on Win32 now attempts to set C<$!> to more appropriate values based on the Win32 API error code. [perl #112272] Perl no longer mangles the environment block, e.g. when launching a new sub-process, when the environment contains non-ASCII characters. Known problems still remain, however, when the environment contains characters outside of the current ANSI codepage (e.g. see the item about Unicode in C<%ENV> in L<http://perl5.git.perl.org/perl.git/blob/HEAD:/Porting/todo.pod>). [perl #113536] =item * Building perl with some Windows compilers used to fail due to a problem with miniperl's C<glob> operator (which uses the C<perlglob> program) deleting the PATH environment variable [perl #113798]. =item * A new makefile option, C<USE_64_BIT_INT>, has been added to the Windows makefiles. Set this to "define" when building a 32-bit perl if you want it to use 64-bit integers. Machine code size reductions, already made to the DLLs of XS modules in Perl v5.17.2, have now been extended to the perl DLL itself. Building with VC++ 6.0 was inadvertently broken in Perl v5.17.2 but has now been fixed again. =back =head3 WinCE Building on WinCE is now possible once again, although more work is required to fully restore a clean build. =head1 Internal Changes =over =item * Synonyms for the misleadingly named C<av_len()> have been created: C<av_top_index()> and C<av_tindex>. All three of these return the number of the highest index in the array, not the number of elements it contains. =item * SvUPGRADE() is no longer an expression. Originally this macro (and its underlying function, sv_upgrade()) were documented as boolean, although in reality they always croaked on error and never returned false. In 2005 the documentation was updated to specify a void return value, but SvUPGRADE() was left always returning 1 for backwards compatibility. This has now been removed, and SvUPGRADE() is now a statement with no return value. So this is now a syntax error: if (!SvUPGRADE(sv)) { croak(...); } If you have code like that, simply replace it with SvUPGRADE(sv); or to avoid compiler warnings with older perls, possibly (void)SvUPGRADE(sv); =item * Perl has a new copy-on-write mechanism that allows any SvPOK scalar to be upgraded to a copy-on-write scalar. A reference count on the string buffer is stored in the string buffer itself. This feature is B<not enabled by default>. It can be enabled in a perl build by running F<Configure> with B<-Accflags=-DPERL_NEW_COPY_ON_WRITE>, and we would encourage XS authors to try their code with such an enabled perl, and provide feedback. Unfortunately, there is not yet a good guide to updating XS code to cope with COW. Until such a document is available, consult the perl5-porters mailing list. It breaks a few XS modules by allowing copy-on-write scalars to go through code paths that never encountered them before. =item * Copy-on-write no longer uses the SvFAKE and SvREADONLY flags. Hence, SvREADONLY indicates a true read-only SV. Use the SvIsCOW macro (as before) to identify a copy-on-write scalar. =item * C<PL_glob_index> is gone. =item * The private Perl_croak_no_modify has had its context parameter removed. It is now has a void prototype. Users of the public API croak_no_modify remain unaffected. =item * Copy-on-write (shared hash key) scalars are no longer marked read-only. C<SvREADONLY> returns false on such an SV, but C<SvIsCOW> still returns true. =item * A new op type, C<OP_PADRANGE> has been introduced. The perl peephole optimiser will, where possible, substitute a single padrange op for a pushmark followed by one or more pad ops, and possibly also skipping list and nextstate ops. In addition, the op can carry out the tasks associated with the RHS of a C<< my(...) = @_ >> assignment, so those ops may be optimised away too. =item * Case-insensitive matching inside a [bracketed] character class with a multi-character fold no longer excludes one of the possibilities in the circumstances that it used to. [perl #89774]. =item * C<PL_formfeed> has been removed. =item * The regular expression engine no longer reads one byte past the end of the target string. While for all internally well-formed scalars this should never have been a problem, this change facilitates clever tricks with string buffers in CPAN modules. [perl #73542] =item * Inside a BEGIN block, C<PL_compcv> now points to the currently-compiling subroutine, rather than the BEGIN block itself. =item * C<mg_length> has been deprecated. =item * C<sv_len> now always returns a byte count and C<sv_len_utf8> a character count. Previously, C<sv_len> and C<sv_len_utf8> were both buggy and would sometimes returns bytes and sometimes characters. C<sv_len_utf8> no longer assumes that its argument is in UTF-8. Neither of these creates UTF-8 caches for tied or overloaded values or for non-PVs any more. =item * C<sv_mortalcopy> now copies string buffers of shared hash key scalars when called from XS modules [perl #79824]. =item * The new C<RXf_MODIFIES_VARS> flag can be set by custom regular expression engines to indicate that the execution of the regular expression may cause variables to be modified. This lets C<s///> know to skip certain optimisations. Perl's own regular expression engine sets this flag for the special backtracking verbs that set $REGMARK and $REGERROR. =item * The APIs for accessing lexical pads have changed considerably. C<PADLIST>s are now longer C<AV>s, but their own type instead. C<PADLIST>s now contain a C<PAD> and a C<PADNAMELIST> of C<PADNAME>s, rather than C<AV>s for the pad and the list of pad names. C<PAD>s, C<PADNAMELIST>s, and C<PADNAME>s are to be accessed as such through the newly added pad API instead of the plain C<AV> and C<SV> APIs. See L<perlapi> for details. =item * In the regex API, the numbered capture callbacks are passed an index indicating what match variable is being accessed. There are special index values for the C<$`, $&, $&> variables. Previously the same three values were used to retrieve C<${^PREMATCH}, ${^MATCH}, ${^POSTMATCH}> too, but these have now been assigned three separate values. See L<perlreapi/Numbered capture callbacks>. =item * C<PL_sawampersand> was previously a boolean indicating that any of C<$`, $&, $&> had been seen; it now contains three one-bit flags indicating the presence of each of the variables individually. =item * The C<CV *> typemap entry now supports C<&{}> overloading and typeglobs, just like C<&{...}> [perl #96872]. =item * The C<SVf_AMAGIC> flag to indicate overloading is now on the stash, not the object. It is now set automatically whenever a method or @ISA changes, so its meaning has changed, too. It now means "potentially overloaded". When the overload table is calculated, the flag is automatically turned off if there is no overloading, so there should be no noticeable slowdown. The staleness of the overload tables is now checked when overload methods are invoked, rather than during C<bless>. "A" magic is gone. The changes to the handling of the C<SVf_AMAGIC> flag eliminate the need for it. C<PL_amagic_generation> has been removed as no longer necessary. For XS modules, it is now a macro alias to C<PL_na>. The fallback overload setting is now stored in a stash entry separate from overloadedness itself. =item * The character-processing code has been cleaned up in places. The changes should be operationally invisible. =item * The C<study> function was made a no-op in v5.16. It was simply disabled via a C<return> statement; the code was left in place. Now the code supporting what C<study> used to do has been removed. =item * Under threaded perls, there is no longer a separate PV allocated for every COP to store its package name (C<< cop->stashpv >>). Instead, there is an offset (C<< cop->stashoff >>) into the new C<PL_stashpad> array, which holds stash pointers. =item * In the pluggable regex API, the C<regexp_engine> struct has acquired a new field C<op_comp>, which is currently just for perl's internal use, and should be initialized to NULL by other regex plugin modules. =item * A new function C<alloccopstash> has been added to the API, but is considered experimental. See L<perlapi>. =item * Perl used to implement get magic in a way that would sometimes hide bugs in code that could call mg_get() too many times on magical values. This hiding of errors no longer occurs, so long-standing bugs may become visible now. If you see magic-related errors in XS code, check to make sure it, together with the Perl API functions it uses, calls mg_get() only once on SvGMAGICAL() values. =item * OP allocation for CVs now uses a slab allocator. This simplifies memory management for OPs allocated to a CV, so cleaning up after a compilation error is simpler and safer [perl #111462][perl #112312]. =item * C<PERL_DEBUG_READONLY_OPS> has been rewritten to work with the new slab allocator, allowing it to catch more violations than before. =item * The old slab allocator for ops, which was only enabled for C<PERL_IMPLICIT_SYS> and C<PERL_DEBUG_READONLY_OPS>, has been retired. =back =head1 Selected Bug Fixes =over 4 =item * Here document terminators no longer require a terminating newline character when they occur at the end of a file. This was already the case at the end of a string eval [perl #65838]. =item * C<-DPERL_GLOBAL_STRUCT> builds now free the global struct B<after> they've finished using it. =item * A trailing '/' on a path in @INC will no longer have an additional '/' appended. =item * The C<:crlf> layer now works when unread data doesn't fit into its own buffer. [perl #112244]. =item * C<ungetc()> now handles UTF-8 encoded data. [perl #116322]. =item * A bug in the core typemap caused any C types that map to the T_BOOL core typemap entry to not be set, updated, or modified when the T_BOOL variable was used in an OUTPUT: section with an exception for RETVAL. T_BOOL in an INPUT: section was not affected. Using a T_BOOL return type for an XSUB (RETVAL) was not affected. A side effect of fixing this bug is, if a T_BOOL is specified in the OUTPUT: section (which previous did nothing to the SV), and a read only SV (literal) is passed to the XSUB, croaks like "Modification of a read-only value attempted" will happen. [perl #115796] =item * On many platforms, providing a directory name as the script name caused perl to do nothing and report success. It should now universally report an error and exit nonzero. [perl #61362] =item * C<sort {undef} ...> under fatal warnings no longer crashes. It had begun crashing in Perl v5.16. =item * Stashes blessed into each other (C<bless \%Foo::, 'Bar'; bless \%Bar::, 'Foo'>) no longer result in double frees. This bug started happening in Perl v5.16. =item * Numerous memory leaks have been fixed, mostly involving fatal warnings and syntax errors. =item * Some failed regular expression matches such as C<'f' =~ /../g> were not resetting C<pos>. Also, "match-once" patterns (C<m?...?g>) failed to reset it, too, when invoked a second time [perl #23180]. =item * Several bugs involving C<local *ISA> and C<local *Foo::> causing stale MRO caches have been fixed. =item * Defining a subroutine when its typeglob has been aliased no longer results in stale method caches. This bug was introduced in Perl v5.10. =item * Localising a typeglob containing a subroutine when the typeglob's package has been deleted from its parent stash no longer produces an error. This bug was introduced in Perl v5.14. =item * Under some circumstances, C<local *method=...> would fail to reset method caches upon scope exit. =item * C</[.foo.]/> is no longer an error, but produces a warning (as before) and is treated as C</[.fo]/> [perl #115818]. =item * C<goto $tied_var> now calls FETCH before deciding what type of goto (subroutine or label) this is. =item * Renaming packages through glob assignment (C<*Foo:: = *Bar::; *Bar:: = *Baz::>) in combination with C<m?...?> and C<reset> no longer makes threaded builds crash. =item * A number of bugs related to assigning a list to hash have been fixed. Many of these involve lists with repeated keys like C<(1, 1, 1, 1)>. =over 4 =item * The expression C<scalar(%h = (1, 1, 1, 1))> now returns C<4>, not C<2>. =item * The return value of C<%h = (1, 1, 1)> in list context was wrong. Previously this would return C<(1, undef, 1)>, now it returns C<(1, undef)>. =item * Perl now issues the same warning on C<($s, %h) = (1, {})> as it does for C<(%h) = ({})>, "Reference found where even-sized list expected". =item * A number of additional edge cases in list assignment to hashes were corrected. For more details see commit 23b7025ebc. =back =item * Attributes applied to lexical variables no longer leak memory. [perl #114764] =item * C<dump>, C<goto>, C<last>, C<next>, C<redo> or C<require> followed by a bareword (or version) and then an infix operator is no longer a syntax error. It used to be for those infix operators (like C<+>) that have a different meaning where a term is expected. [perl #105924] =item * C<require a::b . 1> and C<require a::b + 1> no longer produce erroneous ambiguity warnings. [perl #107002] =item * Class method calls are now allowed on any string, and not just strings beginning with an alphanumeric character. [perl #105922] =item * An empty pattern created with C<qr//> used in C<m///> no longer triggers the "empty pattern reuses last pattern" behaviour. [perl #96230] =item * Tying a hash during iteration no longer results in a memory leak. =item * Freeing a tied hash during iteration no longer results in a memory leak. =item * List assignment to a tied array or hash that dies on STORE no longer results in a memory leak. =item * If the hint hash (C<%^H>) is tied, compile-time scope entry (which copies the hint hash) no longer leaks memory if FETCH dies. [perl #107000] =item * Constant folding no longer inappropriately triggers the special C<split " "> behaviour. [perl #94490] =item * C<defined scalar(@array)>, C<defined do { &foo }>, and similar constructs now treat the argument to C<defined> as a simple scalar. [perl #97466] =item * Running a custom debugging that defines no C<*DB::DB> glob or provides a subroutine stub for C<&DB::DB> no longer results in a crash, but an error instead. [perl #114990] =item * C<reset ""> now matches its documentation. C<reset> only resets C<m?...?> patterns when called with no argument. An empty string for an argument now does nothing. (It used to be treated as no argument.) [perl #97958] =item * C<printf> with an argument returning an empty list no longer reads past the end of the stack, resulting in erratic behaviour. [perl #77094] =item * C<--subname> no longer produces erroneous ambiguity warnings. [perl #77240] =item * C<v10> is now allowed as a label or package name. This was inadvertently broken when v-strings were added in Perl v5.6. [perl #56880] =item * C<length>, C<pos>, C<substr> and C<sprintf> could be confused by ties, overloading, references and typeglobs if the stringification of such changed the internal representation to or from UTF-8. [perl #114410] =item * utf8::encode now calls FETCH and STORE on tied variables. utf8::decode now calls STORE (it was already calling FETCH). =item * C<$tied =~ s/$non_utf8/$utf8/> no longer loops infinitely if the tied variable returns a Latin-1 string, shared hash key scalar, or reference or typeglob that stringifies as ASCII or Latin-1. This was a regression from v5.12. =item * C<s///> without /e is now better at detecting when it needs to forego certain optimisations, fixing some buggy cases: =over =item * Match variables in certain constructs (C<&&>, C<||>, C<..> and others) in the replacement part; e.g., C<s/(.)/$l{$a||$1}/g>. [perl #26986] =item * Aliases to match variables in the replacement. =item * C<$REGERROR> or C<$REGMARK> in the replacement. [perl #49190] =item * An empty pattern (C<s//$foo/>) that causes the last-successful pattern to be used, when that pattern contains code blocks that modify the variables in the replacement. =back =item * The taintedness of the replacement string no longer affects the taintedness of the return value of C<s///e>. =item * The C<$|> autoflush variable is created on-the-fly when needed. If this happened (e.g., if it was mentioned in a module or eval) when the currently-selected filehandle was a typeglob with an empty IO slot, it used to crash. [perl #115206] =item * Line numbers at the end of a string eval are no longer off by one. [perl #114658] =item * @INC filters (subroutines returned by subroutines in @INC) that set $_ to a copy-on-write scalar no longer cause the parser to modify that string buffer in place. =item * C<length($object)> no longer returns the undefined value if the object has string overloading that returns undef. [perl #115260] =item * The use of C<PL_stashcache>, the stash name lookup cache for method calls, has been restored, Commit da6b625f78f5f133 in August 2011 inadvertently broke the code that looks up values in C<PL_stashcache>. As it's only a cache, quite correctly everything carried on working without it. =item * The error "Can't localize through a reference" had disappeared in v5.16.0 when C<local %$ref> appeared on the last line of an lvalue subroutine. This error disappeared for C<\local %$ref> in perl v5.8.1. It has now been restored. =item * The parsing of here-docs has been improved significantly, fixing several parsing bugs and crashes and one memory leak, and correcting wrong subsequent line numbers under certain conditions. =item * Inside an eval, the error message for an unterminated here-doc no longer has a newline in the middle of it [perl #70836]. =item * A substitution inside a substitution pattern (C<s/${s|||}//>) no longer confuses the parser. =item * It may be an odd place to allow comments, but C<s//"" # hello/e> has always worked, I<unless> there happens to be a null character before the first #. Now it works even in the presence of nulls. =item * An invalid range in C<tr///> or C<y///> no longer results in a memory leak. =item * String eval no longer treats a semicolon-delimited quote-like operator at the very end (C<eval 'q;;'>) as a syntax error. =item * C<< warn {$_ => 1} + 1 >> is no longer a syntax error. The parser used to get confused with certain list operators followed by an anonymous hash and then an infix operator that shares its form with a unary operator. =item * C<(caller $n)[6]> (which gives the text of the eval) used to return the actual parser buffer. Modifying it could result in crashes. Now it always returns a copy. The string returned no longer has "\n;" tacked on to the end. The returned text also includes here-doc bodies, which used to be omitted. =item * The UTF-8 position cache is now reset when accessing magical variables, to avoid the string buffer and the UTF-8 position cache getting out of sync [perl #114410]. =item * Various cases of get magic being called twice for magical UTF-8 strings have been fixed. =item * This code (when not in the presence of C<$&> etc) $_ = 'x' x 1_000_000; 1 while /(.)/; used to skip the buffer copy for performance reasons, but suffered from C<$1> etc changing if the original string changed. That's now been fixed. =item * Perl doesn't use PerlIO anymore to report out of memory messages, as PerlIO might attempt to allocate more memory. =item * In a regular expression, if something is quantified with C<{n,m}> where C<S<n E<gt> m>>, it can't possibly match. Previously this was a fatal error, but now is merely a warning (and that something won't match). [perl #82954]. =item * It used to be possible for formats defined in subroutines that have subsequently been undefined and redefined to close over variables in the wrong pad (the newly-defined enclosing sub), resulting in crashes or "Bizarre copy" errors. =item * Redefinition of XSUBs at run time could produce warnings with the wrong line number. =item * The %vd sprintf format does not support version objects for alpha versions. It used to output the format itself (%vd) when passed an alpha version, and also emit an "Invalid conversion in printf" warning. It no longer does, but produces the empty string in the output. It also no longer leaks memory in this case. =item * C<< $obj->SUPER::method >> calls in the main package could fail if the SUPER package had already been accessed by other means. =item * Stash aliasing (C<< *foo:: = *bar:: >>) no longer causes SUPER calls to ignore changes to methods or @ISA or use the wrong package. =item * Method calls on packages whose names end in ::SUPER are no longer treated as SUPER method calls, resulting in failure to find the method. Furthermore, defining subroutines in such packages no longer causes them to be found by SUPER method calls on the containing package [perl #114924]. =item * C<\w> now matches the code points U+200C (ZERO WIDTH NON-JOINER) and U+200D (ZERO WIDTH JOINER). C<\W> no longer matches these. This change is because Unicode corrected their definition of what C<\w> should match. =item * C<dump LABEL> no longer leaks its label. =item * Constant folding no longer changes the behaviour of functions like C<stat()> and C<truncate()> that can take either filenames or handles. C<stat 1 ? foo : bar> nows treats its argument as a file name (since it is an arbitrary expression), rather than the handle "foo". =item * C<truncate FOO, $len> no longer falls back to treating "FOO" as a file name if the filehandle has been deleted. This was broken in Perl v5.16.0. =item * Subroutine redefinitions after sub-to-glob and glob-to-glob assignments no longer cause double frees or panic messages. =item * C<s///> now turns vstrings into plain strings when performing a substitution, even if the resulting string is the same (C<s/a/a/>). =item * Prototype mismatch warnings no longer erroneously treat constant subs as having no prototype when they actually have "". =item * Constant subroutines and forward declarations no longer prevent prototype mismatch warnings from omitting the sub name. =item * C<undef> on a subroutine now clears call checkers. =item * The C<ref> operator started leaking memory on blessed objects in Perl v5.16.0. This has been fixed [perl #114340]. =item * C<use> no longer tries to parse its arguments as a statement, making C<use constant { () };> a syntax error [perl #114222]. =item * On debugging builds, "uninitialized" warnings inside formats no longer cause assertion failures. =item * On debugging builds, subroutines nested inside formats no longer cause assertion failures [perl #78550]. =item * Formats and C<use> statements are now permitted inside formats. =item * C<print $x> and C<sub { print $x }-E<gt>()> now always produce the same output. It was possible for the latter to refuse to close over $x if the variable was not active; e.g., if it was defined outside a currently-running named subroutine. =item * Similarly, C<print $x> and C<print eval '$x'> now produce the same output. This also allows "my $x if 0" variables to be seen in the debugger [perl #114018]. =item * Formats called recursively no longer stomp on their own lexical variables, but each recursive call has its own set of lexicals. =item * Attempting to free an active format or the handle associated with it no longer results in a crash. =item * Format parsing no longer gets confused by braces, semicolons and low-precedence operators. It used to be possible to use braces as format delimiters (instead of C<=> and C<.>), but only sometimes. Semicolons and low-precedence operators in format argument lines no longer confuse the parser into ignoring the line's return value. In format argument lines, braces can now be used for anonymous hashes, instead of being treated always as C<do> blocks. =item * Formats can now be nested inside code blocks in regular expressions and other quoted constructs (C</(?{...})/> and C<qq/${...}/>) [perl #114040]. =item * Formats are no longer created after compilation errors. =item * Under debugging builds, the B<-DA> command line option started crashing in Perl v5.16.0. It has been fixed [perl #114368]. =item * A potential deadlock scenario involving the premature termination of a pseudo- forked child in a Windows build with ithreads enabled has been fixed. This resolves the common problem of the F<t/op/fork.t> test hanging on Windows [perl #88840]. =item * The code which generates errors from C<require()> could potentially read one or two bytes before the start of the filename for filenames less than three bytes long and ending C</\.p?\z/>. This has now been fixed. Note that it could never have happened with module names given to C<use()> or C<require()> anyway. =item * The handling of pathnames of modules given to C<require()> has been made thread-safe on VMS. =item * Non-blocking sockets have been fixed on VMS. =item * Pod can now be nested in code inside a quoted construct outside of a string eval. This used to work only within string evals [perl #114040]. =item * C<goto ''> now looks for an empty label, producing the "goto must have label" error message, instead of exiting the program [perl #111794]. =item * C<goto "\0"> now dies with "Can't find label" instead of "goto must have label". =item * The C function C<hv_store> used to result in crashes when used on C<%^H> [perl #111000]. =item * A call checker attached to a closure prototype via C<cv_set_call_checker> is now copied to closures cloned from it. So C<cv_set_call_checker> now works inside an attribute handler for a closure. =item * Writing to C<$^N> used to have no effect. Now it croaks with "Modification of a read-only value" by default, but that can be overridden by a custom regular expression engine, as with C<$1> [perl #112184]. =item * C<undef> on a control character glob (C<undef *^H>) no longer emits an erroneous warning about ambiguity [perl #112456]. =item * For efficiency's sake, many operators and built-in functions return the same scalar each time. Lvalue subroutines and subroutines in the CORE:: namespace were allowing this implementation detail to leak through. C<print &CORE::uc("a"), &CORE::uc("b")> used to print "BB". The same thing would happen with an lvalue subroutine returning the return value of C<uc>. Now the value is copied in such cases. =item * C<method {}> syntax with an empty block or a block returning an empty list used to crash or use some random value left on the stack as its invocant. Now it produces an error. =item * C<vec> now works with extremely large offsets (E<gt>2 GB) [perl #111730]. =item * Changes to overload settings now take effect immediately, as do changes to inheritance that affect overloading. They used to take effect only after C<bless>. Objects that were created before a class had any overloading used to remain non-overloaded even if the class gained overloading through C<use overload> or @ISA changes, and even after C<bless>. This has been fixed [perl #112708]. =item * Classes with overloading can now inherit fallback values. =item * Overloading was not respecting a fallback value of 0 if there were overloaded objects on both sides of an assignment operator like C<+=> [perl #111856]. =item * C<pos> now croaks with hash and array arguments, instead of producing erroneous warnings. =item * C<while(each %h)> now implies C<while(defined($_ = each %h))>, like C<readline> and C<readdir>. =item * Subs in the CORE:: namespace no longer crash after C<undef *_> when called with no argument list (C<&CORE::time> with no parentheses). =item * C<unpack> no longer produces the "'/' must follow a numeric type in unpack" error when it is the data that are at fault [perl #60204]. =item * C<join> and C<"@array"> now call FETCH only once on a tied C<$"> [perl #8931]. =item * Some subroutine calls generated by compiling core ops affected by a C<CORE::GLOBAL> override had op checking performed twice. The checking is always idempotent for pure Perl code, but the double checking can matter when custom call checkers are involved. =item * A race condition used to exist around fork that could cause a signal sent to the parent to be handled by both parent and child. Signals are now blocked briefly around fork to prevent this from happening [perl #82580]. =item * The implementation of code blocks in regular expressions, such as C<(?{})> and C<(??{})>, has been heavily reworked to eliminate a whole slew of bugs. The main user-visible changes are: =over 4 =item * Code blocks within patterns are now parsed in the same pass as the surrounding code; in particular it is no longer necessary to have balanced braces: this now works: /(?{ $x='{' })/ This means that this error message is no longer generated: Sequence (?{...}) not terminated or not {}-balanced in regex but a new error may be seen: Sequence (?{...}) not terminated with ')' In addition, literal code blocks within run-time patterns are only compiled once, at perl compile-time: for my $p (...) { # this 'FOO' block of code is compiled once, # at the same time as the surrounding 'for' loop /$p{(?{FOO;})/; } =item * Lexical variables are now sane as regards scope, recursion and closure behavior. In particular, C</A(?{B})C/> behaves (from a closure viewpoint) exactly like C</A/ && do { B } && /C/>, while C<qr/A(?{B})C/> is like C<sub {/A/ && do { B } && /C/}>. So this code now works how you might expect, creating three regexes that match 0, 1, and 2: for my $i (0..2) { push @r, qr/^(??{$i})$/; } "1" =~ $r[1]; # matches =item * The C<use re 'eval'> pragma is now only required for code blocks defined at runtime; in particular in the following, the text of the C<$r> pattern is still interpolated into the new pattern and recompiled, but the individual compiled code-blocks within C<$r> are reused rather than being recompiled, and C<use re 'eval'> isn't needed any more: my $r = qr/abc(?{....})def/; /xyz$r/; =item * Flow control operators no longer crash. Each code block runs in a new dynamic scope, so C<next> etc. will not see any enclosing loops. C<return> returns a value from the code block, not from any enclosing subroutine. =item * Perl normally caches the compilation of run-time patterns, and doesn't recompile if the pattern hasn't changed, but this is now disabled if required for the correct behavior of closures. For example: my $code = '(??{$x})'; for my $x (1..3) { # recompile to see fresh value of $x each time $x =~ /$code/; } =item * The C</msix> and C<(?msix)> etc. flags are now propagated into the return value from C<(??{})>; this now works: "AB" =~ /a(??{'b'})/i; =item * Warnings and errors will appear to come from the surrounding code (or for run-time code blocks, from an eval) rather than from an C<re_eval>: use re 'eval'; $c = '(?{ warn "foo" })'; /$c/; /(?{ warn "foo" })/; formerly gave: foo at (re_eval 1) line 1. foo at (re_eval 2) line 1. and now gives: foo at (eval 1) line 1. foo at /some/prog line 2. =back =item * Perl now can be recompiled to use any Unicode version. In v5.16, it worked on Unicodes 6.0 and 6.1, but there were various bugs if earlier releases were used; the older the release the more problems. =item * C<vec> no longer produces "uninitialized" warnings in lvalue context [perl #9423]. =item * An optimization involving fixed strings in regular expressions could cause a severe performance penalty in edge cases. This has been fixed [perl #76546]. =item * In certain cases, including empty subpatterns within a regular expression (such as C<(?:)> or C<(?:|)>) could disable some optimizations. This has been fixed. =item * The "Can't find an opnumber" message that C<prototype> produces when passed a string like "CORE::nonexistent_keyword" now passes UTF-8 and embedded NULs through unchanged [perl #97478]. =item * C<prototype> now treats magical variables like C<$1> the same way as non-magical variables when checking for the CORE:: prefix, instead of treating them as subroutine names. =item * Under threaded perls, a runtime code block in a regular expression could corrupt the package name stored in the op tree, resulting in bad reads in C<caller>, and possibly crashes [perl #113060]. =item * Referencing a closure prototype (C<\&{$_[1]}> in an attribute handler for a closure) no longer results in a copy of the subroutine (or assertion failures on debugging builds). =item * C<eval '__PACKAGE__'> now returns the right answer on threaded builds if the current package has been assigned over (as in C<*ThisPackage:: = *ThatPackage::>) [perl #78742]. =item * If a package is deleted by code that it calls, it is possible for C<caller> to see a stack frame belonging to that deleted package. C<caller> could crash if the stash's memory address was reused for a scalar and a substitution was performed on the same scalar [perl #113486]. =item * C<UNIVERSAL::can> no longer treats its first argument differently depending on whether it is a string or number internally. =item * C<open> with C<< <& >> for the mode checks to see whether the third argument is a number, in determining whether to treat it as a file descriptor or a handle name. Magical variables like C<$1> were always failing the numeric check and being treated as handle names. =item * C<warn>'s handling of magical variables (C<$1>, ties) has undergone several fixes. C<FETCH> is only called once now on a tied argument or a tied C<$@> [perl #97480]. Tied variables returning objects that stringify as "" are no longer ignored. A tied C<$@> that happened to return a reference the I<previous> time it was used is no longer ignored. =item * C<warn ""> now treats C<$@> with a number in it the same way, regardless of whether it happened via C<$@=3> or C<$@="3">. It used to ignore the former. Now it appends "\t...caught", as it has always done with C<$@="3">. =item * Numeric operators on magical variables (e.g., S<C<$1 + 1>>) used to use floating point operations even where integer operations were more appropriate, resulting in loss of accuracy on 64-bit platforms [perl #109542]. =item * Unary negation no longer treats a string as a number if the string happened to be used as a number at some point. So, if C<$x> contains the string "dogs", C<-$x> returns "-dogs" even if C<$y=0+$x> has happened at some point. =item * In Perl v5.14, C<-'-10'> was fixed to return "10", not "+10". But magical variables (C<$1>, ties) were not fixed till now [perl #57706]. =item * Unary negation now treats strings consistently, regardless of the internal C<UTF8> flag. =item * A regression introduced in Perl v5.16.0 involving C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/> has been fixed. Only the first instance is supposed to be meaningful if a character appears more than once in C<I<SEARCHLIST>>. Under some circumstances, the final instance was overriding all earlier ones. [perl #113584] =item * Regular expressions like C<qr/\87/> previously silently inserted a NUL character, thus matching as if it had been written C<qr/\00087/>. Now it matches as if it had been written as C<qr/87/>, with a message that the sequence C<"\8"> is unrecognized. =item * C<__SUB__> now works in special blocks (C<BEGIN>, C<END>, etc.). =item * Thread creation on Windows could theoretically result in a crash if done inside a C<BEGIN> block. It still does not work properly, but it no longer crashes [perl #111610]. =item * C<\&{''}> (with the empty string) now autovivifies a stub like any other sub name, and no longer produces the "Unable to create sub" error [perl #94476]. =item * A regression introduced in v5.14.0 has been fixed, in which some calls to the C<re> module would clobber C<$_> [perl #113750]. =item * C<do FILE> now always either sets or clears C<$@>, even when the file can't be read. This ensures that testing C<$@> first (as recommended by the documentation) always returns the correct result. =item * The array iterator used for the C<each @array> construct is now correctly reset when C<@array> is cleared [perl #75596]. This happens, for example, when the array is globally assigned to, as in C<@array = (...)>, but not when its B<values> are assigned to. In terms of the XS API, it means that C<av_clear()> will now reset the iterator. This mirrors the behaviour of the hash iterator when the hash is cleared. =item * C<< $class->can >>, C<< $class->isa >>, and C<< $class->DOES >> now return correct results, regardless of whether that package referred to by C<$class> exists [perl #47113]. =item * Arriving signals no longer clear C<$@> [perl #45173]. =item * Allow C<my ()> declarations with an empty variable list [perl #113554]. =item * During parsing, subs declared after errors no longer leave stubs [perl #113712]. =item * Closures containing no string evals no longer hang on to their containing subroutines, allowing variables closed over by outer subroutines to be freed when the outer sub is freed, even if the inner sub still exists [perl #89544]. =item * Duplication of in-memory filehandles by opening with a "<&=" or ">&=" mode stopped working properly in v5.16.0. It was causing the new handle to reference a different scalar variable. This has been fixed [perl #113764]. =item * C<qr//> expressions no longer crash with custom regular expression engines that do not set C<offs> at regular expression compilation time [perl #112962]. =item * C<delete local> no longer crashes with certain magical arrays and hashes [perl #112966]. =item * C<local> on elements of certain magical arrays and hashes used not to arrange to have the element deleted on scope exit, even if the element did not exist before C<local>. =item * C<scalar(write)> no longer returns multiple items [perl #73690]. =item * String to floating point conversions no longer misparse certain strings under C<use locale> [perl #109318]. =item * C<@INC> filters that die no longer leak memory [perl #92252]. =item * The implementations of overloaded operations are now called in the correct context. This allows, among other things, being able to properly override C<< <> >> [perl #47119]. =item * Specifying only the C<fallback> key when calling C<use overload> now behaves properly [perl #113010]. =item * C<< sub foo { my $a = 0; while ($a) { ... } } >> and C<< sub foo { while (0) { ... } } >> now return the same thing [perl #73618]. =item * String negation now behaves the same under C<use integer;> as it does without [perl #113012]. =item * C<chr> now returns the Unicode replacement character (U+FFFD) for -1, regardless of the internal representation. -1 used to wrap if the argument was tied or a string internally. =item * Using a C<format> after its enclosing sub was freed could crash as of perl v5.12.0, if the format referenced lexical variables from the outer sub. =item * Using a C<format> after its enclosing sub was undefined could crash as of perl v5.10.0, if the format referenced lexical variables from the outer sub. =item * Using a C<format> defined inside a closure, which format references lexical variables from outside, never really worked unless the C<write> call was directly inside the closure. In v5.10.0 it even started crashing. Now the copy of that closure nearest the top of the call stack is used to find those variables. =item * Formats that close over variables in special blocks no longer crash if a stub exists with the same name as the special block before the special block is compiled. =item * The parser no longer gets confused, treating C<eval foo ()> as a syntax error if preceded by C<print;> [perl #16249]. =item * The return value of C<syscall> is no longer truncated on 64-bit platforms [perl #113980]. =item * Constant folding no longer causes C<print 1 ? FOO : BAR> to print to the FOO handle [perl #78064]. =item * C<do subname> now calls the named subroutine and uses the file name it returns, instead of opening a file named "subname". =item * Subroutines looked up by rv2cv check hooks (registered by XS modules) are now taken into consideration when determining whether C<foo bar> should be the sub call C<foo(bar)> or the method call C<< "bar"->foo >>. =item * C<CORE::foo::bar> is no longer treated specially, allowing global overrides to be called directly via C<CORE::GLOBAL::uc(...)> [perl #113016]. =item * Calling an undefined sub whose typeglob has been undefined now produces the customary "Undefined subroutine called" error, instead of "Not a CODE reference". =item * Two bugs involving @ISA have been fixed. C<*ISA = *glob_without_array> and C<undef *ISA; @{*ISA}> would prevent future modifications to @ISA from updating the internal caches used to look up methods. The *glob_without_array case was a regression from Perl v5.12. =item * Regular expression optimisations sometimes caused C<$> with C</m> to produce failed or incorrect matches [perl #114068]. =item * C<__SUB__> now works in a C<sort> block when the enclosing subroutine is predeclared with C<sub foo;> syntax [perl #113710]. =item * Unicode properties only apply to Unicode code points, which leads to some subtleties when regular expressions are matched against above-Unicode code points. There is a warning generated to draw your attention to this. However, this warning was being generated inappropriately in some cases, such as when a program was being parsed. Non-Unicode matches such as C<\w> and C<[:word:]> should not generate the warning, as their definitions don't limit them to apply to only Unicode code points. Now the message is only generated when matching against C<\p{}> and C<\P{}>. There remains a bug, [perl #114148], for the very few properties in Unicode that match just a single code point. The warning is not generated if they are matched against an above-Unicode code point. =item * Uninitialized warnings mentioning hash elements would only mention the element name if it was not in the first bucket of the hash, due to an off-by-one error. =item * A regular expression optimizer bug could cause multiline "^" to behave incorrectly in the presence of line breaks, such that C<"/\n\n" =~ m#\A(?:^/$)#im> would not match [perl #115242]. =item * Failed C<fork> in list context no longer corrupts the stack. C<@a = (1, 2, fork, 3)> used to gobble up the 2 and assign C<(1, undef, 3)> if the C<fork> call failed. =item * Numerous memory leaks have been fixed, mostly involving tied variables that die, regular expression character classes and code blocks, and syntax errors. =item * Assigning a regular expression (C<${qr//}>) to a variable that happens to hold a floating point number no longer causes assertion failures on debugging builds. =item * Assigning a regular expression to a scalar containing a number no longer causes subsequent numification to produce random numbers. =item * Assigning a regular expression to a magic variable no longer wipes away the magic. This was a regression from v5.10. =item * Assigning a regular expression to a blessed scalar no longer results in crashes. This was also a regression from v5.10. =item * Regular expression can now be assigned to tied hash and array elements with flattening into strings. =item * Numifying a regular expression no longer results in an uninitialized warning. =item * Negative array indices no longer cause EXISTS methods of tied variables to be ignored. This was a regression from v5.12. =item * Negative array indices no longer result in crashes on arrays tied to non-objects. =item * C<$byte_overload .= $utf8> no longer results in doubly-encoded UTF-8 if the left-hand scalar happened to have produced a UTF-8 string the last time overloading was invoked. =item * C<goto &sub> now uses the current value of @_, instead of using the array the subroutine was originally called with. This means C<local @_ = (...); goto &sub> now works [perl #43077]. =item * If a debugger is invoked recursively, it no longer stomps on its own lexical variables. Formerly under recursion all calls would share the same set of lexical variables [perl #115742]. =item * C<*_{ARRAY}> returned from a subroutine no longer spontaneously becomes empty. =item * When using C<say> to print to a tied filehandle, the value of C<$\> is correctly localized, even if it was previously undef. [perl #119927] =back =head1 Known Problems =over 4 =item * UTF8-flagged strings in C<%ENV> on HP-UX 11.00 are buggy The interaction of UTF8-flagged strings and C<%ENV> on HP-UX 11.00 is currently dodgy in some not-yet-fully-diagnosed way. Expect test failures in F<t/op/magic.t>, followed by unknown behavior when storing wide characters in the environment. =back =head1 Obituary Hojung Yoon (AMORETTE), 24, of Seoul, South Korea, went to his long rest on May 8, 2013 with llama figurine and autographed TIMTOADY card. He was a brilliant young Perl 5 & 6 hacker and a devoted member of Seoul.pm. He programmed Perl, talked Perl, ate Perl, and loved Perl. We believe that he is still programming in Perl with his broken IBM laptop somewhere. He will be missed. =head1 Acknowledgements Perl v5.18.0 represents approximately 12 months of development since Perl v5.16.0 and contains approximately 400,000 lines of changes across 2,100 files from 113 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl v5.18.0: Aaron Crane, Aaron Trevena, Abhijit Menon-Sen, Adrian M. Enache, Alan Haggai Alavi, Alexandr Ciornii, Andrew Tam, Andy Dougherty, Anton Nikishaev, Aristotle Pagaltzis, Augustina Blair, Bob Ernst, Brad Gilbert, Breno G. de Oliveira, Brian Carlson, Brian Fraser, Charlie Gonzalez, Chip Salzenberg, Chris 'BinGOs' Williams, Christian Hansen, Colin Kuskie, Craig A. Berry, Dagfinn Ilmari Mannsåker, Daniel Dragan, Daniel Perrett, Darin McBride, Dave Rolsky, David Golden, David Leadbeater, David Mitchell, David Nicol, Dominic Hargreaves, E. Choroba, Eric Brine, Evan Miller, Father Chrysostomos, Florian Ragwitz, François Perrad, George Greer, Goro Fuji, H.Merijn Brand, Herbert Breunung, Hugo van der Sanden, Igor Zaytsev, James E Keenan, Jan Dubois, Jasmine Ahuja, Jerry D. Hedden, Jess Robinson, Jesse Luehrs, Joaquin Ferrero, Joel Berger, John Goodyear, John Peacock, Karen Etheridge, Karl Williamson, Karthik Rajagopalan, Kent Fredric, Leon Timmermans, Lucas Holt, Lukas Mai, Marcus Holland-Moritz, Markus Jansen, Martin Hasch, Matthew Horsfall, Max Maischein, Michael G Schwern, Michael Schroeder, Moritz Lenz, Nicholas Clark, Niko Tyni, Oleg Nesterov, Patrik Hägglund, Paul Green, Paul Johnson, Paul Marquess, Peter Martini, Rafael Garcia-Suarez, Reini Urban, Renee Baecker, Rhesa Rozendaal, Ricardo Signes, Robin Barker, Ronald J. Kimball, Ruslan Zakirov, Salvador Fandiño, Sawyer X, Scott Lanning, Sergey Alekseev, Shawn M Moore, Shirakata Kentaro, Shlomi Fish, Sisyphus, Smylers, Steffen Müller, Steve Hay, Steve Peters, Steven Schubiger, Sullivan Beck, Sven Strickroth, Sébastien Aperghis-Tramoni, Thomas Sibley, Tobias Leich, Tom Wyant, Tony Cook, Vadim Konovalov, Vincent Pit, Volker Schatz, Walt Mankowski, Yves Orton, Zefram. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[hP5� � perlhaiku.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlhaiku - Perl version 5.10+ on Haiku =head1 DESCRIPTION This file contains instructions how to build Perl for Haiku and lists known problems. =head1 BUILD AND INSTALL The build procedure is completely standard: ./Configure -de make make install Make perl executable and create a symlink for libperl: chmod a+x /boot/common/bin/perl cd /boot/common/lib; ln -s perl5/5.32.1/BePC-haiku/CORE/libperl.so . Replace C<5.32.1> with your respective version of Perl. =head1 KNOWN PROBLEMS The following problems are encountered with Haiku revision 28311: =over 4 =item * Perl cannot be compiled with threading support ATM. =item * The F<cpan/Socket/t/socketpair.t> test fails. More precisely: the subtests using datagram sockets fail. Unix datagram sockets aren't implemented in Haiku yet. =item * A subtest of the F<cpan/Sys-Syslog/t/syslog.t> test fails. This is due to Haiku not implementing F</dev/log> support yet. =item * The tests F<dist/Net-Ping/t/450_service.t> and F<dist/Net-Ping/t/510_ping_udp.t> fail. This is due to bugs in Haiku's network stack implementation. =back =head1 CONTACT For Haiku specific problems contact the HaikuPorts developers: L<http://ports.haiku-files.org/> The initial Haiku port was done by Ingo Weinhold <ingo_weinhold@gmx.de>. Last update: 2008-10-29 PK �=�[�s�� perl5242delta.podnu �[��� =encoding utf8 =head1 NAME perl5242delta - what is new for perl v5.24.2 =head1 DESCRIPTION This document describes differences between the 5.24.1 release and the 5.24.2 release. If you are upgrading from an earlier release such as 5.24.0, first read L<perl5241delta>, which describes differences between 5.24.0 and 5.24.1. =head1 Security =head2 Improved handling of '.' in @INC in base.pm The handling of (the removal of) C<'.'> in C<@INC> in L<base> has been improved. This resolves some problematic behaviour in the approach taken in Perl 5.24.1, which is probably best described in the following two threads on the Perl 5 Porters mailing list: L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/08/msg238991.html>, L<http://www.nntp.perl.org/group/perl.perl5.porters/2016/10/msg240297.html>. =head2 "Escaped" colons and relative paths in PATH On Unix systems, Perl treats any relative paths in the PATH environment variable as tainted when starting a new process. Previously, it was allowing a backslash to escape a colon (unlike the OS), consequently allowing relative paths to be considered safe if the PATH was set to something like C</\:.>. The check has been fixed to treat C<.> as tainted in that example. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<base> has been upgraded from version 2.23 to 2.23_01. =item * L<Module::CoreList> has been upgraded from version 5.20170114_24 to 5.20170715_24. =back =head1 Selected Bug Fixes =over 4 =item * Fixed a crash with C<s///l> where it thought it was dealing with UTF-8 when it wasn't. L<[perl #129038]|https://rt.perl.org/Ticket/Display.html?id=129038> =back =head1 Acknowledgements Perl 5.24.2 represents approximately 6 months of development since Perl 5.24.1 and contains approximately 2,500 lines of changes across 53 files from 18 authors. Excluding auto-generated files, documentation and release tools, there were approximately 960 lines of changes to 17 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.24.2: Aaron Crane, Abigail, Aristotle Pagaltzis, Chris 'BinGOs' Williams, Dan Collins, David Mitchell, Eric Herman, Father Chrysostomos, James E Keenan, Karl Williamson, Lukas Mai, Renee Baecker, Ricardo Signes, Sawyer X, Stevan Little, Steve Hay, Tony Cook, Yves Orton. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at L<https://rt.perl.org/> . There may also be information at L<http://www.perl.org/> , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications which make it inappropriate to send to a publicly archived mailing list, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�t�ڟ� �� perlretut.podnu �[��� =head1 NAME perlretut - Perl regular expressions tutorial =head1 DESCRIPTION This page provides a basic tutorial on understanding, creating and using regular expressions in Perl. It serves as a complement to the reference page on regular expressions L<perlre>. Regular expressions are an integral part of the C<m//>, C<s///>, C<qr//> and C<split> operators and so this tutorial also overlaps with L<perlop/"Regexp Quote-Like Operators"> and L<perlfunc/split>. Perl is widely renowned for excellence in text processing, and regular expressions are one of the big factors behind this fame. Perl regular expressions display an efficiency and flexibility unknown in most other computer languages. Mastering even the basics of regular expressions will allow you to manipulate text with surprising ease. What is a regular expression? At its most basic, a regular expression is a template that is used to determine if a string has certain characteristics. The string is most often some text, such as a line, sentence, web page, or even a whole book, but less commonly it could be some binary data as well. Suppose we want to determine if the text in variable, C<$var> contains the sequence of characters S<C<m u s h r o o m>> (blanks added for legibility). We can write in Perl $var =~ m/mushroom/ The value of this expression will be TRUE if C<$var> contains that sequence of characters, and FALSE otherwise. The portion enclosed in C<'E<sol>'> characters denotes the characteristic we are looking for. We use the term I<pattern> for it. The process of looking to see if the pattern occurs in the string is called I<matching>, and the C<"=~"> operator along with the C<m//> tell Perl to try to match the pattern against the string. Note that the pattern is also a string, but a very special kind of one, as we will see. Patterns are in common use these days; examples are the patterns typed into a search engine to find web pages and the patterns used to list files in a directory, I<e.g.>, "C<ls *.txt>" or "C<dir *.*>". In Perl, the patterns described by regular expressions are used not only to search strings, but to also extract desired parts of strings, and to do search and replace operations. Regular expressions have the undeserved reputation of being abstract and difficult to understand. This really stems simply because the notation used to express them tends to be terse and dense, and not because of inherent complexity. We recommend using the C</x> regular expression modifier (described below) along with plenty of white space to make them less dense, and easier to read. Regular expressions are constructed using simple concepts like conditionals and loops and are no more difficult to understand than the corresponding C<if> conditionals and C<while> loops in the Perl language itself. This tutorial flattens the learning curve by discussing regular expression concepts, along with their notation, one at a time and with many examples. The first part of the tutorial will progress from the simplest word searches to the basic regular expression concepts. If you master the first part, you will have all the tools needed to solve about 98% of your needs. The second part of the tutorial is for those comfortable with the basics and hungry for more power tools. It discusses the more advanced regular expression operators and introduces the latest cutting-edge innovations. A note: to save time, "regular expression" is often abbreviated as regexp or regex. Regexp is a more natural abbreviation than regex, but is harder to pronounce. The Perl pod documentation is evenly split on regexp vs regex; in Perl, there is more than one way to abbreviate it. We'll use regexp in this tutorial. New in v5.22, L<C<use re 'strict'>|re/'strict' mode> applies stricter rules than otherwise when compiling regular expression patterns. It can find things that, while legal, may not be what you intended. =head1 Part 1: The basics =head2 Simple word matching The simplest regexp is simply a word, or more generally, a string of characters. A regexp consisting of just a word matches any string that contains that word: "Hello World" =~ /World/; # matches What is this Perl statement all about? C<"Hello World"> is a simple double-quoted string. C<World> is the regular expression and the C<//> enclosing C</World/> tells Perl to search a string for a match. The operator C<=~> associates the string with the regexp match and produces a true value if the regexp matched, or false if the regexp did not match. In our case, C<World> matches the second word in C<"Hello World">, so the expression is true. Expressions like this are useful in conditionals: if ("Hello World" =~ /World/) { print "It matches\n"; } else { print "It doesn't match\n"; } There are useful variations on this theme. The sense of the match can be reversed by using the C<!~> operator: if ("Hello World" !~ /World/) { print "It doesn't match\n"; } else { print "It matches\n"; } The literal string in the regexp can be replaced by a variable: my $greeting = "World"; if ("Hello World" =~ /$greeting/) { print "It matches\n"; } else { print "It doesn't match\n"; } If you're matching against the special default variable C<$_>, the C<$_ =~> part can be omitted: $_ = "Hello World"; if (/World/) { print "It matches\n"; } else { print "It doesn't match\n"; } And finally, the C<//> default delimiters for a match can be changed to arbitrary delimiters by putting an C<'m'> out front: "Hello World" =~ m!World!; # matches, delimited by '!' "Hello World" =~ m{World}; # matches, note the matching '{}' "/usr/bin/perl" =~ m"/perl"; # matches after '/usr/bin', # '/' becomes an ordinary char C</World/>, C<m!World!>, and C<m{World}> all represent the same thing. When, I<e.g.>, the quote (C<'"'>) is used as a delimiter, the forward slash C<'/'> becomes an ordinary character and can be used in this regexp without trouble. Let's consider how different regexps would match C<"Hello World">: "Hello World" =~ /world/; # doesn't match "Hello World" =~ /o W/; # matches "Hello World" =~ /oW/; # doesn't match "Hello World" =~ /World /; # doesn't match The first regexp C<world> doesn't match because regexps are case-sensitive. The second regexp matches because the substring S<C<'o W'>> occurs in the string S<C<"Hello World">>. The space character C<' '> is treated like any other character in a regexp and is needed to match in this case. The lack of a space character is the reason the third regexp C<'oW'> doesn't match. The fourth regexp "C<World >" doesn't match because there is a space at the end of the regexp, but not at the end of the string. The lesson here is that regexps must match a part of the string I<exactly> in order for the statement to be true. If a regexp matches in more than one place in the string, Perl will always match at the earliest possible point in the string: "Hello World" =~ /o/; # matches 'o' in 'Hello' "That hat is red" =~ /hat/; # matches 'hat' in 'That' With respect to character matching, there are a few more points you need to know about. First of all, not all characters can be used "as is" in a match. Some characters, called I<metacharacters>, are generally reserved for use in regexp notation. The metacharacters are {}[]()^$.|*+?-#\ This list is not as definitive as it may appear (or be claimed to be in other documentation). For example, C<"#"> is a metacharacter only when the C</x> pattern modifier (described below) is used, and both C<"}"> and C<"]"> are metacharacters only when paired with opening C<"{"> or C<"["> respectively; other gotchas apply. The significance of each of these will be explained in the rest of the tutorial, but for now, it is important only to know that a metacharacter can be matched as-is by putting a backslash before it: "2+2=4" =~ /2+2/; # doesn't match, + is a metacharacter "2+2=4" =~ /2\+2/; # matches, \+ is treated like an ordinary + "The interval is [0,1)." =~ /[0,1)./ # is a syntax error! "The interval is [0,1)." =~ /\[0,1\)\./ # matches "#!/usr/bin/perl" =~ /#!\/usr\/bin\/perl/; # matches In the last regexp, the forward slash C<'/'> is also backslashed, because it is used to delimit the regexp. This can lead to LTS (leaning toothpick syndrome), however, and it is often more readable to change delimiters. "#!/usr/bin/perl" =~ m!#\!/usr/bin/perl!; # easier to read The backslash character C<'\'> is a metacharacter itself and needs to be backslashed: 'C:\WIN32' =~ /C:\\WIN/; # matches In situations where it doesn't make sense for a particular metacharacter to mean what it normally does, it automatically loses its metacharacter-ness and becomes an ordinary character that is to be matched literally. For example, the C<'}'> is a metacharacter only when it is the mate of a C<'{'> metacharacter. Otherwise it is treated as a literal RIGHT CURLY BRACKET. This may lead to unexpected results. L<C<use re 'strict'>|re/'strict' mode> can catch some of these. In addition to the metacharacters, there are some ASCII characters which don't have printable character equivalents and are instead represented by I<escape sequences>. Common examples are C<\t> for a tab, C<\n> for a newline, C<\r> for a carriage return and C<\a> for a bell (or alert). If your string is better thought of as a sequence of arbitrary bytes, the octal escape sequence, I<e.g.>, C<\033>, or hexadecimal escape sequence, I<e.g.>, C<\x1B> may be a more natural representation for your bytes. Here are some examples of escapes: "1000\t2000" =~ m(0\t2) # matches "1000\n2000" =~ /0\n20/ # matches "1000\t2000" =~ /\000\t2/ # doesn't match, "0" ne "\000" "cat" =~ /\o{143}\x61\x74/ # matches in ASCII, but a weird way # to spell cat If you've been around Perl a while, all this talk of escape sequences may seem familiar. Similar escape sequences are used in double-quoted strings and in fact the regexps in Perl are mostly treated as double-quoted strings. This means that variables can be used in regexps as well. Just like double-quoted strings, the values of the variables in the regexp will be substituted in before the regexp is evaluated for matching purposes. So we have: $foo = 'house'; 'housecat' =~ /$foo/; # matches 'cathouse' =~ /cat$foo/; # matches 'housecat' =~ /${foo}cat/; # matches So far, so good. With the knowledge above you can already perform searches with just about any literal string regexp you can dream up. Here is a I<very simple> emulation of the Unix grep program: % cat > simple_grep #!/usr/bin/perl $regexp = shift; while (<>) { print if /$regexp/; } ^D % chmod +x simple_grep % simple_grep abba /usr/dict/words Babbage cabbage cabbages sabbath Sabbathize Sabbathizes sabbatical scabbard scabbards This program is easy to understand. C<#!/usr/bin/perl> is the standard way to invoke a perl program from the shell. S<C<$regexp = shift;>> saves the first command line argument as the regexp to be used, leaving the rest of the command line arguments to be treated as files. S<C<< while (<>) >>> loops over all the lines in all the files. For each line, S<C<print if /$regexp/;>> prints the line if the regexp matches the line. In this line, both C<print> and C</$regexp/> use the default variable C<$_> implicitly. With all of the regexps above, if the regexp matched anywhere in the string, it was considered a match. Sometimes, however, we'd like to specify I<where> in the string the regexp should try to match. To do this, we would use the I<anchor> metacharacters C<'^'> and C<'$'>. The anchor C<'^'> means match at the beginning of the string and the anchor C<'$'> means match at the end of the string, or before a newline at the end of the string. Here is how they are used: "housekeeper" =~ /keeper/; # matches "housekeeper" =~ /^keeper/; # doesn't match "housekeeper" =~ /keeper$/; # matches "housekeeper\n" =~ /keeper$/; # matches The second regexp doesn't match because C<'^'> constrains C<keeper> to match only at the beginning of the string, but C<"housekeeper"> has keeper starting in the middle. The third regexp does match, since the C<'$'> constrains C<keeper> to match only at the end of the string. When both C<'^'> and C<'$'> are used at the same time, the regexp has to match both the beginning and the end of the string, I<i.e.>, the regexp matches the whole string. Consider "keeper" =~ /^keep$/; # doesn't match "keeper" =~ /^keeper$/; # matches "" =~ /^$/; # ^$ matches an empty string The first regexp doesn't match because the string has more to it than C<keep>. Since the second regexp is exactly the string, it matches. Using both C<'^'> and C<'$'> in a regexp forces the complete string to match, so it gives you complete control over which strings match and which don't. Suppose you are looking for a fellow named bert, off in a string by himself: "dogbert" =~ /bert/; # matches, but not what you want "dilbert" =~ /^bert/; # doesn't match, but .. "bertram" =~ /^bert/; # matches, so still not good enough "bertram" =~ /^bert$/; # doesn't match, good "dilbert" =~ /^bert$/; # doesn't match, good "bert" =~ /^bert$/; # matches, perfect Of course, in the case of a literal string, one could just as easily use the string comparison S<C<$string eq 'bert'>> and it would be more efficient. The C<^...$> regexp really becomes useful when we add in the more powerful regexp tools below. =head2 Using character classes Although one can already do quite a lot with the literal string regexps above, we've only scratched the surface of regular expression technology. In this and subsequent sections we will introduce regexp concepts (and associated metacharacter notations) that will allow a regexp to represent not just a single character sequence, but a I<whole class> of them. One such concept is that of a I<character class>. A character class allows a set of possible characters, rather than just a single character, to match at a particular point in a regexp. You can define your own custom character classes. These are denoted by brackets C<[...]>, with the set of characters to be possibly matched inside. Here are some examples: /cat/; # matches 'cat' /[bcr]at/; # matches 'bat, 'cat', or 'rat' /item[0123456789]/; # matches 'item0' or ... or 'item9' "abc" =~ /[cab]/; # matches 'a' In the last statement, even though C<'c'> is the first character in the class, C<'a'> matches because the first character position in the string is the earliest point at which the regexp can match. /[yY][eE][sS]/; # match 'yes' in a case-insensitive way # 'yes', 'Yes', 'YES', etc. This regexp displays a common task: perform a case-insensitive match. Perl provides a way of avoiding all those brackets by simply appending an C<'i'> to the end of the match. Then C</[yY][eE][sS]/;> can be rewritten as C</yes/i;>. The C<'i'> stands for case-insensitive and is an example of a I<modifier> of the matching operation. We will meet other modifiers later in the tutorial. We saw in the section above that there were ordinary characters, which represented themselves, and special characters, which needed a backslash C<'\'> to represent themselves. The same is true in a character class, but the sets of ordinary and special characters inside a character class are different than those outside a character class. The special characters for a character class are C<-]\^$> (and the pattern delimiter, whatever it is). C<']'> is special because it denotes the end of a character class. C<'$'> is special because it denotes a scalar variable. C<'\'> is special because it is used in escape sequences, just like above. Here is how the special characters C<]$\> are handled: /[\]c]def/; # matches ']def' or 'cdef' $x = 'bcr'; /[$x]at/; # matches 'bat', 'cat', or 'rat' /[\$x]at/; # matches '$at' or 'xat' /[\\$x]at/; # matches '\at', 'bat, 'cat', or 'rat' The last two are a little tricky. In C<[\$x]>, the backslash protects the dollar sign, so the character class has two members C<'$'> and C<'x'>. In C<[\\$x]>, the backslash is protected, so C<$x> is treated as a variable and substituted in double quote fashion. The special character C<'-'> acts as a range operator within character classes, so that a contiguous set of characters can be written as a range. With ranges, the unwieldy C<[0123456789]> and C<[abc...xyz]> become the svelte C<[0-9]> and C<[a-z]>. Some examples are /item[0-9]/; # matches 'item0' or ... or 'item9' /[0-9bx-z]aa/; # matches '0aa', ..., '9aa', # 'baa', 'xaa', 'yaa', or 'zaa' /[0-9a-fA-F]/; # matches a hexadecimal digit /[0-9a-zA-Z_]/; # matches a "word" character, # like those in a Perl variable name If C<'-'> is the first or last character in a character class, it is treated as an ordinary character; C<[-ab]>, C<[ab-]> and C<[a\-b]> are all equivalent. The special character C<'^'> in the first position of a character class denotes a I<negated character class>, which matches any character but those in the brackets. Both C<[...]> and C<[^...]> must match a character, or the match fails. Then /[^a]at/; # doesn't match 'aat' or 'at', but matches # all other 'bat', 'cat, '0at', '%at', etc. /[^0-9]/; # matches a non-numeric character /[a^]at/; # matches 'aat' or '^at'; here '^' is ordinary Now, even C<[0-9]> can be a bother to write multiple times, so in the interest of saving keystrokes and making regexps more readable, Perl has several abbreviations for common character classes, as shown below. Since the introduction of Unicode, unless the C</a> modifier is in effect, these character classes match more than just a few characters in the ASCII range. =over 4 =item * C<\d> matches a digit, not just C<[0-9]> but also digits from non-roman scripts =item * C<\s> matches a whitespace character, the set C<[\ \t\r\n\f]> and others =item * C<\w> matches a word character (alphanumeric or C<'_'>), not just C<[0-9a-zA-Z_]> but also digits and characters from non-roman scripts =item * C<\D> is a negated C<\d>; it represents any other character than a digit, or C<[^\d]> =item * C<\S> is a negated C<\s>; it represents any non-whitespace character C<[^\s]> =item * C<\W> is a negated C<\w>; it represents any non-word character C<[^\w]> =item * The period C<'.'> matches any character but C<"\n"> (unless the modifier C</s> is in effect, as explained below). =item * C<\N>, like the period, matches any character but C<"\n">, but it does so regardless of whether the modifier C</s> is in effect. =back The C</a> modifier, available starting in Perl 5.14, is used to restrict the matches of C<\d>, C<\s>, and C<\w> to just those in the ASCII range. It is useful to keep your program from being needlessly exposed to full Unicode (and its accompanying security considerations) when all you want is to process English-like text. (The "a" may be doubled, C</aa>, to provide even more restrictions, preventing case-insensitive matching of ASCII with non-ASCII characters; otherwise a Unicode "Kelvin Sign" would caselessly match a "k" or "K".) The C<\d\s\w\D\S\W> abbreviations can be used both inside and outside of bracketed character classes. Here are some in use: /\d\d:\d\d:\d\d/; # matches a hh:mm:ss time format /[\d\s]/; # matches any digit or whitespace character /\w\W\w/; # matches a word char, followed by a # non-word char, followed by a word char /..rt/; # matches any two chars, followed by 'rt' /end\./; # matches 'end.' /end[.]/; # same thing, matches 'end.' Because a period is a metacharacter, it needs to be escaped to match as an ordinary period. Because, for example, C<\d> and C<\w> are sets of characters, it is incorrect to think of C<[^\d\w]> as C<[\D\W]>; in fact C<[^\d\w]> is the same as C<[^\w]>, which is the same as C<[\W]>. Think DeMorgan's laws. In actuality, the period and C<\d\s\w\D\S\W> abbreviations are themselves types of character classes, so the ones surrounded by brackets are just one type of character class. When we need to make a distinction, we refer to them as "bracketed character classes." An anchor useful in basic regexps is the I<word anchor> C<\b>. This matches a boundary between a word character and a non-word character C<\w\W> or C<\W\w>: $x = "Housecat catenates house and cat"; $x =~ /cat/; # matches cat in 'housecat' $x =~ /\bcat/; # matches cat in 'catenates' $x =~ /cat\b/; # matches cat in 'housecat' $x =~ /\bcat\b/; # matches 'cat' at end of string Note in the last example, the end of the string is considered a word boundary. For natural language processing (so that, for example, apostrophes are included in words), use instead C<\b{wb}> "don't" =~ / .+? \b{wb} /x; # matches the whole string You might wonder why C<'.'> matches everything but C<"\n"> - why not every character? The reason is that often one is matching against lines and would like to ignore the newline characters. For instance, while the string C<"\n"> represents one line, we would like to think of it as empty. Then "" =~ /^$/; # matches "\n" =~ /^$/; # matches, $ anchors before "\n" "" =~ /./; # doesn't match; it needs a char "" =~ /^.$/; # doesn't match; it needs a char "\n" =~ /^.$/; # doesn't match; it needs a char other than "\n" "a" =~ /^.$/; # matches "a\n" =~ /^.$/; # matches, $ anchors before "\n" This behavior is convenient, because we usually want to ignore newlines when we count and match characters in a line. Sometimes, however, we want to keep track of newlines. We might even want C<'^'> and C<'$'> to anchor at the beginning and end of lines within the string, rather than just the beginning and end of the string. Perl allows us to choose between ignoring and paying attention to newlines by using the C</s> and C</m> modifiers. C</s> and C</m> stand for single line and multi-line and they determine whether a string is to be treated as one continuous string, or as a set of lines. The two modifiers affect two aspects of how the regexp is interpreted: 1) how the C<'.'> character class is defined, and 2) where the anchors C<'^'> and C<'$'> are able to match. Here are the four possible combinations: =over 4 =item * no modifiers: Default behavior. C<'.'> matches any character except C<"\n">. C<'^'> matches only at the beginning of the string and C<'$'> matches only at the end or before a newline at the end. =item * s modifier (C</s>): Treat string as a single long line. C<'.'> matches any character, even C<"\n">. C<'^'> matches only at the beginning of the string and C<'$'> matches only at the end or before a newline at the end. =item * m modifier (C</m>): Treat string as a set of multiple lines. C<'.'> matches any character except C<"\n">. C<'^'> and C<'$'> are able to match at the start or end of I<any> line within the string. =item * both s and m modifiers (C</sm>): Treat string as a single long line, but detect multiple lines. C<'.'> matches any character, even C<"\n">. C<'^'> and C<'$'>, however, are able to match at the start or end of I<any> line within the string. =back Here are examples of C</s> and C</m> in action: $x = "There once was a girl\nWho programmed in Perl\n"; $x =~ /^Who/; # doesn't match, "Who" not at start of string $x =~ /^Who/s; # doesn't match, "Who" not at start of string $x =~ /^Who/m; # matches, "Who" at start of second line $x =~ /^Who/sm; # matches, "Who" at start of second line $x =~ /girl.Who/; # doesn't match, "." doesn't match "\n" $x =~ /girl.Who/s; # matches, "." matches "\n" $x =~ /girl.Who/m; # doesn't match, "." doesn't match "\n" $x =~ /girl.Who/sm; # matches, "." matches "\n" Most of the time, the default behavior is what is wanted, but C</s> and C</m> are occasionally very useful. If C</m> is being used, the start of the string can still be matched with C<\A> and the end of the string can still be matched with the anchors C<\Z> (matches both the end and the newline before, like C<'$'>), and C<\z> (matches only the end): $x =~ /^Who/m; # matches, "Who" at start of second line $x =~ /\AWho/m; # doesn't match, "Who" is not at start of string $x =~ /girl$/m; # matches, "girl" at end of first line $x =~ /girl\Z/m; # doesn't match, "girl" is not at end of string $x =~ /Perl\Z/m; # matches, "Perl" is at newline before end $x =~ /Perl\z/m; # doesn't match, "Perl" is not at end of string We now know how to create choices among classes of characters in a regexp. What about choices among words or character strings? Such choices are described in the next section. =head2 Matching this or that Sometimes we would like our regexp to be able to match different possible words or character strings. This is accomplished by using the I<alternation> metacharacter C<'|'>. To match C<dog> or C<cat>, we form the regexp C<dog|cat>. As before, Perl will try to match the regexp at the earliest possible point in the string. At each character position, Perl will first try to match the first alternative, C<dog>. If C<dog> doesn't match, Perl will then try the next alternative, C<cat>. If C<cat> doesn't match either, then the match fails and Perl moves to the next position in the string. Some examples: "cats and dogs" =~ /cat|dog|bird/; # matches "cat" "cats and dogs" =~ /dog|cat|bird/; # matches "cat" Even though C<dog> is the first alternative in the second regexp, C<cat> is able to match earlier in the string. "cats" =~ /c|ca|cat|cats/; # matches "c" "cats" =~ /cats|cat|ca|c/; # matches "cats" Here, all the alternatives match at the first string position, so the first alternative is the one that matches. If some of the alternatives are truncations of the others, put the longest ones first to give them a chance to match. "cab" =~ /a|b|c/ # matches "c" # /a|b|c/ == /[abc]/ The last example points out that character classes are like alternations of characters. At a given character position, the first alternative that allows the regexp match to succeed will be the one that matches. =head2 Grouping things and hierarchical matching Alternation allows a regexp to choose among alternatives, but by itself it is unsatisfying. The reason is that each alternative is a whole regexp, but sometime we want alternatives for just part of a regexp. For instance, suppose we want to search for housecats or housekeepers. The regexp C<housecat|housekeeper> fits the bill, but is inefficient because we had to type C<house> twice. It would be nice to have parts of the regexp be constant, like C<house>, and some parts have alternatives, like C<cat|keeper>. The I<grouping> metacharacters C<()> solve this problem. Grouping allows parts of a regexp to be treated as a single unit. Parts of a regexp are grouped by enclosing them in parentheses. Thus we could solve the C<housecat|housekeeper> by forming the regexp as C<house(cat|keeper)>. The regexp C<house(cat|keeper)> means match C<house> followed by either C<cat> or C<keeper>. Some more examples are /(a|b)b/; # matches 'ab' or 'bb' /(ac|b)b/; # matches 'acb' or 'bb' /(^a|b)c/; # matches 'ac' at start of string or 'bc' anywhere /(a|[bc])d/; # matches 'ad', 'bd', or 'cd' /house(cat|)/; # matches either 'housecat' or 'house' /house(cat(s|)|)/; # matches either 'housecats' or 'housecat' or # 'house'. Note groups can be nested. /(19|20|)\d\d/; # match years 19xx, 20xx, or the Y2K problem, xx "20" =~ /(19|20|)\d\d/; # matches the null alternative '()\d\d', # because '20\d\d' can't match Alternations behave the same way in groups as out of them: at a given string position, the leftmost alternative that allows the regexp to match is taken. So in the last example at the first string position, C<"20"> matches the second alternative, but there is nothing left over to match the next two digits C<\d\d>. So Perl moves on to the next alternative, which is the null alternative and that works, since C<"20"> is two digits. The process of trying one alternative, seeing if it matches, and moving on to the next alternative, while going back in the string from where the previous alternative was tried, if it doesn't, is called I<backtracking>. The term "backtracking" comes from the idea that matching a regexp is like a walk in the woods. Successfully matching a regexp is like arriving at a destination. There are many possible trailheads, one for each string position, and each one is tried in order, left to right. From each trailhead there may be many paths, some of which get you there, and some which are dead ends. When you walk along a trail and hit a dead end, you have to backtrack along the trail to an earlier point to try another trail. If you hit your destination, you stop immediately and forget about trying all the other trails. You are persistent, and only if you have tried all the trails from all the trailheads and not arrived at your destination, do you declare failure. To be concrete, here is a step-by-step analysis of what Perl does when it tries to match the regexp "abcde" =~ /(abd|abc)(df|d|de)/; =over 4 =item Z<>0. Start with the first letter in the string C<'a'>. E<nbsp> =item Z<>1. Try the first alternative in the first group C<'abd'>. E<nbsp> =item Z<>2. Match C<'a'> followed by C<'b'>. So far so good. E<nbsp> =item Z<>3. C<'d'> in the regexp doesn't match C<'c'> in the string - a dead end. So backtrack two characters and pick the second alternative in the first group C<'abc'>. E<nbsp> =item Z<>4. Match C<'a'> followed by C<'b'> followed by C<'c'>. We are on a roll and have satisfied the first group. Set C<$1> to C<'abc'>. E<nbsp> =item Z<>5 Move on to the second group and pick the first alternative C<'df'>. E<nbsp> =item Z<>6 Match the C<'d'>. E<nbsp> =item Z<>7. C<'f'> in the regexp doesn't match C<'e'> in the string, so a dead end. Backtrack one character and pick the second alternative in the second group C<'d'>. E<nbsp> =item Z<>8. C<'d'> matches. The second grouping is satisfied, so set C<$2> to C<'d'>. E<nbsp> =item Z<>9. We are at the end of the regexp, so we are done! We have matched C<'abcd'> out of the string C<"abcde">. =back There are a couple of things to note about this analysis. First, the third alternative in the second group C<'de'> also allows a match, but we stopped before we got to it - at a given character position, leftmost wins. Second, we were able to get a match at the first character position of the string C<'a'>. If there were no matches at the first position, Perl would move to the second character position C<'b'> and attempt the match all over again. Only when all possible paths at all possible character positions have been exhausted does Perl give up and declare S<C<$string =~ /(abd|abc)(df|d|de)/;>> to be false. Even with all this work, regexp matching happens remarkably fast. To speed things up, Perl compiles the regexp into a compact sequence of opcodes that can often fit inside a processor cache. When the code is executed, these opcodes can then run at full throttle and search very quickly. =head2 Extracting matches The grouping metacharacters C<()> also serve another completely different function: they allow the extraction of the parts of a string that matched. This is very useful to find out what matched and for text processing in general. For each grouping, the part that matched inside goes into the special variables C<$1>, C<$2>, I<etc>. They can be used just as ordinary variables: # extract hours, minutes, seconds if ($time =~ /(\d\d):(\d\d):(\d\d)/) { # match hh:mm:ss format $hours = $1; $minutes = $2; $seconds = $3; } Now, we know that in scalar context, S<C<$time =~ /(\d\d):(\d\d):(\d\d)/>> returns a true or false value. In list context, however, it returns the list of matched values C<($1,$2,$3)>. So we could write the code more compactly as # extract hours, minutes, seconds ($hours, $minutes, $second) = ($time =~ /(\d\d):(\d\d):(\d\d)/); If the groupings in a regexp are nested, C<$1> gets the group with the leftmost opening parenthesis, C<$2> the next opening parenthesis, I<etc>. Here is a regexp with nested groups: /(ab(cd|ef)((gi)|j))/; 1 2 34 If this regexp matches, C<$1> contains a string starting with C<'ab'>, C<$2> is either set to C<'cd'> or C<'ef'>, C<$3> equals either C<'gi'> or C<'j'>, and C<$4> is either set to C<'gi'>, just like C<$3>, or it remains undefined. For convenience, Perl sets C<$+> to the string held by the highest numbered C<$1>, C<$2>,... that got assigned (and, somewhat related, C<$^N> to the value of the C<$1>, C<$2>,... most-recently assigned; I<i.e.> the C<$1>, C<$2>,... associated with the rightmost closing parenthesis used in the match). =head2 Backreferences Closely associated with the matching variables C<$1>, C<$2>, ... are the I<backreferences> C<\g1>, C<\g2>,... Backreferences are simply matching variables that can be used I<inside> a regexp. This is a really nice feature; what matches later in a regexp is made to depend on what matched earlier in the regexp. Suppose we wanted to look for doubled words in a text, like "the the". The following regexp finds all 3-letter doubles with a space in between: /\b(\w\w\w)\s\g1\b/; The grouping assigns a value to C<\g1>, so that the same 3-letter sequence is used for both parts. A similar task is to find words consisting of two identical parts: % simple_grep '^(\w\w\w\w|\w\w\w|\w\w|\w)\g1$' /usr/dict/words beriberi booboo coco mama murmur papa The regexp has a single grouping which considers 4-letter combinations, then 3-letter combinations, I<etc>., and uses C<\g1> to look for a repeat. Although C<$1> and C<\g1> represent the same thing, care should be taken to use matched variables C<$1>, C<$2>,... only I<outside> a regexp and backreferences C<\g1>, C<\g2>,... only I<inside> a regexp; not doing so may lead to surprising and unsatisfactory results. =head2 Relative backreferences Counting the opening parentheses to get the correct number for a backreference is error-prone as soon as there is more than one capturing group. A more convenient technique became available with Perl 5.10: relative backreferences. To refer to the immediately preceding capture group one now may write C<\g{-1}>, the next but last is available via C<\g{-2}>, and so on. Another good reason in addition to readability and maintainability for using relative backreferences is illustrated by the following example, where a simple pattern for matching peculiar strings is used: $a99a = '([a-z])(\d)\g2\g1'; # matches a11a, g22g, x33x, etc. Now that we have this pattern stored as a handy string, we might feel tempted to use it as a part of some other pattern: $line = "code=e99e"; if ($line =~ /^(\w+)=$a99a$/){ # unexpected behavior! print "$1 is valid\n"; } else { print "bad line: '$line'\n"; } But this doesn't match, at least not the way one might expect. Only after inserting the interpolated C<$a99a> and looking at the resulting full text of the regexp is it obvious that the backreferences have backfired. The subexpression C<(\w+)> has snatched number 1 and demoted the groups in C<$a99a> by one rank. This can be avoided by using relative backreferences: $a99a = '([a-z])(\d)\g{-1}\g{-2}'; # safe for being interpolated =head2 Named backreferences Perl 5.10 also introduced named capture groups and named backreferences. To attach a name to a capturing group, you write either C<< (?<name>...) >> or C<< (?'name'...) >>. The backreference may then be written as C<\g{name}>. It is permissible to attach the same name to more than one group, but then only the leftmost one of the eponymous set can be referenced. Outside of the pattern a named capture group is accessible through the C<%+> hash. Assuming that we have to match calendar dates which may be given in one of the three formats yyyy-mm-dd, mm/dd/yyyy or dd.mm.yyyy, we can write three suitable patterns where we use C<'d'>, C<'m'> and C<'y'> respectively as the names of the groups capturing the pertaining components of a date. The matching operation combines the three patterns as alternatives: $fmt1 = '(?<y>\d\d\d\d)-(?<m>\d\d)-(?<d>\d\d)'; $fmt2 = '(?<m>\d\d)/(?<d>\d\d)/(?<y>\d\d\d\d)'; $fmt3 = '(?<d>\d\d)\.(?<m>\d\d)\.(?<y>\d\d\d\d)'; for my $d (qw(2006-10-21 15.01.2007 10/31/2005)) { if ( $d =~ m{$fmt1|$fmt2|$fmt3} ){ print "day=$+{d} month=$+{m} year=$+{y}\n"; } } If any of the alternatives matches, the hash C<%+> is bound to contain the three key-value pairs. =head2 Alternative capture group numbering Yet another capturing group numbering technique (also as from Perl 5.10) deals with the problem of referring to groups within a set of alternatives. Consider a pattern for matching a time of the day, civil or military style: if ( $time =~ /(\d\d|\d):(\d\d)|(\d\d)(\d\d)/ ){ # process hour and minute } Processing the results requires an additional if statement to determine whether C<$1> and C<$2> or C<$3> and C<$4> contain the goodies. It would be easier if we could use group numbers 1 and 2 in second alternative as well, and this is exactly what the parenthesized construct C<(?|...)>, set around an alternative achieves. Here is an extended version of the previous pattern: if($time =~ /(?|(\d\d|\d):(\d\d)|(\d\d)(\d\d))\s+([A-Z][A-Z][A-Z])/){ print "hour=$1 minute=$2 zone=$3\n"; } Within the alternative numbering group, group numbers start at the same position for each alternative. After the group, numbering continues with one higher than the maximum reached across all the alternatives. =head2 Position information In addition to what was matched, Perl also provides the positions of what was matched as contents of the C<@-> and C<@+> arrays. C<$-[0]> is the position of the start of the entire match and C<$+[0]> is the position of the end. Similarly, C<$-[n]> is the position of the start of the C<$n> match and C<$+[n]> is the position of the end. If C<$n> is undefined, so are C<$-[n]> and C<$+[n]>. Then this code $x = "Mmm...donut, thought Homer"; $x =~ /^(Mmm|Yech)\.\.\.(donut|peas)/; # matches foreach $exp (1..$#-) { print "Match $exp: '${$exp}' at position ($-[$exp],$+[$exp])\n"; } prints Match 1: 'Mmm' at position (0,3) Match 2: 'donut' at position (6,11) Even if there are no groupings in a regexp, it is still possible to find out what exactly matched in a string. If you use them, Perl will set C<$`> to the part of the string before the match, will set C<$&> to the part of the string that matched, and will set C<$'> to the part of the string after the match. An example: $x = "the cat caught the mouse"; $x =~ /cat/; # $` = 'the ', $& = 'cat', $' = ' caught the mouse' $x =~ /the/; # $` = '', $& = 'the', $' = ' cat caught the mouse' In the second match, C<$`> equals C<''> because the regexp matched at the first character position in the string and stopped; it never saw the second "the". If your code is to run on Perl versions earlier than 5.20, it is worthwhile to note that using C<$`> and C<$'> slows down regexp matching quite a bit, while C<$&> slows it down to a lesser extent, because if they are used in one regexp in a program, they are generated for I<all> regexps in the program. So if raw performance is a goal of your application, they should be avoided. If you need to extract the corresponding substrings, use C<@-> and C<@+> instead: $` is the same as substr( $x, 0, $-[0] ) $& is the same as substr( $x, $-[0], $+[0]-$-[0] ) $' is the same as substr( $x, $+[0] ) As of Perl 5.10, the C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}> variables may be used. These are only set if the C</p> modifier is present. Consequently they do not penalize the rest of the program. In Perl 5.20, C<${^PREMATCH}>, C<${^MATCH}> and C<${^POSTMATCH}> are available whether the C</p> has been used or not (the modifier is ignored), and C<$`>, C<$'> and C<$&> do not cause any speed difference. =head2 Non-capturing groupings A group that is required to bundle a set of alternatives may or may not be useful as a capturing group. If it isn't, it just creates a superfluous addition to the set of available capture group values, inside as well as outside the regexp. Non-capturing groupings, denoted by C<(?:regexp)>, still allow the regexp to be treated as a single unit, but don't establish a capturing group at the same time. Both capturing and non-capturing groupings are allowed to co-exist in the same regexp. Because there is no extraction, non-capturing groupings are faster than capturing groupings. Non-capturing groupings are also handy for choosing exactly which parts of a regexp are to be extracted to matching variables: # match a number, $1-$4 are set, but we only want $1 /([+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?)/; # match a number faster , only $1 is set /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE][+-]?\d+)?)/; # match a number, get $1 = whole number, $2 = exponent /([+-]?\ *(?:\d+(?:\.\d*)?|\.\d+)(?:[eE]([+-]?\d+))?)/; Non-capturing groupings are also useful for removing nuisance elements gathered from a split operation where parentheses are required for some reason: $x = '12aba34ba5'; @num = split /(a|b)+/, $x; # @num = ('12','a','34','a','5') @num = split /(?:a|b)+/, $x; # @num = ('12','34','5') In Perl 5.22 and later, all groups within a regexp can be set to non-capturing by using the new C</n> flag: "hello" =~ /(hi|hello)/n; # $1 is not set! See L<perlre/"n"> for more information. =head2 Matching repetitions The examples in the previous section display an annoying weakness. We were only matching 3-letter words, or chunks of words of 4 letters or less. We'd like to be able to match words or, more generally, strings of any length, without writing out tedious alternatives like C<\w\w\w\w|\w\w\w|\w\w|\w>. This is exactly the problem the I<quantifier> metacharacters C<'?'>, C<'*'>, C<'+'>, and C<{}> were created for. They allow us to delimit the number of repeats for a portion of a regexp we consider to be a match. Quantifiers are put immediately after the character, character class, or grouping that we want to specify. They have the following meanings: =over 4 =item * C<a?> means: match C<'a'> 1 or 0 times =item * C<a*> means: match C<'a'> 0 or more times, I<i.e.>, any number of times =item * C<a+> means: match C<'a'> 1 or more times, I<i.e.>, at least once =item * C<a{n,m}> means: match at least C<n> times, but not more than C<m> times. =item * C<a{n,}> means: match at least C<n> or more times =item * C<a{n}> means: match exactly C<n> times =back Here are some examples: /[a-z]+\s+\d*/; # match a lowercase word, at least one space, and # any number of digits /(\w+)\s+\g1/; # match doubled words of arbitrary length /y(es)?/i; # matches 'y', 'Y', or a case-insensitive 'yes' $year =~ /^\d{2,4}$/; # make sure year is at least 2 but not more # than 4 digits $year =~ /^\d{4}$|^\d{2}$/; # better match; throw out 3-digit dates $year =~ /^\d{2}(\d{2})?$/; # same thing written differently. # However, this captures the last two # digits in $1 and the other does not. % simple_grep '^(\w+)\g1$' /usr/dict/words # isn't this easier? beriberi booboo coco mama murmur papa For all of these quantifiers, Perl will try to match as much of the string as possible, while still allowing the regexp to succeed. Thus with C</a?.../>, Perl will first try to match the regexp with the C<'a'> present; if that fails, Perl will try to match the regexp without the C<'a'> present. For the quantifier C<'*'>, we get the following: $x = "the cat in the hat"; $x =~ /^(.*)(cat)(.*)$/; # matches, # $1 = 'the ' # $2 = 'cat' # $3 = ' in the hat' Which is what we might expect, the match finds the only C<cat> in the string and locks onto it. Consider, however, this regexp: $x =~ /^(.*)(at)(.*)$/; # matches, # $1 = 'the cat in the h' # $2 = 'at' # $3 = '' (0 characters match) One might initially guess that Perl would find the C<at> in C<cat> and stop there, but that wouldn't give the longest possible string to the first quantifier C<.*>. Instead, the first quantifier C<.*> grabs as much of the string as possible while still having the regexp match. In this example, that means having the C<at> sequence with the final C<at> in the string. The other important principle illustrated here is that, when there are two or more elements in a regexp, the I<leftmost> quantifier, if there is one, gets to grab as much of the string as possible, leaving the rest of the regexp to fight over scraps. Thus in our example, the first quantifier C<.*> grabs most of the string, while the second quantifier C<.*> gets the empty string. Quantifiers that grab as much of the string as possible are called I<maximal match> or I<greedy> quantifiers. When a regexp can match a string in several different ways, we can use the principles above to predict which way the regexp will match: =over 4 =item * Principle 0: Taken as a whole, any regexp will be matched at the earliest possible position in the string. =item * Principle 1: In an alternation C<a|b|c...>, the leftmost alternative that allows a match for the whole regexp will be the one used. =item * Principle 2: The maximal matching quantifiers C<'?'>, C<'*'>, C<'+'> and C<{n,m}> will in general match as much of the string as possible while still allowing the whole regexp to match. =item * Principle 3: If there are two or more elements in a regexp, the leftmost greedy quantifier, if any, will match as much of the string as possible while still allowing the whole regexp to match. The next leftmost greedy quantifier, if any, will try to match as much of the string remaining available to it as possible, while still allowing the whole regexp to match. And so on, until all the regexp elements are satisfied. =back As we have seen above, Principle 0 overrides the others. The regexp will be matched as early as possible, with the other principles determining how the regexp matches at that earliest character position. Here is an example of these principles in action: $x = "The programming republic of Perl"; $x =~ /^(.+)(e|r)(.*)$/; # matches, # $1 = 'The programming republic of Pe' # $2 = 'r' # $3 = 'l' This regexp matches at the earliest string position, C<'T'>. One might think that C<'e'>, being leftmost in the alternation, would be matched, but C<'r'> produces the longest string in the first quantifier. $x =~ /(m{1,2})(.*)$/; # matches, # $1 = 'mm' # $2 = 'ing republic of Perl' Here, The earliest possible match is at the first C<'m'> in C<programming>. C<m{1,2}> is the first quantifier, so it gets to match a maximal C<mm>. $x =~ /.*(m{1,2})(.*)$/; # matches, # $1 = 'm' # $2 = 'ing republic of Perl' Here, the regexp matches at the start of the string. The first quantifier C<.*> grabs as much as possible, leaving just a single C<'m'> for the second quantifier C<m{1,2}>. $x =~ /(.?)(m{1,2})(.*)$/; # matches, # $1 = 'a' # $2 = 'mm' # $3 = 'ing republic of Perl' Here, C<.?> eats its maximal one character at the earliest possible position in the string, C<'a'> in C<programming>, leaving C<m{1,2}> the opportunity to match both C<'m'>'s. Finally, "aXXXb" =~ /(X*)/; # matches with $1 = '' because it can match zero copies of C<'X'> at the beginning of the string. If you definitely want to match at least one C<'X'>, use C<X+>, not C<X*>. Sometimes greed is not good. At times, we would like quantifiers to match a I<minimal> piece of string, rather than a maximal piece. For this purpose, Larry Wall created the I<minimal match> or I<non-greedy> quantifiers C<??>, C<*?>, C<+?>, and C<{}?>. These are the usual quantifiers with a C<'?'> appended to them. They have the following meanings: =over 4 =item * C<a??> means: match C<'a'> 0 or 1 times. Try 0 first, then 1. =item * C<a*?> means: match C<'a'> 0 or more times, I<i.e.>, any number of times, but as few times as possible =item * C<a+?> means: match C<'a'> 1 or more times, I<i.e.>, at least once, but as few times as possible =item * C<a{n,m}?> means: match at least C<n> times, not more than C<m> times, as few times as possible =item * C<a{n,}?> means: match at least C<n> times, but as few times as possible =item * C<a{n}?> means: match exactly C<n> times. Because we match exactly C<n> times, C<a{n}?> is equivalent to C<a{n}> and is just there for notational consistency. =back Let's look at the example above, but with minimal quantifiers: $x = "The programming republic of Perl"; $x =~ /^(.+?)(e|r)(.*)$/; # matches, # $1 = 'Th' # $2 = 'e' # $3 = ' programming republic of Perl' The minimal string that will allow both the start of the string C<'^'> and the alternation to match is C<Th>, with the alternation C<e|r> matching C<'e'>. The second quantifier C<.*> is free to gobble up the rest of the string. $x =~ /(m{1,2}?)(.*?)$/; # matches, # $1 = 'm' # $2 = 'ming republic of Perl' The first string position that this regexp can match is at the first C<'m'> in C<programming>. At this position, the minimal C<m{1,2}?> matches just one C<'m'>. Although the second quantifier C<.*?> would prefer to match no characters, it is constrained by the end-of-string anchor C<'$'> to match the rest of the string. $x =~ /(.*?)(m{1,2}?)(.*)$/; # matches, # $1 = 'The progra' # $2 = 'm' # $3 = 'ming republic of Perl' In this regexp, you might expect the first minimal quantifier C<.*?> to match the empty string, because it is not constrained by a C<'^'> anchor to match the beginning of the word. Principle 0 applies here, however. Because it is possible for the whole regexp to match at the start of the string, it I<will> match at the start of the string. Thus the first quantifier has to match everything up to the first C<'m'>. The second minimal quantifier matches just one C<'m'> and the third quantifier matches the rest of the string. $x =~ /(.??)(m{1,2})(.*)$/; # matches, # $1 = 'a' # $2 = 'mm' # $3 = 'ing republic of Perl' Just as in the previous regexp, the first quantifier C<.??> can match earliest at position C<'a'>, so it does. The second quantifier is greedy, so it matches C<mm>, and the third matches the rest of the string. We can modify principle 3 above to take into account non-greedy quantifiers: =over 4 =item * Principle 3: If there are two or more elements in a regexp, the leftmost greedy (non-greedy) quantifier, if any, will match as much (little) of the string as possible while still allowing the whole regexp to match. The next leftmost greedy (non-greedy) quantifier, if any, will try to match as much (little) of the string remaining available to it as possible, while still allowing the whole regexp to match. And so on, until all the regexp elements are satisfied. =back Just like alternation, quantifiers are also susceptible to backtracking. Here is a step-by-step analysis of the example $x = "the cat in the hat"; $x =~ /^(.*)(at)(.*)$/; # matches, # $1 = 'the cat in the h' # $2 = 'at' # $3 = '' (0 matches) =over 4 =item Z<>0. Start with the first letter in the string C<'t'>. E<nbsp> =item Z<>1. The first quantifier C<'.*'> starts out by matching the whole string "C<the cat in the hat>". E<nbsp> =item Z<>2. C<'a'> in the regexp element C<'at'> doesn't match the end of the string. Backtrack one character. E<nbsp> =item Z<>3. C<'a'> in the regexp element C<'at'> still doesn't match the last letter of the string C<'t'>, so backtrack one more character. E<nbsp> =item Z<>4. Now we can match the C<'a'> and the C<'t'>. E<nbsp> =item Z<>5. Move on to the third element C<'.*'>. Since we are at the end of the string and C<'.*'> can match 0 times, assign it the empty string. E<nbsp> =item Z<>6. We are done! =back Most of the time, all this moving forward and backtracking happens quickly and searching is fast. There are some pathological regexps, however, whose execution time exponentially grows with the size of the string. A typical structure that blows up in your face is of the form /(a|b+)*/; The problem is the nested indeterminate quantifiers. There are many different ways of partitioning a string of length n between the C<'+'> and C<'*'>: one repetition with C<b+> of length n, two repetitions with the first C<b+> length k and the second with length n-k, m repetitions whose bits add up to length n, I<etc>. In fact there are an exponential number of ways to partition a string as a function of its length. A regexp may get lucky and match early in the process, but if there is no match, Perl will try I<every> possibility before giving up. So be careful with nested C<'*'>'s, C<{n,m}>'s, and C<'+'>'s. The book I<Mastering Regular Expressions> by Jeffrey Friedl gives a wonderful discussion of this and other efficiency issues. =head2 Possessive quantifiers Backtracking during the relentless search for a match may be a waste of time, particularly when the match is bound to fail. Consider the simple pattern /^\w+\s+\w+$/; # a word, spaces, a word Whenever this is applied to a string which doesn't quite meet the pattern's expectations such as S<C<"abc ">> or S<C<"abc def ">>, the regexp engine will backtrack, approximately once for each character in the string. But we know that there is no way around taking I<all> of the initial word characters to match the first repetition, that I<all> spaces must be eaten by the middle part, and the same goes for the second word. With the introduction of the I<possessive quantifiers> in Perl 5.10, we have a way of instructing the regexp engine not to backtrack, with the usual quantifiers with a C<'+'> appended to them. This makes them greedy as well as stingy; once they succeed they won't give anything back to permit another solution. They have the following meanings: =over 4 =item * C<a{n,m}+> means: match at least C<n> times, not more than C<m> times, as many times as possible, and don't give anything up. C<a?+> is short for C<a{0,1}+> =item * C<a{n,}+> means: match at least C<n> times, but as many times as possible, and don't give anything up. C<a*+> is short for C<a{0,}+> and C<a++> is short for C<a{1,}+>. =item * C<a{n}+> means: match exactly C<n> times. It is just there for notational consistency. =back These possessive quantifiers represent a special case of a more general concept, the I<independent subexpression>, see below. As an example where a possessive quantifier is suitable we consider matching a quoted string, as it appears in several programming languages. The backslash is used as an escape character that indicates that the next character is to be taken literally, as another character for the string. Therefore, after the opening quote, we expect a (possibly empty) sequence of alternatives: either some character except an unescaped quote or backslash or an escaped character. /"(?:[^"\\]++|\\.)*+"/; =head2 Building a regexp At this point, we have all the basic regexp concepts covered, so let's give a more involved example of a regular expression. We will build a regexp that matches numbers. The first task in building a regexp is to decide what we want to match and what we want to exclude. In our case, we want to match both integers and floating point numbers and we want to reject any string that isn't a number. The next task is to break the problem down into smaller problems that are easily converted into a regexp. The simplest case is integers. These consist of a sequence of digits, with an optional sign in front. The digits we can represent with C<\d+> and the sign can be matched with C<[+-]>. Thus the integer regexp is /[+-]?\d+/; # matches integers A floating point number potentially has a sign, an integral part, a decimal point, a fractional part, and an exponent. One or more of these parts is optional, so we need to check out the different possibilities. Floating point numbers which are in proper form include 123., 0.345, .34, -1e6, and 25.4E-72. As with integers, the sign out front is completely optional and can be matched by C<[+-]?>. We can see that if there is no exponent, floating point numbers must have a decimal point, otherwise they are integers. We might be tempted to model these with C<\d*\.\d*>, but this would also match just a single decimal point, which is not a number. So the three cases of floating point number without exponent are /[+-]?\d+\./; # 1., 321., etc. /[+-]?\.\d+/; # .1, .234, etc. /[+-]?\d+\.\d+/; # 1.0, 30.56, etc. These can be combined into a single regexp with a three-way alternation: /[+-]?(\d+\.\d+|\d+\.|\.\d+)/; # floating point, no exponent In this alternation, it is important to put C<'\d+\.\d+'> before C<'\d+\.'>. If C<'\d+\.'> were first, the regexp would happily match that and ignore the fractional part of the number. Now consider floating point numbers with exponents. The key observation here is that I<both> integers and numbers with decimal points are allowed in front of an exponent. Then exponents, like the overall sign, are independent of whether we are matching numbers with or without decimal points, and can be "decoupled" from the mantissa. The overall form of the regexp now becomes clear: /^(optional sign)(integer | f.p. mantissa)(optional exponent)$/; The exponent is an C<'e'> or C<'E'>, followed by an integer. So the exponent regexp is /[eE][+-]?\d+/; # exponent Putting all the parts together, we get a regexp that matches numbers: /^[+-]?(\d+\.\d+|\d+\.|\.\d+|\d+)([eE][+-]?\d+)?$/; # Ta da! Long regexps like this may impress your friends, but can be hard to decipher. In complex situations like this, the C</x> modifier for a match is invaluable. It allows one to put nearly arbitrary whitespace and comments into a regexp without affecting their meaning. Using it, we can rewrite our "extended" regexp in the more pleasing form /^ [+-]? # first, match an optional sign ( # then match integers or f.p. mantissas: \d+\.\d+ # mantissa of the form a.b |\d+\. # mantissa of the form a. |\.\d+ # mantissa of the form .b |\d+ # integer of the form a ) ( [eE] [+-]? \d+ )? # finally, optionally match an exponent $/x; If whitespace is mostly irrelevant, how does one include space characters in an extended regexp? The answer is to backslash it S<C<'\ '>> or put it in a character class S<C<[ ]>>. The same thing goes for pound signs: use C<\#> or C<[#]>. For instance, Perl allows a space between the sign and the mantissa or integer, and we could add this to our regexp as follows: /^ [+-]?\ * # first, match an optional sign *and space* ( # then match integers or f.p. mantissas: \d+\.\d+ # mantissa of the form a.b |\d+\. # mantissa of the form a. |\.\d+ # mantissa of the form .b |\d+ # integer of the form a ) ( [eE] [+-]? \d+ )? # finally, optionally match an exponent $/x; In this form, it is easier to see a way to simplify the alternation. Alternatives 1, 2, and 4 all start with C<\d+>, so it could be factored out: /^ [+-]?\ * # first, match an optional sign ( # then match integers or f.p. mantissas: \d+ # start out with a ... ( \.\d* # mantissa of the form a.b or a. )? # ? takes care of integers of the form a |\.\d+ # mantissa of the form .b ) ( [eE] [+-]? \d+ )? # finally, optionally match an exponent $/x; Starting in Perl v5.26, specifying C</xx> changes the square-bracketed portions of a pattern to ignore tabs and space characters unless they are escaped by preceding them with a backslash. So, we could write /^ [ + - ]?\ * # first, match an optional sign ( # then match integers or f.p. mantissas: \d+ # start out with a ... ( \.\d* # mantissa of the form a.b or a. )? # ? takes care of integers of the form a |\.\d+ # mantissa of the form .b ) ( [ e E ] [ + - ]? \d+ )? # finally, optionally match an exponent $/xx; This doesn't really improve the legibility of this example, but it's available in case you want it. Squashing the pattern down to the compact form, we have /^[+-]?\ *(\d+(\.\d*)?|\.\d+)([eE][+-]?\d+)?$/; This is our final regexp. To recap, we built a regexp by =over 4 =item * specifying the task in detail, =item * breaking down the problem into smaller parts, =item * translating the small parts into regexps, =item * combining the regexps, =item * and optimizing the final combined regexp. =back These are also the typical steps involved in writing a computer program. This makes perfect sense, because regular expressions are essentially programs written in a little computer language that specifies patterns. =head2 Using regular expressions in Perl The last topic of Part 1 briefly covers how regexps are used in Perl programs. Where do they fit into Perl syntax? We have already introduced the matching operator in its default C</regexp/> and arbitrary delimiter C<m!regexp!> forms. We have used the binding operator C<=~> and its negation C<!~> to test for string matches. Associated with the matching operator, we have discussed the single line C</s>, multi-line C</m>, case-insensitive C</i> and extended C</x> modifiers. There are a few more things you might want to know about matching operators. =head3 Prohibiting substitution If you change C<$pattern> after the first substitution happens, Perl will ignore it. If you don't want any substitutions at all, use the special delimiter C<m''>: @pattern = ('Seuss'); while (<>) { print if m'@pattern'; # matches literal '@pattern', not 'Seuss' } Similar to strings, C<m''> acts like apostrophes on a regexp; all other C<'m'> delimiters act like quotes. If the regexp evaluates to the empty string, the regexp in the I<last successful match> is used instead. So we have "dog" =~ /d/; # 'd' matches "dogbert" =~ //; # this matches the 'd' regexp used before =head3 Global matching The final two modifiers we will discuss here, C</g> and C</c>, concern multiple matches. The modifier C</g> stands for global matching and allows the matching operator to match within a string as many times as possible. In scalar context, successive invocations against a string will have C</g> jump from match to match, keeping track of position in the string as it goes along. You can get or set the position with the C<pos()> function. The use of C</g> is shown in the following example. Suppose we have a string that consists of words separated by spaces. If we know how many words there are in advance, we could extract the words using groupings: $x = "cat dog house"; # 3 words $x =~ /^\s*(\w+)\s+(\w+)\s+(\w+)\s*$/; # matches, # $1 = 'cat' # $2 = 'dog' # $3 = 'house' But what if we had an indeterminate number of words? This is the sort of task C</g> was made for. To extract all words, form the simple regexp C<(\w+)> and loop over all matches with C</(\w+)/g>: while ($x =~ /(\w+)/g) { print "Word is $1, ends at position ", pos $x, "\n"; } prints Word is cat, ends at position 3 Word is dog, ends at position 7 Word is house, ends at position 13 A failed match or changing the target string resets the position. If you don't want the position reset after failure to match, add the C</c>, as in C</regexp/gc>. The current position in the string is associated with the string, not the regexp. This means that different strings have different positions and their respective positions can be set or read independently. In list context, C</g> returns a list of matched groupings, or if there are no groupings, a list of matches to the whole regexp. So if we wanted just the words, we could use @words = ($x =~ /(\w+)/g); # matches, # $words[0] = 'cat' # $words[1] = 'dog' # $words[2] = 'house' Closely associated with the C</g> modifier is the C<\G> anchor. The C<\G> anchor matches at the point where the previous C</g> match left off. C<\G> allows us to easily do context-sensitive matching: $metric = 1; # use metric units ... $x = <FILE>; # read in measurement $x =~ /^([+-]?\d+)\s*/g; # get magnitude $weight = $1; if ($metric) { # error checking print "Units error!" unless $x =~ /\Gkg\./g; } else { print "Units error!" unless $x =~ /\Glbs\./g; } $x =~ /\G\s+(widget|sprocket)/g; # continue processing The combination of C</g> and C<\G> allows us to process the string a bit at a time and use arbitrary Perl logic to decide what to do next. Currently, the C<\G> anchor is only fully supported when used to anchor to the start of the pattern. C<\G> is also invaluable in processing fixed-length records with regexps. Suppose we have a snippet of coding region DNA, encoded as base pair letters C<ATCGTTGAAT...> and we want to find all the stop codons C<TGA>. In a coding region, codons are 3-letter sequences, so we can think of the DNA snippet as a sequence of 3-letter records. The naive regexp # expanded, this is "ATC GTT GAA TGC AAA TGA CAT GAC" $dna = "ATCGTTGAATGCAAATGACATGAC"; $dna =~ /TGA/; doesn't work; it may match a C<TGA>, but there is no guarantee that the match is aligned with codon boundaries, I<e.g.>, the substring S<C<GTT GAA>> gives a match. A better solution is while ($dna =~ /(\w\w\w)*?TGA/g) { # note the minimal *? print "Got a TGA stop codon at position ", pos $dna, "\n"; } which prints Got a TGA stop codon at position 18 Got a TGA stop codon at position 23 Position 18 is good, but position 23 is bogus. What happened? The answer is that our regexp works well until we get past the last real match. Then the regexp will fail to match a synchronized C<TGA> and start stepping ahead one character position at a time, not what we want. The solution is to use C<\G> to anchor the match to the codon alignment: while ($dna =~ /\G(\w\w\w)*?TGA/g) { print "Got a TGA stop codon at position ", pos $dna, "\n"; } This prints Got a TGA stop codon at position 18 which is the correct answer. This example illustrates that it is important not only to match what is desired, but to reject what is not desired. (There are other regexp modifiers that are available, such as C</o>, but their specialized uses are beyond the scope of this introduction. ) =head3 Search and replace Regular expressions also play a big role in I<search and replace> operations in Perl. Search and replace is accomplished with the C<s///> operator. The general form is C<s/regexp/replacement/modifiers>, with everything we know about regexps and modifiers applying in this case as well. The I<replacement> is a Perl double-quoted string that replaces in the string whatever is matched with the C<regexp>. The operator C<=~> is also used here to associate a string with C<s///>. If matching against C<$_>, the S<C<$_ =~>> can be dropped. If there is a match, C<s///> returns the number of substitutions made; otherwise it returns false. Here are a few examples: $x = "Time to feed the cat!"; $x =~ s/cat/hacker/; # $x contains "Time to feed the hacker!" if ($x =~ s/^(Time.*hacker)!$/$1 now!/) { $more_insistent = 1; } $y = "'quoted words'"; $y =~ s/^'(.*)'$/$1/; # strip single quotes, # $y contains "quoted words" In the last example, the whole string was matched, but only the part inside the single quotes was grouped. With the C<s///> operator, the matched variables C<$1>, C<$2>, I<etc>. are immediately available for use in the replacement expression, so we use C<$1> to replace the quoted string with just what was quoted. With the global modifier, C<s///g> will search and replace all occurrences of the regexp in the string: $x = "I batted 4 for 4"; $x =~ s/4/four/; # doesn't do it all: # $x contains "I batted four for 4" $x = "I batted 4 for 4"; $x =~ s/4/four/g; # does it all: # $x contains "I batted four for four" If you prefer "regex" over "regexp" in this tutorial, you could use the following program to replace it: % cat > simple_replace #!/usr/bin/perl $regexp = shift; $replacement = shift; while (<>) { s/$regexp/$replacement/g; print; } ^D % simple_replace regexp regex perlretut.pod In C<simple_replace> we used the C<s///g> modifier to replace all occurrences of the regexp on each line. (Even though the regular expression appears in a loop, Perl is smart enough to compile it only once.) As with C<simple_grep>, both the C<print> and the C<s/$regexp/$replacement/g> use C<$_> implicitly. If you don't want C<s///> to change your original variable you can use the non-destructive substitute modifier, C<s///r>. This changes the behavior so that C<s///r> returns the final substituted string (instead of the number of substitutions): $x = "I like dogs."; $y = $x =~ s/dogs/cats/r; print "$x $y\n"; That example will print "I like dogs. I like cats". Notice the original C<$x> variable has not been affected. The overall result of the substitution is instead stored in C<$y>. If the substitution doesn't affect anything then the original string is returned: $x = "I like dogs."; $y = $x =~ s/elephants/cougars/r; print "$x $y\n"; # prints "I like dogs. I like dogs." One other interesting thing that the C<s///r> flag allows is chaining substitutions: $x = "Cats are great."; print $x =~ s/Cats/Dogs/r =~ s/Dogs/Frogs/r =~ s/Frogs/Hedgehogs/r, "\n"; # prints "Hedgehogs are great." A modifier available specifically to search and replace is the C<s///e> evaluation modifier. C<s///e> treats the replacement text as Perl code, rather than a double-quoted string. The value that the code returns is substituted for the matched substring. C<s///e> is useful if you need to do a bit of computation in the process of replacing text. This example counts character frequencies in a line: $x = "Bill the cat"; $x =~ s/(.)/$chars{$1}++;$1/eg; # final $1 replaces char with itself print "frequency of '$_' is $chars{$_}\n" foreach (sort {$chars{$b} <=> $chars{$a}} keys %chars); This prints frequency of ' ' is 2 frequency of 't' is 2 frequency of 'l' is 2 frequency of 'B' is 1 frequency of 'c' is 1 frequency of 'e' is 1 frequency of 'h' is 1 frequency of 'i' is 1 frequency of 'a' is 1 As with the match C<m//> operator, C<s///> can use other delimiters, such as C<s!!!> and C<s{}{}>, and even C<s{}//>. If single quotes are used C<s'''>, then the regexp and replacement are treated as single-quoted strings and there are no variable substitutions. C<s///> in list context returns the same thing as in scalar context, I<i.e.>, the number of matches. =head3 The split function The C<split()> function is another place where a regexp is used. C<split /regexp/, string, limit> separates the C<string> operand into a list of substrings and returns that list. The regexp must be designed to match whatever constitutes the separators for the desired substrings. The C<limit>, if present, constrains splitting into no more than C<limit> number of strings. For example, to split a string into words, use $x = "Calvin and Hobbes"; @words = split /\s+/, $x; # $word[0] = 'Calvin' # $word[1] = 'and' # $word[2] = 'Hobbes' If the empty regexp C<//> is used, the regexp always matches and the string is split into individual characters. If the regexp has groupings, then the resulting list contains the matched substrings from the groupings as well. For instance, $x = "/usr/bin/perl"; @dirs = split m!/!, $x; # $dirs[0] = '' # $dirs[1] = 'usr' # $dirs[2] = 'bin' # $dirs[3] = 'perl' @parts = split m!(/)!, $x; # $parts[0] = '' # $parts[1] = '/' # $parts[2] = 'usr' # $parts[3] = '/' # $parts[4] = 'bin' # $parts[5] = '/' # $parts[6] = 'perl' Since the first character of C<$x> matched the regexp, C<split> prepended an empty initial element to the list. If you have read this far, congratulations! You now have all the basic tools needed to use regular expressions to solve a wide range of text processing problems. If this is your first time through the tutorial, why not stop here and play around with regexps a while.... S<Part 2> concerns the more esoteric aspects of regular expressions and those concepts certainly aren't needed right at the start. =head1 Part 2: Power tools OK, you know the basics of regexps and you want to know more. If matching regular expressions is analogous to a walk in the woods, then the tools discussed in Part 1 are analogous to topo maps and a compass, basic tools we use all the time. Most of the tools in part 2 are analogous to flare guns and satellite phones. They aren't used too often on a hike, but when we are stuck, they can be invaluable. What follows are the more advanced, less used, or sometimes esoteric capabilities of Perl regexps. In Part 2, we will assume you are comfortable with the basics and concentrate on the advanced features. =head2 More on characters, strings, and character classes There are a number of escape sequences and character classes that we haven't covered yet. There are several escape sequences that convert characters or strings between upper and lower case, and they are also available within patterns. C<\l> and C<\u> convert the next character to lower or upper case, respectively: $x = "perl"; $string =~ /\u$x/; # matches 'Perl' in $string $x = "M(rs?|s)\\."; # note the double backslash $string =~ /\l$x/; # matches 'mr.', 'mrs.', and 'ms.', A C<\L> or C<\U> indicates a lasting conversion of case, until terminated by C<\E> or thrown over by another C<\U> or C<\L>: $x = "This word is in lower case:\L SHOUT\E"; $x =~ /shout/; # matches $x = "I STILL KEYPUNCH CARDS FOR MY 360"; $x =~ /\Ukeypunch/; # matches punch card string If there is no C<\E>, case is converted until the end of the string. The regexps C<\L\u$word> or C<\u\L$word> convert the first character of C<$word> to uppercase and the rest of the characters to lowercase. Control characters can be escaped with C<\c>, so that a control-Z character would be matched with C<\cZ>. The escape sequence C<\Q>...C<\E> quotes, or protects most non-alphabetic characters. For instance, $x = "\QThat !^*&%~& cat!"; $x =~ /\Q!^*&%~&\E/; # check for rough language It does not protect C<'$'> or C<'@'>, so that variables can still be substituted. C<\Q>, C<\L>, C<\l>, C<\U>, C<\u> and C<\E> are actually part of double-quotish syntax, and not part of regexp syntax proper. They will work if they appear in a regular expression embedded directly in a program, but not when contained in a string that is interpolated in a pattern. Perl regexps can handle more than just the standard ASCII character set. Perl supports I<Unicode>, a standard for representing the alphabets from virtually all of the world's written languages, and a host of symbols. Perl's text strings are Unicode strings, so they can contain characters with a value (codepoint or character number) higher than 255. What does this mean for regexps? Well, regexp users don't need to know much about Perl's internal representation of strings. But they do need to know 1) how to represent Unicode characters in a regexp and 2) that a matching operation will treat the string to be searched as a sequence of characters, not bytes. The answer to 1) is that Unicode characters greater than C<chr(255)> are represented using the C<\x{hex}> notation, because C<\x>I<XY> (without curly braces and I<XY> are two hex digits) doesn't go further than 255. (Starting in Perl 5.14, if you're an octal fan, you can also use C<\o{oct}>.) /\x{263a}/; # match a Unicode smiley face :) B<NOTE>: In Perl 5.6.0 it used to be that one needed to say C<use utf8> to use any Unicode features. This is no more the case: for almost all Unicode processing, the explicit C<utf8> pragma is not needed. (The only case where it matters is if your Perl script is in Unicode and encoded in UTF-8, then an explicit C<use utf8> is needed.) Figuring out the hexadecimal sequence of a Unicode character you want or deciphering someone else's hexadecimal Unicode regexp is about as much fun as programming in machine code. So another way to specify Unicode characters is to use the I<named character> escape sequence C<\N{I<name>}>. I<name> is a name for the Unicode character, as specified in the Unicode standard. For instance, if we wanted to represent or match the astrological sign for the planet Mercury, we could use $x = "abc\N{MERCURY}def"; $x =~ /\N{MERCURY}/; # matches One can also use "short" names: print "\N{GREEK SMALL LETTER SIGMA} is called sigma.\n"; print "\N{greek:Sigma} is an upper-case sigma.\n"; You can also restrict names to a certain alphabet by specifying the L<charnames> pragma: use charnames qw(greek); print "\N{sigma} is Greek sigma\n"; An index of character names is available on-line from the Unicode Consortium, L<https://www.unicode.org/charts/charindex.html>; explanatory material with links to other resources at L<https://www.unicode.org/standard/where>. Starting in Perl v5.32, an alternative to C<\N{...}> for full names is available, and that is to say /\p{Name=greek small letter sigma}/ The casing of the character name is irrelevant when used in C<\p{}>, as are most spaces, underscores and hyphens. (A few outlier characters cause problems with ignoring all of them always. The details (which you can look up when you get more proficient, and if ever needed) are in L<https://www.unicode.org/reports/tr44/tr44-24.html#UAX44-LM2>). The answer to requirement 2) is that a regexp (mostly) uses Unicode characters. The "mostly" is for messy backward compatibility reasons, but starting in Perl 5.14, any regexp compiled in the scope of a C<use feature 'unicode_strings'> (which is automatically turned on within the scope of a C<use 5.012> or higher) will turn that "mostly" into "always". If you want to handle Unicode properly, you should ensure that C<'unicode_strings'> is turned on. Internally, this is encoded to bytes using either UTF-8 or a native 8 bit encoding, depending on the history of the string, but conceptually it is a sequence of characters, not bytes. See L<perlunitut> for a tutorial about that. Let us now discuss Unicode character classes, most usually called "character properties". These are represented by the C<\p{I<name>}> escape sequence. The negation of this is C<\P{I<name>}>. For example, to match lower and uppercase characters, $x = "BOB"; $x =~ /^\p{IsUpper}/; # matches, uppercase char class $x =~ /^\P{IsUpper}/; # doesn't match, char class sans uppercase $x =~ /^\p{IsLower}/; # doesn't match, lowercase char class $x =~ /^\P{IsLower}/; # matches, char class sans lowercase (The "C<Is>" is optional.) There are many, many Unicode character properties. For the full list see L<perluniprops>. Most of them have synonyms with shorter names, also listed there. Some synonyms are a single character. For these, you can drop the braces. For instance, C<\pM> is the same thing as C<\p{Mark}>, meaning things like accent marks. The Unicode C<\p{Script}> and C<\p{Script_Extensions}> properties are used to categorize every Unicode character into the language script it is written in. (C<Script_Extensions> is an improved version of C<Script>, which is retained for backward compatibility, and so you should generally use C<Script_Extensions>.) For example, English, French, and a bunch of other European languages are written in the Latin script. But there is also the Greek script, the Thai script, the Katakana script, I<etc>. You can test whether a character is in a particular script (based on C<Script_Extensions>) with, for example C<\p{Latin}>, C<\p{Greek}>, or C<\p{Katakana}>. To test if it isn't in the Balinese script, you would use C<\P{Balinese}>. What we have described so far is the single form of the C<\p{...}> character classes. There is also a compound form which you may run into. These look like C<\p{I<name>=I<value>}> or C<\p{I<name>:I<value>}> (the equals sign and colon can be used interchangeably). These are more general than the single form, and in fact most of the single forms are just Perl-defined shortcuts for common compound forms. For example, the script examples in the previous paragraph could be written equivalently as C<\p{Script_Extensions=Latin}>, C<\p{Script_Extensions:Greek}>, C<\p{script_extensions=katakana}>, and C<\P{script_extensions=balinese}> (case is irrelevant between the C<{}> braces). You may never have to use the compound forms, but sometimes it is necessary, and their use can make your code easier to understand. C<\X> is an abbreviation for a character class that comprises a Unicode I<extended grapheme cluster>. This represents a "logical character": what appears to be a single character, but may be represented internally by more than one. As an example, using the Unicode full names, I<e.g.>, "S<A + COMBINING RING>" is a grapheme cluster with base character "A" and combining character "S<COMBINING RING>, which translates in Danish to "A" with the circle atop it, as in the word E<Aring>ngstrom. For the full and latest information about Unicode see the latest Unicode standard, or the Unicode Consortium's website L<https://www.unicode.org> As if all those classes weren't enough, Perl also defines POSIX-style character classes. These have the form C<[:I<name>:]>, with I<name> the name of the POSIX class. The POSIX classes are C<alpha>, C<alnum>, C<ascii>, C<cntrl>, C<digit>, C<graph>, C<lower>, C<print>, C<punct>, C<space>, C<upper>, and C<xdigit>, and two extensions, C<word> (a Perl extension to match C<\w>), and C<blank> (a GNU extension). The C</a> modifier restricts these to matching just in the ASCII range; otherwise they can match the same as their corresponding Perl Unicode classes: C<[:upper:]> is the same as C<\p{IsUpper}>, I<etc>. (There are some exceptions and gotchas with this; see L<perlrecharclass> for a full discussion.) The C<[:digit:]>, C<[:word:]>, and C<[:space:]> correspond to the familiar C<\d>, C<\w>, and C<\s> character classes. To negate a POSIX class, put a C<'^'> in front of the name, so that, I<e.g.>, C<[:^digit:]> corresponds to C<\D> and, under Unicode, C<\P{IsDigit}>. The Unicode and POSIX character classes can be used just like C<\d>, with the exception that POSIX character classes can only be used inside of a character class: /\s+[abc[:digit:]xyz]\s*/; # match a,b,c,x,y,z, or a digit /^=item\s[[:digit:]]/; # match '=item', # followed by a space and a digit /\s+[abc\p{IsDigit}xyz]\s+/; # match a,b,c,x,y,z, or a digit /^=item\s\p{IsDigit}/; # match '=item', # followed by a space and a digit Whew! That is all the rest of the characters and character classes. =head2 Compiling and saving regular expressions In Part 1 we mentioned that Perl compiles a regexp into a compact sequence of opcodes. Thus, a compiled regexp is a data structure that can be stored once and used again and again. The regexp quote C<qr//> does exactly that: C<qr/string/> compiles the C<string> as a regexp and transforms the result into a form that can be assigned to a variable: $reg = qr/foo+bar?/; # reg contains a compiled regexp Then C<$reg> can be used as a regexp: $x = "fooooba"; $x =~ $reg; # matches, just like /foo+bar?/ $x =~ /$reg/; # same thing, alternate form C<$reg> can also be interpolated into a larger regexp: $x =~ /(abc)?$reg/; # still matches As with the matching operator, the regexp quote can use different delimiters, I<e.g.>, C<qr!!>, C<qr{}> or C<qr~~>. Apostrophes as delimiters (C<qr''>) inhibit any interpolation. Pre-compiled regexps are useful for creating dynamic matches that don't need to be recompiled each time they are encountered. Using pre-compiled regexps, we write a C<grep_step> program which greps for a sequence of patterns, advancing to the next pattern as soon as one has been satisfied. % cat > grep_step #!/usr/bin/perl # grep_step - match <number> regexps, one after the other # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ... $number = shift; $regexp[$_] = shift foreach (0..$number-1); @compiled = map qr/$_/, @regexp; while ($line = <>) { if ($line =~ /$compiled[0]/) { print $line; shift @compiled; last unless @compiled; } } ^D % grep_step 3 shift print last grep_step $number = shift; print $line; last unless @compiled; Storing pre-compiled regexps in an array C<@compiled> allows us to simply loop through the regexps without any recompilation, thus gaining flexibility without sacrificing speed. =head2 Composing regular expressions at runtime Backtracking is more efficient than repeated tries with different regular expressions. If there are several regular expressions and a match with any of them is acceptable, then it is possible to combine them into a set of alternatives. If the individual expressions are input data, this can be done by programming a join operation. We'll exploit this idea in an improved version of the C<simple_grep> program: a program that matches multiple patterns: % cat > multi_grep #!/usr/bin/perl # multi_grep - match any of <number> regexps # usage: multi_grep <number> regexp1 regexp2 ... file1 file2 ... $number = shift; $regexp[$_] = shift foreach (0..$number-1); $pattern = join '|', @regexp; while ($line = <>) { print $line if $line =~ /$pattern/; } ^D % multi_grep 2 shift for multi_grep $number = shift; $regexp[$_] = shift foreach (0..$number-1); Sometimes it is advantageous to construct a pattern from the I<input> that is to be analyzed and use the permissible values on the left hand side of the matching operations. As an example for this somewhat paradoxical situation, let's assume that our input contains a command verb which should match one out of a set of available command verbs, with the additional twist that commands may be abbreviated as long as the given string is unique. The program below demonstrates the basic algorithm. % cat > keymatch #!/usr/bin/perl $kwds = 'copy compare list print'; while( $cmd = <> ){ $cmd =~ s/^\s+|\s+$//g; # trim leading and trailing spaces if( ( @matches = $kwds =~ /\b$cmd\w*/g ) == 1 ){ print "command: '@matches'\n"; } elsif( @matches == 0 ){ print "no such command: '$cmd'\n"; } else { print "not unique: '$cmd' (could be one of: @matches)\n"; } } ^D % keymatch li command: 'list' co not unique: 'co' (could be one of: copy compare) printer no such command: 'printer' Rather than trying to match the input against the keywords, we match the combined set of keywords against the input. The pattern matching operation S<C<$kwds =~ /\b($cmd\w*)/g>> does several things at the same time. It makes sure that the given command begins where a keyword begins (C<\b>). It tolerates abbreviations due to the added C<\w*>. It tells us the number of matches (C<scalar @matches>) and all the keywords that were actually matched. You could hardly ask for more. =head2 Embedding comments and modifiers in a regular expression Starting with this section, we will be discussing Perl's set of I<extended patterns>. These are extensions to the traditional regular expression syntax that provide powerful new tools for pattern matching. We have already seen extensions in the form of the minimal matching constructs C<??>, C<*?>, C<+?>, C<{n,m}?>, and C<{n,}?>. Most of the extensions below have the form C<(?char...)>, where the C<char> is a character that determines the type of extension. The first extension is an embedded comment C<(?#text)>. This embeds a comment into the regular expression without affecting its meaning. The comment should not have any closing parentheses in the text. An example is /(?# Match an integer:)[+-]?\d+/; This style of commenting has been largely superseded by the raw, freeform commenting that is allowed with the C</x> modifier. Most modifiers, such as C</i>, C</m>, C</s> and C</x> (or any combination thereof) can also be embedded in a regexp using C<(?i)>, C<(?m)>, C<(?s)>, and C<(?x)>. For instance, /(?i)yes/; # match 'yes' case insensitively /yes/i; # same thing /(?x)( # freeform version of an integer regexp [+-]? # match an optional sign \d+ # match a sequence of digits ) /x; Embedded modifiers can have two important advantages over the usual modifiers. Embedded modifiers allow a custom set of modifiers for I<each> regexp pattern. This is great for matching an array of regexps that must have different modifiers: $pattern[0] = '(?i)doctor'; $pattern[1] = 'Johnson'; ... while (<>) { foreach $patt (@pattern) { print if /$patt/; } } The second advantage is that embedded modifiers (except C</p>, which modifies the entire regexp) only affect the regexp inside the group the embedded modifier is contained in. So grouping can be used to localize the modifier's effects: /Answer: ((?i)yes)/; # matches 'Answer: yes', 'Answer: YES', etc. Embedded modifiers can also turn off any modifiers already present by using, I<e.g.>, C<(?-i)>. Modifiers can also be combined into a single expression, I<e.g.>, C<(?s-i)> turns on single line mode and turns off case insensitivity. Embedded modifiers may also be added to a non-capturing grouping. C<(?i-m:regexp)> is a non-capturing grouping that matches C<regexp> case insensitively and turns off multi-line mode. =head2 Looking ahead and looking behind This section concerns the lookahead and lookbehind assertions. First, a little background. In Perl regular expressions, most regexp elements "eat up" a certain amount of string when they match. For instance, the regexp element C<[abc]> eats up one character of the string when it matches, in the sense that Perl moves to the next character position in the string after the match. There are some elements, however, that don't eat up characters (advance the character position) if they match. The examples we have seen so far are the anchors. The anchor C<'^'> matches the beginning of the line, but doesn't eat any characters. Similarly, the word boundary anchor C<\b> matches wherever a character matching C<\w> is next to a character that doesn't, but it doesn't eat up any characters itself. Anchors are examples of I<zero-width assertions>: zero-width, because they consume no characters, and assertions, because they test some property of the string. In the context of our walk in the woods analogy to regexp matching, most regexp elements move us along a trail, but anchors have us stop a moment and check our surroundings. If the local environment checks out, we can proceed forward. But if the local environment doesn't satisfy us, we must backtrack. Checking the environment entails either looking ahead on the trail, looking behind, or both. C<'^'> looks behind, to see that there are no characters before. C<'$'> looks ahead, to see that there are no characters after. C<\b> looks both ahead and behind, to see if the characters on either side differ in their "word-ness". The lookahead and lookbehind assertions are generalizations of the anchor concept. Lookahead and lookbehind are zero-width assertions that let us specify which characters we want to test for. The lookahead assertion is denoted by C<(?=regexp)> or (starting in 5.32, experimentally in 5.28) C<(*pla:regexp)> or C<(*positive_lookahead:regexp)>; and the lookbehind assertion is denoted by C<< (?<=fixed-regexp) >> or (starting in 5.32, experimentally in 5.28) C<(*plb:fixed-regexp)> or C<(*positive_lookbehind:fixed-regexp)>. Some examples are $x = "I catch the housecat 'Tom-cat' with catnip"; $x =~ /cat(*pla:\s)/; # matches 'cat' in 'housecat' @catwords = ($x =~ /(?<=\s)cat\w+/g); # matches, # $catwords[0] = 'catch' # $catwords[1] = 'catnip' $x =~ /\bcat\b/; # matches 'cat' in 'Tom-cat' $x =~ /(?<=\s)cat(?=\s)/; # doesn't match; no isolated 'cat' in # middle of $x Note that the parentheses in these are non-capturing, since these are zero-width assertions. Thus in the second regexp, the substrings captured are those of the whole regexp itself. Lookahead can match arbitrary regexps, but lookbehind prior to 5.30 C<< (?<=fixed-regexp) >> only works for regexps of fixed width, I<i.e.>, a fixed number of characters long. Thus C<< (?<=(ab|bc)) >> is fine, but C<< (?<=(ab)*) >> prior to 5.30 is not. The negated versions of the lookahead and lookbehind assertions are denoted by C<(?!regexp)> and C<< (?<!fixed-regexp) >> respectively. Or, starting in 5.32 (experimentally in 5.28), C<(*nla:regexp)>, C<(*negative_lookahead:regexp)>, C<(*nlb:regexp)>, or C<(*negative_lookbehind:regexp)>. They evaluate true if the regexps do I<not> match: $x = "foobar"; $x =~ /foo(?!bar)/; # doesn't match, 'bar' follows 'foo' $x =~ /foo(?!baz)/; # matches, 'baz' doesn't follow 'foo' $x =~ /(?<!\s)foo/; # matches, there is no \s before 'foo' Here is an example where a string containing blank-separated words, numbers and single dashes is to be split into its components. Using C</\s+/> alone won't work, because spaces are not required between dashes, or a word or a dash. Additional places for a split are established by looking ahead and behind: $str = "one two - --6-8"; @toks = split / \s+ # a run of spaces | (?<=\S) (?=-) # any non-space followed by '-' | (?<=-) (?=\S) # a '-' followed by any non-space /x, $str; # @toks = qw(one two - - - 6 - 8) =head2 Using independent subexpressions to prevent backtracking I<Independent subexpressions> (or atomic subexpressions) are regular expressions, in the context of a larger regular expression, that function independently of the larger regular expression. That is, they consume as much or as little of the string as they wish without regard for the ability of the larger regexp to match. Independent subexpressions are represented by C<< (?>regexp) >> or (starting in 5.32, experimentally in 5.28) C<(*atomic:regexp)>. We can illustrate their behavior by first considering an ordinary regexp: $x = "ab"; $x =~ /a*ab/; # matches This obviously matches, but in the process of matching, the subexpression C<a*> first grabbed the C<'a'>. Doing so, however, wouldn't allow the whole regexp to match, so after backtracking, C<a*> eventually gave back the C<'a'> and matched the empty string. Here, what C<a*> matched was I<dependent> on what the rest of the regexp matched. Contrast that with an independent subexpression: $x =~ /(?>a*)ab/; # doesn't match! The independent subexpression C<< (?>a*) >> doesn't care about the rest of the regexp, so it sees an C<'a'> and grabs it. Then the rest of the regexp C<ab> cannot match. Because C<< (?>a*) >> is independent, there is no backtracking and the independent subexpression does not give up its C<'a'>. Thus the match of the regexp as a whole fails. A similar behavior occurs with completely independent regexps: $x = "ab"; $x =~ /a*/g; # matches, eats an 'a' $x =~ /\Gab/g; # doesn't match, no 'a' available Here C</g> and C<\G> create a "tag team" handoff of the string from one regexp to the other. Regexps with an independent subexpression are much like this, with a handoff of the string to the independent subexpression, and a handoff of the string back to the enclosing regexp. The ability of an independent subexpression to prevent backtracking can be quite useful. Suppose we want to match a non-empty string enclosed in parentheses up to two levels deep. Then the following regexp matches: $x = "abc(de(fg)h"; # unbalanced parentheses $x =~ /\( ( [ ^ () ]+ | \( [ ^ () ]* \) )+ \)/xx; The regexp matches an open parenthesis, one or more copies of an alternation, and a close parenthesis. The alternation is two-way, with the first alternative C<[^()]+> matching a substring with no parentheses and the second alternative C<\([^()]*\)> matching a substring delimited by parentheses. The problem with this regexp is that it is pathological: it has nested indeterminate quantifiers of the form C<(a+|b)+>. We discussed in Part 1 how nested quantifiers like this could take an exponentially long time to execute if there was no match possible. To prevent the exponential blowup, we need to prevent useless backtracking at some point. This can be done by enclosing the inner quantifier as an independent subexpression: $x =~ /\( ( (?> [ ^ () ]+ ) | \([ ^ () ]* \) )+ \)/xx; Here, C<< (?>[^()]+) >> breaks the degeneracy of string partitioning by gobbling up as much of the string as possible and keeping it. Then match failures fail much more quickly. =head2 Conditional expressions A I<conditional expression> is a form of if-then-else statement that allows one to choose which patterns are to be matched, based on some condition. There are two types of conditional expression: C<(?(I<condition>)I<yes-regexp>)> and C<(?(condition)I<yes-regexp>|I<no-regexp>)>. C<(?(I<condition>)I<yes-regexp>)> is like an S<C<'if () {}'>> statement in Perl. If the I<condition> is true, the I<yes-regexp> will be matched. If the I<condition> is false, the I<yes-regexp> will be skipped and Perl will move onto the next regexp element. The second form is like an S<C<'if () {} else {}'>> statement in Perl. If the I<condition> is true, the I<yes-regexp> will be matched, otherwise the I<no-regexp> will be matched. The I<condition> can have several forms. The first form is simply an integer in parentheses C<(I<integer>)>. It is true if the corresponding backreference C<\I<integer>> matched earlier in the regexp. The same thing can be done with a name associated with a capture group, written as C<<< (E<lt>I<name>E<gt>) >>> or C<< ('I<name>') >>. The second form is a bare zero-width assertion C<(?...)>, either a lookahead, a lookbehind, or a code assertion (discussed in the next section). The third set of forms provides tests that return true if the expression is executed within a recursion (C<(R)>) or is being called from some capturing group, referenced either by number (C<(R1)>, C<(R2)>,...) or by name (C<(R&I<name>)>). The integer or name form of the C<condition> allows us to choose, with more flexibility, what to match based on what matched earlier in the regexp. This searches for words of the form C<"$x$x"> or C<"$x$y$y$x">: % simple_grep '^(\w+)(\w+)?(?(2)\g2\g1|\g1)$' /usr/dict/words beriberi coco couscous deed ... toot toto tutu The lookbehind C<condition> allows, along with backreferences, an earlier part of the match to influence a later part of the match. For instance, /[ATGC]+(?(?<=AA)G|C)$/; matches a DNA sequence such that it either ends in C<AAG>, or some other base pair combination and C<'C'>. Note that the form is C<< (?(?<=AA)G|C) >> and not C<< (?((?<=AA))G|C) >>; for the lookahead, lookbehind or code assertions, the parentheses around the conditional are not needed. =head2 Defining named patterns Some regular expressions use identical subpatterns in several places. Starting with Perl 5.10, it is possible to define named subpatterns in a section of the pattern so that they can be called up by name anywhere in the pattern. This syntactic pattern for this definition group is C<< (?(DEFINE)(?<I<name>>I<pattern>)...) >>. An insertion of a named pattern is written as C<(?&I<name>)>. The example below illustrates this feature using the pattern for floating point numbers that was presented earlier on. The three subpatterns that are used more than once are the optional sign, the digit sequence for an integer and the decimal fraction. The C<DEFINE> group at the end of the pattern contains their definition. Notice that the decimal fraction pattern is the first place where we can reuse the integer pattern. /^ (?&osg)\ * ( (?&int)(?&dec)? | (?&dec) ) (?: [eE](?&osg)(?&int) )? $ (?(DEFINE) (?<osg>[-+]?) # optional sign (?<int>\d++) # integer (?<dec>\.(?&int)) # decimal fraction )/x =head2 Recursive patterns This feature (introduced in Perl 5.10) significantly extends the power of Perl's pattern matching. By referring to some other capture group anywhere in the pattern with the construct C<(?I<group-ref>)>, the I<pattern> within the referenced group is used as an independent subpattern in place of the group reference itself. Because the group reference may be contained I<within> the group it refers to, it is now possible to apply pattern matching to tasks that hitherto required a recursive parser. To illustrate this feature, we'll design a pattern that matches if a string contains a palindrome. (This is a word or a sentence that, while ignoring spaces, interpunctuation and case, reads the same backwards as forwards. We begin by observing that the empty string or a string containing just one word character is a palindrome. Otherwise it must have a word character up front and the same at its end, with another palindrome in between. /(?: (\w) (?...Here be a palindrome...) \g{-1} | \w? )/x Adding C<\W*> at either end to eliminate what is to be ignored, we already have the full pattern: my $pp = qr/^(\W* (?: (\w) (?1) \g{-1} | \w? ) \W*)$/ix; for $s ( "saippuakauppias", "A man, a plan, a canal: Panama!" ){ print "'$s' is a palindrome\n" if $s =~ /$pp/; } In C<(?...)> both absolute and relative backreferences may be used. The entire pattern can be reinserted with C<(?R)> or C<(?0)>. If you prefer to name your groups, you can use C<(?&I<name>)> to recurse into that group. =head2 A bit of magic: executing Perl code in a regular expression Normally, regexps are a part of Perl expressions. I<Code evaluation> expressions turn that around by allowing arbitrary Perl code to be a part of a regexp. A code evaluation expression is denoted C<(?{I<code>})>, with I<code> a string of Perl statements. Code expressions are zero-width assertions, and the value they return depends on their environment. There are two possibilities: either the code expression is used as a conditional in a conditional expression C<(?(I<condition>)...)>, or it is not. If the code expression is a conditional, the code is evaluated and the result (I<i.e.>, the result of the last statement) is used to determine truth or falsehood. If the code expression is not used as a conditional, the assertion always evaluates true and the result is put into the special variable C<$^R>. The variable C<$^R> can then be used in code expressions later in the regexp. Here are some silly examples: $x = "abcdef"; $x =~ /abc(?{print "Hi Mom!";})def/; # matches, # prints 'Hi Mom!' $x =~ /aaa(?{print "Hi Mom!";})def/; # doesn't match, # no 'Hi Mom!' Pay careful attention to the next example: $x =~ /abc(?{print "Hi Mom!";})ddd/; # doesn't match, # no 'Hi Mom!' # but why not? At first glance, you'd think that it shouldn't print, because obviously the C<ddd> isn't going to match the target string. But look at this example: $x =~ /abc(?{print "Hi Mom!";})[dD]dd/; # doesn't match, # but _does_ print Hmm. What happened here? If you've been following along, you know that the above pattern should be effectively (almost) the same as the last one; enclosing the C<'d'> in a character class isn't going to change what it matches. So why does the first not print while the second one does? The answer lies in the optimizations the regexp engine makes. In the first case, all the engine sees are plain old characters (aside from the C<?{}> construct). It's smart enough to realize that the string C<'ddd'> doesn't occur in our target string before actually running the pattern through. But in the second case, we've tricked it into thinking that our pattern is more complicated. It takes a look, sees our character class, and decides that it will have to actually run the pattern to determine whether or not it matches, and in the process of running it hits the print statement before it discovers that we don't have a match. To take a closer look at how the engine does optimizations, see the section L</"Pragmas and debugging"> below. More fun with C<?{}>: $x =~ /(?{print "Hi Mom!";})/; # matches, # prints 'Hi Mom!' $x =~ /(?{$c = 1;})(?{print "$c";})/; # matches, # prints '1' $x =~ /(?{$c = 1;})(?{print "$^R";})/; # matches, # prints '1' The bit of magic mentioned in the section title occurs when the regexp backtracks in the process of searching for a match. If the regexp backtracks over a code expression and if the variables used within are localized using C<local>, the changes in the variables produced by the code expression are undone! Thus, if we wanted to count how many times a character got matched inside a group, we could use, I<e.g.>, $x = "aaaa"; $count = 0; # initialize 'a' count $c = "bob"; # test if $c gets clobbered $x =~ /(?{local $c = 0;}) # initialize count ( a # match 'a' (?{local $c = $c + 1;}) # increment count )* # do this any number of times, aa # but match 'aa' at the end (?{$count = $c;}) # copy local $c var into $count /x; print "'a' count is $count, \$c variable is '$c'\n"; This prints 'a' count is 2, $c variable is 'bob' If we replace the S<C< (?{local $c = $c + 1;})>> with S<C< (?{$c = $c + 1;})>>, the variable changes are I<not> undone during backtracking, and we get 'a' count is 4, $c variable is 'bob' Note that only localized variable changes are undone. Other side effects of code expression execution are permanent. Thus $x = "aaaa"; $x =~ /(a(?{print "Yow\n";}))*aa/; produces Yow Yow Yow Yow The result C<$^R> is automatically localized, so that it will behave properly in the presence of backtracking. This example uses a code expression in a conditional to match a definite article, either C<'the'> in English or C<'der|die|das'> in German: $lang = 'DE'; # use German ... $text = "das"; print "matched\n" if $text =~ /(?(?{ $lang eq 'EN'; # is the language English? }) the | # if so, then match 'the' (der|die|das) # else, match 'der|die|das' ) /xi; Note that the syntax here is C<(?(?{...})I<yes-regexp>|I<no-regexp>)>, not C<(?((?{...}))I<yes-regexp>|I<no-regexp>)>. In other words, in the case of a code expression, we don't need the extra parentheses around the conditional. If you try to use code expressions where the code text is contained within an interpolated variable, rather than appearing literally in the pattern, Perl may surprise you: $bar = 5; $pat = '(?{ 1 })'; /foo(?{ $bar })bar/; # compiles ok, $bar not interpolated /foo(?{ 1 })$bar/; # compiles ok, $bar interpolated /foo${pat}bar/; # compile error! $pat = qr/(?{ $foo = 1 })/; # precompile code regexp /foo${pat}bar/; # compiles ok If a regexp has a variable that interpolates a code expression, Perl treats the regexp as an error. If the code expression is precompiled into a variable, however, interpolating is ok. The question is, why is this an error? The reason is that variable interpolation and code expressions together pose a security risk. The combination is dangerous because many programmers who write search engines often take user input and plug it directly into a regexp: $regexp = <>; # read user-supplied regexp $chomp $regexp; # get rid of possible newline $text =~ /$regexp/; # search $text for the $regexp If the C<$regexp> variable contains a code expression, the user could then execute arbitrary Perl code. For instance, some joker could search for S<C<system('rm -rf *');>> to erase your files. In this sense, the combination of interpolation and code expressions I<taints> your regexp. So by default, using both interpolation and code expressions in the same regexp is not allowed. If you're not concerned about malicious users, it is possible to bypass this security check by invoking S<C<use re 'eval'>>: use re 'eval'; # throw caution out the door $bar = 5; $pat = '(?{ 1 })'; /foo${pat}bar/; # compiles ok Another form of code expression is the I<pattern code expression>. The pattern code expression is like a regular code expression, except that the result of the code evaluation is treated as a regular expression and matched immediately. A simple example is $length = 5; $char = 'a'; $x = 'aaaaabb'; $x =~ /(??{$char x $length})/x; # matches, there are 5 of 'a' This final example contains both ordinary and pattern code expressions. It detects whether a binary string C<1101010010001...> has a Fibonacci spacing 0,1,1,2,3,5,... of the C<'1'>'s: $x = "1101010010001000001"; $z0 = ''; $z1 = '0'; # initial conditions print "It is a Fibonacci sequence\n" if $x =~ /^1 # match an initial '1' (?: ((??{ $z0 })) # match some '0' 1 # and then a '1' (?{ $z0 = $z1; $z1 .= $^N; }) )+ # repeat as needed $ # that is all there is /x; printf "Largest sequence matched was %d\n", length($z1)-length($z0); Remember that C<$^N> is set to whatever was matched by the last completed capture group. This prints It is a Fibonacci sequence Largest sequence matched was 5 Ha! Try that with your garden variety regexp package... Note that the variables C<$z0> and C<$z1> are not substituted when the regexp is compiled, as happens for ordinary variables outside a code expression. Rather, the whole code block is parsed as perl code at the same time as perl is compiling the code containing the literal regexp pattern. This regexp without the C</x> modifier is /^1(?:((??{ $z0 }))1(?{ $z0 = $z1; $z1 .= $^N; }))+$/ which shows that spaces are still possible in the code parts. Nevertheless, when working with code and conditional expressions, the extended form of regexps is almost necessary in creating and debugging regexps. =head2 Backtracking control verbs Perl 5.10 introduced a number of control verbs intended to provide detailed control over the backtracking process, by directly influencing the regexp engine and by providing monitoring techniques. See L<perlre/"Special Backtracking Control Verbs"> for a detailed description. Below is just one example, illustrating the control verb C<(*FAIL)>, which may be abbreviated as C<(*F)>. If this is inserted in a regexp it will cause it to fail, just as it would at some mismatch between the pattern and the string. Processing of the regexp continues as it would after any "normal" failure, so that, for instance, the next position in the string or another alternative will be tried. As failing to match doesn't preserve capture groups or produce results, it may be necessary to use this in combination with embedded code. %count = (); "supercalifragilisticexpialidocious" =~ /([aeiou])(?{ $count{$1}++; })(*FAIL)/i; printf "%3d '%s'\n", $count{$_}, $_ for (sort keys %count); The pattern begins with a class matching a subset of letters. Whenever this matches, a statement like C<$count{'a'}++;> is executed, incrementing the letter's counter. Then C<(*FAIL)> does what it says, and the regexp engine proceeds according to the book: as long as the end of the string hasn't been reached, the position is advanced before looking for another vowel. Thus, match or no match makes no difference, and the regexp engine proceeds until the entire string has been inspected. (It's remarkable that an alternative solution using something like $count{lc($_)}++ for split('', "supercalifragilisticexpialidocious"); printf "%3d '%s'\n", $count2{$_}, $_ for ( qw{ a e i o u } ); is considerably slower.) =head2 Pragmas and debugging Speaking of debugging, there are several pragmas available to control and debug regexps in Perl. We have already encountered one pragma in the previous section, S<C<use re 'eval';>>, that allows variable interpolation and code expressions to coexist in a regexp. The other pragmas are use re 'taint'; $tainted = <>; @parts = ($tainted =~ /(\w+)\s+(\w+)/; # @parts is now tainted The C<taint> pragma causes any substrings from a match with a tainted variable to be tainted as well. This is not normally the case, as regexps are often used to extract the safe bits from a tainted variable. Use C<taint> when you are not extracting safe bits, but are performing some other processing. Both C<taint> and C<eval> pragmas are lexically scoped, which means they are in effect only until the end of the block enclosing the pragmas. use re '/m'; # or any other flags $multiline_string =~ /^foo/; # /m is implied The C<re '/flags'> pragma (introduced in Perl 5.14) turns on the given regular expression flags until the end of the lexical scope. See L<re/"'E<sol>flags' mode"> for more detail. use re 'debug'; /^(.*)$/s; # output debugging info use re 'debugcolor'; /^(.*)$/s; # output debugging info in living color The global C<debug> and C<debugcolor> pragmas allow one to get detailed debugging info about regexp compilation and execution. C<debugcolor> is the same as debug, except the debugging information is displayed in color on terminals that can display termcap color sequences. Here is example output: % perl -e 'use re "debug"; "abc" =~ /a*b+c/;' Compiling REx 'a*b+c' size 9 first at 1 1: STAR(4) 2: EXACT <a>(0) 4: PLUS(7) 5: EXACT <b>(0) 7: EXACT <c>(9) 9: END(0) floating 'bc' at 0..2147483647 (checking floating) minlen 2 Guessing start of match, REx 'a*b+c' against 'abc'... Found floating substr 'bc' at offset 1... Guessed: match at offset 0 Matching REx 'a*b+c' against 'abc' Setting an EVAL scope, savestack=3 0 <> <abc> | 1: STAR EXACT <a> can match 1 times out of 32767... Setting an EVAL scope, savestack=3 1 <a> <bc> | 4: PLUS EXACT <b> can match 1 times out of 32767... Setting an EVAL scope, savestack=3 2 <ab> <c> | 7: EXACT <c> 3 <abc> <> | 9: END Match successful! Freeing REx: 'a*b+c' If you have gotten this far into the tutorial, you can probably guess what the different parts of the debugging output tell you. The first part Compiling REx 'a*b+c' size 9 first at 1 1: STAR(4) 2: EXACT <a>(0) 4: PLUS(7) 5: EXACT <b>(0) 7: EXACT <c>(9) 9: END(0) describes the compilation stage. C<STAR(4)> means that there is a starred object, in this case C<'a'>, and if it matches, goto line 4, I<i.e.>, C<PLUS(7)>. The middle lines describe some heuristics and optimizations performed before a match: floating 'bc' at 0..2147483647 (checking floating) minlen 2 Guessing start of match, REx 'a*b+c' against 'abc'... Found floating substr 'bc' at offset 1... Guessed: match at offset 0 Then the match is executed and the remaining lines describe the process: Matching REx 'a*b+c' against 'abc' Setting an EVAL scope, savestack=3 0 <> <abc> | 1: STAR EXACT <a> can match 1 times out of 32767... Setting an EVAL scope, savestack=3 1 <a> <bc> | 4: PLUS EXACT <b> can match 1 times out of 32767... Setting an EVAL scope, savestack=3 2 <ab> <c> | 7: EXACT <c> 3 <abc> <> | 9: END Match successful! Freeing REx: 'a*b+c' Each step is of the form S<C<< n <x> <y> >>>, with C<< <x> >> the part of the string matched and C<< <y> >> the part not yet matched. The S<C<< | 1: STAR >>> says that Perl is at line number 1 in the compilation list above. See L<perldebguts/"Debugging Regular Expressions"> for much more detail. An alternative method of debugging regexps is to embed C<print> statements within the regexp. This provides a blow-by-blow account of the backtracking in an alternation: "that this" =~ m@(?{print "Start at position ", pos, "\n";}) t(?{print "t1\n";}) h(?{print "h1\n";}) i(?{print "i1\n";}) s(?{print "s1\n";}) | t(?{print "t2\n";}) h(?{print "h2\n";}) a(?{print "a2\n";}) t(?{print "t2\n";}) (?{print "Done at position ", pos, "\n";}) @x; prints Start at position 0 t1 h1 t2 h2 a2 t2 Done at position 4 =head1 SEE ALSO This is just a tutorial. For the full story on Perl regular expressions, see the L<perlre> regular expressions reference page. For more information on the matching C<m//> and substitution C<s///> operators, see L<perlop/"Regexp Quote-Like Operators">. For information on the C<split> operation, see L<perlfunc/split>. For an excellent all-around resource on the care and feeding of regular expressions, see the book I<Mastering Regular Expressions> by Jeffrey Friedl (published by O'Reilly, ISBN 1556592-257-3). =head1 AUTHOR AND COPYRIGHT Copyright (c) 2000 Mark Kvale. All rights reserved. Now maintained by Perl porters. This document may be distributed under the same terms as Perl itself. =head2 Acknowledgments The inspiration for the stop codon DNA example came from the ZIP code example in chapter 7 of I<Mastering Regular Expressions>. The author would like to thank Jeff Pinyan, Andrew Johnson, Peter Haworth, Ronald J Kimball, and Joe Smith for all their helpful comments. =cut PK �=�[R����0 �0 perl5202delta.podnu �[��� =encoding utf8 =head1 NAME perl5202delta - what is new for perl v5.20.2 =head1 DESCRIPTION This document describes differences between the 5.20.1 release and the 5.20.2 release. If you are upgrading from an earlier release such as 5.20.0, first read L<perl5201delta>, which describes differences between 5.20.0 and 5.20.1. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.20.1. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<attributes> has been upgraded from version 0.22 to 0.23. The usage of C<memEQs> in the XS has been corrected. L<[perl #122701]|https://rt.perl.org/Ticket/Display.html?id=122701> =item * L<Data::Dumper> has been upgraded from version 2.151 to 2.151_01. Fixes CVE-2014-4330 by adding a configuration variable/option to limit recursion when dumping deep data structures. =item * L<Errno> has been upgraded from version 1.20_03 to 1.20_05. Warnings when building the XS on Windows with the Visual C++ compiler are now avoided. =item * L<feature> has been upgraded from version 1.36 to 1.36_01. The C<postderef> feature has now been documented. This feature was actually added in Perl 5.20.0 but was accidentally omitted from the feature documentation until now. =item * L<IO::Socket> has been upgraded from version 1.37 to 1.38. Document the limitations of the connected() method. L<[perl #123096]|https://rt.perl.org/Ticket/Display.html?id=123096> =item * L<Module::CoreList> has been upgraded from version 5.020001 to 5.20150214. The list of Perl versions covered has been updated. =item * PathTools has been upgraded from version 3.48 to 3.48_01. A warning from the B<gcc> compiler is now avoided when building the XS. =item * L<PerlIO::scalar> has been upgraded from version 0.18 to 0.18_01. Reading from a position well past the end of the scalar now correctly returns end of file. L<[perl #123443]|https://rt.perl.org/Ticket/Display.html?id=123443> Seeking to a negative position still fails, but no longer leaves the file position set to a negation location. C<eof()> on a C<PerlIO::scalar> handle now properly returns true when the file position is past the 2GB mark on 32-bit systems. =item * L<Storable> has been upgraded from version 2.49 to 2.49_01. Minor grammatical change to the documentation only. =item * L<VMS::DCLsym> has been upgraded from version 1.05 to 1.05_01. Minor formatting change to the documentation only. =item * L<VMS::Stdio> has been upgraded from version 2.4 to 2.41. Minor formatting change to the documentation only. =back =head1 Documentation =head2 New Documentation =head3 L<perlunicook> This document, by Tom Christiansen, provides examples of handling Unicode in Perl. =head2 Changes to Existing Documentation =head3 L<perlexperiment> =over 4 =item * Added reference to subroutine signatures. This feature was actually added in Perl 5.20.0 but was accidentally omitted from the experimental feature documentation until now. =back =head3 L<perlpolicy> =over 4 =item * The process whereby features may graduate from experimental status has now been formally documented. =back =head3 L<perlsyn> =over 4 =item * An ambiguity in the documentation of the ellipsis statement has been corrected. L<[perl #122661]|https://rt.perl.org/Ticket/Display.html?id=122661> =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 Changes to Existing Diagnostics =over 4 =item * L<Bad symbol for scalar|perldiag/"Bad symbol for scalar"> is now documented. This error is not new, but was not previously documented here. =item * L<Missing right brace on \N{}|perldiag/"Missing right brace on \N{}"> is now documented. This error is not new, but was not previously documented here. =back =head1 Testing =over 4 =item * The test script F<re/rt122747.t> has been added to verify that L<perl #122747|https://rt.perl.org/Ticket/Display.html?id=122747> remains fixed. =back =head1 Platform Support =head2 Regained Platforms IRIX and Tru64 platforms are working again. (Some C<make test> failures remain.) =head1 Selected Bug Fixes =over 4 =item * AIX now sets the length in C<< getsockopt >> correctly. L<[perl #120835]|https://rt.perl.org/Ticket/Display.html?id=120835>, L<[cpan #91183]|https://rt.cpan.org/Ticket/Display.html?id=91183>, L<[cpan #85570]|https://rt.cpan.org/Ticket/Display.html?id=85570> =item * In Perl 5.20.0, C<$^N> accidentally had the internal UTF8 flag turned off if accessed from a code block within a regular expression, effectively UTF8-encoding the value. This has been fixed. L<[perl #123135]|https://rt.perl.org/Ticket/Display.html?id=123135> =item * Various cases where the name of a sub is used (autoload, overloading, error messages) used to crash for lexical subs, but have been fixed. =item * An assertion failure when parsing C<sort> with debugging enabled has been fixed. L<[perl #122771]|https://rt.perl.org/Ticket/Display.html?id=122771> =item * Loading UTF8 tables during a regular expression match could cause assertion failures under debugging builds if the previous match used the very same regular expression. L<[perl #122747]|https://rt.perl.org/Ticket/Display.html?id=122747> =item * Due to a mistake in the string-copying logic, copying the value of a state variable could instead steal the value and undefine the variable. This bug, introduced in Perl 5.20, would happen mostly for long strings (1250 chars or more), but could happen for any strings under builds with copy-on-write disabled. L<[perl #123029]|https://rt.perl.org/Ticket/Display.html?id=123029> =item * Fixed a bug that could cause perl to execute an infinite loop during compilation. L<[perl #122995]|https://rt.perl.org/Ticket/Display.html?id=122995> =item * On Win32, restoring in a child pseudo-process a variable that was C<local()>ed in a parent pseudo-process before the C<fork> happened caused memory corruption and a crash in the child pseudo-process (and therefore OS process). L<[perl #40565]|https://rt.perl.org/Ticket/Display.html?id=40565> =item * Tainted constants evaluated at compile time no longer cause unrelated statements to become tainted. L<[perl #122669]|https://rt.perl.org/Ticket/Display.html?id=122669> =item * Calling C<write> on a format with a C<^**> field could produce a panic in sv_chop() if there were insufficient arguments or if the variable used to fill the field was empty. L<[perl #123245]|https://rt.perl.org/Ticket/Display.html?id=123245> =item * In Perl 5.20.0, C<sort CORE::fake> where 'fake' is anything other than a keyword started chopping of the last 6 characters and treating the result as a sort sub name. The previous behaviour of treating "CORE::fake" as a sort sub name has been restored. L<[perl #123410]|https://rt.perl.org/Ticket/Display.html?id=123410> =item * A bug in regular expression patterns that could lead to segfaults and other crashes has been fixed. This occurred only in patterns compiled with C<"/i">, while taking into account the current POSIX locale (this usually means they have to be compiled within the scope of C<S<"use locale">>), and there must be a string of at least 128 consecutive bytes to match. L<[perl #123539]|https://rt.perl.org/Ticket/Display.html?id=123539> =item * C<qr/@array(?{block})/> no longer dies with "Bizarre copy of ARRAY". L<[perl #123344]|https://rt.perl.org/Ticket/Display.html?id=123344> =item * C<gmtime> no longer crashes with not-a-number values. L<[perl #123495]|https://rt.perl.org/Ticket/Display.html?id=123495> =item * Certain syntax errors in substitutions, such as C<< s/${<>{})// >>, would crash, and had done so since Perl 5.10. (In some cases the crash did not start happening until Perl 5.16.) The crash has, of course, been fixed. L<[perl #123542]|https://rt.perl.org/Ticket/Display.html?id=123542> =item * A memory leak in some regular expressions, introduced in Perl 5.20.1, has been fixed. L<[perl #123198]|https://rt.perl.org/Ticket/Display.html?id=123198> =item * C<< formline("@...", "a"); >> would crash. The C<FF_CHECKNL> case in pp_formline() didn't set the pointer used to mark the chop position, which led to the C<FF_MORE> case crashing with a segmentation fault. This has been fixed. L<[perl #123538]|https://rt.perl.org/Ticket/Display.html?id=123538> L<[perl #123622]|https://rt.perl.org/Ticket/Display.html?id=123622> =item * A possible buffer overrun and crash when parsing a literal pattern during regular expression compilation has been fixed. L<[perl #123604]|https://rt.perl.org/Ticket/Display.html?id=123604> =back =head1 Known Problems =over 4 =item * It is a known bug that lexical subroutines cannot be used as the C<SUBNAME> argument to C<sort>. This will be fixed in a future version of Perl. =back =head1 Errata From Previous Releases =over 4 =item * A regression has been fixed that was introduced in Perl 5.20.0 (fixed in Perl 5.20.1 as well as here) in which a UTF-8 encoded regular expression pattern that contains a single ASCII lowercase letter does not match its uppercase counterpart. L<[perl #122655]|https://rt.perl.org/Ticket/Display.html?id=122655> =back =head1 Acknowledgements Perl 5.20.2 represents approximately 5 months of development since Perl 5.20.1 and contains approximately 6,300 lines of changes across 170 files from 34 authors. Excluding auto-generated files, documentation and release tools, there were approximately 1,900 lines of changes to 80 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.20.2: Aaron Crane, Abigail, Andreas Voegele, Andy Dougherty, Anthony Heading, Aristotle Pagaltzis, Chris 'BinGOs' Williams, Craig A. Berry, Daniel Dragan, Doug Bell, Ed J, Father Chrysostomos, Glenn D. Golden, H.Merijn Brand, Hugo van der Sanden, James E Keenan, Jarkko Hietaniemi, Jim Cromie, Karen Etheridge, Karl Williamson, kmx, Matthew Horsfall, Max Maischein, Peter Martini, Rafael Garcia-Suarez, Ricardo Signes, Shlomi Fish, Slaven Rezic, Steffen Müller, Steve Hay, Tadeusz Sośnierz, Tony Cook, Yves Orton, Ævar Arnfjörð Bjarmason. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at https://rt.perl.org/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�["��" " perl5184delta.podnu �[��� =encoding utf8 =head1 NAME perl5184delta - what is new for perl v5.18.4 =head1 DESCRIPTION This document describes differences between the 5.18.4 release and the 5.18.2 release. B<Please note:> This document ignores perl 5.18.3, a broken release which existed for a few hours only. If you are upgrading from an earlier release such as 5.18.1, first read L<perl5182delta>, which describes differences between 5.18.1 and 5.18.2. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Digest::SHA> has been upgraded from 5.84_01 to 5.84_02. =item * L<perl5db.pl> has been upgraded from version 1.39_10 to 1.39_11. This fixes a crash in tab completion, where available. [perl #120827] Also, filehandle information is properly reset after a pager is run. [perl #121456] =back =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item Win32 =over 4 =item * Introduced by L<perl #113536|https://rt.perl.org/Public/Bug/Display.html?id=113536>, a memory leak on every call to C<system> and backticks (C< `` >), on most Win32 Perls starting from 5.18.0 has been fixed. The memory leak only occurred if you enabled pseudo-fork in your build of Win32 Perl, and were running that build on Server 2003 R2 or newer OS. The leak does not appear on WinXP SP3. [L<perl #121676|https://rt.perl.org/Public/Bug/Display.html?id=121676>] =back =back =head1 Selected Bug Fixes =over 4 =item * The debugger now properly resets filehandles as needed. [perl #121456] =item * A segfault in Digest::SHA has been addressed. [perl #121421] =item * perl can again be built with USE_64_BIT_INT, with Visual C 2003, 32 bit. [perl #120925] =item * A leading { (brace) in formats is properly parsed again. [perl #119973] =item * Copy the values used to perturb hash iteration when cloning an interpreter. This was fairly harmless but caused C<valgrind> to complain. [perl #121336] =item * In Perl v5.18 C<undef *_; goto &sub> and C<local *_; goto &sub> started crashing. This has been fixed. [perl #119949] =back =head1 Acknowledgements Perl 5.18.4 represents approximately 9 months of development since Perl 5.18.2 and contains approximately 2,000 lines of changes across 53 files from 13 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.18.4: Daniel Dragan, David Mitchell, Doug Bell, Father Chrysostomos, Hiroo Hayashi, James E Keenan, Karl Williamson, Mark Shelor, Ricardo Signes, Shlomi Fish, Smylers, Steve Hay, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�\� � perluniprops.podnu �[��� =begin comment # !!!!!!! DO NOT EDIT THIS FILE !!!!!!! # This file is machine-generated by lib/unicore/mktables from the Unicode # database, Version 13.0.0. Any changes made here will be lost! To change this file, edit lib/unicore/mktables instead. =end comment =head1 NAME perluniprops - Index of Unicode Version 13.0.0 character properties in Perl =head1 DESCRIPTION This document provides information about the portion of the Unicode database that deals with character properties, that is the portion that is defined on single code points. (L</Other information in the Unicode data base> below briefly mentions other data that Unicode provides.) Perl can provide access to all non-provisional Unicode character properties, though not all are enabled by default. The omitted ones are the Unihan properties (accessible via the CPAN module L<Unicode::Unihan>) and certain deprecated or Unicode-internal properties. (An installation may choose to recompile Perl's tables to change this. See L</Unicode character properties that are NOT accepted by Perl>.) For most purposes, access to Unicode properties from the Perl core is through regular expression matches, as described in the next section. For some special purposes, and to access the properties that are not suitable for regular expression matching, all the Unicode character properties that Perl handles are accessible via the standard L<Unicode::UCD> module, as described in the section L</Properties accessible through Unicode::UCD>. Perl also provides some additional extensions and short-cut synonyms for Unicode properties. This document merely lists all available properties and does not attempt to explain what each property really means. There is a brief description of each Perl extension; see L<perlunicode/Other Properties> for more information on these. There is some detail about Blocks, Scripts, General_Category, and Bidi_Class in L<perlunicode>, but to find out about the intricacies of the official Unicode properties, refer to the Unicode standard. A good starting place is L<http://www.unicode.org/reports/tr44/>. Note that you can define your own properties; see L<perlunicode/"User-Defined Character Properties">. =head1 Properties accessible through C<\p{}> and C<\P{}> The Perl regular expression C<\p{}> and C<\P{}> constructs give access to most of the Unicode character properties. The table below shows all these constructs, both single and compound forms. B<Compound forms> consist of two components, separated by an equals sign or a colon. The first component is the property name, and the second component is the particular value of the property to match against, for example, C<\p{Script_Extensions: Greek}> and C<\p{Script_Extensions=Greek}> both mean to match characters whose Script_Extensions property value is Greek. (C<Script_Extensions> is an improved version of the C<Script> property.) B<Single forms>, like C<\p{Greek}>, are mostly Perl-defined shortcuts for their equivalent compound forms. The table shows these equivalences. (In our example, C<\p{Greek}> is a just a shortcut for C<\p{Script_Extensions=Greek}>). There are also a few Perl-defined single forms that are not shortcuts for a compound form. One such is C<\p{Word}>. These are also listed in the table. In parsing these constructs, Perl always ignores Upper/lower case differences everywhere within the {braces}. Thus C<\p{Greek}> means the same thing as C<\p{greek}>. But note that changing the case of the C<"p"> or C<"P"> before the left brace completely changes the meaning of the construct, from "match" (for C<\p{}>) to "doesn't match" (for C<\P{}>). Casing in this document is for improved legibility. Also, white space, hyphens, and underscores are normally ignored everywhere between the {braces}, and hence can be freely added or removed even if the C</x> modifier hasn't been specified on the regular expression. But in the table below a 'B<T>' at the beginning of an entry means that tighter (stricter) rules are used for that entry: =over 4 =over 4 =item Single form (C<\p{name}>) tighter rules: White space, hyphens, and underscores ARE significant except for: =over 4 =item * white space adjacent to a non-word character =item * underscores separating digits in numbers =back That means, for example, that you can freely add or remove white space adjacent to (but within) the braces without affecting the meaning. =item Compound form (C<\p{name=value}> or C<\p{name:value}>) tighter rules: The tighter rules given above for the single form apply to everything to the right of the colon or equals; the looser rules still apply to everything to the left. That means, for example, that you can freely add or remove white space adjacent to (but within) the braces and the colon or equal sign. =back =back Some properties are considered obsolete by Unicode, but still available. There are several varieties of obsolescence: =over 4 =over 4 =item Stabilized A property may be stabilized. Such a determination does not indicate that the property should or should not be used; instead it is a declaration that the property will not be maintained nor extended for newly encoded characters. Such properties are marked with an 'B<S>' in the table. =item Deprecated A property may be deprecated, perhaps because its original intent has been replaced by another property, or because its specification was somehow defective. This means that its use is strongly discouraged, so much so that a warning will be issued if used, unless the regular expression is in the scope of a C<S<no warnings 'deprecated'>> statement. A 'B<D>' flags each such entry in the table, and the entry there for the longest, most descriptive version of the property will give the reason it is deprecated, and perhaps advice. Perl may issue such a warning, even for properties that aren't officially deprecated by Unicode, when there used to be characters or code points that were matched by them, but no longer. This is to warn you that your program may not work like it did on earlier Unicode releases. A deprecated property may be made unavailable in a future Perl version, so it is best to move away from them. A deprecated property may also be stabilized, but this fact is not shown. =item Obsolete Properties marked with an 'B<O>' in the table are considered (plain) obsolete. Generally this designation is given to properties that Unicode once used for internal purposes (but not any longer). =item Discouraged This is not actually a Unicode-specified obsolescence, but applies to certain Perl extensions that are present for backwards compatibility, but are discouraged from being used. These are not obsolete, but their meanings are not stable. Future Unicode versions could force any of these extensions to be removed without warning, replaced by another property with the same name that means something different. An 'B<X>' flags each such entry in the table. Use the equivalent shown instead. In particular, matches in the Block property have single forms defined by Perl that begin with C<"In_">, C<"Is_>, or even with no prefix at all, Like all B<DISCOURAGED> forms, these are not stable. For example, C<\p{Block=Deseret}> can currently be written as C<\p{In_Deseret}>, C<\p{Is_Deseret}>, or C<\p{Deseret}>. But, a new Unicode version may come along that would force Perl to change the meaning of one or more of these, and your program would no longer be correct. Currently there are no such conflicts with the form that begins C<"In_">, but there are many with the other two shortcuts, and Unicode continues to define new properties that begin with C<"In">, so it's quite possible that a conflict will occur in the future. The compound form is guaranteed to not become obsolete, and its meaning is clearer anyway. See L<perlunicode/"Blocks"> for more information about this. User-defined properties must begin with "In" or "Is". These override any Unicode property of the same name. =back =back The table below has two columns. The left column contains the C<\p{}> constructs to look up, possibly preceded by the flags mentioned above; and the right column contains information about them, like a description, or synonyms. The table shows both the single and compound forms for each property that has them. If the left column is a short name for a property, the right column will give its longer, more descriptive name; and if the left column is the longest name, the right column will show any equivalent shortest name, in both single and compound forms if applicable. If braces are not needed to specify a property (e.g., C<\pL>), the left column contains both forms, with and without braces. The right column will also caution you if a property means something different than what might normally be expected. All single forms are Perl extensions; a few compound forms are as well, and are noted as such. Numbers in (parentheses) indicate the total number of Unicode code points matched by the property. For the entries that give the longest, most descriptive version of the property, the count is followed by a list of some of the code points matched by it. The list includes all the matched characters in the 0-255 range, enclosed in the familiar [brackets] the same as a regular expression bracketed character class. Following that, the next few higher matching ranges are also given. To avoid visual ambiguity, the SPACE character is represented as C<\x20>. For emphasis, those properties that match no code points at all are listed as well in a separate section following the table. Most properties match the same code points regardless of whether C<"/i"> case-insensitive matching is specified or not. But a few properties are affected. These are shown with the notation S<C<(/i= I<other_property>)>> in the second column. Under case-insensitive matching they match the same code pode points as the property I<other_property>. There is no description given for most non-Perl defined properties (See L<http://www.unicode.org/reports/tr44/> for that). For compactness, 'B<*>' is used as a wildcard instead of showing all possible combinations. For example, entries like: \p{Gc: *} \p{General_Category: *} mean that 'Gc' is a synonym for 'General_Category', and anything that is valid for the latter is also valid for the former. Similarly, \p{Is_*} \p{*} means that if and only if, for example, C<\p{Foo}> exists, then C<\p{Is_Foo}> and C<\p{IsFoo}> are also valid and all mean the same thing. And similarly, C<\p{Foo=Bar}> means the same as C<\p{Is_Foo=Bar}> and C<\p{IsFoo=Bar}>. "*" here is restricted to something not beginning with an underscore. Also, in binary properties, 'Yes', 'T', and 'True' are all synonyms for 'Y'. And 'No', 'F', and 'False' are all synonyms for 'N'. The table shows 'Y*' and 'N*' to indicate this, and doesn't have separate entries for the other possibilities. Note that not all properties which have values 'Yes' and 'No' are binary, and they have all their values spelled out without using this wild card, and a C<NOT> clause in their description that highlights their not being binary. These also require the compound form to match them, whereas true binary properties have both single and compound forms available. Note that all non-essential underscores are removed in the display of the short names below. B<Legend summary:> =over 4 =item Z<>B<*> is a wild-card =item B<(\d+)> in the info column gives the number of Unicode code points matched by this property. =item B<D> means this is deprecated. =item B<O> means this is obsolete. =item B<S> means this is stabilized. =item B<T> means tighter (stricter) name matching applies. =item B<X> means use of this form is discouraged, and may not be stable. =back NAME INFO \p{Adlam} \p{Script_Extensions=Adlam} (Short: \p{Adlm}; NOT \p{Block=Adlam}) (89) \p{Adlm} \p{Adlam} (= \p{Script_Extensions=Adlam}) (NOT \p{Block=Adlam}) (89) X \p{Aegean_Numbers} \p{Block=Aegean_Numbers} (64) T \p{Age: 1.1} \p{Age=V1_1} (33_979) \p{Age: V1_1} Code point's usage introduced in version 1.1 (33_979: U+0000..01F5, U+01FA..0217, U+0250..02A8, U+02B0..02DE, U+02E0..02E9, U+0300..0345 ...) T \p{Age: 2.0} \p{Age=V2_0} (144_521) \p{Age: V2_0} Code point's usage was introduced in version 2.0; See also Property 'Present_In' (144_521: U+0591..05A1, U+05A3..05AF, U+05C4, U+0F00..0F47, U+0F49..0F69, U+0F71..0F8B ...) T \p{Age: 2.1} \p{Age=V2_1} (2) \p{Age: V2_1} Code point's usage was introduced in version 2.1; See also Property 'Present_In' (2: U+20AC, U+FFFC) T \p{Age: 3.0} \p{Age=V3_0} (10_307) \p{Age: V3_0} Code point's usage was introduced in version 3.0; See also Property 'Present_In' (10_307: U+01F6..01F9, U+0218..021F, U+0222..0233, U+02A9..02AD, U+02DF, U+02EA..02EE ...) T \p{Age: 3.1} \p{Age=V3_1} (44_978) \p{Age: V3_1} Code point's usage was introduced in version 3.1; See also Property 'Present_In' (44_978: U+03F4..03F5, U+FDD0..FDEF, U+10300..1031E, U+10320..10323, U+10330..1034A, U+10400..10425 ...) T \p{Age: 3.2} \p{Age=V3_2} (1016) \p{Age: V3_2} Code point's usage was introduced in version 3.2; See also Property 'Present_In' (1016: U+0220, U+034F, U+0363..036F, U+03D8..03D9, U+03F6, U+048A..048B ...) T \p{Age: 4.0} \p{Age=V4_0} (1226) \p{Age: V4_0} Code point's usage was introduced in version 4.0; See also Property 'Present_In' (1226: U+0221, U+0234..0236, U+02AE..02AF, U+02EF..02FF, U+0350..0357, U+035D..035F ...) T \p{Age: 4.1} \p{Age=V4_1} (1273) \p{Age: V4_1} Code point's usage was introduced in version 4.1; See also Property 'Present_In' (1273: U+0237..0241, U+0358..035C, U+03FC..03FF, U+04F6..04F7, U+05A2, U+05C5..05C7 ...) T \p{Age: 5.0} \p{Age=V5_0} (1369) \p{Age: V5_0} Code point's usage was introduced in version 5.0; See also Property 'Present_In' (1369: U+0242..024F, U+037B..037D, U+04CF, U+04FA..04FF, U+0510..0513, U+05BA ...) T \p{Age: 5.1} \p{Age=V5_1} (1624) \p{Age: V5_1} Code point's usage was introduced in version 5.1; See also Property 'Present_In' (1624: U+0370..0373, U+0376..0377, U+03CF, U+0487, U+0514..0523, U+0606..060A ...) T \p{Age: 5.2} \p{Age=V5_2} (6648) \p{Age: V5_2} Code point's usage was introduced in version 5.2; See also Property 'Present_In' (6648: U+0524..0525, U+0800..082D, U+0830..083E, U+0900, U+094E, U+0955 ...) T \p{Age: 6.0} \p{Age=V6_0} (2088) \p{Age: V6_0} Code point's usage was introduced in version 6.0; See also Property 'Present_In' (2088: U+0526..0527, U+0620, U+065F, U+0840..085B, U+085E, U+093A..093B ...) T \p{Age: 6.1} \p{Age=V6_1} (732) \p{Age: V6_1} Code point's usage was introduced in version 6.1; See also Property 'Present_In' (732: U+058F, U+0604, U+08A0, U+08A2..08AC, U+08E4..08FE, U+0AF0 ...) T \p{Age: 6.2} \p{Age=V6_2} (1) \p{Age: V6_2} Code point's usage was introduced in version 6.2; See also Property 'Present_In' (1: U+20BA) T \p{Age: 6.3} \p{Age=V6_3} (5) \p{Age: V6_3} Code point's usage was introduced in version 6.3; See also Property 'Present_In' (5: U+061C, U+2066..2069) T \p{Age: 7.0} \p{Age=V7_0} (2834) \p{Age: V7_0} Code point's usage was introduced in version 7.0; See also Property 'Present_In' (2834: U+037F, U+0528..052F, U+058D..058E, U+0605, U+08A1, U+08AD..08B2 ...) T \p{Age: 8.0} \p{Age=V8_0} (7716) \p{Age: V8_0} Code point's usage was introduced in version 8.0; See also Property 'Present_In' (7716: U+08B3..08B4, U+08E3, U+0AF9, U+0C5A, U+0D5F, U+13F5 ...) T \p{Age: 9.0} \p{Age=V9_0} (7500) \p{Age: V9_0} Code point's usage was introduced in version 9.0; See also Property 'Present_In' (7500: U+08B6..08BD, U+08D4..08E2, U+0C80, U+0D4F, U+0D54..0D56, U+0D58..0D5E ...) T \p{Age: 10.0} \p{Age=V10_0} (8518) \p{Age: V10_0} Code point's usage was introduced in version 10.0; See also Property 'Present_In' (8518: U+0860..086A, U+09FC..09FD, U+0AFA..0AFF, U+0D00, U+0D3B..0D3C, U+1CF7 ...) T \p{Age: 11.0} \p{Age=V11_0} (684) \p{Age: V11_0} Code point's usage was introduced in version 11.0; See also Property 'Present_In' (684: U+0560, U+0588, U+05EF, U+07FD..07FF, U+08D3, U+09FE ...) T \p{Age: 12.0} \p{Age=V12_0} (554) \p{Age: V12_0} Code point's usage was introduced in version 12.0; See also Property 'Present_In' (554: U+0C77, U+0E86, U+0E89, U+0E8C, U+0E8E..0E93, U+0E98 ...) T \p{Age: 12.1} \p{Age=V12_1} (1) \p{Age: V12_1} Code point's usage was introduced in version 12.1; See also Property 'Present_In' (1: U+32FF) T \p{Age: 13.0} \p{Age=V13_0} (5930) \p{Age: V13_0} Code point's usage was introduced in version 13.0; See also Property 'Present_In' (5930: U+08BE..08C7, U+0B55, U+0D04, U+0D81, U+1ABF..1AC0, U+2B97 ...) \p{Age: NA} \p{Age=Unassigned} (830_606 plus all above-Unicode code points) \p{Age: Unassigned} Code point's usage has not been assigned in any Unicode release thus far. (Short: \p{Age=NA}) (830_606 plus all above-Unicode code points: U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2, U+0530 ...) \p{Aghb} \p{Caucasian_Albanian} (= \p{Script_Extensions= Caucasian_Albanian}) (NOT \p{Block= Caucasian_Albanian}) (53) \p{AHex} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y}) (22) \p{AHex: *} \p{ASCII_Hex_Digit: *} \p{Ahom} \p{Script_Extensions=Ahom} (NOT \p{Block= Ahom}) (58) X \p{Alchemical} \p{Alchemical_Symbols} (= \p{Block= Alchemical_Symbols}) (128) X \p{Alchemical_Symbols} \p{Block=Alchemical_Symbols} (Short: \p{InAlchemical}) (128) \p{All} All code points, including those above Unicode. Same as qr/./s (1_114_112 plus all above-Unicode code points: U+0000..infinity) \p{Alnum} \p{XPosixAlnum} (133_525) \p{Alpha} \p{XPosixAlpha} (= \p{Alphabetic=Y}) (132_875) \p{Alpha: *} \p{Alphabetic: *} \p{Alphabetic} \p{XPosixAlpha} (= \p{Alphabetic=Y}) (132_875) \p{Alphabetic: N*} (Short: \p{Alpha=N}, \P{Alpha}) (981_237 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/0-9:;<= >?\@\[\\\]\^_`\{\|\}~\x7f-\xa9\xab-\xb4 \xb6-\xb9\xbb-\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..0344 ...) \p{Alphabetic: Y*} (Short: \p{Alpha=Y}, \p{Alpha}) (132_875: [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6 \xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) X \p{Alphabetic_PF} \p{Alphabetic_Presentation_Forms} (= \p{Block=Alphabetic_Presentation_Forms}) (80) X \p{Alphabetic_Presentation_Forms} \p{Block= Alphabetic_Presentation_Forms} (Short: \p{InAlphabeticPF}) (80) \p{Anatolian_Hieroglyphs} \p{Script_Extensions= Anatolian_Hieroglyphs} (Short: \p{Hluw}; NOT \p{Block=Anatolian_Hieroglyphs}) (583) X \p{Ancient_Greek_Music} \p{Ancient_Greek_Musical_Notation} (= \p{Block= Ancient_Greek_Musical_Notation}) (80) X \p{Ancient_Greek_Musical_Notation} \p{Block= Ancient_Greek_Musical_Notation} (Short: \p{InAncientGreekMusic}) (80) X \p{Ancient_Greek_Numbers} \p{Block=Ancient_Greek_Numbers} (80) X \p{Ancient_Symbols} \p{Block=Ancient_Symbols} (64) \p{Any} All Unicode code points (1_114_112: U+0000..10FFFF) \p{Arab} \p{Arabic} (= \p{Script_Extensions= Arabic}) (NOT \p{Block=Arabic}) (1335) \p{Arabic} \p{Script_Extensions=Arabic} (Short: \p{Arab}; NOT \p{Block=Arabic}) (1335) X \p{Arabic_Ext_A} \p{Arabic_Extended_A} (= \p{Block= Arabic_Extended_A}) (96) X \p{Arabic_Extended_A} \p{Block=Arabic_Extended_A} (Short: \p{InArabicExtA}) (96) X \p{Arabic_Math} \p{Arabic_Mathematical_Alphabetic_Symbols} (= \p{Block= Arabic_Mathematical_Alphabetic_Symbols}) (256) X \p{Arabic_Mathematical_Alphabetic_Symbols} \p{Block= Arabic_Mathematical_Alphabetic_Symbols} (Short: \p{InArabicMath}) (256) X \p{Arabic_PF_A} \p{Arabic_Presentation_Forms_A} (= \p{Block=Arabic_Presentation_Forms_A}) (688) X \p{Arabic_PF_B} \p{Arabic_Presentation_Forms_B} (= \p{Block=Arabic_Presentation_Forms_B}) (144) X \p{Arabic_Presentation_Forms_A} \p{Block= Arabic_Presentation_Forms_A} (Short: \p{InArabicPFA}) (688) X \p{Arabic_Presentation_Forms_B} \p{Block= Arabic_Presentation_Forms_B} (Short: \p{InArabicPFB}) (144) X \p{Arabic_Sup} \p{Arabic_Supplement} (= \p{Block= Arabic_Supplement}) (48) X \p{Arabic_Supplement} \p{Block=Arabic_Supplement} (Short: \p{InArabicSup}) (48) \p{Armenian} \p{Script_Extensions=Armenian} (Short: \p{Armn}; NOT \p{Block=Armenian}) (96) \p{Armi} \p{Imperial_Aramaic} (= \p{Script_Extensions=Imperial_Aramaic}) (NOT \p{Block=Imperial_Aramaic}) (31) \p{Armn} \p{Armenian} (= \p{Script_Extensions= Armenian}) (NOT \p{Block=Armenian}) (96) X \p{Arrows} \p{Block=Arrows} (112) \p{ASCII} \p{Block=Basic_Latin} (128) \p{ASCII_Hex_Digit} \p{PosixXDigit} (= \p{ASCII_Hex_Digit=Y}) (22) \p{ASCII_Hex_Digit: N*} (Short: \p{AHex=N}, \P{AHex}) (1_114_090 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/:;<=>? \@G-Z\[\\\]\^_`g-z\{\|\}~\x7f-\xff], U+0100..infinity) \p{ASCII_Hex_Digit: Y*} (Short: \p{AHex=Y}, \p{AHex}) (22: [0-9A- Fa-f]) \p{Assigned} All assigned code points (283_440: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Avestan} \p{Script_Extensions=Avestan} (Short: \p{Avst}; NOT \p{Block=Avestan}) (61) \p{Avst} \p{Avestan} (= \p{Script_Extensions= Avestan}) (NOT \p{Block=Avestan}) (61) \p{Bali} \p{Balinese} (= \p{Script_Extensions= Balinese}) (NOT \p{Block=Balinese}) (121) \p{Balinese} \p{Script_Extensions=Balinese} (Short: \p{Bali}; NOT \p{Block=Balinese}) (121) \p{Bamu} \p{Bamum} (= \p{Script_Extensions=Bamum}) (NOT \p{Block=Bamum}) (657) \p{Bamum} \p{Script_Extensions=Bamum} (Short: \p{Bamu}; NOT \p{Block=Bamum}) (657) X \p{Bamum_Sup} \p{Bamum_Supplement} (= \p{Block= Bamum_Supplement}) (576) X \p{Bamum_Supplement} \p{Block=Bamum_Supplement} (Short: \p{InBamumSup}) (576) X \p{Basic_Latin} \p{ASCII} (= \p{Block=Basic_Latin}) (128) \p{Bass} \p{Bassa_Vah} (= \p{Script_Extensions= Bassa_Vah}) (NOT \p{Block=Bassa_Vah}) (36) \p{Bassa_Vah} \p{Script_Extensions=Bassa_Vah} (Short: \p{Bass}; NOT \p{Block=Bassa_Vah}) (36) \p{Batak} \p{Script_Extensions=Batak} (Short: \p{Batk}; NOT \p{Block=Batak}) (56) \p{Batk} \p{Batak} (= \p{Script_Extensions=Batak}) (NOT \p{Block=Batak}) (56) \p{Bc: *} \p{Bidi_Class: *} \p{Beng} \p{Bengali} (= \p{Script_Extensions= Bengali}) (NOT \p{Block=Bengali}) (113) \p{Bengali} \p{Script_Extensions=Bengali} (Short: \p{Beng}; NOT \p{Block=Bengali}) (113) \p{Bhaiksuki} \p{Script_Extensions=Bhaiksuki} (Short: \p{Bhks}; NOT \p{Block=Bhaiksuki}) (97) \p{Bhks} \p{Bhaiksuki} (= \p{Script_Extensions= Bhaiksuki}) (NOT \p{Block=Bhaiksuki}) (97) \p{Bidi_C} \p{Bidi_Control} (= \p{Bidi_Control=Y}) (12) \p{Bidi_C: *} \p{Bidi_Control: *} \p{Bidi_Class: AL} \p{Bidi_Class=Arabic_Letter} (1698) \p{Bidi_Class: AN} \p{Bidi_Class=Arabic_Number} (61) \p{Bidi_Class: Arabic_Letter} (Short: \p{Bc=AL}) (1698: U+0608, U+060B, U+060D, U+061B..064A, U+066D..066F, U+0671..06D5 ...) \p{Bidi_Class: Arabic_Number} (Short: \p{Bc=AN}) (61: U+0600..0605, U+0660..0669, U+066B..066C, U+06DD, U+08E2, U+10D30..10D39 ...) \p{Bidi_Class: B} \p{Bidi_Class=Paragraph_Separator} (7) \p{Bidi_Class: BN} \p{Bidi_Class=Boundary_Neutral} (4016) \p{Bidi_Class: Boundary_Neutral} (Short: \p{Bc=BN}) (4016: [^\t\n \cK\f\r\x1c-\x7e\x85\xa0-\xac\xae-\xff], U+180E, U+200B..200D, U+2060..2065, U+206A..206F, U+FDD0..FDEF ...) \p{Bidi_Class: Common_Separator} (Short: \p{Bc=CS}) (15: [,.\/: \xa0], U+060C, U+202F, U+2044, U+FE50, U+FE52 ...) \p{Bidi_Class: CS} \p{Bidi_Class=Common_Separator} (15) \p{Bidi_Class: EN} \p{Bidi_Class=European_Number} (168) \p{Bidi_Class: ES} \p{Bidi_Class=European_Separator} (12) \p{Bidi_Class: ET} \p{Bidi_Class=European_Terminator} (92) \p{Bidi_Class: European_Number} (Short: \p{Bc=EN}) (168: [0-9\xb2- \xb3\xb9], U+06F0..06F9, U+2070, U+2074..2079, U+2080..2089, U+2488..249B ...) \p{Bidi_Class: European_Separator} (Short: \p{Bc=ES}) (12: [+\-], U+207A..207B, U+208A..208B, U+2212, U+FB29, U+FE62..FE63 ...) \p{Bidi_Class: European_Terminator} (Short: \p{Bc=ET}) (92: [#\$ \%\xa2-\xa5\xb0-\xb1], U+058F, U+0609..060A, U+066A, U+09F2..09F3, U+09FB ...) \p{Bidi_Class: First_Strong_Isolate} (Short: \p{Bc=FSI}) (1: U+2068) \p{Bidi_Class: FSI} \p{Bidi_Class=First_Strong_Isolate} (1) \p{Bidi_Class: L} \p{Bidi_Class=Left_To_Right} (1_096_473 plus all above-Unicode code points) \p{Bidi_Class: Left_To_Right} (Short: \p{Bc=L}) (1_096_473 plus all above-Unicode code points: [A-Za-z \xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8- \xff], U+0100..02B8, U+02BB..02C1, U+02D0..02D1, U+02E0..02E4, U+02EE ...) \p{Bidi_Class: Left_To_Right_Embedding} (Short: \p{Bc=LRE}) (1: U+202A) \p{Bidi_Class: Left_To_Right_Isolate} (Short: \p{Bc=LRI}) (1: U+2066) \p{Bidi_Class: Left_To_Right_Override} (Short: \p{Bc=LRO}) (1: U+202D) \p{Bidi_Class: LRE} \p{Bidi_Class=Left_To_Right_Embedding} (1) \p{Bidi_Class: LRI} \p{Bidi_Class=Left_To_Right_Isolate} (1) \p{Bidi_Class: LRO} \p{Bidi_Class=Left_To_Right_Override} (1) \p{Bidi_Class: Nonspacing_Mark} (Short: \p{Bc=NSM}) (1847: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{Bidi_Class: NSM} \p{Bidi_Class=Nonspacing_Mark} (1847) \p{Bidi_Class: ON} \p{Bidi_Class=Other_Neutral} (5931) \p{Bidi_Class: Other_Neutral} (Short: \p{Bc=ON}) (5931: [!\"&\' \(\)*;<=>?\@\[\\\]\^_`\{\|\}~\xa1\xa6- \xa9\xab-\xac\xae-\xaf\xb4\xb6-\xb8\xbb- \xbf\xd7\xf7], U+02B9..02BA, U+02C2..02CF, U+02D2..02DF, U+02E5..02ED, U+02EF..02FF ...) \p{Bidi_Class: Paragraph_Separator} (Short: \p{Bc=B}) (7: [\n\r \x1c-\x1e\x85], U+2029) \p{Bidi_Class: PDF} \p{Bidi_Class=Pop_Directional_Format} (1) \p{Bidi_Class: PDI} \p{Bidi_Class=Pop_Directional_Isolate} (1) \p{Bidi_Class: Pop_Directional_Format} (Short: \p{Bc=PDF}) (1: U+202C) \p{Bidi_Class: Pop_Directional_Isolate} (Short: \p{Bc=PDI}) (1: U+2069) \p{Bidi_Class: R} \p{Bidi_Class=Right_To_Left} (3763) \p{Bidi_Class: Right_To_Left} (Short: \p{Bc=R}) (3763: U+0590, U+05BE, U+05C0, U+05C3, U+05C6, U+05C8..05FF ...) \p{Bidi_Class: Right_To_Left_Embedding} (Short: \p{Bc=RLE}) (1: U+202B) \p{Bidi_Class: Right_To_Left_Isolate} (Short: \p{Bc=RLI}) (1: U+2067) \p{Bidi_Class: Right_To_Left_Override} (Short: \p{Bc=RLO}) (1: U+202E) \p{Bidi_Class: RLE} \p{Bidi_Class=Right_To_Left_Embedding} (1) \p{Bidi_Class: RLI} \p{Bidi_Class=Right_To_Left_Isolate} (1) \p{Bidi_Class: RLO} \p{Bidi_Class=Right_To_Left_Override} (1) \p{Bidi_Class: S} \p{Bidi_Class=Segment_Separator} (3) \p{Bidi_Class: Segment_Separator} (Short: \p{Bc=S}) (3: [\t\cK \x1f]) \p{Bidi_Class: White_Space} (Short: \p{Bc=WS}) (17: [\f\x20], U+1680, U+2000..200A, U+2028, U+205F, U+3000) \p{Bidi_Class: WS} \p{Bidi_Class=White_Space} (17) \p{Bidi_Control} \p{Bidi_Control=Y} (Short: \p{BidiC}) (12) \p{Bidi_Control: N*} (Short: \p{BidiC=N}, \P{BidiC}) (1_114_100 plus all above-Unicode code points: U+0000..061B, U+061D..200D, U+2010..2029, U+202F..2065, U+206A..infinity) \p{Bidi_Control: Y*} (Short: \p{BidiC=Y}, \p{BidiC}) (12: U+061C, U+200E..200F, U+202A..202E, U+2066..2069) \p{Bidi_M} \p{Bidi_Mirrored} (= \p{Bidi_Mirrored=Y}) (545) \p{Bidi_M: *} \p{Bidi_Mirrored: *} \p{Bidi_Mirrored} \p{Bidi_Mirrored=Y} (Short: \p{BidiM}) (545) \p{Bidi_Mirrored: N*} (Short: \p{BidiM=N}, \P{BidiM}) (1_113_567 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'*+,\-.\/0-9:;=?\@A- Z\\\^_`a-z\|~\x7f-\xaa\xac-\xba\xbc- \xff], U+0100..0F39, U+0F3E..169A, U+169D..2038, U+203B..2044, U+2047..207C ...) \p{Bidi_Mirrored: Y*} (Short: \p{BidiM=Y}, \p{BidiM}) (545: [\(\)<>\[\]\{\}\xab\xbb], U+0F3A..0F3D, U+169B..169C, U+2039..203A, U+2045..2046, U+207D..207E ...) \p{Bidi_Paired_Bracket_Type: C} \p{Bidi_Paired_Bracket_Type=Close} (60) \p{Bidi_Paired_Bracket_Type: Close} (Short: \p{Bpt=C}) (60: [\)\] \}], U+0F3B, U+0F3D, U+169C, U+2046, U+207E ...) \p{Bidi_Paired_Bracket_Type: N} \p{Bidi_Paired_Bracket_Type=None} (1_113_992 plus all above-Unicode code points) \p{Bidi_Paired_Bracket_Type: None} (Short: \p{Bpt=N}) (1_113_992 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'*+,\-.\/0-9:;<=>? \@A-Z\\\^_`a-z\|~\x7f-\xff], U+0100..0F39, U+0F3E..169A, U+169D..2044, U+2047..207C, U+207F..208C ...) \p{Bidi_Paired_Bracket_Type: O} \p{Bidi_Paired_Bracket_Type=Open} (60) \p{Bidi_Paired_Bracket_Type: Open} (Short: \p{Bpt=O}) (60: [\(\[\{], U+0F3A, U+0F3C, U+169B, U+2045, U+207D ...) \p{Blank} \p{XPosixBlank} (18) \p{Blk: *} \p{Block: *} \p{Block: Adlam} (NOT \p{Adlam} NOR \p{Is_Adlam}) (96: U+1E900..1E95F) \p{Block: Aegean_Numbers} (64: U+10100..1013F) \p{Block: Ahom} (NOT \p{Ahom} NOR \p{Is_Ahom}) (64: U+11700..1173F) \p{Block: Alchemical} \p{Block=Alchemical_Symbols} (128) \p{Block: Alchemical_Symbols} (Short: \p{Blk=Alchemical}) (128: U+1F700..1F77F) \p{Block: Alphabetic_PF} \p{Block=Alphabetic_Presentation_Forms} (80) \p{Block: Alphabetic_Presentation_Forms} (Short: \p{Blk= AlphabeticPF}) (80: U+FB00..FB4F) \p{Block: Anatolian_Hieroglyphs} (NOT \p{Anatolian_Hieroglyphs} NOR \p{Is_Anatolian_Hieroglyphs}) (640: U+14400..1467F) \p{Block: Ancient_Greek_Music} \p{Block= Ancient_Greek_Musical_Notation} (80) \p{Block: Ancient_Greek_Musical_Notation} (Short: \p{Blk= AncientGreekMusic}) (80: U+1D200..1D24F) \p{Block: Ancient_Greek_Numbers} (80: U+10140..1018F) \p{Block: Ancient_Symbols} (64: U+10190..101CF) \p{Block: Arabic} (NOT \p{Arabic} NOR \p{Is_Arabic}) (256: U+0600..06FF) \p{Block: Arabic_Ext_A} \p{Block=Arabic_Extended_A} (96) \p{Block: Arabic_Extended_A} (Short: \p{Blk=ArabicExtA}) (96: U+08A0..08FF) \p{Block: Arabic_Math} \p{Block= Arabic_Mathematical_Alphabetic_Symbols} (256) \p{Block: Arabic_Mathematical_Alphabetic_Symbols} (Short: \p{Blk= ArabicMath}) (256: U+1EE00..1EEFF) \p{Block: Arabic_PF_A} \p{Block=Arabic_Presentation_Forms_A} (688) \p{Block: Arabic_PF_B} \p{Block=Arabic_Presentation_Forms_B} (144) \p{Block: Arabic_Presentation_Forms_A} (Short: \p{Blk=ArabicPFA}) (688: U+FB50..FDFF) \p{Block: Arabic_Presentation_Forms_B} (Short: \p{Blk=ArabicPFB}) (144: U+FE70..FEFF) \p{Block: Arabic_Sup} \p{Block=Arabic_Supplement} (48) \p{Block: Arabic_Supplement} (Short: \p{Blk=ArabicSup}) (48: U+0750..077F) \p{Block: Armenian} (NOT \p{Armenian} NOR \p{Is_Armenian}) (96: U+0530..058F) \p{Block: Arrows} (112: U+2190..21FF) \p{Block: ASCII} \p{Block=Basic_Latin} (128) \p{Block: Avestan} (NOT \p{Avestan} NOR \p{Is_Avestan}) (64: U+10B00..10B3F) \p{Block: Balinese} (NOT \p{Balinese} NOR \p{Is_Balinese}) (128: U+1B00..1B7F) \p{Block: Bamum} (NOT \p{Bamum} NOR \p{Is_Bamum}) (96: U+A6A0..A6FF) \p{Block: Bamum_Sup} \p{Block=Bamum_Supplement} (576) \p{Block: Bamum_Supplement} (Short: \p{Blk=BamumSup}) (576: U+16800..16A3F) \p{Block: Basic_Latin} (Short: \p{Blk=ASCII}) (128: [\x00-\x7f]) \p{Block: Bassa_Vah} (NOT \p{Bassa_Vah} NOR \p{Is_Bassa_Vah}) (48: U+16AD0..16AFF) \p{Block: Batak} (NOT \p{Batak} NOR \p{Is_Batak}) (64: U+1BC0..1BFF) \p{Block: Bengali} (NOT \p{Bengali} NOR \p{Is_Bengali}) (128: U+0980..09FF) \p{Block: Bhaiksuki} (NOT \p{Bhaiksuki} NOR \p{Is_Bhaiksuki}) (112: U+11C00..11C6F) \p{Block: Block_Elements} (32: U+2580..259F) \p{Block: Bopomofo} (NOT \p{Bopomofo} NOR \p{Is_Bopomofo}) (48: U+3100..312F) \p{Block: Bopomofo_Ext} \p{Block=Bopomofo_Extended} (32) \p{Block: Bopomofo_Extended} (Short: \p{Blk=BopomofoExt}) (32: U+31A0..31BF) \p{Block: Box_Drawing} (128: U+2500..257F) \p{Block: Brahmi} (NOT \p{Brahmi} NOR \p{Is_Brahmi}) (128: U+11000..1107F) \p{Block: Braille} \p{Block=Braille_Patterns} (256) \p{Block: Braille_Patterns} (Short: \p{Blk=Braille}) (256: U+2800..28FF) \p{Block: Buginese} (NOT \p{Buginese} NOR \p{Is_Buginese}) (32: U+1A00..1A1F) \p{Block: Buhid} (NOT \p{Buhid} NOR \p{Is_Buhid}) (32: U+1740..175F) \p{Block: Byzantine_Music} \p{Block=Byzantine_Musical_Symbols} (256) \p{Block: Byzantine_Musical_Symbols} (Short: \p{Blk= ByzantineMusic}) (256: U+1D000..1D0FF) \p{Block: Canadian_Syllabics} \p{Block= Unified_Canadian_Aboriginal_Syllabics} (640) \p{Block: Carian} (NOT \p{Carian} NOR \p{Is_Carian}) (64: U+102A0..102DF) \p{Block: Caucasian_Albanian} (NOT \p{Caucasian_Albanian} NOR \p{Is_Caucasian_Albanian}) (64: U+10530..1056F) \p{Block: Chakma} (NOT \p{Chakma} NOR \p{Is_Chakma}) (80: U+11100..1114F) \p{Block: Cham} (NOT \p{Cham} NOR \p{Is_Cham}) (96: U+AA00..AA5F) \p{Block: Cherokee} (NOT \p{Cherokee} NOR \p{Is_Cherokee}) (96: U+13A0..13FF) \p{Block: Cherokee_Sup} \p{Block=Cherokee_Supplement} (80) \p{Block: Cherokee_Supplement} (Short: \p{Blk=CherokeeSup}) (80: U+AB70..ABBF) \p{Block: Chess_Symbols} (112: U+1FA00..1FA6F) \p{Block: Chorasmian} (NOT \p{Chorasmian} NOR \p{Is_Chorasmian}) (48: U+10FB0..10FDF) \p{Block: CJK} \p{Block=CJK_Unified_Ideographs} (20_992) \p{Block: CJK_Compat} \p{Block=CJK_Compatibility} (256) \p{Block: CJK_Compat_Forms} \p{Block=CJK_Compatibility_Forms} (32) \p{Block: CJK_Compat_Ideographs} \p{Block= CJK_Compatibility_Ideographs} (512) \p{Block: CJK_Compat_Ideographs_Sup} \p{Block= CJK_Compatibility_Ideographs_Supplement} (544) \p{Block: CJK_Compatibility} (Short: \p{Blk=CJKCompat}) (256: U+3300..33FF) \p{Block: CJK_Compatibility_Forms} (Short: \p{Blk=CJKCompatForms}) (32: U+FE30..FE4F) \p{Block: CJK_Compatibility_Ideographs} (Short: \p{Blk= CJKCompatIdeographs}) (512: U+F900..FAFF) \p{Block: CJK_Compatibility_Ideographs_Supplement} (Short: \p{Blk= CJKCompatIdeographsSup}) (544: U+2F800..2FA1F) \p{Block: CJK_Ext_A} \p{Block= CJK_Unified_Ideographs_Extension_A} (6592) \p{Block: CJK_Ext_B} \p{Block= CJK_Unified_Ideographs_Extension_B} (42_720) \p{Block: CJK_Ext_C} \p{Block= CJK_Unified_Ideographs_Extension_C} (4160) \p{Block: CJK_Ext_D} \p{Block= CJK_Unified_Ideographs_Extension_D} (224) \p{Block: CJK_Ext_E} \p{Block= CJK_Unified_Ideographs_Extension_E} (5776) \p{Block: CJK_Ext_F} \p{Block= CJK_Unified_Ideographs_Extension_F} (7488) \p{Block: CJK_Ext_G} \p{Block= CJK_Unified_Ideographs_Extension_G} (4944) \p{Block: CJK_Radicals_Sup} \p{Block=CJK_Radicals_Supplement} (128) \p{Block: CJK_Radicals_Supplement} (Short: \p{Blk=CJKRadicalsSup}) (128: U+2E80..2EFF) \p{Block: CJK_Strokes} (48: U+31C0..31EF) \p{Block: CJK_Symbols} \p{Block=CJK_Symbols_And_Punctuation} (64) \p{Block: CJK_Symbols_And_Punctuation} (Short: \p{Blk=CJKSymbols}) (64: U+3000..303F) \p{Block: CJK_Unified_Ideographs} (Short: \p{Blk=CJK}) (20_992: U+4E00..9FFF) \p{Block: CJK_Unified_Ideographs_Extension_A} (Short: \p{Blk= CJKExtA}) (6592: U+3400..4DBF) \p{Block: CJK_Unified_Ideographs_Extension_B} (Short: \p{Blk= CJKExtB}) (42_720: U+20000..2A6DF) \p{Block: CJK_Unified_Ideographs_Extension_C} (Short: \p{Blk= CJKExtC}) (4160: U+2A700..2B73F) \p{Block: CJK_Unified_Ideographs_Extension_D} (Short: \p{Blk= CJKExtD}) (224: U+2B740..2B81F) \p{Block: CJK_Unified_Ideographs_Extension_E} (Short: \p{Blk= CJKExtE}) (5776: U+2B820..2CEAF) \p{Block: CJK_Unified_Ideographs_Extension_F} (Short: \p{Blk= CJKExtF}) (7488: U+2CEB0..2EBEF) \p{Block: CJK_Unified_Ideographs_Extension_G} (Short: \p{Blk= CJKExtG}) (4944: U+30000..3134F) \p{Block: Combining_Diacritical_Marks} (Short: \p{Blk= Diacriticals}) (112: U+0300..036F) \p{Block: Combining_Diacritical_Marks_Extended} (Short: \p{Blk= DiacriticalsExt}) (80: U+1AB0..1AFF) \p{Block: Combining_Diacritical_Marks_For_Symbols} (Short: \p{Blk= DiacriticalsForSymbols}) (48: U+20D0..20FF) \p{Block: Combining_Diacritical_Marks_Supplement} (Short: \p{Blk= DiacriticalsSup}) (64: U+1DC0..1DFF) \p{Block: Combining_Half_Marks} (Short: \p{Blk=HalfMarks}) (16: U+FE20..FE2F) \p{Block: Combining_Marks_For_Symbols} \p{Block= Combining_Diacritical_Marks_For_Symbols} (48) \p{Block: Common_Indic_Number_Forms} (Short: \p{Blk= IndicNumberForms}) (16: U+A830..A83F) \p{Block: Compat_Jamo} \p{Block=Hangul_Compatibility_Jamo} (96) \p{Block: Control_Pictures} (64: U+2400..243F) \p{Block: Coptic} (NOT \p{Coptic} NOR \p{Is_Coptic}) (128: U+2C80..2CFF) \p{Block: Coptic_Epact_Numbers} (32: U+102E0..102FF) \p{Block: Counting_Rod} \p{Block=Counting_Rod_Numerals} (32) \p{Block: Counting_Rod_Numerals} (Short: \p{Blk=CountingRod}) (32: U+1D360..1D37F) \p{Block: Cuneiform} (NOT \p{Cuneiform} NOR \p{Is_Cuneiform}) (1024: U+12000..123FF) \p{Block: Cuneiform_Numbers} \p{Block= Cuneiform_Numbers_And_Punctuation} (128) \p{Block: Cuneiform_Numbers_And_Punctuation} (Short: \p{Blk= CuneiformNumbers}) (128: U+12400..1247F) \p{Block: Currency_Symbols} (48: U+20A0..20CF) \p{Block: Cypriot_Syllabary} (64: U+10800..1083F) \p{Block: Cyrillic} (NOT \p{Cyrillic} NOR \p{Is_Cyrillic}) (256: U+0400..04FF) \p{Block: Cyrillic_Ext_A} \p{Block=Cyrillic_Extended_A} (32) \p{Block: Cyrillic_Ext_B} \p{Block=Cyrillic_Extended_B} (96) \p{Block: Cyrillic_Ext_C} \p{Block=Cyrillic_Extended_C} (16) \p{Block: Cyrillic_Extended_A} (Short: \p{Blk=CyrillicExtA}) (32: U+2DE0..2DFF) \p{Block: Cyrillic_Extended_B} (Short: \p{Blk=CyrillicExtB}) (96: U+A640..A69F) \p{Block: Cyrillic_Extended_C} (Short: \p{Blk=CyrillicExtC}) (16: U+1C80..1C8F) \p{Block: Cyrillic_Sup} \p{Block=Cyrillic_Supplement} (48) \p{Block: Cyrillic_Supplement} (Short: \p{Blk=CyrillicSup}) (48: U+0500..052F) \p{Block: Cyrillic_Supplementary} \p{Block=Cyrillic_Supplement} (48) \p{Block: Deseret} (80: U+10400..1044F) \p{Block: Devanagari} (NOT \p{Devanagari} NOR \p{Is_Devanagari}) (128: U+0900..097F) \p{Block: Devanagari_Ext} \p{Block=Devanagari_Extended} (32) \p{Block: Devanagari_Extended} (Short: \p{Blk=DevanagariExt}) (32: U+A8E0..A8FF) \p{Block: Diacriticals} \p{Block=Combining_Diacritical_Marks} (112) \p{Block: Diacriticals_Ext} \p{Block= Combining_Diacritical_Marks_Extended} (80) \p{Block: Diacriticals_For_Symbols} \p{Block= Combining_Diacritical_Marks_For_Symbols} (48) \p{Block: Diacriticals_Sup} \p{Block= Combining_Diacritical_Marks_Supplement} (64) \p{Block: Dingbats} (192: U+2700..27BF) \p{Block: Dives_Akuru} (NOT \p{Dives_Akuru} NOR \p{Is_Dives_Akuru}) (96: U+11900..1195F) \p{Block: Dogra} (NOT \p{Dogra} NOR \p{Is_Dogra}) (80: U+11800..1184F) \p{Block: Domino} \p{Block=Domino_Tiles} (112) \p{Block: Domino_Tiles} (Short: \p{Blk=Domino}) (112: U+1F030..1F09F) \p{Block: Duployan} (NOT \p{Duployan} NOR \p{Is_Duployan}) (160: U+1BC00..1BC9F) \p{Block: Early_Dynastic_Cuneiform} (208: U+12480..1254F) \p{Block: Egyptian_Hieroglyph_Format_Controls} (16: U+13430..1343F) \p{Block: Egyptian_Hieroglyphs} (NOT \p{Egyptian_Hieroglyphs} NOR \p{Is_Egyptian_Hieroglyphs}) (1072: U+13000..1342F) \p{Block: Elbasan} (NOT \p{Elbasan} NOR \p{Is_Elbasan}) (48: U+10500..1052F) \p{Block: Elymaic} (NOT \p{Elymaic} NOR \p{Is_Elymaic}) (32: U+10FE0..10FFF) \p{Block: Emoticons} (80: U+1F600..1F64F) \p{Block: Enclosed_Alphanum} \p{Block=Enclosed_Alphanumerics} (160) \p{Block: Enclosed_Alphanum_Sup} \p{Block= Enclosed_Alphanumeric_Supplement} (256) \p{Block: Enclosed_Alphanumeric_Supplement} (Short: \p{Blk= EnclosedAlphanumSup}) (256: U+1F100..1F1FF) \p{Block: Enclosed_Alphanumerics} (Short: \p{Blk= EnclosedAlphanum}) (160: U+2460..24FF) \p{Block: Enclosed_CJK} \p{Block=Enclosed_CJK_Letters_And_Months} (256) \p{Block: Enclosed_CJK_Letters_And_Months} (Short: \p{Blk= EnclosedCJK}) (256: U+3200..32FF) \p{Block: Enclosed_Ideographic_Sup} \p{Block= Enclosed_Ideographic_Supplement} (256) \p{Block: Enclosed_Ideographic_Supplement} (Short: \p{Blk= EnclosedIdeographicSup}) (256: U+1F200..1F2FF) \p{Block: Ethiopic} (NOT \p{Ethiopic} NOR \p{Is_Ethiopic}) (384: U+1200..137F) \p{Block: Ethiopic_Ext} \p{Block=Ethiopic_Extended} (96) \p{Block: Ethiopic_Ext_A} \p{Block=Ethiopic_Extended_A} (48) \p{Block: Ethiopic_Extended} (Short: \p{Blk=EthiopicExt}) (96: U+2D80..2DDF) \p{Block: Ethiopic_Extended_A} (Short: \p{Blk=EthiopicExtA}) (48: U+AB00..AB2F) \p{Block: Ethiopic_Sup} \p{Block=Ethiopic_Supplement} (32) \p{Block: Ethiopic_Supplement} (Short: \p{Blk=EthiopicSup}) (32: U+1380..139F) \p{Block: General_Punctuation} (Short: \p{Blk=Punctuation}; NOT \p{Punct} NOR \p{Is_Punctuation}) (112: U+2000..206F) \p{Block: Geometric_Shapes} (96: U+25A0..25FF) \p{Block: Geometric_Shapes_Ext} \p{Block= Geometric_Shapes_Extended} (128) \p{Block: Geometric_Shapes_Extended} (Short: \p{Blk= GeometricShapesExt}) (128: U+1F780..1F7FF) \p{Block: Georgian} (NOT \p{Georgian} NOR \p{Is_Georgian}) (96: U+10A0..10FF) \p{Block: Georgian_Ext} \p{Block=Georgian_Extended} (48) \p{Block: Georgian_Extended} (Short: \p{Blk=GeorgianExt}) (48: U+1C90..1CBF) \p{Block: Georgian_Sup} \p{Block=Georgian_Supplement} (48) \p{Block: Georgian_Supplement} (Short: \p{Blk=GeorgianSup}) (48: U+2D00..2D2F) \p{Block: Glagolitic} (NOT \p{Glagolitic} NOR \p{Is_Glagolitic}) (96: U+2C00..2C5F) \p{Block: Glagolitic_Sup} \p{Block=Glagolitic_Supplement} (48) \p{Block: Glagolitic_Supplement} (Short: \p{Blk=GlagoliticSup}) (48: U+1E000..1E02F) \p{Block: Gothic} (NOT \p{Gothic} NOR \p{Is_Gothic}) (32: U+10330..1034F) \p{Block: Grantha} (NOT \p{Grantha} NOR \p{Is_Grantha}) (128: U+11300..1137F) \p{Block: Greek} \p{Block=Greek_And_Coptic} (NOT \p{Greek} NOR \p{Is_Greek}) (144) \p{Block: Greek_And_Coptic} (Short: \p{Blk=Greek}; NOT \p{Greek} NOR \p{Is_Greek}) (144: U+0370..03FF) \p{Block: Greek_Ext} \p{Block=Greek_Extended} (256) \p{Block: Greek_Extended} (Short: \p{Blk=GreekExt}) (256: U+1F00..1FFF) \p{Block: Gujarati} (NOT \p{Gujarati} NOR \p{Is_Gujarati}) (128: U+0A80..0AFF) \p{Block: Gunjala_Gondi} (NOT \p{Gunjala_Gondi} NOR \p{Is_Gunjala_Gondi}) (80: U+11D60..11DAF) \p{Block: Gurmukhi} (NOT \p{Gurmukhi} NOR \p{Is_Gurmukhi}) (128: U+0A00..0A7F) \p{Block: Half_And_Full_Forms} \p{Block= Halfwidth_And_Fullwidth_Forms} (240) \p{Block: Half_Marks} \p{Block=Combining_Half_Marks} (16) \p{Block: Halfwidth_And_Fullwidth_Forms} (Short: \p{Blk= HalfAndFullForms}) (240: U+FF00..FFEF) \p{Block: Hangul} \p{Block=Hangul_Syllables} (NOT \p{Hangul} NOR \p{Is_Hangul}) (11_184) \p{Block: Hangul_Compatibility_Jamo} (Short: \p{Blk=CompatJamo}) (96: U+3130..318F) \p{Block: Hangul_Jamo} (Short: \p{Blk=Jamo}) (256: U+1100..11FF) \p{Block: Hangul_Jamo_Extended_A} (Short: \p{Blk=JamoExtA}) (32: U+A960..A97F) \p{Block: Hangul_Jamo_Extended_B} (Short: \p{Blk=JamoExtB}) (80: U+D7B0..D7FF) \p{Block: Hangul_Syllables} (Short: \p{Blk=Hangul}; NOT \p{Hangul} NOR \p{Is_Hangul}) (11_184: U+AC00..D7AF) \p{Block: Hanifi_Rohingya} (NOT \p{Hanifi_Rohingya} NOR \p{Is_Hanifi_Rohingya}) (64: U+10D00..10D3F) \p{Block: Hanunoo} (NOT \p{Hanunoo} NOR \p{Is_Hanunoo}) (32: U+1720..173F) \p{Block: Hatran} (NOT \p{Hatran} NOR \p{Is_Hatran}) (32: U+108E0..108FF) \p{Block: Hebrew} (NOT \p{Hebrew} NOR \p{Is_Hebrew}) (112: U+0590..05FF) \p{Block: High_Private_Use_Surrogates} (Short: \p{Blk= HighPUSurrogates}) (128: U+DB80..DBFF) \p{Block: High_PU_Surrogates} \p{Block= High_Private_Use_Surrogates} (128) \p{Block: High_Surrogates} (896: U+D800..DB7F) \p{Block: Hiragana} (NOT \p{Hiragana} NOR \p{Is_Hiragana}) (96: U+3040..309F) \p{Block: IDC} \p{Block= Ideographic_Description_Characters} (NOT \p{ID_Continue} NOR \p{Is_IDC}) (16) \p{Block: Ideographic_Description_Characters} (Short: \p{Blk=IDC}; NOT \p{ID_Continue} NOR \p{Is_IDC}) (16: U+2FF0..2FFF) \p{Block: Ideographic_Symbols} \p{Block= Ideographic_Symbols_And_Punctuation} (32) \p{Block: Ideographic_Symbols_And_Punctuation} (Short: \p{Blk= IdeographicSymbols}) (32: U+16FE0..16FFF) \p{Block: Imperial_Aramaic} (NOT \p{Imperial_Aramaic} NOR \p{Is_Imperial_Aramaic}) (32: U+10840..1085F) \p{Block: Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms} (16) \p{Block: Indic_Siyaq_Numbers} (80: U+1EC70..1ECBF) \p{Block: Inscriptional_Pahlavi} (NOT \p{Inscriptional_Pahlavi} NOR \p{Is_Inscriptional_Pahlavi}) (32: U+10B60..10B7F) \p{Block: Inscriptional_Parthian} (NOT \p{Inscriptional_Parthian} NOR \p{Is_Inscriptional_Parthian}) (32: U+10B40..10B5F) \p{Block: IPA_Ext} \p{Block=IPA_Extensions} (96) \p{Block: IPA_Extensions} (Short: \p{Blk=IPAExt}) (96: U+0250..02AF) \p{Block: Jamo} \p{Block=Hangul_Jamo} (256) \p{Block: Jamo_Ext_A} \p{Block=Hangul_Jamo_Extended_A} (32) \p{Block: Jamo_Ext_B} \p{Block=Hangul_Jamo_Extended_B} (80) \p{Block: Javanese} (NOT \p{Javanese} NOR \p{Is_Javanese}) (96: U+A980..A9DF) \p{Block: Kaithi} (NOT \p{Kaithi} NOR \p{Is_Kaithi}) (80: U+11080..110CF) \p{Block: Kana_Ext_A} \p{Block=Kana_Extended_A} (48) \p{Block: Kana_Extended_A} (Short: \p{Blk=KanaExtA}) (48: U+1B100..1B12F) \p{Block: Kana_Sup} \p{Block=Kana_Supplement} (256) \p{Block: Kana_Supplement} (Short: \p{Blk=KanaSup}) (256: U+1B000..1B0FF) \p{Block: Kanbun} (16: U+3190..319F) \p{Block: Kangxi} \p{Block=Kangxi_Radicals} (224) \p{Block: Kangxi_Radicals} (Short: \p{Blk=Kangxi}) (224: U+2F00..2FDF) \p{Block: Kannada} (NOT \p{Kannada} NOR \p{Is_Kannada}) (128: U+0C80..0CFF) \p{Block: Katakana} (NOT \p{Katakana} NOR \p{Is_Katakana}) (96: U+30A0..30FF) \p{Block: Katakana_Ext} \p{Block=Katakana_Phonetic_Extensions} (16) \p{Block: Katakana_Phonetic_Extensions} (Short: \p{Blk= KatakanaExt}) (16: U+31F0..31FF) \p{Block: Kayah_Li} (48: U+A900..A92F) \p{Block: Kharoshthi} (NOT \p{Kharoshthi} NOR \p{Is_Kharoshthi}) (96: U+10A00..10A5F) \p{Block: Khitan_Small_Script} (NOT \p{Khitan_Small_Script} NOR \p{Is_Khitan_Small_Script}) (512: U+18B00..18CFF) \p{Block: Khmer} (NOT \p{Khmer} NOR \p{Is_Khmer}) (128: U+1780..17FF) \p{Block: Khmer_Symbols} (32: U+19E0..19FF) \p{Block: Khojki} (NOT \p{Khojki} NOR \p{Is_Khojki}) (80: U+11200..1124F) \p{Block: Khudawadi} (NOT \p{Khudawadi} NOR \p{Is_Khudawadi}) (80: U+112B0..112FF) \p{Block: Lao} (NOT \p{Lao} NOR \p{Is_Lao}) (128: U+0E80..0EFF) \p{Block: Latin_1} \p{Block=Latin_1_Supplement} (128) \p{Block: Latin_1_Sup} \p{Block=Latin_1_Supplement} (128) \p{Block: Latin_1_Supplement} (Short: \p{Blk=Latin1}) (128: [\x80- \xff]) \p{Block: Latin_Ext_A} \p{Block=Latin_Extended_A} (128) \p{Block: Latin_Ext_Additional} \p{Block= Latin_Extended_Additional} (256) \p{Block: Latin_Ext_B} \p{Block=Latin_Extended_B} (208) \p{Block: Latin_Ext_C} \p{Block=Latin_Extended_C} (32) \p{Block: Latin_Ext_D} \p{Block=Latin_Extended_D} (224) \p{Block: Latin_Ext_E} \p{Block=Latin_Extended_E} (64) \p{Block: Latin_Extended_A} (Short: \p{Blk=LatinExtA}) (128: U+0100..017F) \p{Block: Latin_Extended_Additional} (Short: \p{Blk= LatinExtAdditional}) (256: U+1E00..1EFF) \p{Block: Latin_Extended_B} (Short: \p{Blk=LatinExtB}) (208: U+0180..024F) \p{Block: Latin_Extended_C} (Short: \p{Blk=LatinExtC}) (32: U+2C60..2C7F) \p{Block: Latin_Extended_D} (Short: \p{Blk=LatinExtD}) (224: U+A720..A7FF) \p{Block: Latin_Extended_E} (Short: \p{Blk=LatinExtE}) (64: U+AB30..AB6F) \p{Block: Lepcha} (NOT \p{Lepcha} NOR \p{Is_Lepcha}) (80: U+1C00..1C4F) \p{Block: Letterlike_Symbols} (80: U+2100..214F) \p{Block: Limbu} (NOT \p{Limbu} NOR \p{Is_Limbu}) (80: U+1900..194F) \p{Block: Linear_A} (NOT \p{Linear_A} NOR \p{Is_Linear_A}) (384: U+10600..1077F) \p{Block: Linear_B_Ideograms} (128: U+10080..100FF) \p{Block: Linear_B_Syllabary} (128: U+10000..1007F) \p{Block: Lisu} (NOT \p{Lisu} NOR \p{Is_Lisu}) (48: U+A4D0..A4FF) \p{Block: Lisu_Sup} \p{Block=Lisu_Supplement} (16) \p{Block: Lisu_Supplement} (Short: \p{Blk=LisuSup}) (16: U+11FB0..11FBF) \p{Block: Low_Surrogates} (1024: U+DC00..DFFF) \p{Block: Lycian} (NOT \p{Lycian} NOR \p{Is_Lycian}) (32: U+10280..1029F) \p{Block: Lydian} (NOT \p{Lydian} NOR \p{Is_Lydian}) (32: U+10920..1093F) \p{Block: Mahajani} (NOT \p{Mahajani} NOR \p{Is_Mahajani}) (48: U+11150..1117F) \p{Block: Mahjong} \p{Block=Mahjong_Tiles} (48) \p{Block: Mahjong_Tiles} (Short: \p{Blk=Mahjong}) (48: U+1F000..1F02F) \p{Block: Makasar} (NOT \p{Makasar} NOR \p{Is_Makasar}) (32: U+11EE0..11EFF) \p{Block: Malayalam} (NOT \p{Malayalam} NOR \p{Is_Malayalam}) (128: U+0D00..0D7F) \p{Block: Mandaic} (NOT \p{Mandaic} NOR \p{Is_Mandaic}) (32: U+0840..085F) \p{Block: Manichaean} (NOT \p{Manichaean} NOR \p{Is_Manichaean}) (64: U+10AC0..10AFF) \p{Block: Marchen} (NOT \p{Marchen} NOR \p{Is_Marchen}) (80: U+11C70..11CBF) \p{Block: Masaram_Gondi} (NOT \p{Masaram_Gondi} NOR \p{Is_Masaram_Gondi}) (96: U+11D00..11D5F) \p{Block: Math_Alphanum} \p{Block= Mathematical_Alphanumeric_Symbols} (1024) \p{Block: Math_Operators} \p{Block=Mathematical_Operators} (256) \p{Block: Mathematical_Alphanumeric_Symbols} (Short: \p{Blk= MathAlphanum}) (1024: U+1D400..1D7FF) \p{Block: Mathematical_Operators} (Short: \p{Blk=MathOperators}) (256: U+2200..22FF) \p{Block: Mayan_Numerals} (32: U+1D2E0..1D2FF) \p{Block: Medefaidrin} (NOT \p{Medefaidrin} NOR \p{Is_Medefaidrin}) (96: U+16E40..16E9F) \p{Block: Meetei_Mayek} (NOT \p{Meetei_Mayek} NOR \p{Is_Meetei_Mayek}) (64: U+ABC0..ABFF) \p{Block: Meetei_Mayek_Ext} \p{Block=Meetei_Mayek_Extensions} (32) \p{Block: Meetei_Mayek_Extensions} (Short: \p{Blk=MeeteiMayekExt}) (32: U+AAE0..AAFF) \p{Block: Mende_Kikakui} (NOT \p{Mende_Kikakui} NOR \p{Is_Mende_Kikakui}) (224: U+1E800..1E8DF) \p{Block: Meroitic_Cursive} (NOT \p{Meroitic_Cursive} NOR \p{Is_Meroitic_Cursive}) (96: U+109A0..109FF) \p{Block: Meroitic_Hieroglyphs} (32: U+10980..1099F) \p{Block: Miao} (NOT \p{Miao} NOR \p{Is_Miao}) (160: U+16F00..16F9F) \p{Block: Misc_Arrows} \p{Block=Miscellaneous_Symbols_And_Arrows} (256) \p{Block: Misc_Math_Symbols_A} \p{Block= Miscellaneous_Mathematical_Symbols_A} (48) \p{Block: Misc_Math_Symbols_B} \p{Block= Miscellaneous_Mathematical_Symbols_B} (128) \p{Block: Misc_Pictographs} \p{Block= Miscellaneous_Symbols_And_Pictographs} (768) \p{Block: Misc_Symbols} \p{Block=Miscellaneous_Symbols} (256) \p{Block: Misc_Technical} \p{Block=Miscellaneous_Technical} (256) \p{Block: Miscellaneous_Mathematical_Symbols_A} (Short: \p{Blk= MiscMathSymbolsA}) (48: U+27C0..27EF) \p{Block: Miscellaneous_Mathematical_Symbols_B} (Short: \p{Blk= MiscMathSymbolsB}) (128: U+2980..29FF) \p{Block: Miscellaneous_Symbols} (Short: \p{Blk=MiscSymbols}) (256: U+2600..26FF) \p{Block: Miscellaneous_Symbols_And_Arrows} (Short: \p{Blk= MiscArrows}) (256: U+2B00..2BFF) \p{Block: Miscellaneous_Symbols_And_Pictographs} (Short: \p{Blk= MiscPictographs}) (768: U+1F300..1F5FF) \p{Block: Miscellaneous_Technical} (Short: \p{Blk=MiscTechnical}) (256: U+2300..23FF) \p{Block: Modi} (NOT \p{Modi} NOR \p{Is_Modi}) (96: U+11600..1165F) \p{Block: Modifier_Letters} \p{Block=Spacing_Modifier_Letters} (80) \p{Block: Modifier_Tone_Letters} (32: U+A700..A71F) \p{Block: Mongolian} (NOT \p{Mongolian} NOR \p{Is_Mongolian}) (176: U+1800..18AF) \p{Block: Mongolian_Sup} \p{Block=Mongolian_Supplement} (32) \p{Block: Mongolian_Supplement} (Short: \p{Blk=MongolianSup}) (32: U+11660..1167F) \p{Block: Mro} (NOT \p{Mro} NOR \p{Is_Mro}) (48: U+16A40..16A6F) \p{Block: Multani} (NOT \p{Multani} NOR \p{Is_Multani}) (48: U+11280..112AF) \p{Block: Music} \p{Block=Musical_Symbols} (256) \p{Block: Musical_Symbols} (Short: \p{Blk=Music}) (256: U+1D100..1D1FF) \p{Block: Myanmar} (NOT \p{Myanmar} NOR \p{Is_Myanmar}) (160: U+1000..109F) \p{Block: Myanmar_Ext_A} \p{Block=Myanmar_Extended_A} (32) \p{Block: Myanmar_Ext_B} \p{Block=Myanmar_Extended_B} (32) \p{Block: Myanmar_Extended_A} (Short: \p{Blk=MyanmarExtA}) (32: U+AA60..AA7F) \p{Block: Myanmar_Extended_B} (Short: \p{Blk=MyanmarExtB}) (32: U+A9E0..A9FF) \p{Block: Nabataean} (NOT \p{Nabataean} NOR \p{Is_Nabataean}) (48: U+10880..108AF) \p{Block: Nandinagari} (NOT \p{Nandinagari} NOR \p{Is_Nandinagari}) (96: U+119A0..119FF) \p{Block: NB} \p{Block=No_Block} (826_640 plus all above-Unicode code points) \p{Block: New_Tai_Lue} (NOT \p{New_Tai_Lue} NOR \p{Is_New_Tai_Lue}) (96: U+1980..19DF) \p{Block: Newa} (NOT \p{Newa} NOR \p{Is_Newa}) (128: U+11400..1147F) \p{Block: NKo} (NOT \p{Nko} NOR \p{Is_NKo}) (64: U+07C0..07FF) \p{Block: No_Block} (Short: \p{Blk=NB}) (826_640 plus all above-Unicode code points: U+0870..089F, U+2FE0..2FEF, U+10200..1027F, U+103E0..103FF, U+10570..105FF, U+10780..107FF ...) \p{Block: Number_Forms} (64: U+2150..218F) \p{Block: Nushu} (NOT \p{Nushu} NOR \p{Is_Nushu}) (400: U+1B170..1B2FF) \p{Block: Nyiakeng_Puachue_Hmong} (NOT \p{Nyiakeng_Puachue_Hmong} NOR \p{Is_Nyiakeng_Puachue_Hmong}) (80: U+1E100..1E14F) \p{Block: OCR} \p{Block=Optical_Character_Recognition} (32) \p{Block: Ogham} (NOT \p{Ogham} NOR \p{Is_Ogham}) (32: U+1680..169F) \p{Block: Ol_Chiki} (48: U+1C50..1C7F) \p{Block: Old_Hungarian} (NOT \p{Old_Hungarian} NOR \p{Is_Old_Hungarian}) (128: U+10C80..10CFF) \p{Block: Old_Italic} (NOT \p{Old_Italic} NOR \p{Is_Old_Italic}) (48: U+10300..1032F) \p{Block: Old_North_Arabian} (32: U+10A80..10A9F) \p{Block: Old_Permic} (NOT \p{Old_Permic} NOR \p{Is_Old_Permic}) (48: U+10350..1037F) \p{Block: Old_Persian} (NOT \p{Old_Persian} NOR \p{Is_Old_Persian}) (64: U+103A0..103DF) \p{Block: Old_Sogdian} (NOT \p{Old_Sogdian} NOR \p{Is_Old_Sogdian}) (48: U+10F00..10F2F) \p{Block: Old_South_Arabian} (32: U+10A60..10A7F) \p{Block: Old_Turkic} (NOT \p{Old_Turkic} NOR \p{Is_Old_Turkic}) (80: U+10C00..10C4F) \p{Block: Optical_Character_Recognition} (Short: \p{Blk=OCR}) (32: U+2440..245F) \p{Block: Oriya} (NOT \p{Oriya} NOR \p{Is_Oriya}) (128: U+0B00..0B7F) \p{Block: Ornamental_Dingbats} (48: U+1F650..1F67F) \p{Block: Osage} (NOT \p{Osage} NOR \p{Is_Osage}) (80: U+104B0..104FF) \p{Block: Osmanya} (NOT \p{Osmanya} NOR \p{Is_Osmanya}) (48: U+10480..104AF) \p{Block: Ottoman_Siyaq_Numbers} (80: U+1ED00..1ED4F) \p{Block: Pahawh_Hmong} (NOT \p{Pahawh_Hmong} NOR \p{Is_Pahawh_Hmong}) (144: U+16B00..16B8F) \p{Block: Palmyrene} (32: U+10860..1087F) \p{Block: Pau_Cin_Hau} (NOT \p{Pau_Cin_Hau} NOR \p{Is_Pau_Cin_Hau}) (64: U+11AC0..11AFF) \p{Block: Phags_Pa} (NOT \p{Phags_Pa} NOR \p{Is_Phags_Pa}) (64: U+A840..A87F) \p{Block: Phaistos} \p{Block=Phaistos_Disc} (48) \p{Block: Phaistos_Disc} (Short: \p{Blk=Phaistos}) (48: U+101D0..101FF) \p{Block: Phoenician} (NOT \p{Phoenician} NOR \p{Is_Phoenician}) (32: U+10900..1091F) \p{Block: Phonetic_Ext} \p{Block=Phonetic_Extensions} (128) \p{Block: Phonetic_Ext_Sup} \p{Block= Phonetic_Extensions_Supplement} (64) \p{Block: Phonetic_Extensions} (Short: \p{Blk=PhoneticExt}) (128: U+1D00..1D7F) \p{Block: Phonetic_Extensions_Supplement} (Short: \p{Blk= PhoneticExtSup}) (64: U+1D80..1DBF) \p{Block: Playing_Cards} (96: U+1F0A0..1F0FF) \p{Block: Private_Use} \p{Block=Private_Use_Area} (NOT \p{Private_Use} NOR \p{Is_Private_Use}) (6400) \p{Block: Private_Use_Area} (Short: \p{Blk=PUA}; NOT \p{Private_Use} NOR \p{Is_Private_Use}) (6400: U+E000..F8FF) \p{Block: Psalter_Pahlavi} (NOT \p{Psalter_Pahlavi} NOR \p{Is_Psalter_Pahlavi}) (48: U+10B80..10BAF) \p{Block: PUA} \p{Block=Private_Use_Area} (NOT \p{Private_Use} NOR \p{Is_Private_Use}) (6400) \p{Block: Punctuation} \p{Block=General_Punctuation} (NOT \p{Punct} NOR \p{Is_Punctuation}) (112) \p{Block: Rejang} (NOT \p{Rejang} NOR \p{Is_Rejang}) (48: U+A930..A95F) \p{Block: Rumi} \p{Block=Rumi_Numeral_Symbols} (32) \p{Block: Rumi_Numeral_Symbols} (Short: \p{Blk=Rumi}) (32: U+10E60..10E7F) \p{Block: Runic} (NOT \p{Runic} NOR \p{Is_Runic}) (96: U+16A0..16FF) \p{Block: Samaritan} (NOT \p{Samaritan} NOR \p{Is_Samaritan}) (64: U+0800..083F) \p{Block: Saurashtra} (NOT \p{Saurashtra} NOR \p{Is_Saurashtra}) (96: U+A880..A8DF) \p{Block: Sharada} (NOT \p{Sharada} NOR \p{Is_Sharada}) (96: U+11180..111DF) \p{Block: Shavian} (48: U+10450..1047F) \p{Block: Shorthand_Format_Controls} (16: U+1BCA0..1BCAF) \p{Block: Siddham} (NOT \p{Siddham} NOR \p{Is_Siddham}) (128: U+11580..115FF) \p{Block: Sinhala} (NOT \p{Sinhala} NOR \p{Is_Sinhala}) (128: U+0D80..0DFF) \p{Block: Sinhala_Archaic_Numbers} (32: U+111E0..111FF) \p{Block: Small_Form_Variants} (Short: \p{Blk=SmallForms}) (32: U+FE50..FE6F) \p{Block: Small_Forms} \p{Block=Small_Form_Variants} (32) \p{Block: Small_Kana_Ext} \p{Block=Small_Kana_Extension} (64) \p{Block: Small_Kana_Extension} (Short: \p{Blk=SmallKanaExt}) (64: U+1B130..1B16F) \p{Block: Sogdian} (NOT \p{Sogdian} NOR \p{Is_Sogdian}) (64: U+10F30..10F6F) \p{Block: Sora_Sompeng} (NOT \p{Sora_Sompeng} NOR \p{Is_Sora_Sompeng}) (48: U+110D0..110FF) \p{Block: Soyombo} (NOT \p{Soyombo} NOR \p{Is_Soyombo}) (96: U+11A50..11AAF) \p{Block: Spacing_Modifier_Letters} (Short: \p{Blk= ModifierLetters}) (80: U+02B0..02FF) \p{Block: Specials} (16: U+FFF0..FFFF) \p{Block: Sundanese} (NOT \p{Sundanese} NOR \p{Is_Sundanese}) (64: U+1B80..1BBF) \p{Block: Sundanese_Sup} \p{Block=Sundanese_Supplement} (16) \p{Block: Sundanese_Supplement} (Short: \p{Blk=SundaneseSup}) (16: U+1CC0..1CCF) \p{Block: Sup_Arrows_A} \p{Block=Supplemental_Arrows_A} (16) \p{Block: Sup_Arrows_B} \p{Block=Supplemental_Arrows_B} (128) \p{Block: Sup_Arrows_C} \p{Block=Supplemental_Arrows_C} (256) \p{Block: Sup_Math_Operators} \p{Block= Supplemental_Mathematical_Operators} (256) \p{Block: Sup_PUA_A} \p{Block=Supplementary_Private_Use_Area_A} (65_536) \p{Block: Sup_PUA_B} \p{Block=Supplementary_Private_Use_Area_B} (65_536) \p{Block: Sup_Punctuation} \p{Block=Supplemental_Punctuation} (128) \p{Block: Sup_Symbols_And_Pictographs} \p{Block= Supplemental_Symbols_And_Pictographs} (256) \p{Block: Super_And_Sub} \p{Block=Superscripts_And_Subscripts} (48) \p{Block: Superscripts_And_Subscripts} (Short: \p{Blk= SuperAndSub}) (48: U+2070..209F) \p{Block: Supplemental_Arrows_A} (Short: \p{Blk=SupArrowsA}) (16: U+27F0..27FF) \p{Block: Supplemental_Arrows_B} (Short: \p{Blk=SupArrowsB}) (128: U+2900..297F) \p{Block: Supplemental_Arrows_C} (Short: \p{Blk=SupArrowsC}) (256: U+1F800..1F8FF) \p{Block: Supplemental_Mathematical_Operators} (Short: \p{Blk= SupMathOperators}) (256: U+2A00..2AFF) \p{Block: Supplemental_Punctuation} (Short: \p{Blk= SupPunctuation}) (128: U+2E00..2E7F) \p{Block: Supplemental_Symbols_And_Pictographs} (Short: \p{Blk= SupSymbolsAndPictographs}) (256: U+1F900..1F9FF) \p{Block: Supplementary_Private_Use_Area_A} (Short: \p{Blk= SupPUAA}) (65_536: U+F0000..FFFFF) \p{Block: Supplementary_Private_Use_Area_B} (Short: \p{Blk= SupPUAB}) (65_536: U+100000..10FFFF) \p{Block: Sutton_SignWriting} (688: U+1D800..1DAAF) \p{Block: Syloti_Nagri} (NOT \p{Syloti_Nagri} NOR \p{Is_Syloti_Nagri}) (48: U+A800..A82F) \p{Block: Symbols_And_Pictographs_Ext_A} \p{Block= Symbols_And_Pictographs_Extended_A} (144) \p{Block: Symbols_And_Pictographs_Extended_A} (Short: \p{Blk= SymbolsAndPictographsExtA}) (144: U+1FA70..1FAFF) \p{Block: Symbols_For_Legacy_Computing} (256: U+1FB00..1FBFF) \p{Block: Syriac} (NOT \p{Syriac} NOR \p{Is_Syriac}) (80: U+0700..074F) \p{Block: Syriac_Sup} \p{Block=Syriac_Supplement} (16) \p{Block: Syriac_Supplement} (Short: \p{Blk=SyriacSup}) (16: U+0860..086F) \p{Block: Tagalog} (NOT \p{Tagalog} NOR \p{Is_Tagalog}) (32: U+1700..171F) \p{Block: Tagbanwa} (NOT \p{Tagbanwa} NOR \p{Is_Tagbanwa}) (32: U+1760..177F) \p{Block: Tags} (128: U+E0000..E007F) \p{Block: Tai_Le} (NOT \p{Tai_Le} NOR \p{Is_Tai_Le}) (48: U+1950..197F) \p{Block: Tai_Tham} (NOT \p{Tai_Tham} NOR \p{Is_Tai_Tham}) (144: U+1A20..1AAF) \p{Block: Tai_Viet} (NOT \p{Tai_Viet} NOR \p{Is_Tai_Viet}) (96: U+AA80..AADF) \p{Block: Tai_Xuan_Jing} \p{Block=Tai_Xuan_Jing_Symbols} (96) \p{Block: Tai_Xuan_Jing_Symbols} (Short: \p{Blk=TaiXuanJing}) (96: U+1D300..1D35F) \p{Block: Takri} (NOT \p{Takri} NOR \p{Is_Takri}) (80: U+11680..116CF) \p{Block: Tamil} (NOT \p{Tamil} NOR \p{Is_Tamil}) (128: U+0B80..0BFF) \p{Block: Tamil_Sup} \p{Block=Tamil_Supplement} (64) \p{Block: Tamil_Supplement} (Short: \p{Blk=TamilSup}) (64: U+11FC0..11FFF) \p{Block: Tangut} (NOT \p{Tangut} NOR \p{Is_Tangut}) (6144: U+17000..187FF) \p{Block: Tangut_Components} (768: U+18800..18AFF) \p{Block: Tangut_Sup} \p{Block=Tangut_Supplement} (144) \p{Block: Tangut_Supplement} (Short: \p{Blk=TangutSup}) (144: U+18D00..18D8F) \p{Block: Telugu} (NOT \p{Telugu} NOR \p{Is_Telugu}) (128: U+0C00..0C7F) \p{Block: Thaana} (NOT \p{Thaana} NOR \p{Is_Thaana}) (64: U+0780..07BF) \p{Block: Thai} (NOT \p{Thai} NOR \p{Is_Thai}) (128: U+0E00..0E7F) \p{Block: Tibetan} (NOT \p{Tibetan} NOR \p{Is_Tibetan}) (256: U+0F00..0FFF) \p{Block: Tifinagh} (NOT \p{Tifinagh} NOR \p{Is_Tifinagh}) (80: U+2D30..2D7F) \p{Block: Tirhuta} (NOT \p{Tirhuta} NOR \p{Is_Tirhuta}) (96: U+11480..114DF) \p{Block: Transport_And_Map} \p{Block=Transport_And_Map_Symbols} (128) \p{Block: Transport_And_Map_Symbols} (Short: \p{Blk= TransportAndMap}) (128: U+1F680..1F6FF) \p{Block: UCAS} \p{Block= Unified_Canadian_Aboriginal_Syllabics} (640) \p{Block: UCAS_Ext} \p{Block= Unified_Canadian_Aboriginal_Syllabics_- Extended} (80) \p{Block: Ugaritic} (NOT \p{Ugaritic} NOR \p{Is_Ugaritic}) (32: U+10380..1039F) \p{Block: Unified_Canadian_Aboriginal_Syllabics} (Short: \p{Blk= UCAS}) (640: U+1400..167F) \p{Block: Unified_Canadian_Aboriginal_Syllabics_Extended} (Short: \p{Blk=UCASExt}) (80: U+18B0..18FF) \p{Block: Vai} (NOT \p{Vai} NOR \p{Is_Vai}) (320: U+A500..A63F) \p{Block: Variation_Selectors} (Short: \p{Blk=VS}; NOT \p{Variation_Selector} NOR \p{Is_VS}) (16: U+FE00..FE0F) \p{Block: Variation_Selectors_Supplement} (Short: \p{Blk=VSSup}) (240: U+E0100..E01EF) \p{Block: Vedic_Ext} \p{Block=Vedic_Extensions} (48) \p{Block: Vedic_Extensions} (Short: \p{Blk=VedicExt}) (48: U+1CD0..1CFF) \p{Block: Vertical_Forms} (16: U+FE10..FE1F) \p{Block: VS} \p{Block=Variation_Selectors} (NOT \p{Variation_Selector} NOR \p{Is_VS}) (16) \p{Block: VS_Sup} \p{Block=Variation_Selectors_Supplement} (240) \p{Block: Wancho} (NOT \p{Wancho} NOR \p{Is_Wancho}) (64: U+1E2C0..1E2FF) \p{Block: Warang_Citi} (NOT \p{Warang_Citi} NOR \p{Is_Warang_Citi}) (96: U+118A0..118FF) \p{Block: Yezidi} (NOT \p{Yezidi} NOR \p{Is_Yezidi}) (64: U+10E80..10EBF) \p{Block: Yi_Radicals} (64: U+A490..A4CF) \p{Block: Yi_Syllables} (1168: U+A000..A48F) \p{Block: Yijing} \p{Block=Yijing_Hexagram_Symbols} (64) \p{Block: Yijing_Hexagram_Symbols} (Short: \p{Blk=Yijing}) (64: U+4DC0..4DFF) \p{Block: Zanabazar_Square} (NOT \p{Zanabazar_Square} NOR \p{Is_Zanabazar_Square}) (80: U+11A00..11A4F) X \p{Block_Elements} \p{Block=Block_Elements} (32) \p{Bopo} \p{Bopomofo} (= \p{Script_Extensions= Bopomofo}) (NOT \p{Block=Bopomofo}) (117) \p{Bopomofo} \p{Script_Extensions=Bopomofo} (Short: \p{Bopo}; NOT \p{Block=Bopomofo}) (117) X \p{Bopomofo_Ext} \p{Bopomofo_Extended} (= \p{Block= Bopomofo_Extended}) (32) X \p{Bopomofo_Extended} \p{Block=Bopomofo_Extended} (Short: \p{InBopomofoExt}) (32) X \p{Box_Drawing} \p{Block=Box_Drawing} (128) \p{Bpt: *} \p{Bidi_Paired_Bracket_Type: *} \p{Brah} \p{Brahmi} (= \p{Script_Extensions= Brahmi}) (NOT \p{Block=Brahmi}) (109) \p{Brahmi} \p{Script_Extensions=Brahmi} (Short: \p{Brah}; NOT \p{Block=Brahmi}) (109) \p{Brai} \p{Braille} (= \p{Script_Extensions= Braille}) (256) \p{Braille} \p{Script_Extensions=Braille} (Short: \p{Brai}) (256) X \p{Braille_Patterns} \p{Block=Braille_Patterns} (Short: \p{InBraille}) (256) \p{Bugi} \p{Buginese} (= \p{Script_Extensions= Buginese}) (NOT \p{Block=Buginese}) (31) \p{Buginese} \p{Script_Extensions=Buginese} (Short: \p{Bugi}; NOT \p{Block=Buginese}) (31) \p{Buhd} \p{Buhid} (= \p{Script_Extensions=Buhid}) (NOT \p{Block=Buhid}) (22) \p{Buhid} \p{Script_Extensions=Buhid} (Short: \p{Buhd}; NOT \p{Block=Buhid}) (22) X \p{Byzantine_Music} \p{Byzantine_Musical_Symbols} (= \p{Block= Byzantine_Musical_Symbols}) (256) X \p{Byzantine_Musical_Symbols} \p{Block=Byzantine_Musical_Symbols} (Short: \p{InByzantineMusic}) (256) \p{C} \pC \p{Other} (= \p{General_Category=Other}) (970_414 plus all above-Unicode code points) \p{Cakm} \p{Chakma} (= \p{Script_Extensions= Chakma}) (NOT \p{Block=Chakma}) (91) \p{Canadian_Aboriginal} \p{Script_Extensions=Canadian_Aboriginal} (Short: \p{Cans}) (710) X \p{Canadian_Syllabics} \p{Unified_Canadian_Aboriginal_Syllabics} (= \p{Block= Unified_Canadian_Aboriginal_Syllabics}) (640) T \p{Canonical_Combining_Class: 0} \p{Canonical_Combining_Class= Not_Reordered} (1_113_240 plus all above-Unicode code points) T \p{Canonical_Combining_Class: 1} \p{Canonical_Combining_Class= Overlay} (32) T \p{Canonical_Combining_Class: 6} \p{Canonical_Combining_Class= Han_Reading} (2) T \p{Canonical_Combining_Class: 7} \p{Canonical_Combining_Class= Nukta} (26) T \p{Canonical_Combining_Class: 8} \p{Canonical_Combining_Class= Kana_Voicing} (2) T \p{Canonical_Combining_Class: 9} \p{Canonical_Combining_Class= Virama} (61) T \p{Canonical_Combining_Class: 10} \p{Canonical_Combining_Class= CCC10} (1) \p{Canonical_Combining_Class: CCC10} (Short: \p{Ccc=CCC10}) (1: U+05B0) T \p{Canonical_Combining_Class: 11} \p{Canonical_Combining_Class= CCC11} (1) \p{Canonical_Combining_Class: CCC11} (Short: \p{Ccc=CCC11}) (1: U+05B1) T \p{Canonical_Combining_Class: 12} \p{Canonical_Combining_Class= CCC12} (1) \p{Canonical_Combining_Class: CCC12} (Short: \p{Ccc=CCC12}) (1: U+05B2) T \p{Canonical_Combining_Class: 13} \p{Canonical_Combining_Class= CCC13} (1) \p{Canonical_Combining_Class: CCC13} (Short: \p{Ccc=CCC13}) (1: U+05B3) T \p{Canonical_Combining_Class: 14} \p{Canonical_Combining_Class= CCC14} (1) \p{Canonical_Combining_Class: CCC14} (Short: \p{Ccc=CCC14}) (1: U+05B4) T \p{Canonical_Combining_Class: 15} \p{Canonical_Combining_Class= CCC15} (1) \p{Canonical_Combining_Class: CCC15} (Short: \p{Ccc=CCC15}) (1: U+05B5) T \p{Canonical_Combining_Class: 16} \p{Canonical_Combining_Class= CCC16} (1) \p{Canonical_Combining_Class: CCC16} (Short: \p{Ccc=CCC16}) (1: U+05B6) T \p{Canonical_Combining_Class: 17} \p{Canonical_Combining_Class= CCC17} (1) \p{Canonical_Combining_Class: CCC17} (Short: \p{Ccc=CCC17}) (1: U+05B7) T \p{Canonical_Combining_Class: 18} \p{Canonical_Combining_Class= CCC18} (2) \p{Canonical_Combining_Class: CCC18} (Short: \p{Ccc=CCC18}) (2: U+05B8, U+05C7) T \p{Canonical_Combining_Class: 19} \p{Canonical_Combining_Class= CCC19} (2) \p{Canonical_Combining_Class: CCC19} (Short: \p{Ccc=CCC19}) (2: U+05B9..05BA) T \p{Canonical_Combining_Class: 20} \p{Canonical_Combining_Class= CCC20} (1) \p{Canonical_Combining_Class: CCC20} (Short: \p{Ccc=CCC20}) (1: U+05BB) T \p{Canonical_Combining_Class: 21} \p{Canonical_Combining_Class= CCC21} (1) \p{Canonical_Combining_Class: CCC21} (Short: \p{Ccc=CCC21}) (1: U+05BC) T \p{Canonical_Combining_Class: 22} \p{Canonical_Combining_Class= CCC22} (1) \p{Canonical_Combining_Class: CCC22} (Short: \p{Ccc=CCC22}) (1: U+05BD) T \p{Canonical_Combining_Class: 23} \p{Canonical_Combining_Class= CCC23} (1) \p{Canonical_Combining_Class: CCC23} (Short: \p{Ccc=CCC23}) (1: U+05BF) T \p{Canonical_Combining_Class: 24} \p{Canonical_Combining_Class= CCC24} (1) \p{Canonical_Combining_Class: CCC24} (Short: \p{Ccc=CCC24}) (1: U+05C1) T \p{Canonical_Combining_Class: 25} \p{Canonical_Combining_Class= CCC25} (1) \p{Canonical_Combining_Class: CCC25} (Short: \p{Ccc=CCC25}) (1: U+05C2) T \p{Canonical_Combining_Class: 26} \p{Canonical_Combining_Class= CCC26} (1) \p{Canonical_Combining_Class: CCC26} (Short: \p{Ccc=CCC26}) (1: U+FB1E) T \p{Canonical_Combining_Class: 27} \p{Canonical_Combining_Class= CCC27} (2) \p{Canonical_Combining_Class: CCC27} (Short: \p{Ccc=CCC27}) (2: U+064B, U+08F0) T \p{Canonical_Combining_Class: 28} \p{Canonical_Combining_Class= CCC28} (2) \p{Canonical_Combining_Class: CCC28} (Short: \p{Ccc=CCC28}) (2: U+064C, U+08F1) T \p{Canonical_Combining_Class: 29} \p{Canonical_Combining_Class= CCC29} (2) \p{Canonical_Combining_Class: CCC29} (Short: \p{Ccc=CCC29}) (2: U+064D, U+08F2) T \p{Canonical_Combining_Class: 30} \p{Canonical_Combining_Class= CCC30} (2) \p{Canonical_Combining_Class: CCC30} (Short: \p{Ccc=CCC30}) (2: U+0618, U+064E) T \p{Canonical_Combining_Class: 31} \p{Canonical_Combining_Class= CCC31} (2) \p{Canonical_Combining_Class: CCC31} (Short: \p{Ccc=CCC31}) (2: U+0619, U+064F) T \p{Canonical_Combining_Class: 32} \p{Canonical_Combining_Class= CCC32} (2) \p{Canonical_Combining_Class: CCC32} (Short: \p{Ccc=CCC32}) (2: U+061A, U+0650) T \p{Canonical_Combining_Class: 33} \p{Canonical_Combining_Class= CCC33} (1) \p{Canonical_Combining_Class: CCC33} (Short: \p{Ccc=CCC33}) (1: U+0651) T \p{Canonical_Combining_Class: 34} \p{Canonical_Combining_Class= CCC34} (1) \p{Canonical_Combining_Class: CCC34} (Short: \p{Ccc=CCC34}) (1: U+0652) T \p{Canonical_Combining_Class: 35} \p{Canonical_Combining_Class= CCC35} (1) \p{Canonical_Combining_Class: CCC35} (Short: \p{Ccc=CCC35}) (1: U+0670) T \p{Canonical_Combining_Class: 36} \p{Canonical_Combining_Class= CCC36} (1) \p{Canonical_Combining_Class: CCC36} (Short: \p{Ccc=CCC36}) (1: U+0711) T \p{Canonical_Combining_Class: 84} \p{Canonical_Combining_Class= CCC84} (1) \p{Canonical_Combining_Class: CCC84} (Short: \p{Ccc=CCC84}) (1: U+0C55) T \p{Canonical_Combining_Class: 91} \p{Canonical_Combining_Class= CCC91} (1) \p{Canonical_Combining_Class: CCC91} (Short: \p{Ccc=CCC91}) (1: U+0C56) T \p{Canonical_Combining_Class: 103} \p{Canonical_Combining_Class= CCC103} (2) \p{Canonical_Combining_Class: CCC103} (Short: \p{Ccc=CCC103}) (2: U+0E38..0E39) T \p{Canonical_Combining_Class: 107} \p{Canonical_Combining_Class= CCC107} (4) \p{Canonical_Combining_Class: CCC107} (Short: \p{Ccc=CCC107}) (4: U+0E48..0E4B) T \p{Canonical_Combining_Class: 118} \p{Canonical_Combining_Class= CCC118} (2) \p{Canonical_Combining_Class: CCC118} (Short: \p{Ccc=CCC118}) (2: U+0EB8..0EB9) T \p{Canonical_Combining_Class: 122} \p{Canonical_Combining_Class= CCC122} (4) \p{Canonical_Combining_Class: CCC122} (Short: \p{Ccc=CCC122}) (4: U+0EC8..0ECB) T \p{Canonical_Combining_Class: 129} \p{Canonical_Combining_Class= CCC129} (1) \p{Canonical_Combining_Class: CCC129} (Short: \p{Ccc=CCC129}) (1: U+0F71) T \p{Canonical_Combining_Class: 130} \p{Canonical_Combining_Class= CCC130} (6) \p{Canonical_Combining_Class: CCC130} (Short: \p{Ccc=CCC130}) (6: U+0F72, U+0F7A..0F7D, U+0F80) T \p{Canonical_Combining_Class: 132} \p{Canonical_Combining_Class= CCC132} (1) \p{Canonical_Combining_Class: CCC132} (Short: \p{Ccc=CCC132}) (1: U+0F74) T \p{Canonical_Combining_Class: 133} \p{Canonical_Combining_Class= CCC133} (0) \p{Canonical_Combining_Class: CCC133} (Short: \p{Ccc=CCC133}) (0) T \p{Canonical_Combining_Class: 200} \p{Canonical_Combining_Class= Attached_Below_Left} (0) T \p{Canonical_Combining_Class: 202} \p{Canonical_Combining_Class= Attached_Below} (5) T \p{Canonical_Combining_Class: 214} \p{Canonical_Combining_Class= Attached_Above} (1) T \p{Canonical_Combining_Class: 216} \p{Canonical_Combining_Class= Attached_Above_Right} (9) T \p{Canonical_Combining_Class: 218} \p{Canonical_Combining_Class= Below_Left} (1) T \p{Canonical_Combining_Class: 220} \p{Canonical_Combining_Class= Below} (165) T \p{Canonical_Combining_Class: 222} \p{Canonical_Combining_Class= Below_Right} (4) T \p{Canonical_Combining_Class: 224} \p{Canonical_Combining_Class= Left} (2) T \p{Canonical_Combining_Class: 226} \p{Canonical_Combining_Class= Right} (1) T \p{Canonical_Combining_Class: 228} \p{Canonical_Combining_Class= Above_Left} (5) T \p{Canonical_Combining_Class: 230} \p{Canonical_Combining_Class= Above} (484) T \p{Canonical_Combining_Class: 232} \p{Canonical_Combining_Class= Above_Right} (5) T \p{Canonical_Combining_Class: 233} \p{Canonical_Combining_Class= Double_Below} (4) T \p{Canonical_Combining_Class: 234} \p{Canonical_Combining_Class= Double_Above} (5) T \p{Canonical_Combining_Class: 240} \p{Canonical_Combining_Class= Iota_Subscript} (1) \p{Canonical_Combining_Class: A} \p{Canonical_Combining_Class= Above} (484) \p{Canonical_Combining_Class: Above} (Short: \p{Ccc=A}) (484: U+0300..0314, U+033D..0344, U+0346, U+034A..034C, U+0350..0352, U+0357 ...) \p{Canonical_Combining_Class: Above_Left} (Short: \p{Ccc=AL}) (5: U+05AE, U+18A9, U+1DF7..1DF8, U+302B) \p{Canonical_Combining_Class: Above_Right} (Short: \p{Ccc=AR}) (5: U+0315, U+031A, U+0358, U+1DF6, U+302C) \p{Canonical_Combining_Class: AL} \p{Canonical_Combining_Class= Above_Left} (5) \p{Canonical_Combining_Class: AR} \p{Canonical_Combining_Class= Above_Right} (5) \p{Canonical_Combining_Class: ATA} \p{Canonical_Combining_Class= Attached_Above} (1) \p{Canonical_Combining_Class: ATAR} \p{Canonical_Combining_Class= Attached_Above_Right} (9) \p{Canonical_Combining_Class: ATB} \p{Canonical_Combining_Class= Attached_Below} (5) \p{Canonical_Combining_Class: ATBL} \p{Canonical_Combining_Class= Attached_Below_Left} (0) \p{Canonical_Combining_Class: Attached_Above} (Short: \p{Ccc=ATA}) (1: U+1DCE) \p{Canonical_Combining_Class: Attached_Above_Right} (Short: \p{Ccc=ATAR}) (9: U+031B, U+0F39, U+1D165..1D166, U+1D16E..1D172) \p{Canonical_Combining_Class: Attached_Below} (Short: \p{Ccc=ATB}) (5: U+0321..0322, U+0327..0328, U+1DD0) \p{Canonical_Combining_Class: Attached_Below_Left} (Short: \p{Ccc= ATBL}) (0) \p{Canonical_Combining_Class: B} \p{Canonical_Combining_Class= Below} (165) \p{Canonical_Combining_Class: Below} (Short: \p{Ccc=B}) (165: U+0316..0319, U+031C..0320, U+0323..0326, U+0329..0333, U+0339..033C, U+0347..0349 ...) \p{Canonical_Combining_Class: Below_Left} (Short: \p{Ccc=BL}) (1: U+302A) \p{Canonical_Combining_Class: Below_Right} (Short: \p{Ccc=BR}) (4: U+059A, U+05AD, U+1939, U+302D) \p{Canonical_Combining_Class: BL} \p{Canonical_Combining_Class= Below_Left} (1) \p{Canonical_Combining_Class: BR} \p{Canonical_Combining_Class= Below_Right} (4) \p{Canonical_Combining_Class: DA} \p{Canonical_Combining_Class= Double_Above} (5) \p{Canonical_Combining_Class: DB} \p{Canonical_Combining_Class= Double_Below} (4) \p{Canonical_Combining_Class: Double_Above} (Short: \p{Ccc=DA}) (5: U+035D..035E, U+0360..0361, U+1DCD) \p{Canonical_Combining_Class: Double_Below} (Short: \p{Ccc=DB}) (4: U+035C, U+035F, U+0362, U+1DFC) \p{Canonical_Combining_Class: Han_Reading} (Short: \p{Ccc=HANR}) (2: U+16FF0..16FF1) \p{Canonical_Combining_Class: HANR} \p{Canonical_Combining_Class= Han_Reading} (2) \p{Canonical_Combining_Class: Iota_Subscript} (Short: \p{Ccc=IS}) (1: U+0345) \p{Canonical_Combining_Class: IS} \p{Canonical_Combining_Class= Iota_Subscript} (1) \p{Canonical_Combining_Class: Kana_Voicing} (Short: \p{Ccc=KV}) (2: U+3099..309A) \p{Canonical_Combining_Class: KV} \p{Canonical_Combining_Class= Kana_Voicing} (2) \p{Canonical_Combining_Class: L} \p{Canonical_Combining_Class= Left} (2) \p{Canonical_Combining_Class: Left} (Short: \p{Ccc=L}) (2: U+302E..302F) \p{Canonical_Combining_Class: NK} \p{Canonical_Combining_Class= Nukta} (26) \p{Canonical_Combining_Class: Not_Reordered} (Short: \p{Ccc=NR}) (1_113_240 plus all above-Unicode code points: U+0000..02FF, U+034F, U+0370..0482, U+0488..0590, U+05BE, U+05C0 ...) \p{Canonical_Combining_Class: NR} \p{Canonical_Combining_Class= Not_Reordered} (1_113_240 plus all above-Unicode code points) \p{Canonical_Combining_Class: Nukta} (Short: \p{Ccc=NK}) (26: U+093C, U+09BC, U+0A3C, U+0ABC, U+0B3C, U+0CBC ...) \p{Canonical_Combining_Class: OV} \p{Canonical_Combining_Class= Overlay} (32) \p{Canonical_Combining_Class: Overlay} (Short: \p{Ccc=OV}) (32: U+0334..0338, U+1CD4, U+1CE2..1CE8, U+20D2..20D3, U+20D8..20DA, U+20E5..20E6 ...) \p{Canonical_Combining_Class: R} \p{Canonical_Combining_Class= Right} (1) \p{Canonical_Combining_Class: Right} (Short: \p{Ccc=R}) (1: U+1D16D) \p{Canonical_Combining_Class: Virama} (Short: \p{Ccc=VR}) (61: U+094D, U+09CD, U+0A4D, U+0ACD, U+0B4D, U+0BCD ...) \p{Canonical_Combining_Class: VR} \p{Canonical_Combining_Class= Virama} (61) \p{Cans} \p{Canadian_Aboriginal} (= \p{Script_Extensions= Canadian_Aboriginal}) (710) \p{Cari} \p{Carian} (= \p{Script_Extensions= Carian}) (NOT \p{Block=Carian}) (49) \p{Carian} \p{Script_Extensions=Carian} (Short: \p{Cari}; NOT \p{Block=Carian}) (49) \p{Case_Ignorable} \p{Case_Ignorable=Y} (Short: \p{CI}) (2413) \p{Case_Ignorable: N*} (Short: \p{CI=N}, \P{CI}) (1_111_699 plus all above-Unicode code points: [\x00- \x20!\"#\$\%&\(\)*+,\-\/0-9;<=>?\@A-Z \[\\\]_a-z\{\|\}~\x7f-\xa7\xa9-\xac\xae \xb0-\xb3\xb5-\xb6\xb9-\xff], U+0100..02AF, U+0370..0373, U+0376..0379, U+037B..0383, U+0386 ...) \p{Case_Ignorable: Y*} (Short: \p{CI=Y}, \p{CI}) (2413: [\'.:\^` \xa8\xad\xaf\xb4\xb7-\xb8], U+02B0..036F, U+0374..0375, U+037A, U+0384..0385, U+0387 ...) \p{Cased} \p{Cased=Y} (4286) \p{Cased: N*} (Single: \P{Cased}) (1_109_826 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*+,\-.\/0-9:;<=>?\@\[\\\] \^_`\{\|\}~\x7f-\xa9\xab-\xb4\xb6-\xb9 \xbb-\xbf\xd7\xf7], U+01BB, U+01C0..01C3, U+0294, U+02B9..02BF, U+02C2..02DF ...) \p{Cased: Y*} (Single: \p{Cased}) (4286: [A-Za-z\xaa \xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..01BA, U+01BC..01BF, U+01C4..0293, U+0295..02B8, U+02C0..02C1 ...) \p{Cased_Letter} \p{General_Category=Cased_Letter} (Short: \p{LC}) (3977) \p{Category: *} \p{General_Category: *} \p{Caucasian_Albanian} \p{Script_Extensions=Caucasian_Albanian} (Short: \p{Aghb}; NOT \p{Block= Caucasian_Albanian}) (53) \p{Cc} \p{XPosixCntrl} (= \p{General_Category= Control}) (65) \p{Ccc: *} \p{Canonical_Combining_Class: *} \p{CE} \p{Composition_Exclusion} (= \p{Composition_Exclusion=Y}) (81) \p{CE: *} \p{Composition_Exclusion: *} \p{Cf} \p{Format} (= \p{General_Category=Format}) (161) \p{Chakma} \p{Script_Extensions=Chakma} (Short: \p{Cakm}; NOT \p{Block=Chakma}) (91) \p{Cham} \p{Script_Extensions=Cham} (NOT \p{Block= Cham}) (83) \p{Changes_When_Casefolded} \p{Changes_When_Casefolded=Y} (Short: \p{CWCF}) (1466) \p{Changes_When_Casefolded: N*} (Short: \p{CWCF=N}, \P{CWCF}) (1_112_646 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@\[\\\]\^_`a-z\{\|\}~\x7f- \xb4\xb6-\xbf\xd7\xe0-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Changes_When_Casefolded: Y*} (Short: \p{CWCF=Y}, \p{CWCF}) (1466: [A-Z\xb5\xc0-\xd6\xd8-\xdf], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Changes_When_Casemapped} \p{Changes_When_Casemapped=Y} (Short: \p{CWCM}) (2847) \p{Changes_When_Casemapped: N*} (Short: \p{CWCM=N}, \P{CWCM}) (1_111_265 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@\[\\\]\^_`\{\|\}~\x7f-\xb4 \xb6-\xbf\xd7\xf7], U+0138, U+018D, U+019B, U+01AA..01AB, U+01BA..01BB ...) \p{Changes_When_Casemapped: Y*} (Short: \p{CWCM=Y}, \p{CWCM}) (2847: [A-Za-z\xb5\xc0-\xd6\xd8-\xf6 \xf8-\xff], U+0100..0137, U+0139..018C, U+018E..019A, U+019C..01A9, U+01AC..01B9 ...) \p{Changes_When_Lowercased} \p{Changes_When_Lowercased=Y} (Short: \p{CWL}) (1393) \p{Changes_When_Lowercased: N*} (Short: \p{CWL=N}, \P{CWL}) (1_112_719 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@\[\\\]\^_`a-z\{\|\}~\x7f- \xbf\xd7\xdf-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Changes_When_Lowercased: Y*} (Short: \p{CWL=Y}, \p{CWL}) (1393: [A-Z\xc0-\xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Changes_When_NFKC_Casefolded} \p{Changes_When_NFKC_Casefolded= Y} (Short: \p{CWKCF}) (10_329) \p{Changes_When_NFKC_Casefolded: N*} (Short: \p{CWKCF=N}, \P{CWKCF}) (1_103_783 plus all above- Unicode code points: [\x00-\x20!\"#\$ \%&\'\(\)*+,\-.\/0-9:;<=>?\@\[\\\]\^_`a- z\{\|\}~\x7f-\x9f\xa1-\xa7\xa9\xab-\xac \xae\xb0-\xb1\xb6-\xb7\xbb\xbf\xd7\xe0- \xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Changes_When_NFKC_Casefolded: Y*} (Short: \p{CWKCF=Y}, \p{CWKCF}) (10_329: [A-Z\xa0\xa8\xaa \xad\xaf\xb2-\xb5\xb8-\xba\xbc-\xbe\xc0- \xd6\xd8-\xdf], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Changes_When_Titlecased} \p{Changes_When_Titlecased=Y} (Short: \p{CWT}) (1412) \p{Changes_When_Titlecased: N*} (Short: \p{CWT=N}, \P{CWT}) (1_112_700 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@A-Z\[\\\]\^_`\{\|\}~\x7f- \xb4\xb6-\xde\xf7], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Changes_When_Titlecased: Y*} (Short: \p{CWT=Y}, \p{CWT}) (1412: [a-z\xb5\xdf-\xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Changes_When_Uppercased} \p{Changes_When_Uppercased=Y} (Short: \p{CWU}) (1485) \p{Changes_When_Uppercased: N*} (Short: \p{CWU=N}, \P{CWU}) (1_112_627 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@A-Z\[\\\]\^_`\{\|\}~\x7f- \xb4\xb6-\xde\xf7], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Changes_When_Uppercased: Y*} (Short: \p{CWU=Y}, \p{CWU}) (1485: [a-z\xb5\xdf-\xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Cher} \p{Cherokee} (= \p{Script_Extensions= Cherokee}) (NOT \p{Block=Cherokee}) (172) \p{Cherokee} \p{Script_Extensions=Cherokee} (Short: \p{Cher}; NOT \p{Block=Cherokee}) (172) X \p{Cherokee_Sup} \p{Cherokee_Supplement} (= \p{Block= Cherokee_Supplement}) (80) X \p{Cherokee_Supplement} \p{Block=Cherokee_Supplement} (Short: \p{InCherokeeSup}) (80) X \p{Chess_Symbols} \p{Block=Chess_Symbols} (112) \p{Chorasmian} \p{Script_Extensions=Chorasmian} (Short: \p{Chrs}; NOT \p{Block=Chorasmian}) (28) \p{Chrs} \p{Chorasmian} (= \p{Script_Extensions= Chorasmian}) (NOT \p{Block=Chorasmian}) (28) \p{CI} \p{Case_Ignorable} (= \p{Case_Ignorable= Y}) (2413) \p{CI: *} \p{Case_Ignorable: *} X \p{CJK} \p{CJK_Unified_Ideographs} (= \p{Block= CJK_Unified_Ideographs}) (20_992) X \p{CJK_Compat} \p{CJK_Compatibility} (= \p{Block= CJK_Compatibility}) (256) X \p{CJK_Compat_Forms} \p{CJK_Compatibility_Forms} (= \p{Block= CJK_Compatibility_Forms}) (32) X \p{CJK_Compat_Ideographs} \p{CJK_Compatibility_Ideographs} (= \p{Block=CJK_Compatibility_Ideographs}) (512) X \p{CJK_Compat_Ideographs_Sup} \p{CJK_Compatibility_Ideographs_- Supplement} (= \p{Block= CJK_Compatibility_Ideographs_- Supplement}) (544) X \p{CJK_Compatibility} \p{Block=CJK_Compatibility} (Short: \p{InCJKCompat}) (256) X \p{CJK_Compatibility_Forms} \p{Block=CJK_Compatibility_Forms} (Short: \p{InCJKCompatForms}) (32) X \p{CJK_Compatibility_Ideographs} \p{Block= CJK_Compatibility_Ideographs} (Short: \p{InCJKCompatIdeographs}) (512) X \p{CJK_Compatibility_Ideographs_Supplement} \p{Block= CJK_Compatibility_Ideographs_Supplement} (Short: \p{InCJKCompatIdeographsSup}) (544) X \p{CJK_Ext_A} \p{CJK_Unified_Ideographs_Extension_A} (= \p{Block= CJK_Unified_Ideographs_Extension_A}) (6592) X \p{CJK_Ext_B} \p{CJK_Unified_Ideographs_Extension_B} (= \p{Block= CJK_Unified_Ideographs_Extension_B}) (42_720) X \p{CJK_Ext_C} \p{CJK_Unified_Ideographs_Extension_C} (= \p{Block= CJK_Unified_Ideographs_Extension_C}) (4160) X \p{CJK_Ext_D} \p{CJK_Unified_Ideographs_Extension_D} (= \p{Block= CJK_Unified_Ideographs_Extension_D}) (224) X \p{CJK_Ext_E} \p{CJK_Unified_Ideographs_Extension_E} (= \p{Block= CJK_Unified_Ideographs_Extension_E}) (5776) X \p{CJK_Ext_F} \p{CJK_Unified_Ideographs_Extension_F} (= \p{Block= CJK_Unified_Ideographs_Extension_F}) (7488) X \p{CJK_Ext_G} \p{CJK_Unified_Ideographs_Extension_G} (= \p{Block= CJK_Unified_Ideographs_Extension_G}) (4944) X \p{CJK_Radicals_Sup} \p{CJK_Radicals_Supplement} (= \p{Block= CJK_Radicals_Supplement}) (128) X \p{CJK_Radicals_Supplement} \p{Block=CJK_Radicals_Supplement} (Short: \p{InCJKRadicalsSup}) (128) X \p{CJK_Strokes} \p{Block=CJK_Strokes} (48) X \p{CJK_Symbols} \p{CJK_Symbols_And_Punctuation} (= \p{Block=CJK_Symbols_And_Punctuation}) (64) X \p{CJK_Symbols_And_Punctuation} \p{Block= CJK_Symbols_And_Punctuation} (Short: \p{InCJKSymbols}) (64) X \p{CJK_Unified_Ideographs} \p{Block=CJK_Unified_Ideographs} (Short: \p{InCJK}) (20_992) X \p{CJK_Unified_Ideographs_Extension_A} \p{Block= CJK_Unified_Ideographs_Extension_A} (Short: \p{InCJKExtA}) (6592) X \p{CJK_Unified_Ideographs_Extension_B} \p{Block= CJK_Unified_Ideographs_Extension_B} (Short: \p{InCJKExtB}) (42_720) X \p{CJK_Unified_Ideographs_Extension_C} \p{Block= CJK_Unified_Ideographs_Extension_C} (Short: \p{InCJKExtC}) (4160) X \p{CJK_Unified_Ideographs_Extension_D} \p{Block= CJK_Unified_Ideographs_Extension_D} (Short: \p{InCJKExtD}) (224) X \p{CJK_Unified_Ideographs_Extension_E} \p{Block= CJK_Unified_Ideographs_Extension_E} (Short: \p{InCJKExtE}) (5776) X \p{CJK_Unified_Ideographs_Extension_F} \p{Block= CJK_Unified_Ideographs_Extension_F} (Short: \p{InCJKExtF}) (7488) X \p{CJK_Unified_Ideographs_Extension_G} \p{Block= CJK_Unified_Ideographs_Extension_G} (Short: \p{InCJKExtG}) (4944) \p{Close_Punctuation} \p{General_Category=Close_Punctuation} (Short: \p{Pe}) (73) \p{Cn} \p{Unassigned} (= \p{General_Category= Unassigned}) (830_672 plus all above- Unicode code points) \p{Cntrl} \p{XPosixCntrl} (= \p{General_Category= Control}) (65) \p{Co} \p{Private_Use} (= \p{General_Category= Private_Use}) (NOT \p{Private_Use_Area}) (137_468) X \p{Combining_Diacritical_Marks} \p{Block= Combining_Diacritical_Marks} (Short: \p{InDiacriticals}) (112) X \p{Combining_Diacritical_Marks_Extended} \p{Block= Combining_Diacritical_Marks_Extended} (Short: \p{InDiacriticalsExt}) (80) X \p{Combining_Diacritical_Marks_For_Symbols} \p{Block= Combining_Diacritical_Marks_For_Symbols} (Short: \p{InDiacriticalsForSymbols}) (48) X \p{Combining_Diacritical_Marks_Supplement} \p{Block= Combining_Diacritical_Marks_Supplement} (Short: \p{InDiacriticalsSup}) (64) X \p{Combining_Half_Marks} \p{Block=Combining_Half_Marks} (Short: \p{InHalfMarks}) (16) \p{Combining_Mark} \p{Mark} (= \p{General_Category=Mark}) (2295) X \p{Combining_Marks_For_Symbols} \p{Combining_Diacritical_Marks_For_- Symbols} (= \p{Block= Combining_Diacritical_Marks_For_- Symbols}) (48) \p{Common} \p{Script_Extensions=Common} (Short: \p{Zyyy}) (7661) X \p{Common_Indic_Number_Forms} \p{Block=Common_Indic_Number_Forms} (Short: \p{InIndicNumberForms}) (16) \p{Comp_Ex} \p{Full_Composition_Exclusion} (= \p{Full_Composition_Exclusion=Y}) (1120) \p{Comp_Ex: *} \p{Full_Composition_Exclusion: *} X \p{Compat_Jamo} \p{Hangul_Compatibility_Jamo} (= \p{Block= Hangul_Compatibility_Jamo}) (96) \p{Composition_Exclusion} \p{Composition_Exclusion=Y} (Short: \p{CE}) (81) \p{Composition_Exclusion: N*} (Short: \p{CE=N}, \P{CE}) (1_114_031 plus all above-Unicode code points: U+0000..0957, U+0960..09DB, U+09DE, U+09E0..0A32, U+0A34..0A35, U+0A37..0A58 ...) \p{Composition_Exclusion: Y*} (Short: \p{CE=Y}, \p{CE}) (81: U+0958..095F, U+09DC..09DD, U+09DF, U+0A33, U+0A36, U+0A59..0A5B ...) \p{Connector_Punctuation} \p{General_Category= Connector_Punctuation} (Short: \p{Pc}) (10) \p{Control} \p{XPosixCntrl} (= \p{General_Category= Control}) (65) X \p{Control_Pictures} \p{Block=Control_Pictures} (64) \p{Copt} \p{Coptic} (= \p{Script_Extensions= Coptic}) (NOT \p{Block=Coptic}) (165) \p{Coptic} \p{Script_Extensions=Coptic} (Short: \p{Copt}; NOT \p{Block=Coptic}) (165) X \p{Coptic_Epact_Numbers} \p{Block=Coptic_Epact_Numbers} (32) X \p{Counting_Rod} \p{Counting_Rod_Numerals} (= \p{Block= Counting_Rod_Numerals}) (32) X \p{Counting_Rod_Numerals} \p{Block=Counting_Rod_Numerals} (Short: \p{InCountingRod}) (32) \p{Cprt} \p{Cypriot} (= \p{Script_Extensions= Cypriot}) (112) \p{Cs} \p{Surrogate} (= \p{General_Category= Surrogate}) (2048) \p{Cuneiform} \p{Script_Extensions=Cuneiform} (Short: \p{Xsux}; NOT \p{Block=Cuneiform}) (1234) X \p{Cuneiform_Numbers} \p{Cuneiform_Numbers_And_Punctuation} (= \p{Block= Cuneiform_Numbers_And_Punctuation}) (128) X \p{Cuneiform_Numbers_And_Punctuation} \p{Block= Cuneiform_Numbers_And_Punctuation} (Short: \p{InCuneiformNumbers}) (128) \p{Currency_Symbol} \p{General_Category=Currency_Symbol} (Short: \p{Sc}) (62) X \p{Currency_Symbols} \p{Block=Currency_Symbols} (48) \p{CWCF} \p{Changes_When_Casefolded} (= \p{Changes_When_Casefolded=Y}) (1466) \p{CWCF: *} \p{Changes_When_Casefolded: *} \p{CWCM} \p{Changes_When_Casemapped} (= \p{Changes_When_Casemapped=Y}) (2847) \p{CWCM: *} \p{Changes_When_Casemapped: *} \p{CWKCF} \p{Changes_When_NFKC_Casefolded} (= \p{Changes_When_NFKC_Casefolded=Y}) (10_329) \p{CWKCF: *} \p{Changes_When_NFKC_Casefolded: *} \p{CWL} \p{Changes_When_Lowercased} (= \p{Changes_When_Lowercased=Y}) (1393) \p{CWL: *} \p{Changes_When_Lowercased: *} \p{CWT} \p{Changes_When_Titlecased} (= \p{Changes_When_Titlecased=Y}) (1412) \p{CWT: *} \p{Changes_When_Titlecased: *} \p{CWU} \p{Changes_When_Uppercased} (= \p{Changes_When_Uppercased=Y}) (1485) \p{CWU: *} \p{Changes_When_Uppercased: *} \p{Cypriot} \p{Script_Extensions=Cypriot} (Short: \p{Cprt}) (112) X \p{Cypriot_Syllabary} \p{Block=Cypriot_Syllabary} (64) \p{Cyrillic} \p{Script_Extensions=Cyrillic} (Short: \p{Cyrl}; NOT \p{Block=Cyrillic}) (447) X \p{Cyrillic_Ext_A} \p{Cyrillic_Extended_A} (= \p{Block= Cyrillic_Extended_A}) (32) X \p{Cyrillic_Ext_B} \p{Cyrillic_Extended_B} (= \p{Block= Cyrillic_Extended_B}) (96) X \p{Cyrillic_Ext_C} \p{Cyrillic_Extended_C} (= \p{Block= Cyrillic_Extended_C}) (16) X \p{Cyrillic_Extended_A} \p{Block=Cyrillic_Extended_A} (Short: \p{InCyrillicExtA}) (32) X \p{Cyrillic_Extended_B} \p{Block=Cyrillic_Extended_B} (Short: \p{InCyrillicExtB}) (96) X \p{Cyrillic_Extended_C} \p{Block=Cyrillic_Extended_C} (Short: \p{InCyrillicExtC}) (16) X \p{Cyrillic_Sup} \p{Cyrillic_Supplement} (= \p{Block= Cyrillic_Supplement}) (48) X \p{Cyrillic_Supplement} \p{Block=Cyrillic_Supplement} (Short: \p{InCyrillicSup}) (48) X \p{Cyrillic_Supplementary} \p{Cyrillic_Supplement} (= \p{Block= Cyrillic_Supplement}) (48) \p{Cyrl} \p{Cyrillic} (= \p{Script_Extensions= Cyrillic}) (NOT \p{Block=Cyrillic}) (447) \p{Dash} \p{Dash=Y} (29) \p{Dash: N*} (Single: \P{Dash}) (1_114_083 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*+,.\/0-9:;<=>?\@A-Z \[\\\]\^_`a-z\{\|\}~\x7f-\xff], U+0100..0589, U+058B..05BD, U+05BF..13FF, U+1401..1805, U+1807..200F ...) \p{Dash: Y*} (Single: \p{Dash}) (29: [\-], U+058A, U+05BE, U+1400, U+1806, U+2010..2015 ...) \p{Dash_Punctuation} \p{General_Category=Dash_Punctuation} (Short: \p{Pd}) (25) \p{Decimal_Number} \p{XPosixDigit} (= \p{General_Category= Decimal_Number}) (650) \p{Decomposition_Type: Can} \p{Decomposition_Type=Canonical} (13_233) \p{Decomposition_Type: Canonical} (Short: \p{Dt=Can}) (13_233: [\xc0-\xc5\xc7-\xcf\xd1-\xd6\xd9-\xdd \xe0-\xe5\xe7-\xef\xf1-\xf6\xf9-\xfd \xff], U+0100..010F, U+0112..0125, U+0128..0130, U+0134..0137, U+0139..013E ...) \p{Decomposition_Type: Circle} (Short: \p{Dt=Enc}) (240: U+2460..2473, U+24B6..24EA, U+3244..3247, U+3251..327E, U+3280..32BF, U+32D0..32FE ...) \p{Decomposition_Type: Com} \p{Decomposition_Type=Compat} (720) \p{Decomposition_Type: Compat} (Short: \p{Dt=Com}) (720: [\xa8 \xaf\xb4-\xb5\xb8], U+0132..0133, U+013F..0140, U+0149, U+017F, U+01C4..01CC ...) \p{Decomposition_Type: Enc} \p{Decomposition_Type=Circle} (240) \p{Decomposition_Type: Fin} \p{Decomposition_Type=Final} (240) \p{Decomposition_Type: Final} (Short: \p{Dt=Fin}) (240: U+FB51, U+FB53, U+FB57, U+FB5B, U+FB5F, U+FB63 ...) \p{Decomposition_Type: Font} (Short: \p{Dt=Font}) (1194: U+2102, U+210A..2113, U+2115, U+2119..211D, U+2124, U+2128 ...) \p{Decomposition_Type: Fra} \p{Decomposition_Type=Fraction} (20) \p{Decomposition_Type: Fraction} (Short: \p{Dt=Fra}) (20: [\xbc- \xbe], U+2150..215F, U+2189) \p{Decomposition_Type: Init} \p{Decomposition_Type=Initial} (171) \p{Decomposition_Type: Initial} (Short: \p{Dt=Init}) (171: U+FB54, U+FB58, U+FB5C, U+FB60, U+FB64, U+FB68 ...) \p{Decomposition_Type: Iso} \p{Decomposition_Type=Isolated} (238) \p{Decomposition_Type: Isolated} (Short: \p{Dt=Iso}) (238: U+FB50, U+FB52, U+FB56, U+FB5A, U+FB5E, U+FB62 ...) \p{Decomposition_Type: Med} \p{Decomposition_Type=Medial} (82) \p{Decomposition_Type: Medial} (Short: \p{Dt=Med}) (82: U+FB55, U+FB59, U+FB5D, U+FB61, U+FB65, U+FB69 ...) \p{Decomposition_Type: Nar} \p{Decomposition_Type=Narrow} (122) \p{Decomposition_Type: Narrow} (Short: \p{Dt=Nar}) (122: U+FF61..FFBE, U+FFC2..FFC7, U+FFCA..FFCF, U+FFD2..FFD7, U+FFDA..FFDC, U+FFE8..FFEE) \p{Decomposition_Type: Nb} \p{Decomposition_Type=Nobreak} (5) \p{Decomposition_Type: Nobreak} (Short: \p{Dt=Nb}) (5: [\xa0], U+0F0C, U+2007, U+2011, U+202F) \p{Decomposition_Type: Non_Canon} \p{Decomposition_Type= Non_Canonical} (Perl extension) (3675) \p{Decomposition_Type: Non_Canonical} Union of all non-canonical decompositions (Short: \p{Dt=NonCanon}) (Perl extension) (3675: [\xa0\xa8\xaa \xaf\xb2-\xb5\xb8-\xba\xbc-\xbe], U+0132..0133, U+013F..0140, U+0149, U+017F, U+01C4..01CC ...) \p{Decomposition_Type: None} (Short: \p{Dt=None}) (1_097_204 plus all above-Unicode code points: [\x00- \x9f\xa1-\xa7\xa9\xab-\xae\xb0-\xb1\xb6- \xb7\xbb\xbf\xc6\xd0\xd7-\xd8\xde-\xdf \xe6\xf0\xf7-\xf8\xfe], U+0110..0111, U+0126..0127, U+0131, U+0138, U+0141..0142 ...) \p{Decomposition_Type: Small} (Short: \p{Dt=Sml}) (26: U+FE50..FE52, U+FE54..FE66, U+FE68..FE6B) \p{Decomposition_Type: Sml} \p{Decomposition_Type=Small} (26) \p{Decomposition_Type: Sqr} \p{Decomposition_Type=Square} (286) \p{Decomposition_Type: Square} (Short: \p{Dt=Sqr}) (286: U+3250, U+32CC..32CF, U+32FF..3357, U+3371..33DF, U+33FF, U+1F130..1F14F ...) \p{Decomposition_Type: Sub} (Short: \p{Dt=Sub}) (38: U+1D62..1D6A, U+2080..208E, U+2090..209C, U+2C7C) \p{Decomposition_Type: Sup} \p{Decomposition_Type=Super} (154) \p{Decomposition_Type: Super} (Short: \p{Dt=Sup}) (154: [\xaa\xb2- \xb3\xb9-\xba], U+02B0..02B8, U+02E0..02E4, U+10FC, U+1D2C..1D2E, U+1D30..1D3A ...) \p{Decomposition_Type: Vert} \p{Decomposition_Type=Vertical} (35) \p{Decomposition_Type: Vertical} (Short: \p{Dt=Vert}) (35: U+309F, U+30FF, U+FE10..FE19, U+FE30..FE44, U+FE47..FE48) \p{Decomposition_Type: Wide} (Short: \p{Dt=Wide}) (104: U+3000, U+FF01..FF60, U+FFE0..FFE6) \p{Default_Ignorable_Code_Point} \p{Default_Ignorable_Code_Point= Y} (Short: \p{DI}) (4173) \p{Default_Ignorable_Code_Point: N*} (Short: \p{DI=N}, \P{DI}) (1_109_939 plus all above-Unicode code points: [\x00-\xac\xae-\xff], U+0100..034E, U+0350..061B, U+061D..115E, U+1161..17B3, U+17B6..180A ...) \p{Default_Ignorable_Code_Point: Y*} (Short: \p{DI=Y}, \p{DI}) (4173: [\xad], U+034F, U+061C, U+115F..1160, U+17B4..17B5, U+180B..180E ...) \p{Dep} \p{Deprecated} (= \p{Deprecated=Y}) (15) \p{Dep: *} \p{Deprecated: *} \p{Deprecated} \p{Deprecated=Y} (Short: \p{Dep}) (15) \p{Deprecated: N*} (Short: \p{Dep=N}, \P{Dep}) (1_114_097 plus all above-Unicode code points: U+0000..0148, U+014A..0672, U+0674..0F76, U+0F78, U+0F7A..17A2, U+17A5..2069 ...) \p{Deprecated: Y*} (Short: \p{Dep=Y}, \p{Dep}) (15: U+0149, U+0673, U+0F77, U+0F79, U+17A3..17A4, U+206A..206F ...) \p{Deseret} \p{Script_Extensions=Deseret} (Short: \p{Dsrt}) (80) \p{Deva} \p{Devanagari} (= \p{Script_Extensions= Devanagari}) (NOT \p{Block=Devanagari}) (210) \p{Devanagari} \p{Script_Extensions=Devanagari} (Short: \p{Deva}; NOT \p{Block=Devanagari}) (210) X \p{Devanagari_Ext} \p{Devanagari_Extended} (= \p{Block= Devanagari_Extended}) (32) X \p{Devanagari_Extended} \p{Block=Devanagari_Extended} (Short: \p{InDevanagariExt}) (32) \p{DI} \p{Default_Ignorable_Code_Point} (= \p{Default_Ignorable_Code_Point=Y}) (4173) \p{DI: *} \p{Default_Ignorable_Code_Point: *} \p{Dia} \p{Diacritic} (= \p{Diacritic=Y}) (882) \p{Dia: *} \p{Diacritic: *} \p{Diacritic} \p{Diacritic=Y} (Short: \p{Dia}) (882) \p{Diacritic: N*} (Short: \p{Dia=N}, \P{Dia}) (1_113_230 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/0-9:;<= >?\@A-Z\[\\\]_a-z\{\|\}~\x7f-\xa7\xa9- \xae\xb0-\xb3\xb5-\xb6\xb9-\xff], U+0100..02AF, U+034F, U+0358..035C, U+0363..0373, U+0376..0379 ...) \p{Diacritic: Y*} (Short: \p{Dia=Y}, \p{Dia}) (882: [\^` \xa8\xaf\xb4\xb7-\xb8], U+02B0..034E, U+0350..0357, U+035D..0362, U+0374..0375, U+037A ...) X \p{Diacriticals} \p{Combining_Diacritical_Marks} (= \p{Block=Combining_Diacritical_Marks}) (112) X \p{Diacriticals_Ext} \p{Combining_Diacritical_Marks_Extended} (= \p{Block= Combining_Diacritical_Marks_Extended}) (80) X \p{Diacriticals_For_Symbols} \p{Combining_Diacritical_Marks_For_- Symbols} (= \p{Block= Combining_Diacritical_Marks_For_- Symbols}) (48) X \p{Diacriticals_Sup} \p{Combining_Diacritical_Marks_Supplement} (= \p{Block= Combining_Diacritical_Marks_Supplement}) (64) \p{Diak} \p{Dives_Akuru} (= \p{Script_Extensions= Dives_Akuru}) (NOT \p{Block= Dives_Akuru}) (72) \p{Digit} \p{XPosixDigit} (= \p{General_Category= Decimal_Number}) (650) X \p{Dingbats} \p{Block=Dingbats} (192) \p{Dives_Akuru} \p{Script_Extensions=Dives_Akuru} (Short: \p{Diak}; NOT \p{Block=Dives_Akuru}) (72) \p{Dogr} \p{Dogra} (= \p{Script_Extensions=Dogra}) (NOT \p{Block=Dogra}) (82) \p{Dogra} \p{Script_Extensions=Dogra} (Short: \p{Dogr}; NOT \p{Block=Dogra}) (82) X \p{Domino} \p{Domino_Tiles} (= \p{Block= Domino_Tiles}) (112) X \p{Domino_Tiles} \p{Block=Domino_Tiles} (Short: \p{InDomino}) (112) \p{Dsrt} \p{Deseret} (= \p{Script_Extensions= Deseret}) (80) \p{Dt: *} \p{Decomposition_Type: *} \p{Dupl} \p{Duployan} (= \p{Script_Extensions= Duployan}) (NOT \p{Block=Duployan}) (147) \p{Duployan} \p{Script_Extensions=Duployan} (Short: \p{Dupl}; NOT \p{Block=Duployan}) (147) \p{Ea: *} \p{East_Asian_Width: *} X \p{Early_Dynastic_Cuneiform} \p{Block=Early_Dynastic_Cuneiform} (208) \p{East_Asian_Width: A} \p{East_Asian_Width=Ambiguous} (138_739) \p{East_Asian_Width: Ambiguous} (Short: \p{Ea=A}) (138_739: [\xa1 \xa4\xa7-\xa8\xaa\xad-\xae\xb0-\xb4\xb6- \xba\xbc-\xbf\xc6\xd0\xd7-\xd8\xde-\xe1 \xe6\xe8-\xea\xec-\xed\xf0\xf2-\xf3\xf7- \xfa\xfc\xfe], U+0101, U+0111, U+0113, U+011B, U+0126..0127 ...) \p{East_Asian_Width: F} \p{East_Asian_Width=Fullwidth} (104) \p{East_Asian_Width: Fullwidth} (Short: \p{Ea=F}) (104: U+3000, U+FF01..FF60, U+FFE0..FFE6) \p{East_Asian_Width: H} \p{East_Asian_Width=Halfwidth} (123) \p{East_Asian_Width: Halfwidth} (Short: \p{Ea=H}) (123: U+20A9, U+FF61..FFBE, U+FFC2..FFC7, U+FFCA..FFCF, U+FFD2..FFD7, U+FFDA..FFDC ...) \p{East_Asian_Width: N} \p{East_Asian_Width=Neutral} (792_699 plus all above-Unicode code points) \p{East_Asian_Width: Na} \p{East_Asian_Width=Narrow} (111) \p{East_Asian_Width: Narrow} (Short: \p{Ea=Na}) (111: [\x20-\x7e \xa2-\xa3\xa5-\xa6\xac\xaf], U+27E6..27ED, U+2985..2986) \p{East_Asian_Width: Neutral} (Short: \p{Ea=N}) (792_699 plus all above-Unicode code points: [\x00-\x1f \x7f-\xa0\xa9\xab\xb5\xbb\xc0-\xc5\xc7- \xcf\xd1-\xd6\xd9-\xdd\xe2-\xe5\xe7\xeb \xee-\xef\xf1\xf4-\xf6\xfb\xfd\xff], U+00FF..0100, U+0102..0110, U+0112, U+0114..011A, U+011C..0125 ...) \p{East_Asian_Width: W} \p{East_Asian_Width=Wide} (182_336) \p{East_Asian_Width: Wide} (Short: \p{Ea=W}) (182_336: U+1100..115F, U+231A..231B, U+2329..232A, U+23E9..23EC, U+23F0, U+23F3 ...) \p{EBase} \p{Emoji_Modifier_Base} (= \p{Emoji_Modifier_Base=Y}) (122) \p{EBase: *} \p{Emoji_Modifier_Base: *} \p{EComp} \p{Emoji_Component} (= \p{Emoji_Component= Y}) (146) \p{EComp: *} \p{Emoji_Component: *} \p{Egyp} \p{Egyptian_Hieroglyphs} (= \p{Script_Extensions= Egyptian_Hieroglyphs}) (NOT \p{Block= Egyptian_Hieroglyphs}) (1080) X \p{Egyptian_Hieroglyph_Format_Controls} \p{Block= Egyptian_Hieroglyph_Format_Controls} (16) \p{Egyptian_Hieroglyphs} \p{Script_Extensions= Egyptian_Hieroglyphs} (Short: \p{Egyp}; NOT \p{Block=Egyptian_Hieroglyphs}) (1080) \p{Elba} \p{Elbasan} (= \p{Script_Extensions= Elbasan}) (NOT \p{Block=Elbasan}) (40) \p{Elbasan} \p{Script_Extensions=Elbasan} (Short: \p{Elba}; NOT \p{Block=Elbasan}) (40) \p{Elym} \p{Elymaic} (= \p{Script_Extensions= Elymaic}) (NOT \p{Block=Elymaic}) (23) \p{Elymaic} \p{Script_Extensions=Elymaic} (Short: \p{Elym}; NOT \p{Block=Elymaic}) (23) \p{EMod} \p{Emoji_Modifier} (= \p{Emoji_Modifier= Y}) (5) \p{EMod: *} \p{Emoji_Modifier: *} \p{Emoji} \p{Emoji=Y} (1367) \p{Emoji: N*} (Single: \P{Emoji}) (1_112_745 plus all above-Unicode code points: [\x00-\x20! \"\$\%&\'\(\)+,\-.\/:;<=>?\@A-Z\[\\\] \^_`a-z\{\|\}~\x7f-\xa8\xaa-\xad\xaf- \xff], U+0100..203B, U+203D..2048, U+204A..2121, U+2123..2138, U+213A..2193 ...) \p{Emoji: Y*} (Single: \p{Emoji}) (1367: [#*0-9\xa9 \xae], U+203C, U+2049, U+2122, U+2139, U+2194..2199 ...) \p{Emoji_Component} \p{Emoji_Component=Y} (Short: \p{EComp}) (146) \p{Emoji_Component: N*} (Short: \p{EComp=N}, \P{EComp}) (1_113_966 plus all above-Unicode code points: [\x00-\x20!\"\$\%&\'\(\)+,\-.\/:;<=>? \@A-Z\[\\\]\^_`a-z\{\|\}~\x7f-\xff], U+0100..200C, U+200E..20E2, U+20E4..FE0E, U+FE10..1F1E5, U+1F200..1F3FA ...) \p{Emoji_Component: Y*} (Short: \p{EComp=Y}, \p{EComp}) (146: [#*0-9], U+200D, U+20E3, U+FE0F, U+1F1E6..1F1FF, U+1F3FB..1F3FF ...) \p{Emoji_Modifier} \p{Emoji_Modifier=Y} (Short: \p{EMod}) (5) \p{Emoji_Modifier: N*} (Short: \p{EMod=N}, \P{EMod}) (1_114_107 plus all above-Unicode code points: U+0000..1F3FA, U+1F400..infinity) \p{Emoji_Modifier: Y*} (Short: \p{EMod=Y}, \p{EMod}) (5: U+1F3FB..1F3FF) \p{Emoji_Modifier_Base} \p{Emoji_Modifier_Base=Y} (Short: \p{EBase}) (122) \p{Emoji_Modifier_Base: N*} (Short: \p{EBase=N}, \P{EBase}) (1_113_990 plus all above-Unicode code points: U+0000..261C, U+261E..26F8, U+26FA..2709, U+270E..1F384, U+1F386..1F3C1, U+1F3C5..1F3C6 ...) \p{Emoji_Modifier_Base: Y*} (Short: \p{EBase=Y}, \p{EBase}) (122: U+261D, U+26F9, U+270A..270D, U+1F385, U+1F3C2..1F3C4, U+1F3C7 ...) \p{Emoji_Presentation} \p{Emoji_Presentation=Y} (Short: \p{EPres}) (1148) \p{Emoji_Presentation: N*} (Short: \p{EPres=N}, \P{EPres}) (1_112_964 plus all above-Unicode code points: U+0000..2319, U+231C..23E8, U+23ED..23EF, U+23F1..23F2, U+23F4..25FC, U+25FF..2613 ...) \p{Emoji_Presentation: Y*} (Short: \p{EPres=Y}, \p{EPres}) (1148: U+231A..231B, U+23E9..23EC, U+23F0, U+23F3, U+25FD..25FE, U+2614..2615 ...) X \p{Emoticons} \p{Block=Emoticons} (80) X \p{Enclosed_Alphanum} \p{Enclosed_Alphanumerics} (= \p{Block= Enclosed_Alphanumerics}) (160) X \p{Enclosed_Alphanum_Sup} \p{Enclosed_Alphanumeric_Supplement} (= \p{Block= Enclosed_Alphanumeric_Supplement}) (256) X \p{Enclosed_Alphanumeric_Supplement} \p{Block= Enclosed_Alphanumeric_Supplement} (Short: \p{InEnclosedAlphanumSup}) (256) X \p{Enclosed_Alphanumerics} \p{Block=Enclosed_Alphanumerics} (Short: \p{InEnclosedAlphanum}) (160) X \p{Enclosed_CJK} \p{Enclosed_CJK_Letters_And_Months} (= \p{Block= Enclosed_CJK_Letters_And_Months}) (256) X \p{Enclosed_CJK_Letters_And_Months} \p{Block= Enclosed_CJK_Letters_And_Months} (Short: \p{InEnclosedCJK}) (256) X \p{Enclosed_Ideographic_Sup} \p{Enclosed_Ideographic_Supplement} (= \p{Block= Enclosed_Ideographic_Supplement}) (256) X \p{Enclosed_Ideographic_Supplement} \p{Block= Enclosed_Ideographic_Supplement} (Short: \p{InEnclosedIdeographicSup}) (256) \p{Enclosing_Mark} \p{General_Category=Enclosing_Mark} (Short: \p{Me}) (13) \p{EPres} \p{Emoji_Presentation} (= \p{Emoji_Presentation=Y}) (1148) \p{EPres: *} \p{Emoji_Presentation: *} \p{Ethi} \p{Ethiopic} (= \p{Script_Extensions= Ethiopic}) (NOT \p{Block=Ethiopic}) (495) \p{Ethiopic} \p{Script_Extensions=Ethiopic} (Short: \p{Ethi}; NOT \p{Block=Ethiopic}) (495) X \p{Ethiopic_Ext} \p{Ethiopic_Extended} (= \p{Block= Ethiopic_Extended}) (96) X \p{Ethiopic_Ext_A} \p{Ethiopic_Extended_A} (= \p{Block= Ethiopic_Extended_A}) (48) X \p{Ethiopic_Extended} \p{Block=Ethiopic_Extended} (Short: \p{InEthiopicExt}) (96) X \p{Ethiopic_Extended_A} \p{Block=Ethiopic_Extended_A} (Short: \p{InEthiopicExtA}) (48) X \p{Ethiopic_Sup} \p{Ethiopic_Supplement} (= \p{Block= Ethiopic_Supplement}) (32) X \p{Ethiopic_Supplement} \p{Block=Ethiopic_Supplement} (Short: \p{InEthiopicSup}) (32) \p{Ext} \p{Extender} (= \p{Extender=Y}) (48) \p{Ext: *} \p{Extender: *} \p{Extended_Pictographic} \p{Extended_Pictographic=Y} (Short: \p{ExtPict}) (3537) \p{Extended_Pictographic: N*} (Short: \p{ExtPict=N}, \P{ExtPict}) (1_110_575 plus all above-Unicode code points: [\x00-\xa8\xaa-\xad\xaf-\xff], U+0100..203B, U+203D..2048, U+204A..2121, U+2123..2138, U+213A..2193 ...) \p{Extended_Pictographic: Y*} (Short: \p{ExtPict=Y}, \p{ExtPict}) (3537: [\xa9\xae], U+203C, U+2049, U+2122, U+2139, U+2194..2199 ...) \p{Extender} \p{Extender=Y} (Short: \p{Ext}) (48) \p{Extender: N*} (Short: \p{Ext=N}, \P{Ext}) (1_114_064 plus all above-Unicode code points: [\x00-\xb6\xb8-\xff], U+0100..02CF, U+02D2..063F, U+0641..07F9, U+07FB..0B54, U+0B56..0E45 ...) \p{Extender: Y*} (Short: \p{Ext=Y}, \p{Ext}) (48: [\xb7], U+02D0..02D1, U+0640, U+07FA, U+0B55, U+0E46 ...) \p{ExtPict} \p{Extended_Pictographic} (= \p{Extended_Pictographic=Y}) (3537) \p{ExtPict: *} \p{Extended_Pictographic: *} \p{Final_Punctuation} \p{General_Category=Final_Punctuation} (Short: \p{Pf}) (10) \p{Format} \p{General_Category=Format} (Short: \p{Cf}) (161) \p{Full_Composition_Exclusion} \p{Full_Composition_Exclusion=Y} (Short: \p{CompEx}) (1120) \p{Full_Composition_Exclusion: N*} (Short: \p{CompEx=N}, \P{CompEx}) (1_112_992 plus all above- Unicode code points: U+0000..033F, U+0342, U+0345..0373, U+0375..037D, U+037F..0386, U+0388..0957 ...) \p{Full_Composition_Exclusion: Y*} (Short: \p{CompEx=Y}, \p{CompEx}) (1120: U+0340..0341, U+0343..0344, U+0374, U+037E, U+0387, U+0958..095F ...) \p{Gc: *} \p{General_Category: *} \p{GCB: *} \p{Grapheme_Cluster_Break: *} \p{General_Category: C} \p{General_Category=Other} (970_414 plus all above-Unicode code points) \p{General_Category: Cased_Letter} [\p{Ll}\p{Lu}\p{Lt}] (Short: \p{Gc=LC}, \p{LC}) (3977: [A-Za-z\xb5 \xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..01BA, U+01BC..01BF, U+01C4..0293, U+0295..02AF, U+0370..0373 ...) \p{General_Category: Cc} \p{General_Category=Control} (65) \p{General_Category: Cf} \p{General_Category=Format} (161) \p{General_Category: Close_Punctuation} (Short: \p{Gc=Pe}, \p{Pe}) (73: [\)\]\}], U+0F3B, U+0F3D, U+169C, U+2046, U+207E ...) \p{General_Category: Cn} \p{General_Category=Unassigned} (830_672 plus all above-Unicode code points) \p{General_Category: Cntrl} \p{General_Category=Control} (65) \p{General_Category: Co} \p{General_Category=Private_Use} (137_468) \p{General_Category: Combining_Mark} \p{General_Category=Mark} (2295) \p{General_Category: Connector_Punctuation} (Short: \p{Gc=Pc}, \p{Pc}) (10: [_], U+203F..2040, U+2054, U+FE33..FE34, U+FE4D..FE4F, U+FF3F) \p{General_Category: Control} (Short: \p{Gc=Cc}, \p{Cc}) (65: [\x00-\x1f\x7f-\x9f]) \p{General_Category: Cs} \p{General_Category=Surrogate} (2048) \p{General_Category: Currency_Symbol} (Short: \p{Gc=Sc}, \p{Sc}) (62: [\$\xa2-\xa5], U+058F, U+060B, U+07FE..07FF, U+09F2..09F3, U+09FB ...) \p{General_Category: Dash_Punctuation} (Short: \p{Gc=Pd}, \p{Pd}) (25: [\-], U+058A, U+05BE, U+1400, U+1806, U+2010..2015 ...) \p{General_Category: Decimal_Number} (Short: \p{Gc=Nd}, \p{Nd}) (650: [0-9], U+0660..0669, U+06F0..06F9, U+07C0..07C9, U+0966..096F, U+09E6..09EF ...) \p{General_Category: Digit} \p{General_Category=Decimal_Number} (650) \p{General_Category: Enclosing_Mark} (Short: \p{Gc=Me}, \p{Me}) (13: U+0488..0489, U+1ABE, U+20DD..20E0, U+20E2..20E4, U+A670..A672) \p{General_Category: Final_Punctuation} (Short: \p{Gc=Pf}, \p{Pf}) (10: [\xbb], U+2019, U+201D, U+203A, U+2E03, U+2E05 ...) \p{General_Category: Format} (Short: \p{Gc=Cf}, \p{Cf}) (161: [\xad], U+0600..0605, U+061C, U+06DD, U+070F, U+08E2 ...) \p{General_Category: Initial_Punctuation} (Short: \p{Gc=Pi}, \p{Pi}) (12: [\xab], U+2018, U+201B..201C, U+201F, U+2039, U+2E02 ...) \p{General_Category: L} \p{General_Category=Letter} (131_241) X \p{General_Category: L&} \p{General_Category=Cased_Letter} (3977) X \p{General_Category: L_} \p{General_Category=Cased_Letter} Note the trailing '_' matters in spite of loose matching rules. (3977) \p{General_Category: LC} \p{General_Category=Cased_Letter} (3977) \p{General_Category: Letter} (Short: \p{Gc=L}, \p{L}) (131_241: [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6 \xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{General_Category: Letter_Number} (Short: \p{Gc=Nl}, \p{Nl}) (236: U+16EE..16F0, U+2160..2182, U+2185..2188, U+3007, U+3021..3029, U+3038..303A ...) \p{General_Category: Line_Separator} (Short: \p{Gc=Zl}, \p{Zl}) (1: U+2028) \p{General_Category: Ll} \p{General_Category=Lowercase_Letter} (/i= General_Category=Cased_Letter) (2155) \p{General_Category: Lm} \p{General_Category=Modifier_Letter} (260) \p{General_Category: Lo} \p{General_Category=Other_Letter} (127_004) \p{General_Category: Lowercase_Letter} (Short: \p{Gc=Ll}, \p{Ll}; /i= General_Category=Cased_Letter) (2155: [a-z\xb5\xdf-\xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{General_Category: Lt} \p{General_Category=Titlecase_Letter} (/i= General_Category=Cased_Letter) (31) \p{General_Category: Lu} \p{General_Category=Uppercase_Letter} (/i= General_Category=Cased_Letter) (1791) \p{General_Category: M} \p{General_Category=Mark} (2295) \p{General_Category: Mark} (Short: \p{Gc=M}, \p{M}) (2295: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{General_Category: Math_Symbol} (Short: \p{Gc=Sm}, \p{Sm}) (948: [+<=>\|~\xac\xb1\xd7\xf7], U+03F6, U+0606..0608, U+2044, U+2052, U+207A..207C ...) \p{General_Category: Mc} \p{General_Category=Spacing_Mark} (443) \p{General_Category: Me} \p{General_Category=Enclosing_Mark} (13) \p{General_Category: Mn} \p{General_Category=Nonspacing_Mark} (1839) \p{General_Category: Modifier_Letter} (Short: \p{Gc=Lm}, \p{Lm}) (260: U+02B0..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE, U+0374 ...) \p{General_Category: Modifier_Symbol} (Short: \p{Gc=Sk}, \p{Sk}) (123: [\^`\xa8\xaf\xb4\xb8], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..02FF ...) \p{General_Category: N} \p{General_Category=Number} (1781) \p{General_Category: Nd} \p{General_Category=Decimal_Number} (650) \p{General_Category: Nl} \p{General_Category=Letter_Number} (236) \p{General_Category: No} \p{General_Category=Other_Number} (895) \p{General_Category: Nonspacing_Mark} (Short: \p{Gc=Mn}, \p{Mn}) (1839: U+0300..036F, U+0483..0487, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{General_Category: Number} (Short: \p{Gc=N}, \p{N}) (1781: [0-9 \xb2-\xb3\xb9\xbc-\xbe], U+0660..0669, U+06F0..06F9, U+07C0..07C9, U+0966..096F, U+09E6..09EF ...) \p{General_Category: Open_Punctuation} (Short: \p{Gc=Ps}, \p{Ps}) (75: [\(\[\{], U+0F3A, U+0F3C, U+169B, U+201A, U+201E ...) \p{General_Category: Other} (Short: \p{Gc=C}, \p{C}) (970_414 plus all above-Unicode code points: [\x00- \x1f\x7f-\x9f\xad], U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2 ...) \p{General_Category: Other_Letter} (Short: \p{Gc=Lo}, \p{Lo}) (127_004: [\xaa\xba], U+01BB, U+01C0..01C3, U+0294, U+05D0..05EA, U+05EF..05F2 ...) \p{General_Category: Other_Number} (Short: \p{Gc=No}, \p{No}) (895: [\xb2-\xb3\xb9\xbc-\xbe], U+09F4..09F9, U+0B72..0B77, U+0BF0..0BF2, U+0C78..0C7E, U+0D58..0D5E ...) \p{General_Category: Other_Punctuation} (Short: \p{Gc=Po}, \p{Po}) (593: [!\"#\%&\'*,.\/:;?\@\\\xa1\xa7 \xb6-\xb7\xbf], U+037E, U+0387, U+055A..055F, U+0589, U+05C0 ...) \p{General_Category: Other_Symbol} (Short: \p{Gc=So}, \p{So}) (6431: [\xa6\xa9\xae\xb0], U+0482, U+058D..058E, U+060E..060F, U+06DE, U+06E9 ...) \p{General_Category: P} \p{General_Category=Punctuation} (798) \p{General_Category: Paragraph_Separator} (Short: \p{Gc=Zp}, \p{Zp}) (1: U+2029) \p{General_Category: Pc} \p{General_Category= Connector_Punctuation} (10) \p{General_Category: Pd} \p{General_Category=Dash_Punctuation} (25) \p{General_Category: Pe} \p{General_Category=Close_Punctuation} (73) \p{General_Category: Pf} \p{General_Category=Final_Punctuation} (10) \p{General_Category: Pi} \p{General_Category=Initial_Punctuation} (12) \p{General_Category: Po} \p{General_Category=Other_Punctuation} (593) \p{General_Category: Private_Use} (Short: \p{Gc=Co}, \p{Co}) (137_468: U+E000..F8FF, U+F0000..FFFFD, U+100000..10FFFD) \p{General_Category: Ps} \p{General_Category=Open_Punctuation} (75) \p{General_Category: Punct} \p{General_Category=Punctuation} (798) \p{General_Category: Punctuation} (Short: \p{Gc=P}, \p{P}) (798: [!\"#\%&\'\(\)*,\-.\/:;?\@\[\\\]_\{\} \xa1\xa7\xab\xb6-\xb7\xbb\xbf], U+037E, U+0387, U+055A..055F, U+0589..058A, U+05BE ...) \p{General_Category: S} \p{General_Category=Symbol} (7564) \p{General_Category: Sc} \p{General_Category=Currency_Symbol} (62) \p{General_Category: Separator} (Short: \p{Gc=Z}, \p{Z}) (19: [\x20\xa0], U+1680, U+2000..200A, U+2028..2029, U+202F, U+205F ...) \p{General_Category: Sk} \p{General_Category=Modifier_Symbol} (123) \p{General_Category: Sm} \p{General_Category=Math_Symbol} (948) \p{General_Category: So} \p{General_Category=Other_Symbol} (6431) \p{General_Category: Space_Separator} (Short: \p{Gc=Zs}, \p{Zs}) (17: [\x20\xa0], U+1680, U+2000..200A, U+202F, U+205F, U+3000) \p{General_Category: Spacing_Mark} (Short: \p{Gc=Mc}, \p{Mc}) (443: U+0903, U+093B, U+093E..0940, U+0949..094C, U+094E..094F, U+0982..0983 ...) \p{General_Category: Surrogate} (Short: \p{Gc=Cs}, \p{Cs}) (2048: U+D800..DFFF) \p{General_Category: Symbol} (Short: \p{Gc=S}, \p{S}) (7564: [\$+<=>\^`\|~\xa2-\xa6\xa8-\xa9\xac\xae- \xb1\xb4\xb8\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..02FF ...) \p{General_Category: Titlecase_Letter} (Short: \p{Gc=Lt}, \p{Lt}; /i= General_Category=Cased_Letter) (31: U+01C5, U+01C8, U+01CB, U+01F2, U+1F88..1F8F, U+1F98..1F9F ...) \p{General_Category: Unassigned} (Short: \p{Gc=Cn}, \p{Cn}) (830_672 plus all above-Unicode code points: U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2, U+0530 ...) \p{General_Category: Uppercase_Letter} (Short: \p{Gc=Lu}, \p{Lu}; /i= General_Category=Cased_Letter) (1791: [A-Z\xc0-\xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{General_Category: Z} \p{General_Category=Separator} (19) \p{General_Category: Zl} \p{General_Category=Line_Separator} (1) \p{General_Category: Zp} \p{General_Category=Paragraph_Separator} (1) \p{General_Category: Zs} \p{General_Category=Space_Separator} (17) X \p{General_Punctuation} \p{Block=General_Punctuation} (Short: \p{InPunctuation}) (112) X \p{Geometric_Shapes} \p{Block=Geometric_Shapes} (96) X \p{Geometric_Shapes_Ext} \p{Geometric_Shapes_Extended} (= \p{Block=Geometric_Shapes_Extended}) (128) X \p{Geometric_Shapes_Extended} \p{Block=Geometric_Shapes_Extended} (Short: \p{InGeometricShapesExt}) (128) \p{Geor} \p{Georgian} (= \p{Script_Extensions= Georgian}) (NOT \p{Block=Georgian}) (174) \p{Georgian} \p{Script_Extensions=Georgian} (Short: \p{Geor}; NOT \p{Block=Georgian}) (174) X \p{Georgian_Ext} \p{Georgian_Extended} (= \p{Block= Georgian_Extended}) (48) X \p{Georgian_Extended} \p{Block=Georgian_Extended} (Short: \p{InGeorgianExt}) (48) X \p{Georgian_Sup} \p{Georgian_Supplement} (= \p{Block= Georgian_Supplement}) (48) X \p{Georgian_Supplement} \p{Block=Georgian_Supplement} (Short: \p{InGeorgianSup}) (48) \p{Glag} \p{Glagolitic} (= \p{Script_Extensions= Glagolitic}) (NOT \p{Block=Glagolitic}) (136) \p{Glagolitic} \p{Script_Extensions=Glagolitic} (Short: \p{Glag}; NOT \p{Block=Glagolitic}) (136) X \p{Glagolitic_Sup} \p{Glagolitic_Supplement} (= \p{Block= Glagolitic_Supplement}) (48) X \p{Glagolitic_Supplement} \p{Block=Glagolitic_Supplement} (Short: \p{InGlagoliticSup}) (48) \p{Gong} \p{Gunjala_Gondi} (= \p{Script_Extensions= Gunjala_Gondi}) (NOT \p{Block= Gunjala_Gondi}) (65) \p{Gonm} \p{Masaram_Gondi} (= \p{Script_Extensions= Masaram_Gondi}) (NOT \p{Block= Masaram_Gondi}) (77) \p{Goth} \p{Gothic} (= \p{Script_Extensions= Gothic}) (NOT \p{Block=Gothic}) (27) \p{Gothic} \p{Script_Extensions=Gothic} (Short: \p{Goth}; NOT \p{Block=Gothic}) (27) \p{Gr_Base} \p{Grapheme_Base} (= \p{Grapheme_Base=Y}) (141_814) \p{Gr_Base: *} \p{Grapheme_Base: *} \p{Gr_Ext} \p{Grapheme_Extend} (= \p{Grapheme_Extend= Y}) (1979) \p{Gr_Ext: *} \p{Grapheme_Extend: *} \p{Gran} \p{Grantha} (= \p{Script_Extensions= Grantha}) (NOT \p{Block=Grantha}) (116) \p{Grantha} \p{Script_Extensions=Grantha} (Short: \p{Gran}; NOT \p{Block=Grantha}) (116) \p{Graph} \p{XPosixGraph} (281_308) \p{Grapheme_Base} \p{Grapheme_Base=Y} (Short: \p{GrBase}) (141_814) \p{Grapheme_Base: N*} (Short: \p{GrBase=N}, \P{GrBase}) (972_298 plus all above-Unicode code points: [\x00-\x1f\x7f-\x9f\xad], U+0300..036F, U+0378..0379, U+0380..0383, U+038B, U+038D ...) \p{Grapheme_Base: Y*} (Short: \p{GrBase=Y}, \p{GrBase}) (141_814: [\x20-\x7e\xa0-\xac\xae-\xff], U+0100..02FF, U+0370..0377, U+037A..037F, U+0384..038A, U+038C ...) \p{Grapheme_Cluster_Break: CN} \p{Grapheme_Cluster_Break=Control} (3886) \p{Grapheme_Cluster_Break: Control} (Short: \p{GCB=CN}) (3886: [^ \n\r\x20-\x7e\xa0-\xac\xae-\xff], U+061C, U+180E, U+200B, U+200E..200F, U+2028..202E ...) \p{Grapheme_Cluster_Break: CR} (Short: \p{GCB=CR}) (1: [\r]) \p{Grapheme_Cluster_Break: E_Base} (Short: \p{GCB=EB}) (0) \p{Grapheme_Cluster_Break: E_Base_GAZ} (Short: \p{GCB=EBG}) (0) \p{Grapheme_Cluster_Break: E_Modifier} (Short: \p{GCB=EM}) (0) \p{Grapheme_Cluster_Break: EB} \p{Grapheme_Cluster_Break=E_Base} (0) \p{Grapheme_Cluster_Break: EBG} \p{Grapheme_Cluster_Break= E_Base_GAZ} (0) \p{Grapheme_Cluster_Break: EM} \p{Grapheme_Cluster_Break= E_Modifier} (0) \p{Grapheme_Cluster_Break: EX} \p{Grapheme_Cluster_Break=Extend} (1984) \p{Grapheme_Cluster_Break: Extend} (Short: \p{GCB=EX}) (1984: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{Grapheme_Cluster_Break: GAZ} \p{Grapheme_Cluster_Break= Glue_After_Zwj} (0) \p{Grapheme_Cluster_Break: Glue_After_Zwj} (Short: \p{GCB=GAZ}) (0) \p{Grapheme_Cluster_Break: L} (Short: \p{GCB=L}) (125: U+1100..115F, U+A960..A97C) \p{Grapheme_Cluster_Break: LF} (Short: \p{GCB=LF}) (1: [\n]) \p{Grapheme_Cluster_Break: LV} (Short: \p{GCB=LV}) (399: U+AC00, U+AC1C, U+AC38, U+AC54, U+AC70, U+AC8C ...) \p{Grapheme_Cluster_Break: LVT} (Short: \p{GCB=LVT}) (10_773: U+AC01..AC1B, U+AC1D..AC37, U+AC39..AC53, U+AC55..AC6F, U+AC71..AC8B, U+AC8D..ACA7 ...) \p{Grapheme_Cluster_Break: Other} (Short: \p{GCB=XX}) (1_096_272 plus all above-Unicode code points: [\x20-\x7e\xa0-\xac\xae-\xff], U+0100..02FF, U+0370..0482, U+048A..0590, U+05BE, U+05C0 ...) \p{Grapheme_Cluster_Break: PP} \p{Grapheme_Cluster_Break=Prepend} (24) \p{Grapheme_Cluster_Break: Prepend} (Short: \p{GCB=PP}) (24: U+0600..0605, U+06DD, U+070F, U+08E2, U+0D4E, U+110BD ...) \p{Grapheme_Cluster_Break: Regional_Indicator} (Short: \p{GCB=RI}) (26: U+1F1E6..1F1FF) \p{Grapheme_Cluster_Break: RI} \p{Grapheme_Cluster_Break= Regional_Indicator} (26) \p{Grapheme_Cluster_Break: SM} \p{Grapheme_Cluster_Break= SpacingMark} (388) \p{Grapheme_Cluster_Break: SpacingMark} (Short: \p{GCB=SM}) (388: U+0903, U+093B, U+093E..0940, U+0949..094C, U+094E..094F, U+0982..0983 ...) \p{Grapheme_Cluster_Break: T} (Short: \p{GCB=T}) (137: U+11A8..11FF, U+D7CB..D7FB) \p{Grapheme_Cluster_Break: V} (Short: \p{GCB=V}) (95: U+1160..11A7, U+D7B0..D7C6) \p{Grapheme_Cluster_Break: XX} \p{Grapheme_Cluster_Break=Other} (1_096_272 plus all above-Unicode code points) \p{Grapheme_Cluster_Break: ZWJ} (Short: \p{GCB=ZWJ}) (1: U+200D) \p{Grapheme_Extend} \p{Grapheme_Extend=Y} (Short: \p{GrExt}) (1979) \p{Grapheme_Extend: N*} (Short: \p{GrExt=N}, \P{GrExt}) (1_112_133 plus all above-Unicode code points: U+0000..02FF, U+0370..0482, U+048A..0590, U+05BE, U+05C0, U+05C3 ...) \p{Grapheme_Extend: Y*} (Short: \p{GrExt=Y}, \p{GrExt}) (1979: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{Greek} \p{Script_Extensions=Greek} (Short: \p{Grek}; NOT \p{Greek_And_Coptic}) (522) X \p{Greek_And_Coptic} \p{Block=Greek_And_Coptic} (Short: \p{InGreek}) (144) X \p{Greek_Ext} \p{Greek_Extended} (= \p{Block= Greek_Extended}) (256) X \p{Greek_Extended} \p{Block=Greek_Extended} (Short: \p{InGreekExt}) (256) \p{Grek} \p{Greek} (= \p{Script_Extensions=Greek}) (NOT \p{Greek_And_Coptic}) (522) \p{Gujarati} \p{Script_Extensions=Gujarati} (Short: \p{Gujr}; NOT \p{Block=Gujarati}) (105) \p{Gujr} \p{Gujarati} (= \p{Script_Extensions= Gujarati}) (NOT \p{Block=Gujarati}) (105) \p{Gunjala_Gondi} \p{Script_Extensions=Gunjala_Gondi} (Short: \p{Gong}; NOT \p{Block= Gunjala_Gondi}) (65) \p{Gurmukhi} \p{Script_Extensions=Gurmukhi} (Short: \p{Guru}; NOT \p{Block=Gurmukhi}) (94) \p{Guru} \p{Gurmukhi} (= \p{Script_Extensions= Gurmukhi}) (NOT \p{Block=Gurmukhi}) (94) X \p{Half_And_Full_Forms} \p{Halfwidth_And_Fullwidth_Forms} (= \p{Block=Halfwidth_And_Fullwidth_Forms}) (240) X \p{Half_Marks} \p{Combining_Half_Marks} (= \p{Block= Combining_Half_Marks}) (16) X \p{Halfwidth_And_Fullwidth_Forms} \p{Block= Halfwidth_And_Fullwidth_Forms} (Short: \p{InHalfAndFullForms}) (240) \p{Han} \p{Script_Extensions=Han} (94_492) \p{Hang} \p{Hangul} (= \p{Script_Extensions= Hangul}) (NOT \p{Hangul_Syllables}) (11_775) \p{Hangul} \p{Script_Extensions=Hangul} (Short: \p{Hang}; NOT \p{Hangul_Syllables}) (11_775) X \p{Hangul_Compatibility_Jamo} \p{Block=Hangul_Compatibility_Jamo} (Short: \p{InCompatJamo}) (96) X \p{Hangul_Jamo} \p{Block=Hangul_Jamo} (Short: \p{InJamo}) (256) X \p{Hangul_Jamo_Extended_A} \p{Block=Hangul_Jamo_Extended_A} (Short: \p{InJamoExtA}) (32) X \p{Hangul_Jamo_Extended_B} \p{Block=Hangul_Jamo_Extended_B} (Short: \p{InJamoExtB}) (80) \p{Hangul_Syllable_Type: L} \p{Hangul_Syllable_Type=Leading_Jamo} (125) \p{Hangul_Syllable_Type: Leading_Jamo} (Short: \p{Hst=L}) (125: U+1100..115F, U+A960..A97C) \p{Hangul_Syllable_Type: LV} \p{Hangul_Syllable_Type=LV_Syllable} (399) \p{Hangul_Syllable_Type: LV_Syllable} (Short: \p{Hst=LV}) (399: U+AC00, U+AC1C, U+AC38, U+AC54, U+AC70, U+AC8C ...) \p{Hangul_Syllable_Type: LVT} \p{Hangul_Syllable_Type= LVT_Syllable} (10_773) \p{Hangul_Syllable_Type: LVT_Syllable} (Short: \p{Hst=LVT}) (10_773: U+AC01..AC1B, U+AC1D..AC37, U+AC39..AC53, U+AC55..AC6F, U+AC71..AC8B, U+AC8D..ACA7 ...) \p{Hangul_Syllable_Type: NA} \p{Hangul_Syllable_Type= Not_Applicable} (1_102_583 plus all above-Unicode code points) \p{Hangul_Syllable_Type: Not_Applicable} (Short: \p{Hst=NA}) (1_102_583 plus all above-Unicode code points: U+0000..10FF, U+1200..A95F, U+A97D..ABFF, U+D7A4..D7AF, U+D7C7..D7CA, U+D7FC..infinity) \p{Hangul_Syllable_Type: T} \p{Hangul_Syllable_Type=Trailing_Jamo} (137) \p{Hangul_Syllable_Type: Trailing_Jamo} (Short: \p{Hst=T}) (137: U+11A8..11FF, U+D7CB..D7FB) \p{Hangul_Syllable_Type: V} \p{Hangul_Syllable_Type=Vowel_Jamo} (95) \p{Hangul_Syllable_Type: Vowel_Jamo} (Short: \p{Hst=V}) (95: U+1160..11A7, U+D7B0..D7C6) X \p{Hangul_Syllables} \p{Block=Hangul_Syllables} (Short: \p{InHangul}) (11_184) \p{Hani} \p{Han} (= \p{Script_Extensions=Han}) (94_492) \p{Hanifi_Rohingya} \p{Script_Extensions=Hanifi_Rohingya} (Short: \p{Rohg}; NOT \p{Block= Hanifi_Rohingya}) (55) \p{Hano} \p{Hanunoo} (= \p{Script_Extensions= Hanunoo}) (NOT \p{Block=Hanunoo}) (23) \p{Hanunoo} \p{Script_Extensions=Hanunoo} (Short: \p{Hano}; NOT \p{Block=Hanunoo}) (23) \p{Hatr} \p{Hatran} (= \p{Script_Extensions= Hatran}) (NOT \p{Block=Hatran}) (26) \p{Hatran} \p{Script_Extensions=Hatran} (Short: \p{Hatr}; NOT \p{Block=Hatran}) (26) \p{Hebr} \p{Hebrew} (= \p{Script_Extensions= Hebrew}) (NOT \p{Block=Hebrew}) (134) \p{Hebrew} \p{Script_Extensions=Hebrew} (Short: \p{Hebr}; NOT \p{Block=Hebrew}) (134) \p{Hex} \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44) \p{Hex: *} \p{Hex_Digit: *} \p{Hex_Digit} \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44) \p{Hex_Digit: N*} (Short: \p{Hex=N}, \P{Hex}) (1_114_068 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/:;<=>? \@G-Z\[\\\]\^_`g-z\{\|\}~\x7f-\xff], U+0100..FF0F, U+FF1A..FF20, U+FF27..FF40, U+FF47..infinity) \p{Hex_Digit: Y*} (Short: \p{Hex=Y}, \p{Hex}) (44: [0-9A-Fa- f], U+FF10..FF19, U+FF21..FF26, U+FF41..FF46) X \p{High_Private_Use_Surrogates} \p{Block= High_Private_Use_Surrogates} (Short: \p{InHighPUSurrogates}) (128) X \p{High_PU_Surrogates} \p{High_Private_Use_Surrogates} (= \p{Block=High_Private_Use_Surrogates}) (128) X \p{High_Surrogates} \p{Block=High_Surrogates} (896) \p{Hira} \p{Hiragana} (= \p{Script_Extensions= Hiragana}) (NOT \p{Block=Hiragana}) (431) \p{Hiragana} \p{Script_Extensions=Hiragana} (Short: \p{Hira}; NOT \p{Block=Hiragana}) (431) \p{Hluw} \p{Anatolian_Hieroglyphs} (= \p{Script_Extensions= Anatolian_Hieroglyphs}) (NOT \p{Block= Anatolian_Hieroglyphs}) (583) \p{Hmng} \p{Pahawh_Hmong} (= \p{Script_Extensions= Pahawh_Hmong}) (NOT \p{Block= Pahawh_Hmong}) (127) \p{Hmnp} \p{Nyiakeng_Puachue_Hmong} (= \p{Script_Extensions= Nyiakeng_Puachue_Hmong}) (NOT \p{Block= Nyiakeng_Puachue_Hmong}) (71) \p{HorizSpace} \p{XPosixBlank} (18) \p{Hst: *} \p{Hangul_Syllable_Type: *} \p{Hung} \p{Old_Hungarian} (= \p{Script_Extensions= Old_Hungarian}) (NOT \p{Block= Old_Hungarian}) (108) D \p{Hyphen} \p{Hyphen=Y} (11) D \p{Hyphen: N*} Supplanted by Line_Break property values; see www.unicode.org/reports/tr14 (Single: \P{Hyphen}) (1_114_101 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*+,.\/0-9:;<=>?\@A-Z \[\\\]\^_`a-z\{\|\}~\x7f-\xac\xae-\xff], U+0100..0589, U+058B..1805, U+1807..200F, U+2012..2E16, U+2E18..30FA ...) D \p{Hyphen: Y*} Supplanted by Line_Break property values; see www.unicode.org/reports/tr14 (Single: \p{Hyphen}) (11: [\-\xad], U+058A, U+1806, U+2010..2011, U+2E17, U+30FB ...) \p{ID_Continue} \p{ID_Continue=Y} (Short: \p{IDC}; NOT \p{Ideographic_Description_Characters}) (134_434) \p{ID_Continue: N*} (Short: \p{IDC=N}, \P{IDC}) (979_678 plus all above-Unicode code points: [\x00- \x20!\"#\$\%&\'\(\)*+,\-.\/:;<=>?\@ \[\\\]\^`\{\|\}~\x7f-\xa9\xab-\xb4\xb6 \xb8-\xb9\xbb-\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..02FF ...) \p{ID_Continue: Y*} (Short: \p{IDC=Y}, \p{IDC}) (134_434: [0-9A-Z_a-z\xaa\xb5\xb7\xba\xc0-\xd6 \xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{ID_Start} \p{ID_Start=Y} (Short: \p{IDS}) (131_482) \p{ID_Start: N*} (Short: \p{IDS=N}, \P{IDS}) (982_630 plus all above-Unicode code points: [\x00- \x20!\"#\$\%&\'\(\)*+,\-.\/0-9:;<=>?\@ \[\\\]\^_`\{\|\}~\x7f-\xa9\xab-\xb4\xb6- \xb9\xbb-\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..036F ...) \p{ID_Start: Y*} (Short: \p{IDS=Y}, \p{IDS}) (131_482: [A- Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8- \xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{IDC} \p{ID_Continue} (= \p{ID_Continue=Y}) (NOT \p{Ideographic_Description_Characters}) (134_434) \p{IDC: *} \p{ID_Continue: *} \p{Identifier_Status: Allowed} (107_835: [\'\-.0-9:A-Z_a-z\xb7 \xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..0131, U+0134..013E, U+0141..0148, U+014A..017E, U+018F ...) \p{Identifier_Status: Restricted} (1_006_277 plus all above- Unicode code points: [\x00-\x20!\"#\$ \%&\(\)*+,\/;<=>?\@\[\\\]\^`\{\|\}~\x7f- \xb6\xb8-\xbf\xd7\xf7], U+0132..0133, U+013F..0140, U+0149, U+017F..018E, U+0190..019F ...) \p{Identifier_Type: Default_Ignorable} (395: [\xad], U+034F, U+061C, U+115F..1160, U+17B4..17B5, U+180B..180E ...) \p{Identifier_Type: Deprecated} (15: U+0149, U+0673, U+0F77, U+0F79, U+17A3..17A4, U+206A..206F ...) \p{Identifier_Type: Exclusion} (16_745: U+03E2..03EF, U+0800..082D, U+0830..083E, U+1680..169C, U+16A0..16EA, U+16EE..16F8 ...) \p{Identifier_Type: Inclusion} (19: [\'\-.:\xb7], U+0375, U+058A, U+05F3..05F4, U+06FD..06FE, U+0F0B ...) \p{Identifier_Type: Limited_Use} (5248: U+0700..070D, U+070F..074A, U+074D..074F, U+07C0..07FA, U+07FD..07FF, U+0840..085B ...) \p{Identifier_Type: Not_Character} (970_247 plus all above-Unicode code points: [^\t\n\cK\f\r\x20-\x7e\x85 \xa0-\xff], U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2 ...) \p{Identifier_Type: Not_NFKC} (4800: [\xa0\xa8\xaa\xaf\xb2-\xb5 \xb8-\xba\xbc-\xbe], U+0132..0133, U+013F..0140, U+017F, U+01C4..01CC, U+01F1..01F3 ...) \p{Identifier_Type: Not_XID} (7998: [\t\n\cK\f\r\x20!\"#\$\%& \(\)*+,\/;<=>?\@\[\\\]\^`\{\|\}~\x85 \xa1-\xa7\xa9\xab-\xac\xae\xb0-\xb1\xb6 \xbb\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02D7, U+02DE..02DF, U+02E5..02EB, U+02ED ...) \p{Identifier_Type: Obsolete} (1611: U+018D, U+01AA..01AB, U+01B9..01BB, U+01BE..01BF, U+01F6..01F7, U+021C..021D ...) \p{Identifier_Type: Recommended} (107_816: [0-9A-Z_a-z\xc0-\xd6 \xd8-\xf6\xf8-\xff], U+0100..0131, U+0134..013E, U+0141..0148, U+014A..017E, U+018F ...) \p{Identifier_Type: Technical} (1463: U+0180, U+018D, U+01AA..01AB, U+01BA..01BB, U+01BE, U+01C0..01C3 ...) \p{Identifier_Type: Uncommon_Use} (348: U+0181..018C, U+018E, U+0190..019F, U+01A2..01A9, U+01AC..01AE, U+01B1..01B8 ...) \p{Ideo} \p{Ideographic} (= \p{Ideographic=Y}) (101_652) \p{Ideo: *} \p{Ideographic: *} \p{Ideographic} \p{Ideographic=Y} (Short: \p{Ideo}) (101_652) \p{Ideographic: N*} (Short: \p{Ideo=N}, \P{Ideo}) (1_012_460 plus all above-Unicode code points: U+0000..3005, U+3008..3020, U+302A..3037, U+303B..33FF, U+4DC0..4DFF, U+9FFD..F8FF ...) \p{Ideographic: Y*} (Short: \p{Ideo=Y}, \p{Ideo}) (101_652: U+3006..3007, U+3021..3029, U+3038..303A, U+3400..4DBF, U+4E00..9FFC, U+F900..FA6D ...) X \p{Ideographic_Description_Characters} \p{Block= Ideographic_Description_Characters} (Short: \p{InIDC}) (16) X \p{Ideographic_Symbols} \p{Ideographic_Symbols_And_Punctuation} (= \p{Block= Ideographic_Symbols_And_Punctuation}) (32) X \p{Ideographic_Symbols_And_Punctuation} \p{Block= Ideographic_Symbols_And_Punctuation} (Short: \p{InIdeographicSymbols}) (32) \p{IDS} \p{ID_Start} (= \p{ID_Start=Y}) (131_482) \p{IDS: *} \p{ID_Start: *} \p{IDS_Binary_Operator} \p{IDS_Binary_Operator=Y} (Short: \p{IDSB}) (10) \p{IDS_Binary_Operator: N*} (Short: \p{IDSB=N}, \P{IDSB}) (1_114_102 plus all above-Unicode code points: U+0000..2FEF, U+2FF2..2FF3, U+2FFC..infinity) \p{IDS_Binary_Operator: Y*} (Short: \p{IDSB=Y}, \p{IDSB}) (10: U+2FF0..2FF1, U+2FF4..2FFB) \p{IDS_Trinary_Operator} \p{IDS_Trinary_Operator=Y} (Short: \p{IDST}) (2) \p{IDS_Trinary_Operator: N*} (Short: \p{IDST=N}, \P{IDST}) (1_114_110 plus all above-Unicode code points: U+0000..2FF1, U+2FF4..infinity) \p{IDS_Trinary_Operator: Y*} (Short: \p{IDST=Y}, \p{IDST}) (2: U+2FF2..2FF3) \p{IDSB} \p{IDS_Binary_Operator} (= \p{IDS_Binary_Operator=Y}) (10) \p{IDSB: *} \p{IDS_Binary_Operator: *} \p{IDST} \p{IDS_Trinary_Operator} (= \p{IDS_Trinary_Operator=Y}) (2) \p{IDST: *} \p{IDS_Trinary_Operator: *} \p{Imperial_Aramaic} \p{Script_Extensions=Imperial_Aramaic} (Short: \p{Armi}; NOT \p{Block= Imperial_Aramaic}) (31) \p{In: *} \p{Present_In: *} (Perl extension) X \p{In_*} \p{Block: *} X \p{Indic_Number_Forms} \p{Common_Indic_Number_Forms} (= \p{Block= Common_Indic_Number_Forms}) (16) \p{Indic_Positional_Category: Bottom} (Short: \p{InPC=Bottom}) (351: U+093C, U+0941..0944, U+094D, U+0952, U+0956..0957, U+0962..0963 ...) \p{Indic_Positional_Category: Bottom_And_Left} (Short: \p{InPC= BottomAndLeft}) (1: U+A9BF) \p{Indic_Positional_Category: Bottom_And_Right} (Short: \p{InPC= BottomAndRight}) (4: U+1B3B, U+A9BE, U+A9C0, U+11942) \p{Indic_Positional_Category: Left} (Short: \p{InPC=Left}) (64: U+093F, U+094E, U+09BF, U+09C7..09C8, U+0A3F, U+0ABF ...) \p{Indic_Positional_Category: Left_And_Right} (Short: \p{InPC= LeftAndRight}) (22: U+09CB..09CC, U+0B4B, U+0BCA..0BCC, U+0D4A..0D4C, U+0DDC, U+0DDE ...) \p{Indic_Positional_Category: NA} (Short: \p{InPC=NA}) (1_112_902 plus all above-Unicode code points: U+0000..08FF, U+0904..0939, U+093D, U+0950, U+0958..0961, U+0964..0980 ...) \p{Indic_Positional_Category: Overstruck} (Short: \p{InPC= Overstruck}) (10: U+1CD4, U+1CE2..1CE8, U+10A01, U+10A06) \p{Indic_Positional_Category: Right} (Short: \p{InPC=Right}) (288: U+0903, U+093B, U+093E, U+0940, U+0949..094C, U+094F ...) \p{Indic_Positional_Category: Top} (Short: \p{InPC=Top}) (415: U+0900..0902, U+093A, U+0945..0948, U+0951, U+0953..0955, U+0981 ...) \p{Indic_Positional_Category: Top_And_Bottom} (Short: \p{InPC= TopAndBottom}) (10: U+0C48, U+0F73, U+0F76..0F79, U+0F81, U+1B3C, U+1112E..1112F) \p{Indic_Positional_Category: Top_And_Bottom_And_Left} (Short: \p{InPC=TopAndBottomAndLeft}) (2: U+103C, U+1171E) \p{Indic_Positional_Category: Top_And_Bottom_And_Right} (Short: \p{InPC=TopAndBottomAndRight}) (1: U+1B3D) \p{Indic_Positional_Category: Top_And_Left} (Short: \p{InPC= TopAndLeft}) (6: U+0B48, U+0DDA, U+17BE, U+1C29, U+114BB, U+115B9) \p{Indic_Positional_Category: Top_And_Left_And_Right} (Short: \p{InPC=TopAndLeftAndRight}) (4: U+0B4C, U+0DDD, U+17BF, U+115BB) \p{Indic_Positional_Category: Top_And_Right} (Short: \p{InPC= TopAndRight}) (13: U+0AC9, U+0B57, U+0CC0, U+0CC7..0CC8, U+0CCA..0CCB, U+1925..1926 ...) \p{Indic_Positional_Category: Visual_Order_Left} (Short: \p{InPC= VisualOrderLeft}) (19: U+0E40..0E44, U+0EC0..0EC4, U+19B5..19B7, U+19BA, U+AAB5..AAB6, U+AAB9 ...) X \p{Indic_Siyaq_Numbers} \p{Block=Indic_Siyaq_Numbers} (80) \p{Indic_Syllabic_Category: Avagraha} (Short: \p{InSC=Avagraha}) (17: U+093D, U+09BD, U+0ABD, U+0B3D, U+0C3D, U+0CBD ...) \p{Indic_Syllabic_Category: Bindu} (Short: \p{InSC=Bindu}) (91: U+0900..0902, U+0981..0982, U+09FC, U+0A01..0A02, U+0A70, U+0A81..0A82 ...) \p{Indic_Syllabic_Category: Brahmi_Joining_Number} (Short: \p{InSC=BrahmiJoiningNumber}) (20: U+11052..11065) \p{Indic_Syllabic_Category: Cantillation_Mark} (Short: \p{InSC= CantillationMark}) (59: U+0951..0952, U+0A51, U+0AFA..0AFC, U+1CD0..1CD2, U+1CD4..1CE1, U+1CF4 ...) \p{Indic_Syllabic_Category: Consonant} (Short: \p{InSC=Consonant}) (2195: U+0915..0939, U+0958..095F, U+0978..097F, U+0995..09A8, U+09AA..09B0, U+09B2 ...) \p{Indic_Syllabic_Category: Consonant_Dead} (Short: \p{InSC= ConsonantDead}) (12: U+09CE, U+0D54..0D56, U+0D7A..0D7F, U+1CF2..1CF3) \p{Indic_Syllabic_Category: Consonant_Final} (Short: \p{InSC= ConsonantFinal}) (67: U+1930..1931, U+1933..1939, U+19C1..19C7, U+1A58..1A59, U+1BBE..1BBF, U+1BF0..1BF1 ...) \p{Indic_Syllabic_Category: Consonant_Head_Letter} (Short: \p{InSC=ConsonantHeadLetter}) (5: U+0F88..0F8C) \p{Indic_Syllabic_Category: Consonant_Initial_Postfixed} (Short: \p{InSC=ConsonantInitialPostfixed}) (1: U+1A5A) \p{Indic_Syllabic_Category: Consonant_Killer} (Short: \p{InSC= ConsonantKiller}) (2: U+0E4C, U+17CD) \p{Indic_Syllabic_Category: Consonant_Medial} (Short: \p{InSC= ConsonantMedial}) (31: U+0A75, U+0EBC..0EBD, U+103B..103E, U+105E..1060, U+1082, U+1A55..1A56 ...) \p{Indic_Syllabic_Category: Consonant_Placeholder} (Short: \p{InSC=ConsonantPlaceholder}) (22: [\- \xa0\xd7], U+0980, U+0A72..0A73, U+104B, U+104E, U+1900 ...) \p{Indic_Syllabic_Category: Consonant_Preceding_Repha} (Short: \p{InSC=ConsonantPrecedingRepha}) (3: U+0D4E, U+11941, U+11D46) \p{Indic_Syllabic_Category: Consonant_Prefixed} (Short: \p{InSC= ConsonantPrefixed}) (10: U+111C2..111C3, U+1193F, U+11A3A, U+11A84..11A89) \p{Indic_Syllabic_Category: Consonant_Subjoined} (Short: \p{InSC= ConsonantSubjoined}) (94: U+0F8D..0F97, U+0F99..0FBC, U+1929..192B, U+1A57, U+1A5B..1A5E, U+1BA1..1BA3 ...) \p{Indic_Syllabic_Category: Consonant_Succeeding_Repha} (Short: \p{InSC=ConsonantSucceedingRepha}) (4: U+17CC, U+1B03, U+1B81, U+A982) \p{Indic_Syllabic_Category: Consonant_With_Stacker} (Short: \p{InSC=ConsonantWithStacker}) (8: U+0CF1..0CF2, U+1CF5..1CF6, U+11003..11004, U+11460..11461) \p{Indic_Syllabic_Category: Gemination_Mark} (Short: \p{InSC= GeminationMark}) (3: U+0A71, U+11237, U+11A98) \p{Indic_Syllabic_Category: Invisible_Stacker} (Short: \p{InSC= InvisibleStacker}) (12: U+1039, U+17D2, U+1A60, U+1BAB, U+AAF6, U+10A3F ...) \p{Indic_Syllabic_Category: Joiner} (Short: \p{InSC=Joiner}) (1: U+200D) \p{Indic_Syllabic_Category: Modifying_Letter} (Short: \p{InSC= ModifyingLetter}) (1: U+0B83) \p{Indic_Syllabic_Category: Non_Joiner} (Short: \p{InSC= NonJoiner}) (1: U+200C) \p{Indic_Syllabic_Category: Nukta} (Short: \p{InSC=Nukta}) (31: U+093C, U+09BC, U+0A3C, U+0ABC, U+0AFD..0AFF, U+0B3C ...) \p{Indic_Syllabic_Category: Number} (Short: \p{InSC=Number}) (491: [0-9], U+0966..096F, U+09E6..09EF, U+0A66..0A6F, U+0AE6..0AEF, U+0B66..0B6F ...) \p{Indic_Syllabic_Category: Number_Joiner} (Short: \p{InSC= NumberJoiner}) (1: U+1107F) \p{Indic_Syllabic_Category: Other} (Short: \p{InSC=Other}) (1_109_572 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,. \/:;<=>?\@A-Z\[\\\]\^_`a-z\{\|\}~\x7f- \x9f\xa1-\xb1\xb4-\xd6\xd8-\xff], U+0100..08FF, U+0950, U+0953..0954, U+0964..0965, U+0970..0971 ...) \p{Indic_Syllabic_Category: Pure_Killer} (Short: \p{InSC= PureKiller}) (23: U+0D3B..0D3C, U+0E3A, U+0E4E, U+0EBA, U+0F84, U+103A ...) \p{Indic_Syllabic_Category: Register_Shifter} (Short: \p{InSC= RegisterShifter}) (2: U+17C9..17CA) \p{Indic_Syllabic_Category: Syllable_Modifier} (Short: \p{InSC= SyllableModifier}) (25: [\xb2-\xb3], U+09FE, U+0F35, U+0F37, U+0FC6, U+17CB ...) \p{Indic_Syllabic_Category: Tone_Letter} (Short: \p{InSC= ToneLetter}) (7: U+1970..1974, U+AAC0, U+AAC2) \p{Indic_Syllabic_Category: Tone_Mark} (Short: \p{InSC=ToneMark}) (42: U+0E48..0E4B, U+0EC8..0ECB, U+1037, U+1063..1064, U+1069..106D, U+1087..108D ...) \p{Indic_Syllabic_Category: Virama} (Short: \p{InSC=Virama}) (27: U+094D, U+09CD, U+0A4D, U+0ACD, U+0B4D, U+0BCD ...) \p{Indic_Syllabic_Category: Visarga} (Short: \p{InSC=Visarga}) (35: U+0903, U+0983, U+0A03, U+0A83, U+0B03, U+0C03 ...) \p{Indic_Syllabic_Category: Vowel} (Short: \p{InSC=Vowel}) (30: U+1963..196D, U+A85E..A861, U+A866, U+A922..A92A, U+11150..11154) \p{Indic_Syllabic_Category: Vowel_Dependent} (Short: \p{InSC= VowelDependent}) (683: U+093A..093B, U+093E..094C, U+094E..094F, U+0955..0957, U+0962..0963, U+09BE..09C4 ...) \p{Indic_Syllabic_Category: Vowel_Independent} (Short: \p{InSC= VowelIndependent}) (484: U+0904..0914, U+0960..0961, U+0972..0977, U+0985..098C, U+098F..0990, U+0993..0994 ...) \p{Inherited} \p{Script_Extensions=Inherited} (Short: \p{Zinh}) (503) \p{Initial_Punctuation} \p{General_Category=Initial_Punctuation} (Short: \p{Pi}) (12) \p{InPC: *} \p{Indic_Positional_Category: *} \p{InSC: *} \p{Indic_Syllabic_Category: *} \p{Inscriptional_Pahlavi} \p{Script_Extensions= Inscriptional_Pahlavi} (Short: \p{Phli}; NOT \p{Block=Inscriptional_Pahlavi}) (27) \p{Inscriptional_Parthian} \p{Script_Extensions= Inscriptional_Parthian} (Short: \p{Prti}; NOT \p{Block= Inscriptional_Parthian}) (30) X \p{IPA_Ext} \p{IPA_Extensions} (= \p{Block= IPA_Extensions}) (96) X \p{IPA_Extensions} \p{Block=IPA_Extensions} (Short: \p{InIPAExt}) (96) \p{Is_*} \p{*} (Any exceptions are individually noted beginning with the word NOT.) If an entry has flag(s) at its beginning, like "D", the "Is_" form has the same flag(s) \p{Ital} \p{Old_Italic} (= \p{Script_Extensions= Old_Italic}) (NOT \p{Block=Old_Italic}) (39) X \p{Jamo} \p{Hangul_Jamo} (= \p{Block=Hangul_Jamo}) (256) X \p{Jamo_Ext_A} \p{Hangul_Jamo_Extended_A} (= \p{Block= Hangul_Jamo_Extended_A}) (32) X \p{Jamo_Ext_B} \p{Hangul_Jamo_Extended_B} (= \p{Block= Hangul_Jamo_Extended_B}) (80) \p{Java} \p{Javanese} (= \p{Script_Extensions= Javanese}) (NOT \p{Block=Javanese}) (91) \p{Javanese} \p{Script_Extensions=Javanese} (Short: \p{Java}; NOT \p{Block=Javanese}) (91) \p{Jg: *} \p{Joining_Group: *} \p{Join_C} \p{Join_Control} (= \p{Join_Control=Y}) (2) \p{Join_C: *} \p{Join_Control: *} \p{Join_Control} \p{Join_Control=Y} (Short: \p{JoinC}) (2) \p{Join_Control: N*} (Short: \p{JoinC=N}, \P{JoinC}) (1_114_110 plus all above-Unicode code points: U+0000..200B, U+200E..infinity) \p{Join_Control: Y*} (Short: \p{JoinC=Y}, \p{JoinC}) (2: U+200C..200D) \p{Joining_Group: African_Feh} (Short: \p{Jg=AfricanFeh}) (1: U+08BB) \p{Joining_Group: African_Noon} (Short: \p{Jg=AfricanNoon}) (1: U+08BD) \p{Joining_Group: African_Qaf} (Short: \p{Jg=AfricanQaf}) (2: U+08BC, U+08C4) \p{Joining_Group: Ain} (Short: \p{Jg=Ain}) (9: U+0639..063A, U+06A0, U+06FC, U+075D..075F, U+08B3, U+08C3) \p{Joining_Group: Alaph} (Short: \p{Jg=Alaph}) (1: U+0710) \p{Joining_Group: Alef} (Short: \p{Jg=Alef}) (10: U+0622..0623, U+0625, U+0627, U+0671..0673, U+0675, U+0773..0774) \p{Joining_Group: Beh} (Short: \p{Jg=Beh}) (27: U+0628, U+062A..062B, U+066E, U+0679..0680, U+0750..0756, U+08A0..08A1 ...) \p{Joining_Group: Beth} (Short: \p{Jg=Beth}) (2: U+0712, U+072D) \p{Joining_Group: Burushaski_Yeh_Barree} (Short: \p{Jg= BurushaskiYehBarree}) (2: U+077A..077B) \p{Joining_Group: Dal} (Short: \p{Jg=Dal}) (15: U+062F..0630, U+0688..0690, U+06EE, U+0759..075A, U+08AE) \p{Joining_Group: Dalath_Rish} (Short: \p{Jg=DalathRish}) (4: U+0715..0716, U+072A, U+072F) \p{Joining_Group: E} (Short: \p{Jg=E}) (1: U+0725) \p{Joining_Group: Farsi_Yeh} (Short: \p{Jg=FarsiYeh}) (7: U+063D..063F, U+06CC, U+06CE, U+0775..0776) \p{Joining_Group: Fe} (Short: \p{Jg=Fe}) (1: U+074F) \p{Joining_Group: Feh} (Short: \p{Jg=Feh}) (10: U+0641, U+06A1..06A6, U+0760..0761, U+08A4) \p{Joining_Group: Final_Semkath} (Short: \p{Jg=FinalSemkath}) (1: U+0724) \p{Joining_Group: Gaf} (Short: \p{Jg=Gaf}) (15: U+063B..063C, U+06A9, U+06AB, U+06AF..06B4, U+0762..0764, U+08B0 ...) \p{Joining_Group: Gamal} (Short: \p{Jg=Gamal}) (3: U+0713..0714, U+072E) \p{Joining_Group: Hah} (Short: \p{Jg=Hah}) (21: U+062C..062E, U+0681..0687, U+06BF, U+0757..0758, U+076E..076F, U+0772 ...) \p{Joining_Group: Hamza_On_Heh_Goal} (Short: \p{Jg= HamzaOnHehGoal}) (1: U+06C3) \p{Joining_Group: Hanifi_Rohingya_Kinna_Ya} (Short: \p{Jg= HanifiRohingyaKinnaYa}) (4: U+10D19, U+10D1E, U+10D20, U+10D23) \p{Joining_Group: Hanifi_Rohingya_Pa} (Short: \p{Jg= HanifiRohingyaPa}) (3: U+10D02, U+10D09, U+10D1C) \p{Joining_Group: He} (Short: \p{Jg=He}) (1: U+0717) \p{Joining_Group: Heh} (Short: \p{Jg=Heh}) (1: U+0647) \p{Joining_Group: Heh_Goal} (Short: \p{Jg=HehGoal}) (2: U+06C1..06C2) \p{Joining_Group: Heth} (Short: \p{Jg=Heth}) (1: U+071A) \p{Joining_Group: Kaf} (Short: \p{Jg=Kaf}) (6: U+0643, U+06AC..06AE, U+077F, U+08B4) \p{Joining_Group: Kaph} (Short: \p{Jg=Kaph}) (1: U+071F) \p{Joining_Group: Khaph} (Short: \p{Jg=Khaph}) (1: U+074E) \p{Joining_Group: Knotted_Heh} (Short: \p{Jg=KnottedHeh}) (2: U+06BE, U+06FF) \p{Joining_Group: Lam} (Short: \p{Jg=Lam}) (8: U+0644, U+06B5..06B8, U+076A, U+08A6, U+08C7) \p{Joining_Group: Lamadh} (Short: \p{Jg=Lamadh}) (1: U+0720) \p{Joining_Group: Malayalam_Bha} (Short: \p{Jg=MalayalamBha}) (1: U+0866) \p{Joining_Group: Malayalam_Ja} (Short: \p{Jg=MalayalamJa}) (1: U+0861) \p{Joining_Group: Malayalam_Lla} (Short: \p{Jg=MalayalamLla}) (1: U+0868) \p{Joining_Group: Malayalam_Llla} (Short: \p{Jg=MalayalamLlla}) (1: U+0869) \p{Joining_Group: Malayalam_Nga} (Short: \p{Jg=MalayalamNga}) (1: U+0860) \p{Joining_Group: Malayalam_Nna} (Short: \p{Jg=MalayalamNna}) (1: U+0864) \p{Joining_Group: Malayalam_Nnna} (Short: \p{Jg=MalayalamNnna}) (1: U+0865) \p{Joining_Group: Malayalam_Nya} (Short: \p{Jg=MalayalamNya}) (1: U+0862) \p{Joining_Group: Malayalam_Ra} (Short: \p{Jg=MalayalamRa}) (1: U+0867) \p{Joining_Group: Malayalam_Ssa} (Short: \p{Jg=MalayalamSsa}) (1: U+086A) \p{Joining_Group: Malayalam_Tta} (Short: \p{Jg=MalayalamTta}) (1: U+0863) \p{Joining_Group: Manichaean_Aleph} (Short: \p{Jg= ManichaeanAleph}) (1: U+10AC0) \p{Joining_Group: Manichaean_Ayin} (Short: \p{Jg=ManichaeanAyin}) (2: U+10AD9..10ADA) \p{Joining_Group: Manichaean_Beth} (Short: \p{Jg=ManichaeanBeth}) (2: U+10AC1..10AC2) \p{Joining_Group: Manichaean_Daleth} (Short: \p{Jg= ManichaeanDaleth}) (1: U+10AC5) \p{Joining_Group: Manichaean_Dhamedh} (Short: \p{Jg= ManichaeanDhamedh}) (1: U+10AD4) \p{Joining_Group: Manichaean_Five} (Short: \p{Jg=ManichaeanFive}) (1: U+10AEC) \p{Joining_Group: Manichaean_Gimel} (Short: \p{Jg= ManichaeanGimel}) (2: U+10AC3..10AC4) \p{Joining_Group: Manichaean_Heth} (Short: \p{Jg=ManichaeanHeth}) (1: U+10ACD) \p{Joining_Group: Manichaean_Hundred} (Short: \p{Jg= ManichaeanHundred}) (1: U+10AEF) \p{Joining_Group: Manichaean_Kaph} (Short: \p{Jg=ManichaeanKaph}) (3: U+10AD0..10AD2) \p{Joining_Group: Manichaean_Lamedh} (Short: \p{Jg= ManichaeanLamedh}) (1: U+10AD3) \p{Joining_Group: Manichaean_Mem} (Short: \p{Jg=ManichaeanMem}) (1: U+10AD6) \p{Joining_Group: Manichaean_Nun} (Short: \p{Jg=ManichaeanNun}) (1: U+10AD7) \p{Joining_Group: Manichaean_One} (Short: \p{Jg=ManichaeanOne}) (1: U+10AEB) \p{Joining_Group: Manichaean_Pe} (Short: \p{Jg=ManichaeanPe}) (2: U+10ADB..10ADC) \p{Joining_Group: Manichaean_Qoph} (Short: \p{Jg=ManichaeanQoph}) (3: U+10ADE..10AE0) \p{Joining_Group: Manichaean_Resh} (Short: \p{Jg=ManichaeanResh}) (1: U+10AE1) \p{Joining_Group: Manichaean_Sadhe} (Short: \p{Jg= ManichaeanSadhe}) (1: U+10ADD) \p{Joining_Group: Manichaean_Samekh} (Short: \p{Jg= ManichaeanSamekh}) (1: U+10AD8) \p{Joining_Group: Manichaean_Taw} (Short: \p{Jg=ManichaeanTaw}) (1: U+10AE4) \p{Joining_Group: Manichaean_Ten} (Short: \p{Jg=ManichaeanTen}) (1: U+10AED) \p{Joining_Group: Manichaean_Teth} (Short: \p{Jg=ManichaeanTeth}) (1: U+10ACE) \p{Joining_Group: Manichaean_Thamedh} (Short: \p{Jg= ManichaeanThamedh}) (1: U+10AD5) \p{Joining_Group: Manichaean_Twenty} (Short: \p{Jg= ManichaeanTwenty}) (1: U+10AEE) \p{Joining_Group: Manichaean_Waw} (Short: \p{Jg=ManichaeanWaw}) (1: U+10AC7) \p{Joining_Group: Manichaean_Yodh} (Short: \p{Jg=ManichaeanYodh}) (1: U+10ACF) \p{Joining_Group: Manichaean_Zayin} (Short: \p{Jg= ManichaeanZayin}) (2: U+10AC9..10ACA) \p{Joining_Group: Meem} (Short: \p{Jg=Meem}) (4: U+0645, U+0765..0766, U+08A7) \p{Joining_Group: Mim} (Short: \p{Jg=Mim}) (1: U+0721) \p{Joining_Group: No_Joining_Group} (Short: \p{Jg=NoJoiningGroup}) (1_113_790 plus all above-Unicode code points: U+0000..061F, U+0621, U+0640, U+064B..066D, U+0670, U+0674 ...) \p{Joining_Group: Noon} (Short: \p{Jg=Noon}) (8: U+0646, U+06B9..06BC, U+0767..0769) \p{Joining_Group: Nun} (Short: \p{Jg=Nun}) (1: U+0722) \p{Joining_Group: Nya} (Short: \p{Jg=Nya}) (1: U+06BD) \p{Joining_Group: Pe} (Short: \p{Jg=Pe}) (1: U+0726) \p{Joining_Group: Qaf} (Short: \p{Jg=Qaf}) (5: U+0642, U+066F, U+06A7..06A8, U+08A5) \p{Joining_Group: Qaph} (Short: \p{Jg=Qaph}) (1: U+0729) \p{Joining_Group: Reh} (Short: \p{Jg=Reh}) (19: U+0631..0632, U+0691..0699, U+06EF, U+075B, U+076B..076C, U+0771 ...) \p{Joining_Group: Reversed_Pe} (Short: \p{Jg=ReversedPe}) (1: U+0727) \p{Joining_Group: Rohingya_Yeh} (Short: \p{Jg=RohingyaYeh}) (1: U+08AC) \p{Joining_Group: Sad} (Short: \p{Jg=Sad}) (6: U+0635..0636, U+069D..069E, U+06FB, U+08AF) \p{Joining_Group: Sadhe} (Short: \p{Jg=Sadhe}) (1: U+0728) \p{Joining_Group: Seen} (Short: \p{Jg=Seen}) (11: U+0633..0634, U+069A..069C, U+06FA, U+075C, U+076D, U+0770 ...) \p{Joining_Group: Semkath} (Short: \p{Jg=Semkath}) (1: U+0723) \p{Joining_Group: Shin} (Short: \p{Jg=Shin}) (1: U+072B) \p{Joining_Group: Straight_Waw} (Short: \p{Jg=StraightWaw}) (1: U+08B1) \p{Joining_Group: Swash_Kaf} (Short: \p{Jg=SwashKaf}) (1: U+06AA) \p{Joining_Group: Syriac_Waw} (Short: \p{Jg=SyriacWaw}) (1: U+0718) \p{Joining_Group: Tah} (Short: \p{Jg=Tah}) (4: U+0637..0638, U+069F, U+08A3) \p{Joining_Group: Taw} (Short: \p{Jg=Taw}) (1: U+072C) \p{Joining_Group: Teh_Marbuta} (Short: \p{Jg=TehMarbuta}) (3: U+0629, U+06C0, U+06D5) \p{Joining_Group: Teh_Marbuta_Goal} \p{Joining_Group= Hamza_On_Heh_Goal} (1) \p{Joining_Group: Teth} (Short: \p{Jg=Teth}) (2: U+071B..071C) \p{Joining_Group: Waw} (Short: \p{Jg=Waw}) (16: U+0624, U+0648, U+0676..0677, U+06C4..06CB, U+06CF, U+0778..0779 ...) \p{Joining_Group: Yeh} (Short: \p{Jg=Yeh}) (11: U+0620, U+0626, U+0649..064A, U+0678, U+06D0..06D1, U+0777 ...) \p{Joining_Group: Yeh_Barree} (Short: \p{Jg=YehBarree}) (2: U+06D2..06D3) \p{Joining_Group: Yeh_With_Tail} (Short: \p{Jg=YehWithTail}) (1: U+06CD) \p{Joining_Group: Yudh} (Short: \p{Jg=Yudh}) (1: U+071D) \p{Joining_Group: Yudh_He} (Short: \p{Jg=YudhHe}) (1: U+071E) \p{Joining_Group: Zain} (Short: \p{Jg=Zain}) (1: U+0719) \p{Joining_Group: Zhain} (Short: \p{Jg=Zhain}) (1: U+074D) \p{Joining_Type: C} \p{Joining_Type=Join_Causing} (4) \p{Joining_Type: D} \p{Joining_Type=Dual_Joining} (586) \p{Joining_Type: Dual_Joining} (Short: \p{Jt=D}) (586: U+0620, U+0626, U+0628, U+062A..062E, U+0633..063F, U+0641..0647 ...) \p{Joining_Type: Join_Causing} (Short: \p{Jt=C}) (4: U+0640, U+07FA, U+180A, U+200D) \p{Joining_Type: L} \p{Joining_Type=Left_Joining} (5) \p{Joining_Type: Left_Joining} (Short: \p{Jt=L}) (5: U+A872, U+10ACD, U+10AD7, U+10D00, U+10FCB) \p{Joining_Type: Non_Joining} (Short: \p{Jt=U}) (1_111_390 plus all above-Unicode code points: [\x00- \xac\xae-\xff], U+0100..02FF, U+0370..0482, U+048A..0590, U+05BE, U+05C0 ...) \p{Joining_Type: R} \p{Joining_Type=Right_Joining} (130) \p{Joining_Type: Right_Joining} (Short: \p{Jt=R}) (130: U+0622..0625, U+0627, U+0629, U+062F..0632, U+0648, U+0671..0673 ...) \p{Joining_Type: T} \p{Joining_Type=Transparent} (1997) \p{Joining_Type: Transparent} (Short: \p{Jt=T}) (1997: [\xad], U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2 ...) \p{Joining_Type: U} \p{Joining_Type=Non_Joining} (1_111_390 plus all above-Unicode code points) \p{Jt: *} \p{Joining_Type: *} \p{Kaithi} \p{Script_Extensions=Kaithi} (Short: \p{Kthi}; NOT \p{Block=Kaithi}) (87) \p{Kali} \p{Kayah_Li} (= \p{Script_Extensions= Kayah_Li}) (48) \p{Kana} \p{Katakana} (= \p{Script_Extensions= Katakana}) (NOT \p{Block=Katakana}) (356) X \p{Kana_Ext_A} \p{Kana_Extended_A} (= \p{Block= Kana_Extended_A}) (48) X \p{Kana_Extended_A} \p{Block=Kana_Extended_A} (Short: \p{InKanaExtA}) (48) X \p{Kana_Sup} \p{Kana_Supplement} (= \p{Block= Kana_Supplement}) (256) X \p{Kana_Supplement} \p{Block=Kana_Supplement} (Short: \p{InKanaSup}) (256) X \p{Kanbun} \p{Block=Kanbun} (16) X \p{Kangxi} \p{Kangxi_Radicals} (= \p{Block= Kangxi_Radicals}) (224) X \p{Kangxi_Radicals} \p{Block=Kangxi_Radicals} (Short: \p{InKangxi}) (224) \p{Kannada} \p{Script_Extensions=Kannada} (Short: \p{Knda}; NOT \p{Block=Kannada}) (104) \p{Katakana} \p{Script_Extensions=Katakana} (Short: \p{Kana}; NOT \p{Block=Katakana}) (356) X \p{Katakana_Ext} \p{Katakana_Phonetic_Extensions} (= \p{Block=Katakana_Phonetic_Extensions}) (16) X \p{Katakana_Phonetic_Extensions} \p{Block= Katakana_Phonetic_Extensions} (Short: \p{InKatakanaExt}) (16) \p{Kayah_Li} \p{Script_Extensions=Kayah_Li} (Short: \p{Kali}) (48) \p{Khar} \p{Kharoshthi} (= \p{Script_Extensions= Kharoshthi}) (NOT \p{Block=Kharoshthi}) (68) \p{Kharoshthi} \p{Script_Extensions=Kharoshthi} (Short: \p{Khar}; NOT \p{Block=Kharoshthi}) (68) \p{Khitan_Small_Script} \p{Script_Extensions=Khitan_Small_Script} (Short: \p{Kits}; NOT \p{Block= Khitan_Small_Script}) (471) \p{Khmer} \p{Script_Extensions=Khmer} (Short: \p{Khmr}; NOT \p{Block=Khmer}) (146) X \p{Khmer_Symbols} \p{Block=Khmer_Symbols} (32) \p{Khmr} \p{Khmer} (= \p{Script_Extensions=Khmer}) (NOT \p{Block=Khmer}) (146) \p{Khoj} \p{Khojki} (= \p{Script_Extensions= Khojki}) (NOT \p{Block=Khojki}) (82) \p{Khojki} \p{Script_Extensions=Khojki} (Short: \p{Khoj}; NOT \p{Block=Khojki}) (82) \p{Khudawadi} \p{Script_Extensions=Khudawadi} (Short: \p{Sind}; NOT \p{Block=Khudawadi}) (81) \p{Kits} \p{Khitan_Small_Script} (= \p{Script_Extensions= Khitan_Small_Script}) (NOT \p{Block= Khitan_Small_Script}) (471) \p{Knda} \p{Kannada} (= \p{Script_Extensions= Kannada}) (NOT \p{Block=Kannada}) (104) \p{Kthi} \p{Kaithi} (= \p{Script_Extensions= Kaithi}) (NOT \p{Block=Kaithi}) (87) \p{L} \pL \p{Letter} (= \p{General_Category=Letter}) (131_241) X \p{L&} \p{Cased_Letter} (= \p{General_Category= Cased_Letter}) (3977) X \p{L_} \p{Cased_Letter} (= \p{General_Category= Cased_Letter}) Note the trailing '_' matters in spite of loose matching rules. (3977) \p{Lana} \p{Tai_Tham} (= \p{Script_Extensions= Tai_Tham}) (NOT \p{Block=Tai_Tham}) (127) \p{Lao} \p{Script_Extensions=Lao} (NOT \p{Block= Lao}) (82) \p{Laoo} \p{Lao} (= \p{Script_Extensions=Lao}) (NOT \p{Block=Lao}) (82) \p{Latin} \p{Script_Extensions=Latin} (Short: \p{Latn}) (1403) X \p{Latin_1} \p{Latin_1_Supplement} (= \p{Block= Latin_1_Supplement}) (128) X \p{Latin_1_Sup} \p{Latin_1_Supplement} (= \p{Block= Latin_1_Supplement}) (128) X \p{Latin_1_Supplement} \p{Block=Latin_1_Supplement} (Short: \p{InLatin1}) (128) X \p{Latin_Ext_A} \p{Latin_Extended_A} (= \p{Block= Latin_Extended_A}) (128) X \p{Latin_Ext_Additional} \p{Latin_Extended_Additional} (= \p{Block=Latin_Extended_Additional}) (256) X \p{Latin_Ext_B} \p{Latin_Extended_B} (= \p{Block= Latin_Extended_B}) (208) X \p{Latin_Ext_C} \p{Latin_Extended_C} (= \p{Block= Latin_Extended_C}) (32) X \p{Latin_Ext_D} \p{Latin_Extended_D} (= \p{Block= Latin_Extended_D}) (224) X \p{Latin_Ext_E} \p{Latin_Extended_E} (= \p{Block= Latin_Extended_E}) (64) X \p{Latin_Extended_A} \p{Block=Latin_Extended_A} (Short: \p{InLatinExtA}) (128) X \p{Latin_Extended_Additional} \p{Block=Latin_Extended_Additional} (Short: \p{InLatinExtAdditional}) (256) X \p{Latin_Extended_B} \p{Block=Latin_Extended_B} (Short: \p{InLatinExtB}) (208) X \p{Latin_Extended_C} \p{Block=Latin_Extended_C} (Short: \p{InLatinExtC}) (32) X \p{Latin_Extended_D} \p{Block=Latin_Extended_D} (Short: \p{InLatinExtD}) (224) X \p{Latin_Extended_E} \p{Block=Latin_Extended_E} (Short: \p{InLatinExtE}) (64) \p{Latn} \p{Latin} (= \p{Script_Extensions=Latin}) (1403) \p{Lb: *} \p{Line_Break: *} \p{LC} \p{Cased_Letter} (= \p{General_Category= Cased_Letter}) (3977) \p{Lepc} \p{Lepcha} (= \p{Script_Extensions= Lepcha}) (NOT \p{Block=Lepcha}) (74) \p{Lepcha} \p{Script_Extensions=Lepcha} (Short: \p{Lepc}; NOT \p{Block=Lepcha}) (74) \p{Letter} \p{General_Category=Letter} (Short: \p{L}) (131_241) \p{Letter_Number} \p{General_Category=Letter_Number} (Short: \p{Nl}) (236) X \p{Letterlike_Symbols} \p{Block=Letterlike_Symbols} (80) \p{Limb} \p{Limbu} (= \p{Script_Extensions=Limbu}) (NOT \p{Block=Limbu}) (69) \p{Limbu} \p{Script_Extensions=Limbu} (Short: \p{Limb}; NOT \p{Block=Limbu}) (69) \p{Lina} \p{Linear_A} (= \p{Script_Extensions= Linear_A}) (NOT \p{Block=Linear_A}) (386) \p{Linb} \p{Linear_B} (= \p{Script_Extensions= Linear_B}) (268) \p{Line_Break: AI} \p{Line_Break=Ambiguous} (707) \p{Line_Break: AL} \p{Line_Break=Alphabetic} (21_400) \p{Line_Break: Alphabetic} (Short: \p{Lb=AL}) (21_400: [#&*<=>\@A- Z\^_`a-z~\xa6\xa9\xac\xae-\xaf\xb5\xc0- \xd6\xd8-\xf6\xf8-\xff], U+0100..02C6, U+02CE..02CF, U+02D1..02D7, U+02DC, U+02DE ...) \p{Line_Break: Ambiguous} (Short: \p{Lb=AI}) (707: [\xa7-\xa8\xaa \xb2-\xb3\xb6-\xba\xbc-\xbe\xd7\xf7], U+02C7, U+02C9..02CB, U+02CD, U+02D0, U+02D8..02DB ...) \p{Line_Break: B2} \p{Line_Break=Break_Both} (3) \p{Line_Break: BA} \p{Line_Break=Break_After} (244) \p{Line_Break: BB} \p{Line_Break=Break_Before} (45) \p{Line_Break: BK} \p{Line_Break=Mandatory_Break} (4) \p{Line_Break: Break_After} (Short: \p{Lb=BA}) (244: [\t\|\xad], U+058A, U+05BE, U+0964..0965, U+0E5A..0E5B, U+0F0B ...) \p{Line_Break: Break_Before} (Short: \p{Lb=BB}) (45: [\xb4], U+02C8, U+02CC, U+02DF, U+0C77, U+0C84 ...) \p{Line_Break: Break_Both} (Short: \p{Lb=B2}) (3: U+2014, U+2E3A..2E3B) \p{Line_Break: Break_Symbols} (Short: \p{Lb=SY}) (1: [\/]) \p{Line_Break: Carriage_Return} (Short: \p{Lb=CR}) (1: [\r]) \p{Line_Break: CB} \p{Line_Break=Contingent_Break} (1) \p{Line_Break: CJ} \p{Line_Break= Conditional_Japanese_Starter} (58) \p{Line_Break: CL} \p{Line_Break=Close_Punctuation} (91) \p{Line_Break: Close_Parenthesis} (Short: \p{Lb=CP}) (2: [\)\]]) \p{Line_Break: Close_Punctuation} (Short: \p{Lb=CL}) (91: [\}], U+0F3B, U+0F3D, U+169C, U+2046, U+207E ...) \p{Line_Break: CM} \p{Line_Break=Combining_Mark} (2286) \p{Line_Break: Combining_Mark} (Short: \p{Lb=CM}) (2286: [^\t\n \cK\f\r\x20-\x7e\x85\xa0-\xff], U+0300..034E, U+0350..035B, U+0363..036F, U+0483..0489, U+0591..05BD ...) \p{Line_Break: Complex_Context} (Short: \p{Lb=SA}) (750: U+0E01..0E3A, U+0E40..0E4E, U+0E81..0E82, U+0E84, U+0E86..0E8A, U+0E8C..0EA3 ...) \p{Line_Break: Conditional_Japanese_Starter} (Short: \p{Lb=CJ}) (58: U+3041, U+3043, U+3045, U+3047, U+3049, U+3063 ...) \p{Line_Break: Contingent_Break} (Short: \p{Lb=CB}) (1: U+FFFC) \p{Line_Break: CP} \p{Line_Break=Close_Parenthesis} (2) \p{Line_Break: CR} \p{Line_Break=Carriage_Return} (1) \p{Line_Break: E_Base} (Short: \p{Lb=EB}) (122: U+261D, U+26F9, U+270A..270D, U+1F385, U+1F3C2..1F3C4, U+1F3C7 ...) \p{Line_Break: E_Modifier} (Short: \p{Lb=EM}) (5: U+1F3FB..1F3FF) \p{Line_Break: EB} \p{Line_Break=E_Base} (122) \p{Line_Break: EM} \p{Line_Break=E_Modifier} (5) \p{Line_Break: EX} \p{Line_Break=Exclamation} (37) \p{Line_Break: Exclamation} (Short: \p{Lb=EX}) (37: [!?], U+05C6, U+061B, U+061E..061F, U+06D4, U+07F9 ...) \p{Line_Break: GL} \p{Line_Break=Glue} (26) \p{Line_Break: Glue} (Short: \p{Lb=GL}) (26: [\xa0], U+034F, U+035C..0362, U+0F08, U+0F0C, U+0F12 ...) \p{Line_Break: H2} (Short: \p{Lb=H2}) (399: U+AC00, U+AC1C, U+AC38, U+AC54, U+AC70, U+AC8C ...) \p{Line_Break: H3} (Short: \p{Lb=H3}) (10_773: U+AC01..AC1B, U+AC1D..AC37, U+AC39..AC53, U+AC55..AC6F, U+AC71..AC8B, U+AC8D..ACA7 ...) \p{Line_Break: Hebrew_Letter} (Short: \p{Lb=HL}) (75: U+05D0..05EA, U+05EF..05F2, U+FB1D, U+FB1F..FB28, U+FB2A..FB36, U+FB38..FB3C ...) \p{Line_Break: HL} \p{Line_Break=Hebrew_Letter} (75) \p{Line_Break: HY} \p{Line_Break=Hyphen} (1) \p{Line_Break: Hyphen} (Short: \p{Lb=HY}) (1: [\-]) \p{Line_Break: ID} \p{Line_Break=Ideographic} (172_462) \p{Line_Break: Ideographic} (Short: \p{Lb=ID}) (172_462: U+231A..231B, U+23F0..23F3, U+2600..2603, U+2614..2615, U+2618, U+261A..261C ...) \p{Line_Break: IN} \p{Line_Break=Inseparable} (6) \p{Line_Break: Infix_Numeric} (Short: \p{Lb=IS}) (13: [,.:;], U+037E, U+0589, U+060C..060D, U+07F8, U+2044 ...) \p{Line_Break: Inseparable} (Short: \p{Lb=IN}) (6: U+2024..2026, U+22EF, U+FE19, U+10AF6) \p{Line_Break: Inseperable} \p{Line_Break=Inseparable} (6) \p{Line_Break: IS} \p{Line_Break=Infix_Numeric} (13) \p{Line_Break: JL} (Short: \p{Lb=JL}) (125: U+1100..115F, U+A960..A97C) \p{Line_Break: JT} (Short: \p{Lb=JT}) (137: U+11A8..11FF, U+D7CB..D7FB) \p{Line_Break: JV} (Short: \p{Lb=JV}) (95: U+1160..11A7, U+D7B0..D7C6) \p{Line_Break: LF} \p{Line_Break=Line_Feed} (1) \p{Line_Break: Line_Feed} (Short: \p{Lb=LF}) (1: [\n]) \p{Line_Break: Mandatory_Break} (Short: \p{Lb=BK}) (4: [\cK\f], U+2028..2029) \p{Line_Break: Next_Line} (Short: \p{Lb=NL}) (1: [\x85]) \p{Line_Break: NL} \p{Line_Break=Next_Line} (1) \p{Line_Break: Nonstarter} (Short: \p{Lb=NS}) (33: U+17D6, U+203C..203D, U+2047..2049, U+3005, U+301C, U+303B..303C ...) \p{Line_Break: NS} \p{Line_Break=Nonstarter} (33) \p{Line_Break: NU} \p{Line_Break=Numeric} (642) \p{Line_Break: Numeric} (Short: \p{Lb=NU}) (642: [0-9], U+0660..0669, U+066B..066C, U+06F0..06F9, U+07C0..07C9, U+0966..096F ...) \p{Line_Break: OP} \p{Line_Break=Open_Punctuation} (88) \p{Line_Break: Open_Punctuation} (Short: \p{Lb=OP}) (88: [\(\[\{ \xa1\xbf], U+0F3A, U+0F3C, U+169B, U+201A, U+201E ...) \p{Line_Break: PO} \p{Line_Break=Postfix_Numeric} (36) \p{Line_Break: Postfix_Numeric} (Short: \p{Lb=PO}) (36: [\%\xa2 \xb0], U+0609..060B, U+066A, U+09F2..09F3, U+09F9, U+0D79 ...) \p{Line_Break: PR} \p{Line_Break=Prefix_Numeric} (68) \p{Line_Break: Prefix_Numeric} (Short: \p{Lb=PR}) (68: [\$+\\\xa3- \xa5\xb1], U+058F, U+07FE..07FF, U+09FB, U+0AF1, U+0BF9 ...) \p{Line_Break: QU} \p{Line_Break=Quotation} (39) \p{Line_Break: Quotation} (Short: \p{Lb=QU}) (39: [\"\'\xab\xbb], U+2018..2019, U+201B..201D, U+201F, U+2039..203A, U+275B..2760 ...) \p{Line_Break: Regional_Indicator} (Short: \p{Lb=RI}) (26: U+1F1E6..1F1FF) \p{Line_Break: RI} \p{Line_Break=Regional_Indicator} (26) \p{Line_Break: SA} \p{Line_Break=Complex_Context} (750) D \p{Line_Break: SG} \p{Line_Break=Surrogate} (2048) \p{Line_Break: SP} \p{Line_Break=Space} (1) \p{Line_Break: Space} (Short: \p{Lb=SP}) (1: [\x20]) D \p{Line_Break: Surrogate} Surrogates should never appear in well- formed text, and therefore shouldn't be the basis for line breaking (Short: \p{Lb=SG}) (2048: U+D800..DFFF) \p{Line_Break: SY} \p{Line_Break=Break_Symbols} (1) \p{Line_Break: Unknown} (Short: \p{Lb=XX}) (901_256 plus all above-Unicode code points: U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2, U+0530 ...) \p{Line_Break: WJ} \p{Line_Break=Word_Joiner} (2) \p{Line_Break: Word_Joiner} (Short: \p{Lb=WJ}) (2: U+2060, U+FEFF) \p{Line_Break: XX} \p{Line_Break=Unknown} (901_256 plus all above-Unicode code points) \p{Line_Break: ZW} \p{Line_Break=ZWSpace} (1) \p{Line_Break: ZWJ} (Short: \p{Lb=ZWJ}) (1: U+200D) \p{Line_Break: ZWSpace} (Short: \p{Lb=ZW}) (1: U+200B) \p{Line_Separator} \p{General_Category=Line_Separator} (Short: \p{Zl}) (1) \p{Linear_A} \p{Script_Extensions=Linear_A} (Short: \p{Lina}; NOT \p{Block=Linear_A}) (386) \p{Linear_B} \p{Script_Extensions=Linear_B} (Short: \p{Linb}) (268) X \p{Linear_B_Ideograms} \p{Block=Linear_B_Ideograms} (128) X \p{Linear_B_Syllabary} \p{Block=Linear_B_Syllabary} (128) \p{Lisu} \p{Script_Extensions=Lisu} (NOT \p{Block= Lisu}) (49) X \p{Lisu_Sup} \p{Lisu_Supplement} (= \p{Block= Lisu_Supplement}) (16) X \p{Lisu_Supplement} \p{Block=Lisu_Supplement} (Short: \p{InLisuSup}) (16) \p{Ll} \p{Lowercase_Letter} (= \p{General_Category=Lowercase_Letter}) (/i= General_Category=Cased_Letter) (2155) \p{Lm} \p{Modifier_Letter} (= \p{General_Category=Modifier_Letter}) (260) \p{Lo} \p{Other_Letter} (= \p{General_Category= Other_Letter}) (127_004) \p{LOE} \p{Logical_Order_Exception} (= \p{Logical_Order_Exception=Y}) (19) \p{LOE: *} \p{Logical_Order_Exception: *} \p{Logical_Order_Exception} \p{Logical_Order_Exception=Y} (Short: \p{LOE}) (19) \p{Logical_Order_Exception: N*} (Short: \p{LOE=N}, \P{LOE}) (1_114_093 plus all above-Unicode code points: U+0000..0E3F, U+0E45..0EBF, U+0EC5..19B4, U+19B8..19B9, U+19BB..AAB4, U+AAB7..AAB8 ...) \p{Logical_Order_Exception: Y*} (Short: \p{LOE=Y}, \p{LOE}) (19: U+0E40..0E44, U+0EC0..0EC4, U+19B5..19B7, U+19BA, U+AAB5..AAB6, U+AAB9 ...) X \p{Low_Surrogates} \p{Block=Low_Surrogates} (1024) \p{Lower} \p{XPosixLower} (= \p{Lowercase=Y}) (/i= Cased=Yes) (2344) \p{Lower: *} \p{Lowercase: *} \p{Lowercase} \p{XPosixLower} (= \p{Lowercase=Y}) (/i= Cased=Yes) (2344) \p{Lowercase: N*} (Short: \p{Lower=N}, \P{Lower}; /i= Cased= No) (1_111_768 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\' \(\)*+,\-.\/0-9:;<=>?\@A-Z\[\\\]\^_`\{ \|\}~\x7f-\xa9\xab-\xb4\xb6-\xb9\xbb- \xde\xf7], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Lowercase: Y*} (Short: \p{Lower=Y}, \p{Lower}; /i= Cased= Yes) (2344: [a-z\xaa\xb5\xba\xdf-\xf6 \xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Lowercase_Letter} \p{General_Category=Lowercase_Letter} (Short: \p{Ll}; /i= General_Category= Cased_Letter) (2155) \p{Lt} \p{Titlecase_Letter} (= \p{General_Category=Titlecase_Letter}) (/i= General_Category=Cased_Letter) (31) \p{Lu} \p{Uppercase_Letter} (= \p{General_Category=Uppercase_Letter}) (/i= General_Category=Cased_Letter) (1791) \p{Lyci} \p{Lycian} (= \p{Script_Extensions= Lycian}) (NOT \p{Block=Lycian}) (29) \p{Lycian} \p{Script_Extensions=Lycian} (Short: \p{Lyci}; NOT \p{Block=Lycian}) (29) \p{Lydi} \p{Lydian} (= \p{Script_Extensions= Lydian}) (NOT \p{Block=Lydian}) (27) \p{Lydian} \p{Script_Extensions=Lydian} (Short: \p{Lydi}; NOT \p{Block=Lydian}) (27) \p{M} \pM \p{Mark} (= \p{General_Category=Mark}) (2295) \p{Mahajani} \p{Script_Extensions=Mahajani} (Short: \p{Mahj}; NOT \p{Block=Mahajani}) (61) \p{Mahj} \p{Mahajani} (= \p{Script_Extensions= Mahajani}) (NOT \p{Block=Mahajani}) (61) X \p{Mahjong} \p{Mahjong_Tiles} (= \p{Block= Mahjong_Tiles}) (48) X \p{Mahjong_Tiles} \p{Block=Mahjong_Tiles} (Short: \p{InMahjong}) (48) \p{Maka} \p{Makasar} (= \p{Script_Extensions= Makasar}) (NOT \p{Block=Makasar}) (25) \p{Makasar} \p{Script_Extensions=Makasar} (Short: \p{Maka}; NOT \p{Block=Makasar}) (25) \p{Malayalam} \p{Script_Extensions=Malayalam} (Short: \p{Mlym}; NOT \p{Block=Malayalam}) (126) \p{Mand} \p{Mandaic} (= \p{Script_Extensions= Mandaic}) (NOT \p{Block=Mandaic}) (30) \p{Mandaic} \p{Script_Extensions=Mandaic} (Short: \p{Mand}; NOT \p{Block=Mandaic}) (30) \p{Mani} \p{Manichaean} (= \p{Script_Extensions= Manichaean}) (NOT \p{Block=Manichaean}) (52) \p{Manichaean} \p{Script_Extensions=Manichaean} (Short: \p{Mani}; NOT \p{Block=Manichaean}) (52) \p{Marc} \p{Marchen} (= \p{Script_Extensions= Marchen}) (NOT \p{Block=Marchen}) (68) \p{Marchen} \p{Script_Extensions=Marchen} (Short: \p{Marc}; NOT \p{Block=Marchen}) (68) \p{Mark} \p{General_Category=Mark} (Short: \p{M}) (2295) \p{Masaram_Gondi} \p{Script_Extensions=Masaram_Gondi} (Short: \p{Gonm}; NOT \p{Block= Masaram_Gondi}) (77) \p{Math} \p{Math=Y} (2310) \p{Math: N*} (Single: \P{Math}) (1_111_802 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*,\-.\/0-9:;?\@A-Z \[\\\]_`a-z\{\}\x7f-\xab\xad-\xb0\xb2- \xd6\xd8-\xf6\xf8-\xff], U+0100..03CF, U+03D3..03D4, U+03D6..03EF, U+03F2..03F3, U+03F7..0605 ...) \p{Math: Y*} (Single: \p{Math}) (2310: [+<=>\^\|~\xac \xb1\xd7\xf7], U+03D0..03D2, U+03D5, U+03F0..03F1, U+03F4..03F6, U+0606..0608 ...) X \p{Math_Alphanum} \p{Mathematical_Alphanumeric_Symbols} (= \p{Block= Mathematical_Alphanumeric_Symbols}) (1024) X \p{Math_Operators} \p{Mathematical_Operators} (= \p{Block= Mathematical_Operators}) (256) \p{Math_Symbol} \p{General_Category=Math_Symbol} (Short: \p{Sm}) (948) X \p{Mathematical_Alphanumeric_Symbols} \p{Block= Mathematical_Alphanumeric_Symbols} (Short: \p{InMathAlphanum}) (1024) X \p{Mathematical_Operators} \p{Block=Mathematical_Operators} (Short: \p{InMathOperators}) (256) X \p{Mayan_Numerals} \p{Block=Mayan_Numerals} (32) \p{Mc} \p{Spacing_Mark} (= \p{General_Category= Spacing_Mark}) (443) \p{Me} \p{Enclosing_Mark} (= \p{General_Category= Enclosing_Mark}) (13) \p{Medefaidrin} \p{Script_Extensions=Medefaidrin} (Short: \p{Medf}; NOT \p{Block=Medefaidrin}) (91) \p{Medf} \p{Medefaidrin} (= \p{Script_Extensions= Medefaidrin}) (NOT \p{Block= Medefaidrin}) (91) \p{Meetei_Mayek} \p{Script_Extensions=Meetei_Mayek} (Short: \p{Mtei}; NOT \p{Block=Meetei_Mayek}) (79) X \p{Meetei_Mayek_Ext} \p{Meetei_Mayek_Extensions} (= \p{Block= Meetei_Mayek_Extensions}) (32) X \p{Meetei_Mayek_Extensions} \p{Block=Meetei_Mayek_Extensions} (Short: \p{InMeeteiMayekExt}) (32) \p{Mend} \p{Mende_Kikakui} (= \p{Script_Extensions= Mende_Kikakui}) (NOT \p{Block= Mende_Kikakui}) (213) \p{Mende_Kikakui} \p{Script_Extensions=Mende_Kikakui} (Short: \p{Mend}; NOT \p{Block= Mende_Kikakui}) (213) \p{Merc} \p{Meroitic_Cursive} (= \p{Script_Extensions=Meroitic_Cursive}) (NOT \p{Block=Meroitic_Cursive}) (90) \p{Mero} \p{Meroitic_Hieroglyphs} (= \p{Script_Extensions= Meroitic_Hieroglyphs}) (32) \p{Meroitic_Cursive} \p{Script_Extensions=Meroitic_Cursive} (Short: \p{Merc}; NOT \p{Block= Meroitic_Cursive}) (90) \p{Meroitic_Hieroglyphs} \p{Script_Extensions= Meroitic_Hieroglyphs} (Short: \p{Mero}) (32) \p{Miao} \p{Script_Extensions=Miao} (NOT \p{Block= Miao}) (149) X \p{Misc_Arrows} \p{Miscellaneous_Symbols_And_Arrows} (= \p{Block= Miscellaneous_Symbols_And_Arrows}) (256) X \p{Misc_Math_Symbols_A} \p{Miscellaneous_Mathematical_Symbols_A} (= \p{Block= Miscellaneous_Mathematical_Symbols_A}) (48) X \p{Misc_Math_Symbols_B} \p{Miscellaneous_Mathematical_Symbols_B} (= \p{Block= Miscellaneous_Mathematical_Symbols_B}) (128) X \p{Misc_Pictographs} \p{Miscellaneous_Symbols_And_Pictographs} (= \p{Block= Miscellaneous_Symbols_And_Pictographs}) (768) X \p{Misc_Symbols} \p{Miscellaneous_Symbols} (= \p{Block= Miscellaneous_Symbols}) (256) X \p{Misc_Technical} \p{Miscellaneous_Technical} (= \p{Block= Miscellaneous_Technical}) (256) X \p{Miscellaneous_Mathematical_Symbols_A} \p{Block= Miscellaneous_Mathematical_Symbols_A} (Short: \p{InMiscMathSymbolsA}) (48) X \p{Miscellaneous_Mathematical_Symbols_B} \p{Block= Miscellaneous_Mathematical_Symbols_B} (Short: \p{InMiscMathSymbolsB}) (128) X \p{Miscellaneous_Symbols} \p{Block=Miscellaneous_Symbols} (Short: \p{InMiscSymbols}) (256) X \p{Miscellaneous_Symbols_And_Arrows} \p{Block= Miscellaneous_Symbols_And_Arrows} (Short: \p{InMiscArrows}) (256) X \p{Miscellaneous_Symbols_And_Pictographs} \p{Block= Miscellaneous_Symbols_And_Pictographs} (Short: \p{InMiscPictographs}) (768) X \p{Miscellaneous_Technical} \p{Block=Miscellaneous_Technical} (Short: \p{InMiscTechnical}) (256) \p{Mlym} \p{Malayalam} (= \p{Script_Extensions= Malayalam}) (NOT \p{Block=Malayalam}) (126) \p{Mn} \p{Nonspacing_Mark} (= \p{General_Category=Nonspacing_Mark}) (1839) \p{Modi} \p{Script_Extensions=Modi} (NOT \p{Block= Modi}) (89) \p{Modifier_Letter} \p{General_Category=Modifier_Letter} (Short: \p{Lm}) (260) X \p{Modifier_Letters} \p{Spacing_Modifier_Letters} (= \p{Block= Spacing_Modifier_Letters}) (80) \p{Modifier_Symbol} \p{General_Category=Modifier_Symbol} (Short: \p{Sk}) (123) X \p{Modifier_Tone_Letters} \p{Block=Modifier_Tone_Letters} (32) \p{Mong} \p{Mongolian} (= \p{Script_Extensions= Mongolian}) (NOT \p{Block=Mongolian}) (171) \p{Mongolian} \p{Script_Extensions=Mongolian} (Short: \p{Mong}; NOT \p{Block=Mongolian}) (171) X \p{Mongolian_Sup} \p{Mongolian_Supplement} (= \p{Block= Mongolian_Supplement}) (32) X \p{Mongolian_Supplement} \p{Block=Mongolian_Supplement} (Short: \p{InMongolianSup}) (32) \p{Mro} \p{Script_Extensions=Mro} (NOT \p{Block= Mro}) (43) \p{Mroo} \p{Mro} (= \p{Script_Extensions=Mro}) (NOT \p{Block=Mro}) (43) \p{Mtei} \p{Meetei_Mayek} (= \p{Script_Extensions= Meetei_Mayek}) (NOT \p{Block= Meetei_Mayek}) (79) \p{Mult} \p{Multani} (= \p{Script_Extensions= Multani}) (NOT \p{Block=Multani}) (48) \p{Multani} \p{Script_Extensions=Multani} (Short: \p{Mult}; NOT \p{Block=Multani}) (48) X \p{Music} \p{Musical_Symbols} (= \p{Block= Musical_Symbols}) (256) X \p{Musical_Symbols} \p{Block=Musical_Symbols} (Short: \p{InMusic}) (256) \p{Myanmar} \p{Script_Extensions=Myanmar} (Short: \p{Mymr}; NOT \p{Block=Myanmar}) (224) X \p{Myanmar_Ext_A} \p{Myanmar_Extended_A} (= \p{Block= Myanmar_Extended_A}) (32) X \p{Myanmar_Ext_B} \p{Myanmar_Extended_B} (= \p{Block= Myanmar_Extended_B}) (32) X \p{Myanmar_Extended_A} \p{Block=Myanmar_Extended_A} (Short: \p{InMyanmarExtA}) (32) X \p{Myanmar_Extended_B} \p{Block=Myanmar_Extended_B} (Short: \p{InMyanmarExtB}) (32) \p{Mymr} \p{Myanmar} (= \p{Script_Extensions= Myanmar}) (NOT \p{Block=Myanmar}) (224) \p{N} \pN \p{Number} (= \p{General_Category=Number}) (1781) \p{Na=*} \p{Name=*} \p{Nabataean} \p{Script_Extensions=Nabataean} (Short: \p{Nbat}; NOT \p{Block=Nabataean}) (40) \p{Name=*} Combination of Name and Name_Alias properties; has special loose matching rules, for which see Unicode UAX #44 \p{Nand} \p{Nandinagari} (= \p{Script_Extensions= Nandinagari}) (NOT \p{Block= Nandinagari}) (86) \p{Nandinagari} \p{Script_Extensions=Nandinagari} (Short: \p{Nand}; NOT \p{Block=Nandinagari}) (86) \p{Narb} \p{Old_North_Arabian} (= \p{Script_Extensions=Old_North_Arabian}) (32) X \p{NB} \p{No_Block} (= \p{Block=No_Block}) (826_640 plus all above-Unicode code points) \p{Nbat} \p{Nabataean} (= \p{Script_Extensions= Nabataean}) (NOT \p{Block=Nabataean}) (40) \p{NChar} \p{Noncharacter_Code_Point} (= \p{Noncharacter_Code_Point=Y}) (66) \p{NChar: *} \p{Noncharacter_Code_Point: *} \p{Nd} \p{XPosixDigit} (= \p{General_Category= Decimal_Number}) (650) \p{New_Tai_Lue} \p{Script_Extensions=New_Tai_Lue} (Short: \p{Talu}; NOT \p{Block=New_Tai_Lue}) (83) \p{Newa} \p{Script_Extensions=Newa} (NOT \p{Block= Newa}) (97) \p{NFC_QC: *} \p{NFC_Quick_Check: *} \p{NFC_Quick_Check: M} \p{NFC_Quick_Check=Maybe} (111) \p{NFC_Quick_Check: Maybe} (Short: \p{NFCQC=M}) (111: U+0300..0304, U+0306..030C, U+030F, U+0311, U+0313..0314, U+031B ...) \p{NFC_Quick_Check: N} \p{NFC_Quick_Check=No} (NOT \P{NFC_Quick_Check} NOR \P{NFC_QC}) (1120) \p{NFC_Quick_Check: No} (Short: \p{NFCQC=N}; NOT \P{NFC_Quick_Check} NOR \P{NFC_QC}) (1120: U+0340..0341, U+0343..0344, U+0374, U+037E, U+0387, U+0958..095F ...) \p{NFC_Quick_Check: Y} \p{NFC_Quick_Check=Yes} (NOT \p{NFC_Quick_Check} NOR \p{NFC_QC}) (1_112_881 plus all above-Unicode code points) \p{NFC_Quick_Check: Yes} (Short: \p{NFCQC=Y}; NOT \p{NFC_Quick_Check} NOR \p{NFC_QC}) (1_112_881 plus all above-Unicode code points: U+0000..02FF, U+0305, U+030D..030E, U+0310, U+0312, U+0315..031A ...) \p{NFD_QC: *} \p{NFD_Quick_Check: *} \p{NFD_Quick_Check: N} \p{NFD_Quick_Check=No} (NOT \P{NFD_Quick_Check} NOR \P{NFD_QC}) (13_233) \p{NFD_Quick_Check: No} (Short: \p{NFDQC=N}; NOT \P{NFD_Quick_Check} NOR \P{NFD_QC}) (13_233: [\xc0-\xc5\xc7-\xcf\xd1-\xd6 \xd9-\xdd\xe0-\xe5\xe7-\xef\xf1-\xf6 \xf9-\xfd\xff], U+0100..010F, U+0112..0125, U+0128..0130, U+0134..0137, U+0139..013E ...) \p{NFD_Quick_Check: Y} \p{NFD_Quick_Check=Yes} (NOT \p{NFD_Quick_Check} NOR \p{NFD_QC}) (1_100_879 plus all above-Unicode code points) \p{NFD_Quick_Check: Yes} (Short: \p{NFDQC=Y}; NOT \p{NFD_Quick_Check} NOR \p{NFD_QC}) (1_100_879 plus all above-Unicode code points: [\x00-\xbf\xc6\xd0\xd7-\xd8\xde- \xdf\xe6\xf0\xf7-\xf8\xfe], U+0110..0111, U+0126..0127, U+0131..0133, U+0138, U+013F..0142 ...) \p{NFKC_QC: *} \p{NFKC_Quick_Check: *} \p{NFKC_Quick_Check: M} \p{NFKC_Quick_Check=Maybe} (111) \p{NFKC_Quick_Check: Maybe} (Short: \p{NFKCQC=M}) (111: U+0300..0304, U+0306..030C, U+030F, U+0311, U+0313..0314, U+031B ...) \p{NFKC_Quick_Check: N} \p{NFKC_Quick_Check=No} (NOT \P{NFKC_Quick_Check} NOR \P{NFKC_QC}) (4807) \p{NFKC_Quick_Check: No} (Short: \p{NFKCQC=N}; NOT \P{NFKC_Quick_Check} NOR \P{NFKC_QC}) (4807: [\xa0\xa8\xaa\xaf\xb2-\xb5\xb8- \xba\xbc-\xbe], U+0132..0133, U+013F..0140, U+0149, U+017F, U+01C4..01CC ...) \p{NFKC_Quick_Check: Y} \p{NFKC_Quick_Check=Yes} (NOT \p{NFKC_Quick_Check} NOR \p{NFKC_QC}) (1_109_194 plus all above-Unicode code points) \p{NFKC_Quick_Check: Yes} (Short: \p{NFKCQC=Y}; NOT \p{NFKC_Quick_Check} NOR \p{NFKC_QC}) (1_109_194 plus all above-Unicode code points: [\x00-\x9f\xa1-\xa7\xa9\xab- \xae\xb0-\xb1\xb6-\xb7\xbb\xbf-\xff], U+0100..0131, U+0134..013E, U+0141..0148, U+014A..017E, U+0180..01C3 ...) \p{NFKD_QC: *} \p{NFKD_Quick_Check: *} \p{NFKD_Quick_Check: N} \p{NFKD_Quick_Check=No} (NOT \P{NFKD_Quick_Check} NOR \P{NFKD_QC}) (16_908) \p{NFKD_Quick_Check: No} (Short: \p{NFKDQC=N}; NOT \P{NFKD_Quick_Check} NOR \P{NFKD_QC}) (16_908: [\xa0\xa8\xaa\xaf\xb2-\xb5\xb8- \xba\xbc-\xbe\xc0-\xc5\xc7-\xcf\xd1- \xd6\xd9-\xdd\xe0-\xe5\xe7-\xef\xf1- \xf6\xf9-\xfd\xff], U+0100..010F, U+0112..0125, U+0128..0130, U+0132..0137, U+0139..0140 ...) \p{NFKD_Quick_Check: Y} \p{NFKD_Quick_Check=Yes} (NOT \p{NFKD_Quick_Check} NOR \p{NFKD_QC}) (1_097_204 plus all above-Unicode code points) \p{NFKD_Quick_Check: Yes} (Short: \p{NFKDQC=Y}; NOT \p{NFKD_Quick_Check} NOR \p{NFKD_QC}) (1_097_204 plus all above-Unicode code points: [\x00-\x9f\xa1-\xa7\xa9\xab- \xae\xb0-\xb1\xb6-\xb7\xbb\xbf\xc6\xd0 \xd7-\xd8\xde-\xdf\xe6\xf0\xf7-\xf8 \xfe], U+0110..0111, U+0126..0127, U+0131, U+0138, U+0141..0142 ...) \p{Nko} \p{Script_Extensions=Nko} (NOT \p{Block= NKo}) (62) \p{Nkoo} \p{Nko} (= \p{Script_Extensions=Nko}) (NOT \p{Block=NKo}) (62) \p{Nl} \p{Letter_Number} (= \p{General_Category= Letter_Number}) (236) \p{No} \p{Other_Number} (= \p{General_Category= Other_Number}) (895) X \p{No_Block} \p{Block=No_Block} (Short: \p{InNB}) (826_640 plus all above-Unicode code points) \p{Noncharacter_Code_Point} \p{Noncharacter_Code_Point=Y} (Short: \p{NChar}) (66) \p{Noncharacter_Code_Point: N*} (Short: \p{NChar=N}, \P{NChar}) (1_114_046 plus all above-Unicode code points: U+0000..FDCF, U+FDF0..FFFD, U+10000..1FFFD, U+20000..2FFFD, U+30000..3FFFD, U+40000..4FFFD ...) \p{Noncharacter_Code_Point: Y*} (Short: \p{NChar=Y}, \p{NChar}) (66: U+FDD0..FDEF, U+FFFE..FFFF, U+1FFFE..1FFFF, U+2FFFE..2FFFF, U+3FFFE..3FFFF, U+4FFFE..4FFFF ...) \p{Nonspacing_Mark} \p{General_Category=Nonspacing_Mark} (Short: \p{Mn}) (1839) \p{Nshu} \p{Nushu} (= \p{Script_Extensions=Nushu}) (NOT \p{Block=Nushu}) (397) \p{Nt: *} \p{Numeric_Type: *} \p{Number} \p{General_Category=Number} (Short: \p{N}) (1781) X \p{Number_Forms} \p{Block=Number_Forms} (64) \p{Numeric_Type: De} \p{Numeric_Type=Decimal} (650) \p{Numeric_Type: Decimal} (Short: \p{Nt=De}) (650: [0-9], U+0660..0669, U+06F0..06F9, U+07C0..07C9, U+0966..096F, U+09E6..09EF ...) \p{Numeric_Type: Di} \p{Numeric_Type=Digit} (128) \p{Numeric_Type: Digit} (Short: \p{Nt=Di}) (128: [\xb2-\xb3\xb9], U+1369..1371, U+19DA, U+2070, U+2074..2079, U+2080..2089 ...) \p{Numeric_Type: None} (Short: \p{Nt=None}) (1_112_250 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*+,\-.\/:;<=>?\@A-Z\[\\\] \^_`a-z\{\|\}~\x7f-\xb1\xb4-\xb8\xba- \xbb\xbf-\xff], U+0100..065F, U+066A..06EF, U+06FA..07BF, U+07CA..0965, U+0970..09E5 ...) \p{Numeric_Type: Nu} \p{Numeric_Type=Numeric} (1084) \p{Numeric_Type: Numeric} (Short: \p{Nt=Nu}) (1084: [\xbc-\xbe], U+09F4..09F9, U+0B72..0B77, U+0BF0..0BF2, U+0C78..0C7E, U+0D58..0D5E ...) T \p{Numeric_Value: -1/2} (Short: \p{Nv=-1/2}) (1: U+0F33) T \p{Numeric_Value: 0} (Short: \p{Nv=0}) (83: [0], U+0660, U+06F0, U+07C0, U+0966, U+09E6 ...) T \p{Numeric_Value: 1/320} (Short: \p{Nv=1/320}) (2: U+11FC0, U+11FD4) T \p{Numeric_Value: 1/160} (Short: \p{Nv=1/160}) (2: U+0D58, U+11FC1) T \p{Numeric_Value: 1/80} (Short: \p{Nv=1/80}) (1: U+11FC2) T \p{Numeric_Value: 1/64} (Short: \p{Nv=1/64}) (1: U+11FC3) T \p{Numeric_Value: 1/40} (Short: \p{Nv=1/40}) (2: U+0D59, U+11FC4) T \p{Numeric_Value: 1/32} (Short: \p{Nv=1/32}) (1: U+11FC5) T \p{Numeric_Value: 3/80} (Short: \p{Nv=3/80}) (2: U+0D5A, U+11FC6) T \p{Numeric_Value: 3/64} (Short: \p{Nv=3/64}) (1: U+11FC7) T \p{Numeric_Value: 1/20} (Short: \p{Nv=1/20}) (2: U+0D5B, U+11FC8) T \p{Numeric_Value: 1/16} (Short: \p{Nv=1/16}) (6: U+09F4, U+0B75, U+0D76, U+A833, U+11FC9..11FCA) T \p{Numeric_Value: 1/12} (Short: \p{Nv=1/12}) (1: U+109F6) T \p{Numeric_Value: 1/10} (Short: \p{Nv=1/10}) (3: U+0D5C, U+2152, U+11FCB) T \p{Numeric_Value: 1/9} (Short: \p{Nv=1/9}) (1: U+2151) T \p{Numeric_Value: 1/8} (Short: \p{Nv=1/8}) (7: U+09F5, U+0B76, U+0D77, U+215B, U+A834, U+11FCC ...) T \p{Numeric_Value: 1/7} (Short: \p{Nv=1/7}) (1: U+2150) T \p{Numeric_Value: 3/20} (Short: \p{Nv=3/20}) (2: U+0D5D, U+11FCD) T \p{Numeric_Value: 1/6} (Short: \p{Nv=1/6}) (4: U+2159, U+109F7, U+12461, U+1ED3D) T \p{Numeric_Value: 3/16} (Short: \p{Nv=3/16}) (5: U+09F6, U+0B77, U+0D78, U+A835, U+11FCE) T \p{Numeric_Value: 1/5} (Short: \p{Nv=1/5}) (3: U+0D5E, U+2155, U+11FCF) T \p{Numeric_Value: 1/4} (Short: \p{Nv=1/4}) (14: [\xbc], U+09F7, U+0B72, U+0D73, U+A830, U+10140 ...) T \p{Numeric_Value: 1/3} (Short: \p{Nv=1/3}) (6: U+2153, U+109F9, U+10E7D, U+1245A, U+1245D, U+12465) T \p{Numeric_Value: 3/8} (Short: \p{Nv=3/8}) (1: U+215C) T \p{Numeric_Value: 2/5} (Short: \p{Nv=2/5}) (1: U+2156) T \p{Numeric_Value: 5/12} (Short: \p{Nv=5/12}) (1: U+109FA) T \p{Numeric_Value: 1/2} (Short: \p{Nv=1/2}) (19: [\xbd], U+0B73, U+0D74, U+0F2A, U+2CFD, U+A831 ...) T \p{Numeric_Value: 7/12} (Short: \p{Nv=7/12}) (1: U+109FC) T \p{Numeric_Value: 3/5} (Short: \p{Nv=3/5}) (1: U+2157) T \p{Numeric_Value: 5/8} (Short: \p{Nv=5/8}) (1: U+215D) T \p{Numeric_Value: 2/3} (Short: \p{Nv=2/3}) (7: U+2154, U+10177, U+109FD, U+10E7E, U+1245B, U+1245E ...) T \p{Numeric_Value: 3/4} (Short: \p{Nv=3/4}) (9: [\xbe], U+09F8, U+0B74, U+0D75, U+A832, U+10178 ...) T \p{Numeric_Value: 4/5} (Short: \p{Nv=4/5}) (1: U+2158) T \p{Numeric_Value: 5/6} (Short: \p{Nv=5/6}) (3: U+215A, U+109FF, U+1245C) T \p{Numeric_Value: 7/8} (Short: \p{Nv=7/8}) (1: U+215E) T \p{Numeric_Value: 11/12} (Short: \p{Nv=11/12}) (1: U+109BC) T \p{Numeric_Value: 1} (Short: \p{Nv=1}) (140: [1\xb9], U+0661, U+06F1, U+07C1, U+0967, U+09E7 ...) T \p{Numeric_Value: 3/2} (Short: \p{Nv=3/2}) (1: U+0F2B) T \p{Numeric_Value: 2} (Short: \p{Nv=2}) (139: [2\xb2], U+0662, U+06F2, U+07C2, U+0968, U+09E8 ...) T \p{Numeric_Value: 5/2} (Short: \p{Nv=5/2}) (1: U+0F2C) T \p{Numeric_Value: 3} (Short: \p{Nv=3}) (140: [3\xb3], U+0663, U+06F3, U+07C3, U+0969, U+09E9 ...) T \p{Numeric_Value: 7/2} (Short: \p{Nv=7/2}) (1: U+0F2D) T \p{Numeric_Value: 4} (Short: \p{Nv=4}) (131: [4], U+0664, U+06F4, U+07C4, U+096A, U+09EA ...) T \p{Numeric_Value: 9/2} (Short: \p{Nv=9/2}) (1: U+0F2E) T \p{Numeric_Value: 5} (Short: \p{Nv=5}) (129: [5], U+0665, U+06F5, U+07C5, U+096B, U+09EB ...) T \p{Numeric_Value: 11/2} (Short: \p{Nv=11/2}) (1: U+0F2F) T \p{Numeric_Value: 6} (Short: \p{Nv=6}) (113: [6], U+0666, U+06F6, U+07C6, U+096C, U+09EC ...) T \p{Numeric_Value: 13/2} (Short: \p{Nv=13/2}) (1: U+0F30) T \p{Numeric_Value: 7} (Short: \p{Nv=7}) (112: [7], U+0667, U+06F7, U+07C7, U+096D, U+09ED ...) T \p{Numeric_Value: 15/2} (Short: \p{Nv=15/2}) (1: U+0F31) T \p{Numeric_Value: 8} (Short: \p{Nv=8}) (108: [8], U+0668, U+06F8, U+07C8, U+096E, U+09EE ...) T \p{Numeric_Value: 17/2} (Short: \p{Nv=17/2}) (1: U+0F32) T \p{Numeric_Value: 9} (Short: \p{Nv=9}) (112: [9], U+0669, U+06F9, U+07C9, U+096F, U+09EF ...) T \p{Numeric_Value: 10} (Short: \p{Nv=10}) (62: U+0BF0, U+0D70, U+1372, U+2169, U+2179, U+2469 ...) T \p{Numeric_Value: 11} (Short: \p{Nv=11}) (8: U+216A, U+217A, U+246A, U+247E, U+2492, U+24EB ...) T \p{Numeric_Value: 12} (Short: \p{Nv=12}) (8: U+216B, U+217B, U+246B, U+247F, U+2493, U+24EC ...) T \p{Numeric_Value: 13} (Short: \p{Nv=13}) (6: U+246C, U+2480, U+2494, U+24ED, U+16E8D, U+1D2ED) T \p{Numeric_Value: 14} (Short: \p{Nv=14}) (6: U+246D, U+2481, U+2495, U+24EE, U+16E8E, U+1D2EE) T \p{Numeric_Value: 15} (Short: \p{Nv=15}) (6: U+246E, U+2482, U+2496, U+24EF, U+16E8F, U+1D2EF) T \p{Numeric_Value: 16} (Short: \p{Nv=16}) (7: U+09F9, U+246F, U+2483, U+2497, U+24F0, U+16E90 ...) T \p{Numeric_Value: 17} (Short: \p{Nv=17}) (7: U+16EE, U+2470, U+2484, U+2498, U+24F1, U+16E91 ...) T \p{Numeric_Value: 18} (Short: \p{Nv=18}) (7: U+16EF, U+2471, U+2485, U+2499, U+24F2, U+16E92 ...) T \p{Numeric_Value: 19} (Short: \p{Nv=19}) (7: U+16F0, U+2472, U+2486, U+249A, U+24F3, U+16E93 ...) T \p{Numeric_Value: 20} (Short: \p{Nv=20}) (36: U+1373, U+2473, U+2487, U+249B, U+24F4, U+3039 ...) T \p{Numeric_Value: 21} (Short: \p{Nv=21}) (1: U+3251) T \p{Numeric_Value: 22} (Short: \p{Nv=22}) (1: U+3252) T \p{Numeric_Value: 23} (Short: \p{Nv=23}) (1: U+3253) T \p{Numeric_Value: 24} (Short: \p{Nv=24}) (1: U+3254) T \p{Numeric_Value: 25} (Short: \p{Nv=25}) (1: U+3255) T \p{Numeric_Value: 26} (Short: \p{Nv=26}) (1: U+3256) T \p{Numeric_Value: 27} (Short: \p{Nv=27}) (1: U+3257) T \p{Numeric_Value: 28} (Short: \p{Nv=28}) (1: U+3258) T \p{Numeric_Value: 29} (Short: \p{Nv=29}) (1: U+3259) T \p{Numeric_Value: 30} (Short: \p{Nv=30}) (19: U+1374, U+303A, U+324A, U+325A, U+5345, U+10112 ...) T \p{Numeric_Value: 31} (Short: \p{Nv=31}) (1: U+325B) T \p{Numeric_Value: 32} (Short: \p{Nv=32}) (1: U+325C) T \p{Numeric_Value: 33} (Short: \p{Nv=33}) (1: U+325D) T \p{Numeric_Value: 34} (Short: \p{Nv=34}) (1: U+325E) T \p{Numeric_Value: 35} (Short: \p{Nv=35}) (1: U+325F) T \p{Numeric_Value: 36} (Short: \p{Nv=36}) (1: U+32B1) T \p{Numeric_Value: 37} (Short: \p{Nv=37}) (1: U+32B2) T \p{Numeric_Value: 38} (Short: \p{Nv=38}) (1: U+32B3) T \p{Numeric_Value: 39} (Short: \p{Nv=39}) (1: U+32B4) T \p{Numeric_Value: 40} (Short: \p{Nv=40}) (18: U+1375, U+324B, U+32B5, U+534C, U+10113, U+102ED ...) T \p{Numeric_Value: 41} (Short: \p{Nv=41}) (1: U+32B6) T \p{Numeric_Value: 42} (Short: \p{Nv=42}) (1: U+32B7) T \p{Numeric_Value: 43} (Short: \p{Nv=43}) (1: U+32B8) T \p{Numeric_Value: 44} (Short: \p{Nv=44}) (1: U+32B9) T \p{Numeric_Value: 45} (Short: \p{Nv=45}) (1: U+32BA) T \p{Numeric_Value: 46} (Short: \p{Nv=46}) (1: U+32BB) T \p{Numeric_Value: 47} (Short: \p{Nv=47}) (1: U+32BC) T \p{Numeric_Value: 48} (Short: \p{Nv=48}) (1: U+32BD) T \p{Numeric_Value: 49} (Short: \p{Nv=49}) (1: U+32BE) T \p{Numeric_Value: 50} (Short: \p{Nv=50}) (29: U+1376, U+216C, U+217C, U+2186, U+324C, U+32BF ...) T \p{Numeric_Value: 60} (Short: \p{Nv=60}) (13: U+1377, U+324D, U+10115, U+102EF, U+109CE, U+10E6E ...) T \p{Numeric_Value: 70} (Short: \p{Nv=70}) (13: U+1378, U+324E, U+10116, U+102F0, U+109CF, U+10E6F ...) T \p{Numeric_Value: 80} (Short: \p{Nv=80}) (12: U+1379, U+324F, U+10117, U+102F1, U+10E70, U+11062 ...) T \p{Numeric_Value: 90} (Short: \p{Nv=90}) (12: U+137A, U+10118, U+102F2, U+10341, U+10E71, U+11063 ...) T \p{Numeric_Value: 100} (Short: \p{Nv=100}) (35: U+0BF1, U+0D71, U+137B, U+216D, U+217D, U+4F70 ...) T \p{Numeric_Value: 200} (Short: \p{Nv=200}) (6: U+1011A, U+102F4, U+109D3, U+10E73, U+1EC84, U+1ED14) T \p{Numeric_Value: 300} (Short: \p{Nv=300}) (7: U+1011B, U+1016B, U+102F5, U+109D4, U+10E74, U+1EC85 ...) T \p{Numeric_Value: 400} (Short: \p{Nv=400}) (7: U+1011C, U+102F6, U+109D5, U+10E75, U+1EC86, U+1ED16 ...) T \p{Numeric_Value: 500} (Short: \p{Nv=500}) (16: U+216E, U+217E, U+1011D, U+10145, U+1014C, U+10153 ...) T \p{Numeric_Value: 600} (Short: \p{Nv=600}) (7: U+1011E, U+102F8, U+109D7, U+10E77, U+1EC88, U+1ED18 ...) T \p{Numeric_Value: 700} (Short: \p{Nv=700}) (6: U+1011F, U+102F9, U+109D8, U+10E78, U+1EC89, U+1ED19) T \p{Numeric_Value: 800} (Short: \p{Nv=800}) (6: U+10120, U+102FA, U+109D9, U+10E79, U+1EC8A, U+1ED1A) T \p{Numeric_Value: 900} (Short: \p{Nv=900}) (7: U+10121, U+102FB, U+1034A, U+109DA, U+10E7A, U+1EC8B ...) T \p{Numeric_Value: 1000} (Short: \p{Nv=1000}) (22: U+0BF2, U+0D72, U+216F, U+217F..2180, U+4EDF, U+5343 ...) T \p{Numeric_Value: 2000} (Short: \p{Nv=2000}) (5: U+10123, U+109DC, U+1EC8D, U+1ED1D, U+1ED3A) T \p{Numeric_Value: 3000} (Short: \p{Nv=3000}) (4: U+10124, U+109DD, U+1EC8E, U+1ED1E) T \p{Numeric_Value: 4000} (Short: \p{Nv=4000}) (4: U+10125, U+109DE, U+1EC8F, U+1ED1F) T \p{Numeric_Value: 5000} (Short: \p{Nv=5000}) (8: U+2181, U+10126, U+10146, U+1014E, U+10172, U+109DF ...) T \p{Numeric_Value: 6000} (Short: \p{Nv=6000}) (4: U+10127, U+109E0, U+1EC91, U+1ED21) T \p{Numeric_Value: 7000} (Short: \p{Nv=7000}) (4: U+10128, U+109E1, U+1EC92, U+1ED22) T \p{Numeric_Value: 8000} (Short: \p{Nv=8000}) (4: U+10129, U+109E2, U+1EC93, U+1ED23) T \p{Numeric_Value: 9000} (Short: \p{Nv=9000}) (4: U+1012A, U+109E3, U+1EC94, U+1ED24) T \p{Numeric_Value: 10000} (= 1.0e+04) (Short: \p{Nv=10000}) (13: U+137C, U+2182, U+4E07, U+842C, U+1012B, U+10155 ...) T \p{Numeric_Value: 20000} (= 2.0e+04) (Short: \p{Nv=20000}) (4: U+1012C, U+109E5, U+1EC96, U+1ED26) T \p{Numeric_Value: 30000} (= 3.0e+04) (Short: \p{Nv=30000}) (4: U+1012D, U+109E6, U+1EC97, U+1ED27) T \p{Numeric_Value: 40000} (= 4.0e+04) (Short: \p{Nv=40000}) (4: U+1012E, U+109E7, U+1EC98, U+1ED28) T \p{Numeric_Value: 50000} (= 5.0e+04) (Short: \p{Nv=50000}) (7: U+2187, U+1012F, U+10147, U+10156, U+109E8, U+1EC99 ...) T \p{Numeric_Value: 60000} (= 6.0e+04) (Short: \p{Nv=60000}) (4: U+10130, U+109E9, U+1EC9A, U+1ED2A) T \p{Numeric_Value: 70000} (= 7.0e+04) (Short: \p{Nv=70000}) (4: U+10131, U+109EA, U+1EC9B, U+1ED2B) T \p{Numeric_Value: 80000} (= 8.0e+04) (Short: \p{Nv=80000}) (4: U+10132, U+109EB, U+1EC9C, U+1ED2C) T \p{Numeric_Value: 90000} (= 9.0e+04) (Short: \p{Nv=90000}) (4: U+10133, U+109EC, U+1EC9D, U+1ED2D) T \p{Numeric_Value: 100000} (= 1.0e+05) (Short: \p{Nv=100000}) (5: U+2188, U+109ED, U+1EC9E, U+1ECA0, U+1ECB4) T \p{Numeric_Value: 200000} (= 2.0e+05) (Short: \p{Nv=200000}) (2: U+109EE, U+1EC9F) T \p{Numeric_Value: 216000} (= 2.2e+05) (Short: \p{Nv=216000}) (1: U+12432) T \p{Numeric_Value: 300000} (= 3.0e+05) (Short: \p{Nv=300000}) (1: U+109EF) T \p{Numeric_Value: 400000} (= 4.0e+05) (Short: \p{Nv=400000}) (1: U+109F0) T \p{Numeric_Value: 432000} (= 4.3e+05) (Short: \p{Nv=432000}) (1: U+12433) T \p{Numeric_Value: 500000} (= 5.0e+05) (Short: \p{Nv=500000}) (1: U+109F1) T \p{Numeric_Value: 600000} (= 6.0e+05) (Short: \p{Nv=600000}) (1: U+109F2) T \p{Numeric_Value: 700000} (= 7.0e+05) (Short: \p{Nv=700000}) (1: U+109F3) T \p{Numeric_Value: 800000} (= 8.0e+05) (Short: \p{Nv=800000}) (1: U+109F4) T \p{Numeric_Value: 900000} (= 9.0e+05) (Short: \p{Nv=900000}) (1: U+109F5) T \p{Numeric_Value: 1000000} (= 1.0e+06) (Short: \p{Nv=1000000}) (1: U+16B5E) T \p{Numeric_Value: 10000000} (= 1.0e+07) (Short: \p{Nv=10000000}) (1: U+1ECA1) T \p{Numeric_Value: 20000000} (= 2.0e+07) (Short: \p{Nv=20000000}) (1: U+1ECA2) T \p{Numeric_Value: 100000000} (= 1.0e+08) (Short: \p{Nv=100000000}) (3: U+4EBF, U+5104, U+16B5F) T \p{Numeric_Value: 10000000000} (= 1.0e+10) (Short: \p{Nv= 10000000000}) (1: U+16B60) T \p{Numeric_Value: 1000000000000} (= 1.0e+12) (Short: \p{Nv= 1000000000000}) (2: U+5146, U+16B61) \p{Numeric_Value: NaN} (Short: \p{Nv=NaN}) (1_112_250 plus all above-Unicode code points: [\x00-\x20! \"#\$\%&\'\(\)*+,\-.\/:;<=>?\@A-Z\[\\\] \^_`a-z\{\|\}~\x7f-\xb1\xb4-\xb8\xba- \xbb\xbf-\xff], U+0100..065F, U+066A..06EF, U+06FA..07BF, U+07CA..0965, U+0970..09E5 ...) \p{Nushu} \p{Script_Extensions=Nushu} (Short: \p{Nshu}; NOT \p{Block=Nushu}) (397) \p{Nv: *} \p{Numeric_Value: *} \p{Nyiakeng_Puachue_Hmong} \p{Script_Extensions= Nyiakeng_Puachue_Hmong} (Short: \p{Hmnp}; NOT \p{Block= Nyiakeng_Puachue_Hmong}) (71) X \p{OCR} \p{Optical_Character_Recognition} (= \p{Block=Optical_Character_Recognition}) (32) \p{Ogam} \p{Ogham} (= \p{Script_Extensions=Ogham}) (NOT \p{Block=Ogham}) (29) \p{Ogham} \p{Script_Extensions=Ogham} (Short: \p{Ogam}; NOT \p{Block=Ogham}) (29) \p{Ol_Chiki} \p{Script_Extensions=Ol_Chiki} (Short: \p{Olck}) (48) \p{Olck} \p{Ol_Chiki} (= \p{Script_Extensions= Ol_Chiki}) (48) \p{Old_Hungarian} \p{Script_Extensions=Old_Hungarian} (Short: \p{Hung}; NOT \p{Block= Old_Hungarian}) (108) \p{Old_Italic} \p{Script_Extensions=Old_Italic} (Short: \p{Ital}; NOT \p{Block=Old_Italic}) (39) \p{Old_North_Arabian} \p{Script_Extensions=Old_North_Arabian} (Short: \p{Narb}) (32) \p{Old_Permic} \p{Script_Extensions=Old_Permic} (Short: \p{Perm}; NOT \p{Block=Old_Permic}) (44) \p{Old_Persian} \p{Script_Extensions=Old_Persian} (Short: \p{Xpeo}; NOT \p{Block=Old_Persian}) (50) \p{Old_Sogdian} \p{Script_Extensions=Old_Sogdian} (Short: \p{Sogo}; NOT \p{Block=Old_Sogdian}) (40) \p{Old_South_Arabian} \p{Script_Extensions=Old_South_Arabian} (Short: \p{Sarb}) (32) \p{Old_Turkic} \p{Script_Extensions=Old_Turkic} (Short: \p{Orkh}; NOT \p{Block=Old_Turkic}) (73) \p{Open_Punctuation} \p{General_Category=Open_Punctuation} (Short: \p{Ps}) (75) X \p{Optical_Character_Recognition} \p{Block= Optical_Character_Recognition} (Short: \p{InOCR}) (32) \p{Oriya} \p{Script_Extensions=Oriya} (Short: \p{Orya}; NOT \p{Block=Oriya}) (97) \p{Orkh} \p{Old_Turkic} (= \p{Script_Extensions= Old_Turkic}) (NOT \p{Block=Old_Turkic}) (73) X \p{Ornamental_Dingbats} \p{Block=Ornamental_Dingbats} (48) \p{Orya} \p{Oriya} (= \p{Script_Extensions=Oriya}) (NOT \p{Block=Oriya}) (97) \p{Osage} \p{Script_Extensions=Osage} (Short: \p{Osge}; NOT \p{Block=Osage}) (72) \p{Osge} \p{Osage} (= \p{Script_Extensions=Osage}) (NOT \p{Block=Osage}) (72) \p{Osma} \p{Osmanya} (= \p{Script_Extensions= Osmanya}) (NOT \p{Block=Osmanya}) (40) \p{Osmanya} \p{Script_Extensions=Osmanya} (Short: \p{Osma}; NOT \p{Block=Osmanya}) (40) \p{Other} \p{General_Category=Other} (Short: \p{C}) (970_414 plus all above-Unicode code points) \p{Other_Letter} \p{General_Category=Other_Letter} (Short: \p{Lo}) (127_004) \p{Other_Number} \p{General_Category=Other_Number} (Short: \p{No}) (895) \p{Other_Punctuation} \p{General_Category=Other_Punctuation} (Short: \p{Po}) (593) \p{Other_Symbol} \p{General_Category=Other_Symbol} (Short: \p{So}) (6431) X \p{Ottoman_Siyaq_Numbers} \p{Block=Ottoman_Siyaq_Numbers} (80) \p{P} \pP \p{Punct} (= \p{General_Category= Punctuation}) (NOT \p{General_Punctuation}) (798) \p{Pahawh_Hmong} \p{Script_Extensions=Pahawh_Hmong} (Short: \p{Hmng}; NOT \p{Block=Pahawh_Hmong}) (127) \p{Palm} \p{Palmyrene} (= \p{Script_Extensions= Palmyrene}) (32) \p{Palmyrene} \p{Script_Extensions=Palmyrene} (Short: \p{Palm}) (32) \p{Paragraph_Separator} \p{General_Category=Paragraph_Separator} (Short: \p{Zp}) (1) \p{Pat_Syn} \p{Pattern_Syntax} (= \p{Pattern_Syntax= Y}) (2760) \p{Pat_Syn: *} \p{Pattern_Syntax: *} \p{Pat_WS} \p{Pattern_White_Space} (= \p{Pattern_White_Space=Y}) (11) \p{Pat_WS: *} \p{Pattern_White_Space: *} \p{Pattern_Syntax} \p{Pattern_Syntax=Y} (Short: \p{PatSyn}) (2760) \p{Pattern_Syntax: N*} (Short: \p{PatSyn=N}, \P{PatSyn}) (1_111_352 plus all above-Unicode code points: [\x00-\x200-9A-Z_a-z\x7f-\xa0 \xa8\xaa\xad\xaf\xb2-\xb5\xb7-\xba\xbc- \xbe\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..200F, U+2028..202F, U+203F..2040, U+2054, U+205F..218F ...) \p{Pattern_Syntax: Y*} (Short: \p{PatSyn=Y}, \p{PatSyn}) (2760: [!\"#\$\%&\'\(\)*+,\-.\/:;<=>?\@\[\\\] \^`\{\|\}~\xa1-\xa7\xa9\xab-\xac\xae \xb0-\xb1\xb6\xbb\xbf\xd7\xf7], U+2010..2027, U+2030..203E, U+2041..2053, U+2055..205E, U+2190..245F ...) \p{Pattern_White_Space} \p{Pattern_White_Space=Y} (Short: \p{PatWS}) (11) \p{Pattern_White_Space: N*} (Short: \p{PatWS=N}, \P{PatWS}) (1_114_101 plus all above-Unicode code points: [^\t\n\cK\f\r\x20\x85], U+0100..200D, U+2010..2027, U+202A..infinity) \p{Pattern_White_Space: Y*} (Short: \p{PatWS=Y}, \p{PatWS}) (11: [\t\n\cK\f\r\x20\x85], U+200E..200F, U+2028..2029) \p{Pau_Cin_Hau} \p{Script_Extensions=Pau_Cin_Hau} (Short: \p{Pauc}; NOT \p{Block=Pau_Cin_Hau}) (57) \p{Pauc} \p{Pau_Cin_Hau} (= \p{Script_Extensions= Pau_Cin_Hau}) (NOT \p{Block= Pau_Cin_Hau}) (57) \p{Pc} \p{Connector_Punctuation} (= \p{General_Category= Connector_Punctuation}) (10) \p{PCM} \p{Prepended_Concatenation_Mark} (= \p{Prepended_Concatenation_Mark=Y}) (11) \p{PCM: *} \p{Prepended_Concatenation_Mark: *} \p{Pd} \p{Dash_Punctuation} (= \p{General_Category=Dash_Punctuation}) (25) \p{Pe} \p{Close_Punctuation} (= \p{General_Category=Close_Punctuation}) (73) \p{PerlSpace} \p{PosixSpace} (6) \p{PerlWord} \p{PosixWord} (63) \p{Perm} \p{Old_Permic} (= \p{Script_Extensions= Old_Permic}) (NOT \p{Block=Old_Permic}) (44) \p{Pf} \p{Final_Punctuation} (= \p{General_Category=Final_Punctuation}) (10) \p{Phag} \p{Phags_Pa} (= \p{Script_Extensions= Phags_Pa}) (NOT \p{Block=Phags_Pa}) (59) \p{Phags_Pa} \p{Script_Extensions=Phags_Pa} (Short: \p{Phag}; NOT \p{Block=Phags_Pa}) (59) X \p{Phaistos} \p{Phaistos_Disc} (= \p{Block= Phaistos_Disc}) (48) X \p{Phaistos_Disc} \p{Block=Phaistos_Disc} (Short: \p{InPhaistos}) (48) \p{Phli} \p{Inscriptional_Pahlavi} (= \p{Script_Extensions= Inscriptional_Pahlavi}) (NOT \p{Block= Inscriptional_Pahlavi}) (27) \p{Phlp} \p{Psalter_Pahlavi} (= \p{Script_Extensions=Psalter_Pahlavi}) (NOT \p{Block=Psalter_Pahlavi}) (30) \p{Phnx} \p{Phoenician} (= \p{Script_Extensions= Phoenician}) (NOT \p{Block=Phoenician}) (29) \p{Phoenician} \p{Script_Extensions=Phoenician} (Short: \p{Phnx}; NOT \p{Block=Phoenician}) (29) X \p{Phonetic_Ext} \p{Phonetic_Extensions} (= \p{Block= Phonetic_Extensions}) (128) X \p{Phonetic_Ext_Sup} \p{Phonetic_Extensions_Supplement} (= \p{Block= Phonetic_Extensions_Supplement}) (64) X \p{Phonetic_Extensions} \p{Block=Phonetic_Extensions} (Short: \p{InPhoneticExt}) (128) X \p{Phonetic_Extensions_Supplement} \p{Block= Phonetic_Extensions_Supplement} (Short: \p{InPhoneticExtSup}) (64) \p{Pi} \p{Initial_Punctuation} (= \p{General_Category= Initial_Punctuation}) (12) X \p{Playing_Cards} \p{Block=Playing_Cards} (96) \p{Plrd} \p{Miao} (= \p{Script_Extensions=Miao}) (NOT \p{Block=Miao}) (149) \p{Po} \p{Other_Punctuation} (= \p{General_Category=Other_Punctuation}) (593) \p{PosixAlnum} (62: [0-9A-Za-z]) \p{PosixAlpha} (52: [A-Za-z]) \p{PosixBlank} (2: [\t\x20]) \p{PosixCntrl} ASCII control characters (33: ACK, BEL, BS, CAN, CR, DC1, DC2, DC3, DC4, DEL, DLE, ENQ, EOM, EOT, ESC, ETB, ETX, FF, FS, GS, HT, LF, NAK, NUL, RS, SI, SO, SOH, STX, SUB, SYN, US, VT) \p{PosixDigit} (10: [0-9]) \p{PosixGraph} (94: [!\"#\$\%&\'\(\)*+,\-.\/0-9:;<=>?\@A- Z\[\\\]\^_`a-z\{\|\}~]) \p{PosixLower} (/i= PosixAlpha) (26: [a-z]) \p{PosixPrint} (95: [\x20-\x7e]) \p{PosixPunct} (32: [!\"#\$\%&\'\(\)*+,\-.\/:;<=>?\@ \[\\\]\^_`\{\|\}~]) \p{PosixSpace} (Short: \p{PerlSpace}) (6: [\t\n\cK\f\r \x20]) \p{PosixUpper} (/i= PosixAlpha) (26: [A-Z]) \p{PosixWord} \w, restricted to ASCII (Short: \p{PerlWord}) (63: [0-9A-Z_a-z]) \p{PosixXDigit} \p{ASCII_Hex_Digit=Y} (Short: \p{AHex}) (22) \p{Prepended_Concatenation_Mark} \p{Prepended_Concatenation_Mark= Y} (Short: \p{PCM}) (11) \p{Prepended_Concatenation_Mark: N*} (Short: \p{PCM=N}, \P{PCM}) (1_114_101 plus all above-Unicode code points: U+0000..05FF, U+0606..06DC, U+06DE..070E, U+0710..08E1, U+08E3..110BC, U+110BE..110CC ...) \p{Prepended_Concatenation_Mark: Y*} (Short: \p{PCM=Y}, \p{PCM}) (11: U+0600..0605, U+06DD, U+070F, U+08E2, U+110BD, U+110CD) T \p{Present_In: 1.1} \p{Age=V1_1} (Short: \p{In=1.1}) (Perl extension) (33_979) T \p{Present_In: 2.0} Code point's usage introduced in version 2.0 or earlier (Short: \p{In=2.0}) (Perl extension) (178_500: U+0000..01F5, U+01FA..0217, U+0250..02A8, U+02B0..02DE, U+02E0..02E9, U+0300..0345 ...) \p{Present_In: V2_0} \p{Present_In=2.0} (Perl extension) (178_500) T \p{Present_In: 2.1} Code point's usage introduced in version 2.1 or earlier (Short: \p{In=2.1}) (Perl extension) (178_502: U+0000..01F5, U+01FA..0217, U+0250..02A8, U+02B0..02DE, U+02E0..02E9, U+0300..0345 ...) \p{Present_In: V2_1} \p{Present_In=2.1} (Perl extension) (178_502) T \p{Present_In: 3.0} Code point's usage introduced in version 3.0 or earlier (Short: \p{In=3.0}) (Perl extension) (188_809: U+0000..021F, U+0222..0233, U+0250..02AD, U+02B0..02EE, U+0300..034E, U+0360..0362 ...) \p{Present_In: V3_0} \p{Present_In=3.0} (Perl extension) (188_809) T \p{Present_In: 3.1} Code point's usage introduced in version 3.1 or earlier (Short: \p{In=3.1}) (Perl extension) (233_787: U+0000..021F, U+0222..0233, U+0250..02AD, U+02B0..02EE, U+0300..034E, U+0360..0362 ...) \p{Present_In: V3_1} \p{Present_In=3.1} (Perl extension) (233_787) T \p{Present_In: 3.2} Code point's usage introduced in version 3.2 or earlier (Short: \p{In=3.2}) (Perl extension) (234_803: U+0000..0220, U+0222..0233, U+0250..02AD, U+02B0..02EE, U+0300..034F, U+0360..036F ...) \p{Present_In: V3_2} \p{Present_In=3.2} (Perl extension) (234_803) T \p{Present_In: 4.0} Code point's usage introduced in version 4.0 or earlier (Short: \p{In=4.0}) (Perl extension) (236_029: U+0000..0236, U+0250..0357, U+035D..036F, U+0374..0375, U+037A, U+037E ...) \p{Present_In: V4_0} \p{Present_In=4.0} (Perl extension) (236_029) T \p{Present_In: 4.1} Code point's usage introduced in version 4.1 or earlier (Short: \p{In=4.1}) (Perl extension) (237_302: U+0000..0241, U+0250..036F, U+0374..0375, U+037A, U+037E, U+0384..038A ...) \p{Present_In: V4_1} \p{Present_In=4.1} (Perl extension) (237_302) T \p{Present_In: 5.0} Code point's usage introduced in version 5.0 or earlier (Short: \p{In=5.0}) (Perl extension) (238_671: U+0000..036F, U+0374..0375, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1 ...) \p{Present_In: V5_0} \p{Present_In=5.0} (Perl extension) (238_671) T \p{Present_In: 5.1} Code point's usage introduced in version 5.1 or earlier (Short: \p{In=5.1}) (Perl extension) (240_295: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0523 ...) \p{Present_In: V5_1} \p{Present_In=5.1} (Perl extension) (240_295) T \p{Present_In: 5.2} Code point's usage introduced in version 5.2 or earlier (Short: \p{In=5.2}) (Perl extension) (246_943: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0525 ...) \p{Present_In: V5_2} \p{Present_In=5.2} (Perl extension) (246_943) T \p{Present_In: 6.0} Code point's usage introduced in version 6.0 or earlier (Short: \p{In=6.0}) (Perl extension) (249_031: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0527 ...) \p{Present_In: V6_0} \p{Present_In=6.0} (Perl extension) (249_031) T \p{Present_In: 6.1} Code point's usage introduced in version 6.1 or earlier (Short: \p{In=6.1}) (Perl extension) (249_763: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0527 ...) \p{Present_In: V6_1} \p{Present_In=6.1} (Perl extension) (249_763) T \p{Present_In: 6.2} Code point's usage introduced in version 6.2 or earlier (Short: \p{In=6.2}) (Perl extension) (249_764: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0527 ...) \p{Present_In: V6_2} \p{Present_In=6.2} (Perl extension) (249_764) T \p{Present_In: 6.3} Code point's usage introduced in version 6.3 or earlier (Short: \p{In=6.3}) (Perl extension) (249_769: U+0000..0377, U+037A..037E, U+0384..038A, U+038C, U+038E..03A1, U+03A3..0527 ...) \p{Present_In: V6_3} \p{Present_In=6.3} (Perl extension) (249_769) T \p{Present_In: 7.0} Code point's usage introduced in version 7.0 or earlier (Short: \p{In=7.0}) (Perl extension) (252_603: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V7_0} \p{Present_In=7.0} (Perl extension) (252_603) T \p{Present_In: 8.0} Code point's usage introduced in version 8.0 or earlier (Short: \p{In=8.0}) (Perl extension) (260_319: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V8_0} \p{Present_In=8.0} (Perl extension) (260_319) T \p{Present_In: 9.0} Code point's usage introduced in version 9.0 or earlier (Short: \p{In=9.0}) (Perl extension) (267_819: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V9_0} \p{Present_In=9.0} (Perl extension) (267_819) T \p{Present_In: 10.0} Code point's usage introduced in version 10.0 or earlier (Short: \p{In=10.0}) (Perl extension) (276_337: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V10_0} \p{Present_In=10.0} (Perl extension) (276_337) T \p{Present_In: 11.0} Code point's usage introduced in version 11.0 or earlier (Short: \p{In=11.0}) (Perl extension) (277_021: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V11_0} \p{Present_In=11.0} (Perl extension) (277_021) T \p{Present_In: 12.0} Code point's usage introduced in version 12.0 or earlier (Short: \p{In=12.0}) (Perl extension) (277_575: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V12_0} \p{Present_In=12.0} (Perl extension) (277_575) T \p{Present_In: 12.1} Code point's usage introduced in version 12.1 or earlier (Short: \p{In=12.1}) (Perl extension) (277_576: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V12_1} \p{Present_In=12.1} (Perl extension) (277_576) T \p{Present_In: 13.0} Code point's usage introduced in version 13.0 or earlier (Short: \p{In=13.0}) (Perl extension) (283_506: U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ...) \p{Present_In: V13_0} \p{Present_In=13.0} (Perl extension) (283_506) \p{Present_In: Unassigned} \p{Age=Unassigned} (Short: \p{In= Unassigned}) (Perl extension) (830_606 plus all above-Unicode code points) \p{Print} \p{XPosixPrint} (281_325) \p{Private_Use} \p{General_Category=Private_Use} (Short: \p{Co}; NOT \p{Private_Use_Area}) (137_468) X \p{Private_Use_Area} \p{Block=Private_Use_Area} (Short: \p{InPUA}) (6400) \p{Prti} \p{Inscriptional_Parthian} (= \p{Script_Extensions= Inscriptional_Parthian}) (NOT \p{Block= Inscriptional_Parthian}) (30) \p{Ps} \p{Open_Punctuation} (= \p{General_Category=Open_Punctuation}) (75) \p{Psalter_Pahlavi} \p{Script_Extensions=Psalter_Pahlavi} (Short: \p{Phlp}; NOT \p{Block= Psalter_Pahlavi}) (30) X \p{PUA} \p{Private_Use_Area} (= \p{Block= Private_Use_Area}) (6400) \p{Punct} \p{General_Category=Punctuation} (Short: \p{P}; NOT \p{General_Punctuation}) (798) \p{Punctuation} \p{Punct} (= \p{General_Category= Punctuation}) (NOT \p{General_Punctuation}) (798) \p{Qaac} \p{Coptic} (= \p{Script_Extensions= Coptic}) (NOT \p{Block=Coptic}) (165) \p{Qaai} \p{Inherited} (= \p{Script_Extensions= Inherited}) (503) \p{QMark} \p{Quotation_Mark} (= \p{Quotation_Mark= Y}) (30) \p{QMark: *} \p{Quotation_Mark: *} \p{Quotation_Mark} \p{Quotation_Mark=Y} (Short: \p{QMark}) (30) \p{Quotation_Mark: N*} (Short: \p{QMark=N}, \P{QMark}) (1_114_082 plus all above-Unicode code points: [\x00-\x20!#\$\%&\(\)*+,\-.\/0-9:;<=>? \@A-Z\[\\\]\^_`a-z\{\|\}~\x7f-\xaa\xac- \xba\xbc-\xff], U+0100..2017, U+2020..2038, U+203B..2E41, U+2E43..300B, U+3010..301C ...) \p{Quotation_Mark: Y*} (Short: \p{QMark=Y}, \p{QMark}) (30: [\" \'\xab\xbb], U+2018..201F, U+2039..203A, U+2E42, U+300C..300F, U+301D..301F ...) \p{Radical} \p{Radical=Y} (329) \p{Radical: N*} (Single: \P{Radical}) (1_113_783 plus all above-Unicode code points: U+0000..2E7F, U+2E9A, U+2EF4..2EFF, U+2FD6..infinity) \p{Radical: Y*} (Single: \p{Radical}) (329: U+2E80..2E99, U+2E9B..2EF3, U+2F00..2FD5) \p{Regional_Indicator} \p{Regional_Indicator=Y} (Short: \p{RI}) (26) \p{Regional_Indicator: N*} (Short: \p{RI=N}, \P{RI}) (1_114_086 plus all above-Unicode code points: U+0000..1F1E5, U+1F200..infinity) \p{Regional_Indicator: Y*} (Short: \p{RI=Y}, \p{RI}) (26: U+1F1E6..1F1FF) \p{Rejang} \p{Script_Extensions=Rejang} (Short: \p{Rjng}; NOT \p{Block=Rejang}) (37) \p{RI} \p{Regional_Indicator} (= \p{Regional_Indicator=Y}) (26) \p{RI: *} \p{Regional_Indicator: *} \p{Rjng} \p{Rejang} (= \p{Script_Extensions= Rejang}) (NOT \p{Block=Rejang}) (37) \p{Rohg} \p{Hanifi_Rohingya} (= \p{Script_Extensions=Hanifi_Rohingya}) (NOT \p{Block=Hanifi_Rohingya}) (55) X \p{Rumi} \p{Rumi_Numeral_Symbols} (= \p{Block= Rumi_Numeral_Symbols}) (32) X \p{Rumi_Numeral_Symbols} \p{Block=Rumi_Numeral_Symbols} (Short: \p{InRumi}) (32) \p{Runic} \p{Script_Extensions=Runic} (Short: \p{Runr}; NOT \p{Block=Runic}) (86) \p{Runr} \p{Runic} (= \p{Script_Extensions=Runic}) (NOT \p{Block=Runic}) (86) \p{S} \pS \p{Symbol} (= \p{General_Category=Symbol}) (7564) \p{Samaritan} \p{Script_Extensions=Samaritan} (Short: \p{Samr}; NOT \p{Block=Samaritan}) (61) \p{Samr} \p{Samaritan} (= \p{Script_Extensions= Samaritan}) (NOT \p{Block=Samaritan}) (61) \p{Sarb} \p{Old_South_Arabian} (= \p{Script_Extensions=Old_South_Arabian}) (32) \p{Saur} \p{Saurashtra} (= \p{Script_Extensions= Saurashtra}) (NOT \p{Block=Saurashtra}) (82) \p{Saurashtra} \p{Script_Extensions=Saurashtra} (Short: \p{Saur}; NOT \p{Block=Saurashtra}) (82) \p{SB: *} \p{Sentence_Break: *} \p{Sc} \p{Currency_Symbol} (= \p{General_Category=Currency_Symbol}) (62) \p{Sc: *} \p{Script: *} \p{Script: Adlam} (Short: \p{Sc=Adlm}) (88: U+1E900..1E94B, U+1E950..1E959, U+1E95E..1E95F) \p{Script: Adlm} \p{Script=Adlam} (88) \p{Script: Aghb} \p{Script=Caucasian_Albanian} (= \p{Script_Extensions= Caucasian_Albanian}) (53) \p{Script: Ahom} \p{Script_Extensions=Ahom} (Short: \p{Sc= Ahom}, \p{Ahom}) (58) \p{Script: Anatolian_Hieroglyphs} \p{Script_Extensions= Anatolian_Hieroglyphs} (Short: \p{Sc= Hluw}, \p{Hluw}) (583) \p{Script: Arab} \p{Script=Arabic} (1291) \p{Script: Arabic} (Short: \p{Sc=Arab}) (1291: U+0600..0604, U+0606..060B, U+060D..061A, U+061C, U+061E, U+0620..063F ...) \p{Script: Armenian} \p{Script_Extensions=Armenian} (Short: \p{Sc=Armn}, \p{Armn}) (96) \p{Script: Armi} \p{Script=Imperial_Aramaic} (= \p{Script_Extensions=Imperial_Aramaic}) (31) \p{Script: Armn} \p{Script=Armenian} (= \p{Script_Extensions=Armenian}) (96) \p{Script: Avestan} \p{Script_Extensions=Avestan} (Short: \p{Sc=Avst}, \p{Avst}) (61) \p{Script: Avst} \p{Script=Avestan} (= \p{Script_Extensions=Avestan}) (61) \p{Script: Bali} \p{Script=Balinese} (= \p{Script_Extensions=Balinese}) (121) \p{Script: Balinese} \p{Script_Extensions=Balinese} (Short: \p{Sc=Bali}, \p{Bali}) (121) \p{Script: Bamu} \p{Script=Bamum} (= \p{Script_Extensions= Bamum}) (657) \p{Script: Bamum} \p{Script_Extensions=Bamum} (Short: \p{Sc= Bamu}, \p{Bamu}) (657) \p{Script: Bass} \p{Script=Bassa_Vah} (= \p{Script_Extensions=Bassa_Vah}) (36) \p{Script: Bassa_Vah} \p{Script_Extensions=Bassa_Vah} (Short: \p{Sc=Bass}, \p{Bass}) (36) \p{Script: Batak} \p{Script_Extensions=Batak} (Short: \p{Sc= Batk}, \p{Batk}) (56) \p{Script: Batk} \p{Script=Batak} (= \p{Script_Extensions= Batak}) (56) \p{Script: Beng} \p{Script=Bengali} (96) \p{Script: Bengali} (Short: \p{Sc=Beng}) (96: U+0980..0983, U+0985..098C, U+098F..0990, U+0993..09A8, U+09AA..09B0, U+09B2 ...) \p{Script: Bhaiksuki} \p{Script_Extensions=Bhaiksuki} (Short: \p{Sc=Bhks}, \p{Bhks}) (97) \p{Script: Bhks} \p{Script=Bhaiksuki} (= \p{Script_Extensions=Bhaiksuki}) (97) \p{Script: Bopo} \p{Script=Bopomofo} (77) \p{Script: Bopomofo} (Short: \p{Sc=Bopo}) (77: U+02EA..02EB, U+3105..312F, U+31A0..31BF) \p{Script: Brah} \p{Script=Brahmi} (= \p{Script_Extensions= Brahmi}) (109) \p{Script: Brahmi} \p{Script_Extensions=Brahmi} (Short: \p{Sc=Brah}, \p{Brah}) (109) \p{Script: Brai} \p{Script=Braille} (= \p{Script_Extensions=Braille}) (256) \p{Script: Braille} \p{Script_Extensions=Braille} (Short: \p{Sc=Brai}, \p{Brai}) (256) \p{Script: Bugi} \p{Script=Buginese} (30) \p{Script: Buginese} (Short: \p{Sc=Bugi}) (30: U+1A00..1A1B, U+1A1E..1A1F) \p{Script: Buhd} \p{Script=Buhid} (20) \p{Script: Buhid} (Short: \p{Sc=Buhd}) (20: U+1740..1753) \p{Script: Cakm} \p{Script=Chakma} (71) \p{Script: Canadian_Aboriginal} \p{Script_Extensions= Canadian_Aboriginal} (Short: \p{Sc= Cans}, \p{Cans}) (710) \p{Script: Cans} \p{Script=Canadian_Aboriginal} (= \p{Script_Extensions= Canadian_Aboriginal}) (710) \p{Script: Cari} \p{Script=Carian} (= \p{Script_Extensions= Carian}) (49) \p{Script: Carian} \p{Script_Extensions=Carian} (Short: \p{Sc=Cari}, \p{Cari}) (49) \p{Script: Caucasian_Albanian} \p{Script_Extensions= Caucasian_Albanian} (Short: \p{Sc=Aghb}, \p{Aghb}) (53) \p{Script: Chakma} (Short: \p{Sc=Cakm}) (71: U+11100..11134, U+11136..11147) \p{Script: Cham} \p{Script_Extensions=Cham} (Short: \p{Sc= Cham}, \p{Cham}) (83) \p{Script: Cher} \p{Script=Cherokee} (= \p{Script_Extensions=Cherokee}) (172) \p{Script: Cherokee} \p{Script_Extensions=Cherokee} (Short: \p{Sc=Cher}, \p{Cher}) (172) \p{Script: Chorasmian} \p{Script_Extensions=Chorasmian} (Short: \p{Sc=Chrs}, \p{Chrs}) (28) \p{Script: Chrs} \p{Script=Chorasmian} (= \p{Script_Extensions=Chorasmian}) (28) \p{Script: Common} (Short: \p{Sc=Zyyy}) (8087: [\x00-\x20! \"#\$\%&\'\(\)*+,\-.\/0-9:;<=>?\@\[\\\] \^_`\{\|\}~\x7f-\xa9\xab-\xb9\xbb-\xbf \xd7\xf7], U+02B9..02DF, U+02E5..02E9, U+02EC..02FF, U+0374, U+037E ...) \p{Script: Copt} \p{Script=Coptic} (137) \p{Script: Coptic} (Short: \p{Sc=Copt}) (137: U+03E2..03EF, U+2C80..2CF3, U+2CF9..2CFF) \p{Script: Cprt} \p{Script=Cypriot} (55) \p{Script: Cuneiform} \p{Script_Extensions=Cuneiform} (Short: \p{Sc=Xsux}, \p{Xsux}) (1234) \p{Script: Cypriot} (Short: \p{Sc=Cprt}) (55: U+10800..10805, U+10808, U+1080A..10835, U+10837..10838, U+1083C, U+1083F) \p{Script: Cyrillic} (Short: \p{Sc=Cyrl}) (443: U+0400..0484, U+0487..052F, U+1C80..1C88, U+1D2B, U+1D78, U+2DE0..2DFF ...) \p{Script: Cyrl} \p{Script=Cyrillic} (443) \p{Script: Deseret} \p{Script_Extensions=Deseret} (Short: \p{Sc=Dsrt}, \p{Dsrt}) (80) \p{Script: Deva} \p{Script=Devanagari} (154) \p{Script: Devanagari} (Short: \p{Sc=Deva}) (154: U+0900..0950, U+0955..0963, U+0966..097F, U+A8E0..A8FF) \p{Script: Diak} \p{Script=Dives_Akuru} (= \p{Script_Extensions=Dives_Akuru}) (72) \p{Script: Dives_Akuru} \p{Script_Extensions=Dives_Akuru} (Short: \p{Sc=Diak}, \p{Diak}) (72) \p{Script: Dogr} \p{Script=Dogra} (60) \p{Script: Dogra} (Short: \p{Sc=Dogr}) (60: U+11800..1183B) \p{Script: Dsrt} \p{Script=Deseret} (= \p{Script_Extensions=Deseret}) (80) \p{Script: Dupl} \p{Script=Duployan} (143) \p{Script: Duployan} (Short: \p{Sc=Dupl}) (143: U+1BC00..1BC6A, U+1BC70..1BC7C, U+1BC80..1BC88, U+1BC90..1BC99, U+1BC9C..1BC9F) \p{Script: Egyp} \p{Script=Egyptian_Hieroglyphs} (= \p{Script_Extensions= Egyptian_Hieroglyphs}) (1080) \p{Script: Egyptian_Hieroglyphs} \p{Script_Extensions= Egyptian_Hieroglyphs} (Short: \p{Sc= Egyp}, \p{Egyp}) (1080) \p{Script: Elba} \p{Script=Elbasan} (= \p{Script_Extensions=Elbasan}) (40) \p{Script: Elbasan} \p{Script_Extensions=Elbasan} (Short: \p{Sc=Elba}, \p{Elba}) (40) \p{Script: Elym} \p{Script=Elymaic} (= \p{Script_Extensions=Elymaic}) (23) \p{Script: Elymaic} \p{Script_Extensions=Elymaic} (Short: \p{Sc=Elym}, \p{Elym}) (23) \p{Script: Ethi} \p{Script=Ethiopic} (= \p{Script_Extensions=Ethiopic}) (495) \p{Script: Ethiopic} \p{Script_Extensions=Ethiopic} (Short: \p{Sc=Ethi}, \p{Ethi}) (495) \p{Script: Geor} \p{Script=Georgian} (173) \p{Script: Georgian} (Short: \p{Sc=Geor}) (173: U+10A0..10C5, U+10C7, U+10CD, U+10D0..10FA, U+10FC..10FF, U+1C90..1CBA ...) \p{Script: Glag} \p{Script=Glagolitic} (132) \p{Script: Glagolitic} (Short: \p{Sc=Glag}) (132: U+2C00..2C2E, U+2C30..2C5E, U+1E000..1E006, U+1E008..1E018, U+1E01B..1E021, U+1E023..1E024 ...) \p{Script: Gong} \p{Script=Gunjala_Gondi} (63) \p{Script: Gonm} \p{Script=Masaram_Gondi} (75) \p{Script: Goth} \p{Script=Gothic} (= \p{Script_Extensions= Gothic}) (27) \p{Script: Gothic} \p{Script_Extensions=Gothic} (Short: \p{Sc=Goth}, \p{Goth}) (27) \p{Script: Gran} \p{Script=Grantha} (85) \p{Script: Grantha} (Short: \p{Sc=Gran}) (85: U+11300..11303, U+11305..1130C, U+1130F..11310, U+11313..11328, U+1132A..11330, U+11332..11333 ...) \p{Script: Greek} (Short: \p{Sc=Grek}) (518: U+0370..0373, U+0375..0377, U+037A..037D, U+037F, U+0384, U+0386 ...) \p{Script: Grek} \p{Script=Greek} (518) \p{Script: Gujarati} (Short: \p{Sc=Gujr}) (91: U+0A81..0A83, U+0A85..0A8D, U+0A8F..0A91, U+0A93..0AA8, U+0AAA..0AB0, U+0AB2..0AB3 ...) \p{Script: Gujr} \p{Script=Gujarati} (91) \p{Script: Gunjala_Gondi} (Short: \p{Sc=Gong}) (63: U+11D60..11D65, U+11D67..11D68, U+11D6A..11D8E, U+11D90..11D91, U+11D93..11D98, U+11DA0..11DA9) \p{Script: Gurmukhi} (Short: \p{Sc=Guru}) (80: U+0A01..0A03, U+0A05..0A0A, U+0A0F..0A10, U+0A13..0A28, U+0A2A..0A30, U+0A32..0A33 ...) \p{Script: Guru} \p{Script=Gurmukhi} (80) \p{Script: Han} (Short: \p{Sc=Han}) (94_204: U+2E80..2E99, U+2E9B..2EF3, U+2F00..2FD5, U+3005, U+3007, U+3021..3029 ...) \p{Script: Hang} \p{Script=Hangul} (11_739) \p{Script: Hangul} (Short: \p{Sc=Hang}) (11_739: U+1100..11FF, U+302E..302F, U+3131..318E, U+3200..321E, U+3260..327E, U+A960..A97C ...) \p{Script: Hani} \p{Script=Han} (94_204) \p{Script: Hanifi_Rohingya} (Short: \p{Sc=Rohg}) (50: U+10D00..10D27, U+10D30..10D39) \p{Script: Hano} \p{Script=Hanunoo} (21) \p{Script: Hanunoo} (Short: \p{Sc=Hano}) (21: U+1720..1734) \p{Script: Hatr} \p{Script=Hatran} (= \p{Script_Extensions= Hatran}) (26) \p{Script: Hatran} \p{Script_Extensions=Hatran} (Short: \p{Sc=Hatr}, \p{Hatr}) (26) \p{Script: Hebr} \p{Script=Hebrew} (= \p{Script_Extensions= Hebrew}) (134) \p{Script: Hebrew} \p{Script_Extensions=Hebrew} (Short: \p{Sc=Hebr}, \p{Hebr}) (134) \p{Script: Hira} \p{Script=Hiragana} (379) \p{Script: Hiragana} (Short: \p{Sc=Hira}) (379: U+3041..3096, U+309D..309F, U+1B001..1B11E, U+1B150..1B152, U+1F200) \p{Script: Hluw} \p{Script=Anatolian_Hieroglyphs} (= \p{Script_Extensions= Anatolian_Hieroglyphs}) (583) \p{Script: Hmng} \p{Script=Pahawh_Hmong} (= \p{Script_Extensions=Pahawh_Hmong}) (127) \p{Script: Hmnp} \p{Script=Nyiakeng_Puachue_Hmong} (= \p{Script_Extensions= Nyiakeng_Puachue_Hmong}) (71) \p{Script: Hung} \p{Script=Old_Hungarian} (= \p{Script_Extensions=Old_Hungarian}) (108) \p{Script: Imperial_Aramaic} \p{Script_Extensions= Imperial_Aramaic} (Short: \p{Sc=Armi}, \p{Armi}) (31) \p{Script: Inherited} (Short: \p{Sc=Zinh}) (573: U+0300..036F, U+0485..0486, U+064B..0655, U+0670, U+0951..0954, U+1AB0..1AC0 ...) \p{Script: Inscriptional_Pahlavi} \p{Script_Extensions= Inscriptional_Pahlavi} (Short: \p{Sc= Phli}, \p{Phli}) (27) \p{Script: Inscriptional_Parthian} \p{Script_Extensions= Inscriptional_Parthian} (Short: \p{Sc= Prti}, \p{Prti}) (30) \p{Script: Ital} \p{Script=Old_Italic} (= \p{Script_Extensions=Old_Italic}) (39) \p{Script: Java} \p{Script=Javanese} (90) \p{Script: Javanese} (Short: \p{Sc=Java}) (90: U+A980..A9CD, U+A9D0..A9D9, U+A9DE..A9DF) \p{Script: Kaithi} (Short: \p{Sc=Kthi}) (67: U+11080..110C1, U+110CD) \p{Script: Kali} \p{Script=Kayah_Li} (47) \p{Script: Kana} \p{Script=Katakana} (304) \p{Script: Kannada} (Short: \p{Sc=Knda}) (89: U+0C80..0C8C, U+0C8E..0C90, U+0C92..0CA8, U+0CAA..0CB3, U+0CB5..0CB9, U+0CBC..0CC4 ...) \p{Script: Katakana} (Short: \p{Sc=Kana}) (304: U+30A1..30FA, U+30FD..30FF, U+31F0..31FF, U+32D0..32FE, U+3300..3357, U+FF66..FF6F ...) \p{Script: Kayah_Li} (Short: \p{Sc=Kali}) (47: U+A900..A92D, U+A92F) \p{Script: Khar} \p{Script=Kharoshthi} (= \p{Script_Extensions=Kharoshthi}) (68) \p{Script: Kharoshthi} \p{Script_Extensions=Kharoshthi} (Short: \p{Sc=Khar}, \p{Khar}) (68) \p{Script: Khitan_Small_Script} \p{Script_Extensions= Khitan_Small_Script} (Short: \p{Sc= Kits}, \p{Kits}) (471) \p{Script: Khmer} \p{Script_Extensions=Khmer} (Short: \p{Sc= Khmr}, \p{Khmr}) (146) \p{Script: Khmr} \p{Script=Khmer} (= \p{Script_Extensions= Khmer}) (146) \p{Script: Khoj} \p{Script=Khojki} (62) \p{Script: Khojki} (Short: \p{Sc=Khoj}) (62: U+11200..11211, U+11213..1123E) \p{Script: Khudawadi} (Short: \p{Sc=Sind}) (69: U+112B0..112EA, U+112F0..112F9) \p{Script: Kits} \p{Script=Khitan_Small_Script} (= \p{Script_Extensions= Khitan_Small_Script}) (471) \p{Script: Knda} \p{Script=Kannada} (89) \p{Script: Kthi} \p{Script=Kaithi} (67) \p{Script: Lana} \p{Script=Tai_Tham} (= \p{Script_Extensions=Tai_Tham}) (127) \p{Script: Lao} \p{Script_Extensions=Lao} (Short: \p{Sc= Lao}, \p{Lao}) (82) \p{Script: Laoo} \p{Script=Lao} (= \p{Script_Extensions= Lao}) (82) \p{Script: Latin} (Short: \p{Sc=Latn}) (1374: [A-Za-z\xaa \xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02B8, U+02E0..02E4, U+1D00..1D25, U+1D2C..1D5C, U+1D62..1D65 ...) \p{Script: Latn} \p{Script=Latin} (1374) \p{Script: Lepc} \p{Script=Lepcha} (= \p{Script_Extensions= Lepcha}) (74) \p{Script: Lepcha} \p{Script_Extensions=Lepcha} (Short: \p{Sc=Lepc}, \p{Lepc}) (74) \p{Script: Limb} \p{Script=Limbu} (68) \p{Script: Limbu} (Short: \p{Sc=Limb}) (68: U+1900..191E, U+1920..192B, U+1930..193B, U+1940, U+1944..194F) \p{Script: Lina} \p{Script=Linear_A} (341) \p{Script: Linb} \p{Script=Linear_B} (211) \p{Script: Linear_A} (Short: \p{Sc=Lina}) (341: U+10600..10736, U+10740..10755, U+10760..10767) \p{Script: Linear_B} (Short: \p{Sc=Linb}) (211: U+10000..1000B, U+1000D..10026, U+10028..1003A, U+1003C..1003D, U+1003F..1004D, U+10050..1005D ...) \p{Script: Lisu} \p{Script_Extensions=Lisu} (Short: \p{Sc= Lisu}, \p{Lisu}) (49) \p{Script: Lyci} \p{Script=Lycian} (= \p{Script_Extensions= Lycian}) (29) \p{Script: Lycian} \p{Script_Extensions=Lycian} (Short: \p{Sc=Lyci}, \p{Lyci}) (29) \p{Script: Lydi} \p{Script=Lydian} (= \p{Script_Extensions= Lydian}) (27) \p{Script: Lydian} \p{Script_Extensions=Lydian} (Short: \p{Sc=Lydi}, \p{Lydi}) (27) \p{Script: Mahajani} (Short: \p{Sc=Mahj}) (39: U+11150..11176) \p{Script: Mahj} \p{Script=Mahajani} (39) \p{Script: Maka} \p{Script=Makasar} (= \p{Script_Extensions=Makasar}) (25) \p{Script: Makasar} \p{Script_Extensions=Makasar} (Short: \p{Sc=Maka}, \p{Maka}) (25) \p{Script: Malayalam} (Short: \p{Sc=Mlym}) (118: U+0D00..0D0C, U+0D0E..0D10, U+0D12..0D44, U+0D46..0D48, U+0D4A..0D4F, U+0D54..0D63 ...) \p{Script: Mand} \p{Script=Mandaic} (29) \p{Script: Mandaic} (Short: \p{Sc=Mand}) (29: U+0840..085B, U+085E) \p{Script: Mani} \p{Script=Manichaean} (51) \p{Script: Manichaean} (Short: \p{Sc=Mani}) (51: U+10AC0..10AE6, U+10AEB..10AF6) \p{Script: Marc} \p{Script=Marchen} (= \p{Script_Extensions=Marchen}) (68) \p{Script: Marchen} \p{Script_Extensions=Marchen} (Short: \p{Sc=Marc}, \p{Marc}) (68) \p{Script: Masaram_Gondi} (Short: \p{Sc=Gonm}) (75: U+11D00..11D06, U+11D08..11D09, U+11D0B..11D36, U+11D3A, U+11D3C..11D3D, U+11D3F..11D47 ...) \p{Script: Medefaidrin} \p{Script_Extensions=Medefaidrin} (Short: \p{Sc=Medf}, \p{Medf}) (91) \p{Script: Medf} \p{Script=Medefaidrin} (= \p{Script_Extensions=Medefaidrin}) (91) \p{Script: Meetei_Mayek} \p{Script_Extensions=Meetei_Mayek} (Short: \p{Sc=Mtei}, \p{Mtei}) (79) \p{Script: Mend} \p{Script=Mende_Kikakui} (= \p{Script_Extensions=Mende_Kikakui}) (213) \p{Script: Mende_Kikakui} \p{Script_Extensions=Mende_Kikakui} (Short: \p{Sc=Mend}, \p{Mend}) (213) \p{Script: Merc} \p{Script=Meroitic_Cursive} (= \p{Script_Extensions=Meroitic_Cursive}) (90) \p{Script: Mero} \p{Script=Meroitic_Hieroglyphs} (= \p{Script_Extensions= Meroitic_Hieroglyphs}) (32) \p{Script: Meroitic_Cursive} \p{Script_Extensions= Meroitic_Cursive} (Short: \p{Sc=Merc}, \p{Merc}) (90) \p{Script: Meroitic_Hieroglyphs} \p{Script_Extensions= Meroitic_Hieroglyphs} (Short: \p{Sc= Mero}, \p{Mero}) (32) \p{Script: Miao} \p{Script_Extensions=Miao} (Short: \p{Sc= Miao}, \p{Miao}) (149) \p{Script: Mlym} \p{Script=Malayalam} (118) \p{Script: Modi} (Short: \p{Sc=Modi}) (79: U+11600..11644, U+11650..11659) \p{Script: Mong} \p{Script=Mongolian} (167) \p{Script: Mongolian} (Short: \p{Sc=Mong}) (167: U+1800..1801, U+1804, U+1806..180E, U+1810..1819, U+1820..1878, U+1880..18AA ...) \p{Script: Mro} \p{Script_Extensions=Mro} (Short: \p{Sc= Mro}, \p{Mro}) (43) \p{Script: Mroo} \p{Script=Mro} (= \p{Script_Extensions= Mro}) (43) \p{Script: Mtei} \p{Script=Meetei_Mayek} (= \p{Script_Extensions=Meetei_Mayek}) (79) \p{Script: Mult} \p{Script=Multani} (38) \p{Script: Multani} (Short: \p{Sc=Mult}) (38: U+11280..11286, U+11288, U+1128A..1128D, U+1128F..1129D, U+1129F..112A9) \p{Script: Myanmar} (Short: \p{Sc=Mymr}) (223: U+1000..109F, U+A9E0..A9FE, U+AA60..AA7F) \p{Script: Mymr} \p{Script=Myanmar} (223) \p{Script: Nabataean} \p{Script_Extensions=Nabataean} (Short: \p{Sc=Nbat}, \p{Nbat}) (40) \p{Script: Nand} \p{Script=Nandinagari} (65) \p{Script: Nandinagari} (Short: \p{Sc=Nand}) (65: U+119A0..119A7, U+119AA..119D7, U+119DA..119E4) \p{Script: Narb} \p{Script=Old_North_Arabian} (= \p{Script_Extensions=Old_North_Arabian}) (32) \p{Script: Nbat} \p{Script=Nabataean} (= \p{Script_Extensions=Nabataean}) (40) \p{Script: New_Tai_Lue} \p{Script_Extensions=New_Tai_Lue} (Short: \p{Sc=Talu}, \p{Talu}) (83) \p{Script: Newa} \p{Script_Extensions=Newa} (Short: \p{Sc= Newa}, \p{Newa}) (97) \p{Script: Nko} \p{Script_Extensions=Nko} (Short: \p{Sc= Nko}, \p{Nko}) (62) \p{Script: Nkoo} \p{Script=Nko} (= \p{Script_Extensions= Nko}) (62) \p{Script: Nshu} \p{Script=Nushu} (= \p{Script_Extensions= Nushu}) (397) \p{Script: Nushu} \p{Script_Extensions=Nushu} (Short: \p{Sc= Nshu}, \p{Nshu}) (397) \p{Script: Nyiakeng_Puachue_Hmong} \p{Script_Extensions= Nyiakeng_Puachue_Hmong} (Short: \p{Sc= Hmnp}, \p{Hmnp}) (71) \p{Script: Ogam} \p{Script=Ogham} (= \p{Script_Extensions= Ogham}) (29) \p{Script: Ogham} \p{Script_Extensions=Ogham} (Short: \p{Sc= Ogam}, \p{Ogam}) (29) \p{Script: Ol_Chiki} \p{Script_Extensions=Ol_Chiki} (Short: \p{Sc=Olck}, \p{Olck}) (48) \p{Script: Olck} \p{Script=Ol_Chiki} (= \p{Script_Extensions=Ol_Chiki}) (48) \p{Script: Old_Hungarian} \p{Script_Extensions=Old_Hungarian} (Short: \p{Sc=Hung}, \p{Hung}) (108) \p{Script: Old_Italic} \p{Script_Extensions=Old_Italic} (Short: \p{Sc=Ital}, \p{Ital}) (39) \p{Script: Old_North_Arabian} \p{Script_Extensions= Old_North_Arabian} (Short: \p{Sc=Narb}, \p{Narb}) (32) \p{Script: Old_Permic} (Short: \p{Sc=Perm}) (43: U+10350..1037A) \p{Script: Old_Persian} \p{Script_Extensions=Old_Persian} (Short: \p{Sc=Xpeo}, \p{Xpeo}) (50) \p{Script: Old_Sogdian} \p{Script_Extensions=Old_Sogdian} (Short: \p{Sc=Sogo}, \p{Sogo}) (40) \p{Script: Old_South_Arabian} \p{Script_Extensions= Old_South_Arabian} (Short: \p{Sc=Sarb}, \p{Sarb}) (32) \p{Script: Old_Turkic} \p{Script_Extensions=Old_Turkic} (Short: \p{Sc=Orkh}, \p{Orkh}) (73) \p{Script: Oriya} (Short: \p{Sc=Orya}) (91: U+0B01..0B03, U+0B05..0B0C, U+0B0F..0B10, U+0B13..0B28, U+0B2A..0B30, U+0B32..0B33 ...) \p{Script: Orkh} \p{Script=Old_Turkic} (= \p{Script_Extensions=Old_Turkic}) (73) \p{Script: Orya} \p{Script=Oriya} (91) \p{Script: Osage} \p{Script_Extensions=Osage} (Short: \p{Sc= Osge}, \p{Osge}) (72) \p{Script: Osge} \p{Script=Osage} (= \p{Script_Extensions= Osage}) (72) \p{Script: Osma} \p{Script=Osmanya} (= \p{Script_Extensions=Osmanya}) (40) \p{Script: Osmanya} \p{Script_Extensions=Osmanya} (Short: \p{Sc=Osma}, \p{Osma}) (40) \p{Script: Pahawh_Hmong} \p{Script_Extensions=Pahawh_Hmong} (Short: \p{Sc=Hmng}, \p{Hmng}) (127) \p{Script: Palm} \p{Script=Palmyrene} (= \p{Script_Extensions=Palmyrene}) (32) \p{Script: Palmyrene} \p{Script_Extensions=Palmyrene} (Short: \p{Sc=Palm}, \p{Palm}) (32) \p{Script: Pau_Cin_Hau} \p{Script_Extensions=Pau_Cin_Hau} (Short: \p{Sc=Pauc}, \p{Pauc}) (57) \p{Script: Pauc} \p{Script=Pau_Cin_Hau} (= \p{Script_Extensions=Pau_Cin_Hau}) (57) \p{Script: Perm} \p{Script=Old_Permic} (43) \p{Script: Phag} \p{Script=Phags_Pa} (56) \p{Script: Phags_Pa} (Short: \p{Sc=Phag}) (56: U+A840..A877) \p{Script: Phli} \p{Script=Inscriptional_Pahlavi} (= \p{Script_Extensions= Inscriptional_Pahlavi}) (27) \p{Script: Phlp} \p{Script=Psalter_Pahlavi} (29) \p{Script: Phnx} \p{Script=Phoenician} (= \p{Script_Extensions=Phoenician}) (29) \p{Script: Phoenician} \p{Script_Extensions=Phoenician} (Short: \p{Sc=Phnx}, \p{Phnx}) (29) \p{Script: Plrd} \p{Script=Miao} (= \p{Script_Extensions= Miao}) (149) \p{Script: Prti} \p{Script=Inscriptional_Parthian} (= \p{Script_Extensions= Inscriptional_Parthian}) (30) \p{Script: Psalter_Pahlavi} (Short: \p{Sc=Phlp}) (29: U+10B80..10B91, U+10B99..10B9C, U+10BA9..10BAF) \p{Script: Qaac} \p{Script=Coptic} (137) \p{Script: Qaai} \p{Script=Inherited} (573) \p{Script: Rejang} \p{Script_Extensions=Rejang} (Short: \p{Sc=Rjng}, \p{Rjng}) (37) \p{Script: Rjng} \p{Script=Rejang} (= \p{Script_Extensions= Rejang}) (37) \p{Script: Rohg} \p{Script=Hanifi_Rohingya} (50) \p{Script: Runic} \p{Script_Extensions=Runic} (Short: \p{Sc= Runr}, \p{Runr}) (86) \p{Script: Runr} \p{Script=Runic} (= \p{Script_Extensions= Runic}) (86) \p{Script: Samaritan} \p{Script_Extensions=Samaritan} (Short: \p{Sc=Samr}, \p{Samr}) (61) \p{Script: Samr} \p{Script=Samaritan} (= \p{Script_Extensions=Samaritan}) (61) \p{Script: Sarb} \p{Script=Old_South_Arabian} (= \p{Script_Extensions=Old_South_Arabian}) (32) \p{Script: Saur} \p{Script=Saurashtra} (= \p{Script_Extensions=Saurashtra}) (82) \p{Script: Saurashtra} \p{Script_Extensions=Saurashtra} (Short: \p{Sc=Saur}, \p{Saur}) (82) \p{Script: Sgnw} \p{Script=SignWriting} (= \p{Script_Extensions=SignWriting}) (672) \p{Script: Sharada} (Short: \p{Sc=Shrd}) (96: U+11180..111DF) \p{Script: Shavian} \p{Script_Extensions=Shavian} (Short: \p{Sc=Shaw}, \p{Shaw}) (48) \p{Script: Shaw} \p{Script=Shavian} (= \p{Script_Extensions=Shavian}) (48) \p{Script: Shrd} \p{Script=Sharada} (96) \p{Script: Sidd} \p{Script=Siddham} (= \p{Script_Extensions=Siddham}) (92) \p{Script: Siddham} \p{Script_Extensions=Siddham} (Short: \p{Sc=Sidd}, \p{Sidd}) (92) \p{Script: SignWriting} \p{Script_Extensions=SignWriting} (Short: \p{Sc=Sgnw}, \p{Sgnw}) (672) \p{Script: Sind} \p{Script=Khudawadi} (69) \p{Script: Sinh} \p{Script=Sinhala} (111) \p{Script: Sinhala} (Short: \p{Sc=Sinh}) (111: U+0D81..0D83, U+0D85..0D96, U+0D9A..0DB1, U+0DB3..0DBB, U+0DBD, U+0DC0..0DC6 ...) \p{Script: Sogd} \p{Script=Sogdian} (42) \p{Script: Sogdian} (Short: \p{Sc=Sogd}) (42: U+10F30..10F59) \p{Script: Sogo} \p{Script=Old_Sogdian} (= \p{Script_Extensions=Old_Sogdian}) (40) \p{Script: Sora} \p{Script=Sora_Sompeng} (= \p{Script_Extensions=Sora_Sompeng}) (35) \p{Script: Sora_Sompeng} \p{Script_Extensions=Sora_Sompeng} (Short: \p{Sc=Sora}, \p{Sora}) (35) \p{Script: Soyo} \p{Script=Soyombo} (= \p{Script_Extensions=Soyombo}) (83) \p{Script: Soyombo} \p{Script_Extensions=Soyombo} (Short: \p{Sc=Soyo}, \p{Soyo}) (83) \p{Script: Sund} \p{Script=Sundanese} (= \p{Script_Extensions=Sundanese}) (72) \p{Script: Sundanese} \p{Script_Extensions=Sundanese} (Short: \p{Sc=Sund}, \p{Sund}) (72) \p{Script: Sylo} \p{Script=Syloti_Nagri} (45) \p{Script: Syloti_Nagri} (Short: \p{Sc=Sylo}) (45: U+A800..A82C) \p{Script: Syrc} \p{Script=Syriac} (88) \p{Script: Syriac} (Short: \p{Sc=Syrc}) (88: U+0700..070D, U+070F..074A, U+074D..074F, U+0860..086A) \p{Script: Tagalog} (Short: \p{Sc=Tglg}) (20: U+1700..170C, U+170E..1714) \p{Script: Tagb} \p{Script=Tagbanwa} (18) \p{Script: Tagbanwa} (Short: \p{Sc=Tagb}) (18: U+1760..176C, U+176E..1770, U+1772..1773) \p{Script: Tai_Le} (Short: \p{Sc=Tale}) (35: U+1950..196D, U+1970..1974) \p{Script: Tai_Tham} \p{Script_Extensions=Tai_Tham} (Short: \p{Sc=Lana}, \p{Lana}) (127) \p{Script: Tai_Viet} \p{Script_Extensions=Tai_Viet} (Short: \p{Sc=Tavt}, \p{Tavt}) (72) \p{Script: Takr} \p{Script=Takri} (67) \p{Script: Takri} (Short: \p{Sc=Takr}) (67: U+11680..116B8, U+116C0..116C9) \p{Script: Tale} \p{Script=Tai_Le} (35) \p{Script: Talu} \p{Script=New_Tai_Lue} (= \p{Script_Extensions=New_Tai_Lue}) (83) \p{Script: Tamil} (Short: \p{Sc=Taml}) (123: U+0B82..0B83, U+0B85..0B8A, U+0B8E..0B90, U+0B92..0B95, U+0B99..0B9A, U+0B9C ...) \p{Script: Taml} \p{Script=Tamil} (123) \p{Script: Tang} \p{Script=Tangut} (= \p{Script_Extensions= Tangut}) (6914) \p{Script: Tangut} \p{Script_Extensions=Tangut} (Short: \p{Sc=Tang}, \p{Tang}) (6914) \p{Script: Tavt} \p{Script=Tai_Viet} (= \p{Script_Extensions=Tai_Viet}) (72) \p{Script: Telu} \p{Script=Telugu} (98) \p{Script: Telugu} (Short: \p{Sc=Telu}) (98: U+0C00..0C0C, U+0C0E..0C10, U+0C12..0C28, U+0C2A..0C39, U+0C3D..0C44, U+0C46..0C48 ...) \p{Script: Tfng} \p{Script=Tifinagh} (= \p{Script_Extensions=Tifinagh}) (59) \p{Script: Tglg} \p{Script=Tagalog} (20) \p{Script: Thaa} \p{Script=Thaana} (50) \p{Script: Thaana} (Short: \p{Sc=Thaa}) (50: U+0780..07B1) \p{Script: Thai} \p{Script_Extensions=Thai} (Short: \p{Sc= Thai}, \p{Thai}) (86) \p{Script: Tibetan} \p{Script_Extensions=Tibetan} (Short: \p{Sc=Tibt}, \p{Tibt}) (207) \p{Script: Tibt} \p{Script=Tibetan} (= \p{Script_Extensions=Tibetan}) (207) \p{Script: Tifinagh} \p{Script_Extensions=Tifinagh} (Short: \p{Sc=Tfng}, \p{Tfng}) (59) \p{Script: Tirh} \p{Script=Tirhuta} (82) \p{Script: Tirhuta} (Short: \p{Sc=Tirh}) (82: U+11480..114C7, U+114D0..114D9) \p{Script: Ugar} \p{Script=Ugaritic} (= \p{Script_Extensions=Ugaritic}) (31) \p{Script: Ugaritic} \p{Script_Extensions=Ugaritic} (Short: \p{Sc=Ugar}, \p{Ugar}) (31) \p{Script: Unknown} \p{Script_Extensions=Unknown} (Short: \p{Sc=Zzzz}, \p{Zzzz}) (970_188 plus all above-Unicode code points) \p{Script: Vai} \p{Script_Extensions=Vai} (Short: \p{Sc= Vai}, \p{Vai}) (300) \p{Script: Vaii} \p{Script=Vai} (= \p{Script_Extensions= Vai}) (300) \p{Script: Wancho} \p{Script_Extensions=Wancho} (Short: \p{Sc=Wcho}, \p{Wcho}) (59) \p{Script: Wara} \p{Script=Warang_Citi} (= \p{Script_Extensions=Warang_Citi}) (84) \p{Script: Warang_Citi} \p{Script_Extensions=Warang_Citi} (Short: \p{Sc=Wara}, \p{Wara}) (84) \p{Script: Wcho} \p{Script=Wancho} (= \p{Script_Extensions= Wancho}) (59) \p{Script: Xpeo} \p{Script=Old_Persian} (= \p{Script_Extensions=Old_Persian}) (50) \p{Script: Xsux} \p{Script=Cuneiform} (= \p{Script_Extensions=Cuneiform}) (1234) \p{Script: Yezi} \p{Script=Yezidi} (47) \p{Script: Yezidi} (Short: \p{Sc=Yezi}) (47: U+10E80..10EA9, U+10EAB..10EAD, U+10EB0..10EB1) \p{Script: Yi} (Short: \p{Sc=Yi}) (1220: U+A000..A48C, U+A490..A4C6) \p{Script: Yiii} \p{Script=Yi} (1220) \p{Script: Zanabazar_Square} \p{Script_Extensions= Zanabazar_Square} (Short: \p{Sc=Zanb}, \p{Zanb}) (72) \p{Script: Zanb} \p{Script=Zanabazar_Square} (= \p{Script_Extensions=Zanabazar_Square}) (72) \p{Script: Zinh} \p{Script=Inherited} (573) \p{Script: Zyyy} \p{Script=Common} (8087) \p{Script: Zzzz} \p{Script=Unknown} (= \p{Script_Extensions=Unknown}) (970_188 plus all above-Unicode code points) \p{Script_Extensions: Adlam} (Short: \p{Scx=Adlm}, \p{Adlm}) (89: U+0640, U+1E900..1E94B, U+1E950..1E959, U+1E95E..1E95F) \p{Script_Extensions: Adlm} \p{Script_Extensions=Adlam} (89) \p{Script_Extensions: Aghb} \p{Script_Extensions= Caucasian_Albanian} (53) \p{Script_Extensions: Ahom} (Short: \p{Scx=Ahom}, \p{Ahom}) (58: U+11700..1171A, U+1171D..1172B, U+11730..1173F) \p{Script_Extensions: Anatolian_Hieroglyphs} (Short: \p{Scx=Hluw}, \p{Hluw}) (583: U+14400..14646) \p{Script_Extensions: Arab} \p{Script_Extensions=Arabic} (1335) \p{Script_Extensions: Arabic} (Short: \p{Scx=Arab}, \p{Arab}) (1335: U+0600..0604, U+0606..061C, U+061E..06DC, U+06DE..06FF, U+0750..077F, U+08A0..08B4 ...) \p{Script_Extensions: Armenian} (Short: \p{Scx=Armn}, \p{Armn}) (96: U+0531..0556, U+0559..058A, U+058D..058F, U+FB13..FB17) \p{Script_Extensions: Armi} \p{Script_Extensions=Imperial_Aramaic} (31) \p{Script_Extensions: Armn} \p{Script_Extensions=Armenian} (96) \p{Script_Extensions: Avestan} (Short: \p{Scx=Avst}, \p{Avst}) (61: U+10B00..10B35, U+10B39..10B3F) \p{Script_Extensions: Avst} \p{Script_Extensions=Avestan} (61) \p{Script_Extensions: Bali} \p{Script_Extensions=Balinese} (121) \p{Script_Extensions: Balinese} (Short: \p{Scx=Bali}, \p{Bali}) (121: U+1B00..1B4B, U+1B50..1B7C) \p{Script_Extensions: Bamu} \p{Script_Extensions=Bamum} (657) \p{Script_Extensions: Bamum} (Short: \p{Scx=Bamu}, \p{Bamu}) (657: U+A6A0..A6F7, U+16800..16A38) \p{Script_Extensions: Bass} \p{Script_Extensions=Bassa_Vah} (36) \p{Script_Extensions: Bassa_Vah} (Short: \p{Scx=Bass}, \p{Bass}) (36: U+16AD0..16AED, U+16AF0..16AF5) \p{Script_Extensions: Batak} (Short: \p{Scx=Batk}, \p{Batk}) (56: U+1BC0..1BF3, U+1BFC..1BFF) \p{Script_Extensions: Batk} \p{Script_Extensions=Batak} (56) \p{Script_Extensions: Beng} \p{Script_Extensions=Bengali} (113) \p{Script_Extensions: Bengali} (Short: \p{Scx=Beng}, \p{Beng}) (113: U+0951..0952, U+0964..0965, U+0980..0983, U+0985..098C, U+098F..0990, U+0993..09A8 ...) \p{Script_Extensions: Bhaiksuki} (Short: \p{Scx=Bhks}, \p{Bhks}) (97: U+11C00..11C08, U+11C0A..11C36, U+11C38..11C45, U+11C50..11C6C) \p{Script_Extensions: Bhks} \p{Script_Extensions=Bhaiksuki} (97) \p{Script_Extensions: Bopo} \p{Script_Extensions=Bopomofo} (117) \p{Script_Extensions: Bopomofo} (Short: \p{Scx=Bopo}, \p{Bopo}) (117: U+02EA..02EB, U+3001..3003, U+3008..3011, U+3013..301F, U+302A..302D, U+3030 ...) \p{Script_Extensions: Brah} \p{Script_Extensions=Brahmi} (109) \p{Script_Extensions: Brahmi} (Short: \p{Scx=Brah}, \p{Brah}) (109: U+11000..1104D, U+11052..1106F, U+1107F) \p{Script_Extensions: Brai} \p{Script_Extensions=Braille} (256) \p{Script_Extensions: Braille} (Short: \p{Scx=Brai}, \p{Brai}) (256: U+2800..28FF) \p{Script_Extensions: Bugi} \p{Script_Extensions=Buginese} (31) \p{Script_Extensions: Buginese} (Short: \p{Scx=Bugi}, \p{Bugi}) (31: U+1A00..1A1B, U+1A1E..1A1F, U+A9CF) \p{Script_Extensions: Buhd} \p{Script_Extensions=Buhid} (22) \p{Script_Extensions: Buhid} (Short: \p{Scx=Buhd}, \p{Buhd}) (22: U+1735..1736, U+1740..1753) \p{Script_Extensions: Cakm} \p{Script_Extensions=Chakma} (91) \p{Script_Extensions: Canadian_Aboriginal} (Short: \p{Scx=Cans}, \p{Cans}) (710: U+1400..167F, U+18B0..18F5) \p{Script_Extensions: Cans} \p{Script_Extensions= Canadian_Aboriginal} (710) \p{Script_Extensions: Cari} \p{Script_Extensions=Carian} (49) \p{Script_Extensions: Carian} (Short: \p{Scx=Cari}, \p{Cari}) (49: U+102A0..102D0) \p{Script_Extensions: Caucasian_Albanian} (Short: \p{Scx=Aghb}, \p{Aghb}) (53: U+10530..10563, U+1056F) \p{Script_Extensions: Chakma} (Short: \p{Scx=Cakm}, \p{Cakm}) (91: U+09E6..09EF, U+1040..1049, U+11100..11134, U+11136..11147) \p{Script_Extensions: Cham} (Short: \p{Scx=Cham}, \p{Cham}) (83: U+AA00..AA36, U+AA40..AA4D, U+AA50..AA59, U+AA5C..AA5F) \p{Script_Extensions: Cher} \p{Script_Extensions=Cherokee} (172) \p{Script_Extensions: Cherokee} (Short: \p{Scx=Cher}, \p{Cher}) (172: U+13A0..13F5, U+13F8..13FD, U+AB70..ABBF) \p{Script_Extensions: Chorasmian} (Short: \p{Scx=Chrs}, \p{Chrs}) (28: U+10FB0..10FCB) \p{Script_Extensions: Chrs} \p{Script_Extensions=Chorasmian} (28) \p{Script_Extensions: Common} (Short: \p{Scx=Zyyy}, \p{Zyyy}) (7661: [\x00-\x20!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@\[\\\]\^_`\{\|\}~\x7f-\xa9 \xab-\xb9\xbb-\xbf\xd7\xf7], U+02B9..02DF, U+02E5..02E9, U+02EC..02FF, U+0374, U+037E ...) \p{Script_Extensions: Copt} \p{Script_Extensions=Coptic} (165) \p{Script_Extensions: Coptic} (Short: \p{Scx=Copt}, \p{Copt}) (165: U+03E2..03EF, U+2C80..2CF3, U+2CF9..2CFF, U+102E0..102FB) \p{Script_Extensions: Cprt} \p{Script_Extensions=Cypriot} (112) \p{Script_Extensions: Cuneiform} (Short: \p{Scx=Xsux}, \p{Xsux}) (1234: U+12000..12399, U+12400..1246E, U+12470..12474, U+12480..12543) \p{Script_Extensions: Cypriot} (Short: \p{Scx=Cprt}, \p{Cprt}) (112: U+10100..10102, U+10107..10133, U+10137..1013F, U+10800..10805, U+10808, U+1080A..10835 ...) \p{Script_Extensions: Cyrillic} (Short: \p{Scx=Cyrl}, \p{Cyrl}) (447: U+0400..052F, U+1C80..1C88, U+1D2B, U+1D78, U+1DF8, U+2DE0..2DFF ...) \p{Script_Extensions: Cyrl} \p{Script_Extensions=Cyrillic} (447) \p{Script_Extensions: Deseret} (Short: \p{Scx=Dsrt}, \p{Dsrt}) (80: U+10400..1044F) \p{Script_Extensions: Deva} \p{Script_Extensions=Devanagari} (210) \p{Script_Extensions: Devanagari} (Short: \p{Scx=Deva}, \p{Deva}) (210: U+0900..0952, U+0955..097F, U+1CD0..1CF6, U+1CF8..1CF9, U+20F0, U+A830..A839 ...) \p{Script_Extensions: Diak} \p{Script_Extensions=Dives_Akuru} (72) \p{Script_Extensions: Dives_Akuru} (Short: \p{Scx=Diak}, \p{Diak}) (72: U+11900..11906, U+11909, U+1190C..11913, U+11915..11916, U+11918..11935, U+11937..11938 ...) \p{Script_Extensions: Dogr} \p{Script_Extensions=Dogra} (82) \p{Script_Extensions: Dogra} (Short: \p{Scx=Dogr}, \p{Dogr}) (82: U+0964..096F, U+A830..A839, U+11800..1183B) \p{Script_Extensions: Dsrt} \p{Script_Extensions=Deseret} (80) \p{Script_Extensions: Dupl} \p{Script_Extensions=Duployan} (147) \p{Script_Extensions: Duployan} (Short: \p{Scx=Dupl}, \p{Dupl}) (147: U+1BC00..1BC6A, U+1BC70..1BC7C, U+1BC80..1BC88, U+1BC90..1BC99, U+1BC9C..1BCA3) \p{Script_Extensions: Egyp} \p{Script_Extensions= Egyptian_Hieroglyphs} (1080) \p{Script_Extensions: Egyptian_Hieroglyphs} (Short: \p{Scx=Egyp}, \p{Egyp}) (1080: U+13000..1342E, U+13430..13438) \p{Script_Extensions: Elba} \p{Script_Extensions=Elbasan} (40) \p{Script_Extensions: Elbasan} (Short: \p{Scx=Elba}, \p{Elba}) (40: U+10500..10527) \p{Script_Extensions: Elym} \p{Script_Extensions=Elymaic} (23) \p{Script_Extensions: Elymaic} (Short: \p{Scx=Elym}, \p{Elym}) (23: U+10FE0..10FF6) \p{Script_Extensions: Ethi} \p{Script_Extensions=Ethiopic} (495) \p{Script_Extensions: Ethiopic} (Short: \p{Scx=Ethi}, \p{Ethi}) (495: U+1200..1248, U+124A..124D, U+1250..1256, U+1258, U+125A..125D, U+1260..1288 ...) \p{Script_Extensions: Geor} \p{Script_Extensions=Georgian} (174) \p{Script_Extensions: Georgian} (Short: \p{Scx=Geor}, \p{Geor}) (174: U+10A0..10C5, U+10C7, U+10CD, U+10D0..10FF, U+1C90..1CBA, U+1CBD..1CBF ...) \p{Script_Extensions: Glag} \p{Script_Extensions=Glagolitic} (136) \p{Script_Extensions: Glagolitic} (Short: \p{Scx=Glag}, \p{Glag}) (136: U+0484, U+0487, U+2C00..2C2E, U+2C30..2C5E, U+2E43, U+A66F ...) \p{Script_Extensions: Gong} \p{Script_Extensions=Gunjala_Gondi} (65) \p{Script_Extensions: Gonm} \p{Script_Extensions=Masaram_Gondi} (77) \p{Script_Extensions: Goth} \p{Script_Extensions=Gothic} (27) \p{Script_Extensions: Gothic} (Short: \p{Scx=Goth}, \p{Goth}) (27: U+10330..1034A) \p{Script_Extensions: Gran} \p{Script_Extensions=Grantha} (116) \p{Script_Extensions: Grantha} (Short: \p{Scx=Gran}, \p{Gran}) (116: U+0951..0952, U+0964..0965, U+0BE6..0BF3, U+1CD0, U+1CD2..1CD3, U+1CF2..1CF4 ...) \p{Script_Extensions: Greek} (Short: \p{Scx=Grek}, \p{Grek}) (522: U+0342, U+0345, U+0370..0373, U+0375..0377, U+037A..037D, U+037F ...) \p{Script_Extensions: Grek} \p{Script_Extensions=Greek} (522) \p{Script_Extensions: Gujarati} (Short: \p{Scx=Gujr}, \p{Gujr}) (105: U+0951..0952, U+0964..0965, U+0A81..0A83, U+0A85..0A8D, U+0A8F..0A91, U+0A93..0AA8 ...) \p{Script_Extensions: Gujr} \p{Script_Extensions=Gujarati} (105) \p{Script_Extensions: Gunjala_Gondi} (Short: \p{Scx=Gong}, \p{Gong}) (65: U+0964..0965, U+11D60..11D65, U+11D67..11D68, U+11D6A..11D8E, U+11D90..11D91, U+11D93..11D98 ...) \p{Script_Extensions: Gurmukhi} (Short: \p{Scx=Guru}, \p{Guru}) (94: U+0951..0952, U+0964..0965, U+0A01..0A03, U+0A05..0A0A, U+0A0F..0A10, U+0A13..0A28 ...) \p{Script_Extensions: Guru} \p{Script_Extensions=Gurmukhi} (94) \p{Script_Extensions: Han} (Short: \p{Scx=Han}, \p{Han}) (94_492: U+2E80..2E99, U+2E9B..2EF3, U+2F00..2FD5, U+3001..3003, U+3005..3011, U+3013..301F ...) \p{Script_Extensions: Hang} \p{Script_Extensions=Hangul} (11_775) \p{Script_Extensions: Hangul} (Short: \p{Scx=Hang}, \p{Hang}) (11_775: U+1100..11FF, U+3001..3003, U+3008..3011, U+3013..301F, U+302E..3030, U+3037 ...) \p{Script_Extensions: Hani} \p{Script_Extensions=Han} (94_492) \p{Script_Extensions: Hanifi_Rohingya} (Short: \p{Scx=Rohg}, \p{Rohg}) (55: U+060C, U+061B, U+061F, U+0640, U+06D4, U+10D00..10D27 ...) \p{Script_Extensions: Hano} \p{Script_Extensions=Hanunoo} (23) \p{Script_Extensions: Hanunoo} (Short: \p{Scx=Hano}, \p{Hano}) (23: U+1720..1736) \p{Script_Extensions: Hatr} \p{Script_Extensions=Hatran} (26) \p{Script_Extensions: Hatran} (Short: \p{Scx=Hatr}, \p{Hatr}) (26: U+108E0..108F2, U+108F4..108F5, U+108FB..108FF) \p{Script_Extensions: Hebr} \p{Script_Extensions=Hebrew} (134) \p{Script_Extensions: Hebrew} (Short: \p{Scx=Hebr}, \p{Hebr}) (134: U+0591..05C7, U+05D0..05EA, U+05EF..05F4, U+FB1D..FB36, U+FB38..FB3C, U+FB3E ...) \p{Script_Extensions: Hira} \p{Script_Extensions=Hiragana} (431) \p{Script_Extensions: Hiragana} (Short: \p{Scx=Hira}, \p{Hira}) (431: U+3001..3003, U+3008..3011, U+3013..301F, U+3030..3035, U+3037, U+303C..303D ...) \p{Script_Extensions: Hluw} \p{Script_Extensions= Anatolian_Hieroglyphs} (583) \p{Script_Extensions: Hmng} \p{Script_Extensions=Pahawh_Hmong} (127) \p{Script_Extensions: Hmnp} \p{Script_Extensions= Nyiakeng_Puachue_Hmong} (71) \p{Script_Extensions: Hung} \p{Script_Extensions=Old_Hungarian} (108) \p{Script_Extensions: Imperial_Aramaic} (Short: \p{Scx=Armi}, \p{Armi}) (31: U+10840..10855, U+10857..1085F) \p{Script_Extensions: Inherited} (Short: \p{Scx=Zinh}, \p{Zinh}) (503: U+0300..0341, U+0343..0344, U+0346..0362, U+0953..0954, U+1AB0..1AC0, U+1DC2..1DF7 ...) \p{Script_Extensions: Inscriptional_Pahlavi} (Short: \p{Scx=Phli}, \p{Phli}) (27: U+10B60..10B72, U+10B78..10B7F) \p{Script_Extensions: Inscriptional_Parthian} (Short: \p{Scx= Prti}, \p{Prti}) (30: U+10B40..10B55, U+10B58..10B5F) \p{Script_Extensions: Ital} \p{Script_Extensions=Old_Italic} (39) \p{Script_Extensions: Java} \p{Script_Extensions=Javanese} (91) \p{Script_Extensions: Javanese} (Short: \p{Scx=Java}, \p{Java}) (91: U+A980..A9CD, U+A9CF..A9D9, U+A9DE..A9DF) \p{Script_Extensions: Kaithi} (Short: \p{Scx=Kthi}, \p{Kthi}) (87: U+0966..096F, U+A830..A839, U+11080..110C1, U+110CD) \p{Script_Extensions: Kali} \p{Script_Extensions=Kayah_Li} (48) \p{Script_Extensions: Kana} \p{Script_Extensions=Katakana} (356) \p{Script_Extensions: Kannada} (Short: \p{Scx=Knda}, \p{Knda}) (104: U+0951..0952, U+0964..0965, U+0C80..0C8C, U+0C8E..0C90, U+0C92..0CA8, U+0CAA..0CB3 ...) \p{Script_Extensions: Katakana} (Short: \p{Scx=Kana}, \p{Kana}) (356: U+3001..3003, U+3008..3011, U+3013..301F, U+3030..3035, U+3037, U+303C..303D ...) \p{Script_Extensions: Kayah_Li} (Short: \p{Scx=Kali}, \p{Kali}) (48: U+A900..A92F) \p{Script_Extensions: Khar} \p{Script_Extensions=Kharoshthi} (68) \p{Script_Extensions: Kharoshthi} (Short: \p{Scx=Khar}, \p{Khar}) (68: U+10A00..10A03, U+10A05..10A06, U+10A0C..10A13, U+10A15..10A17, U+10A19..10A35, U+10A38..10A3A ...) \p{Script_Extensions: Khitan_Small_Script} (Short: \p{Scx=Kits}, \p{Kits}) (471: U+16FE4, U+18B00..18CD5) \p{Script_Extensions: Khmer} (Short: \p{Scx=Khmr}, \p{Khmr}) (146: U+1780..17DD, U+17E0..17E9, U+17F0..17F9, U+19E0..19FF) \p{Script_Extensions: Khmr} \p{Script_Extensions=Khmer} (146) \p{Script_Extensions: Khoj} \p{Script_Extensions=Khojki} (82) \p{Script_Extensions: Khojki} (Short: \p{Scx=Khoj}, \p{Khoj}) (82: U+0AE6..0AEF, U+A830..A839, U+11200..11211, U+11213..1123E) \p{Script_Extensions: Khudawadi} (Short: \p{Scx=Sind}, \p{Sind}) (81: U+0964..0965, U+A830..A839, U+112B0..112EA, U+112F0..112F9) \p{Script_Extensions: Kits} \p{Script_Extensions= Khitan_Small_Script} (471) \p{Script_Extensions: Knda} \p{Script_Extensions=Kannada} (104) \p{Script_Extensions: Kthi} \p{Script_Extensions=Kaithi} (87) \p{Script_Extensions: Lana} \p{Script_Extensions=Tai_Tham} (127) \p{Script_Extensions: Lao} (Short: \p{Scx=Lao}, \p{Lao}) (82: U+0E81..0E82, U+0E84, U+0E86..0E8A, U+0E8C..0EA3, U+0EA5, U+0EA7..0EBD ...) \p{Script_Extensions: Laoo} \p{Script_Extensions=Lao} (82) \p{Script_Extensions: Latin} (Short: \p{Scx=Latn}, \p{Latn}) (1403: [A-Za-z\xaa\xba\xc0-\xd6\xd8- \xf6\xf8-\xff], U+0100..02B8, U+02E0..02E4, U+0363..036F, U+0485..0486, U+0951..0952 ...) \p{Script_Extensions: Latn} \p{Script_Extensions=Latin} (1403) \p{Script_Extensions: Lepc} \p{Script_Extensions=Lepcha} (74) \p{Script_Extensions: Lepcha} (Short: \p{Scx=Lepc}, \p{Lepc}) (74: U+1C00..1C37, U+1C3B..1C49, U+1C4D..1C4F) \p{Script_Extensions: Limb} \p{Script_Extensions=Limbu} (69) \p{Script_Extensions: Limbu} (Short: \p{Scx=Limb}, \p{Limb}) (69: U+0965, U+1900..191E, U+1920..192B, U+1930..193B, U+1940, U+1944..194F) \p{Script_Extensions: Lina} \p{Script_Extensions=Linear_A} (386) \p{Script_Extensions: Linb} \p{Script_Extensions=Linear_B} (268) \p{Script_Extensions: Linear_A} (Short: \p{Scx=Lina}, \p{Lina}) (386: U+10107..10133, U+10600..10736, U+10740..10755, U+10760..10767) \p{Script_Extensions: Linear_B} (Short: \p{Scx=Linb}, \p{Linb}) (268: U+10000..1000B, U+1000D..10026, U+10028..1003A, U+1003C..1003D, U+1003F..1004D, U+10050..1005D ...) \p{Script_Extensions: Lisu} (Short: \p{Scx=Lisu}, \p{Lisu}) (49: U+A4D0..A4FF, U+11FB0) \p{Script_Extensions: Lyci} \p{Script_Extensions=Lycian} (29) \p{Script_Extensions: Lycian} (Short: \p{Scx=Lyci}, \p{Lyci}) (29: U+10280..1029C) \p{Script_Extensions: Lydi} \p{Script_Extensions=Lydian} (27) \p{Script_Extensions: Lydian} (Short: \p{Scx=Lydi}, \p{Lydi}) (27: U+10920..10939, U+1093F) \p{Script_Extensions: Mahajani} (Short: \p{Scx=Mahj}, \p{Mahj}) (61: U+0964..096F, U+A830..A839, U+11150..11176) \p{Script_Extensions: Mahj} \p{Script_Extensions=Mahajani} (61) \p{Script_Extensions: Maka} \p{Script_Extensions=Makasar} (25) \p{Script_Extensions: Makasar} (Short: \p{Scx=Maka}, \p{Maka}) (25: U+11EE0..11EF8) \p{Script_Extensions: Malayalam} (Short: \p{Scx=Mlym}, \p{Mlym}) (126: U+0951..0952, U+0964..0965, U+0D00..0D0C, U+0D0E..0D10, U+0D12..0D44, U+0D46..0D48 ...) \p{Script_Extensions: Mand} \p{Script_Extensions=Mandaic} (30) \p{Script_Extensions: Mandaic} (Short: \p{Scx=Mand}, \p{Mand}) (30: U+0640, U+0840..085B, U+085E) \p{Script_Extensions: Mani} \p{Script_Extensions=Manichaean} (52) \p{Script_Extensions: Manichaean} (Short: \p{Scx=Mani}, \p{Mani}) (52: U+0640, U+10AC0..10AE6, U+10AEB..10AF6) \p{Script_Extensions: Marc} \p{Script_Extensions=Marchen} (68) \p{Script_Extensions: Marchen} (Short: \p{Scx=Marc}, \p{Marc}) (68: U+11C70..11C8F, U+11C92..11CA7, U+11CA9..11CB6) \p{Script_Extensions: Masaram_Gondi} (Short: \p{Scx=Gonm}, \p{Gonm}) (77: U+0964..0965, U+11D00..11D06, U+11D08..11D09, U+11D0B..11D36, U+11D3A, U+11D3C..11D3D ...) \p{Script_Extensions: Medefaidrin} (Short: \p{Scx=Medf}, \p{Medf}) (91: U+16E40..16E9A) \p{Script_Extensions: Medf} \p{Script_Extensions=Medefaidrin} (91) \p{Script_Extensions: Meetei_Mayek} (Short: \p{Scx=Mtei}, \p{Mtei}) (79: U+AAE0..AAF6, U+ABC0..ABED, U+ABF0..ABF9) \p{Script_Extensions: Mend} \p{Script_Extensions=Mende_Kikakui} (213) \p{Script_Extensions: Mende_Kikakui} (Short: \p{Scx=Mend}, \p{Mend}) (213: U+1E800..1E8C4, U+1E8C7..1E8D6) \p{Script_Extensions: Merc} \p{Script_Extensions=Meroitic_Cursive} (90) \p{Script_Extensions: Mero} \p{Script_Extensions= Meroitic_Hieroglyphs} (32) \p{Script_Extensions: Meroitic_Cursive} (Short: \p{Scx=Merc}, \p{Merc}) (90: U+109A0..109B7, U+109BC..109CF, U+109D2..109FF) \p{Script_Extensions: Meroitic_Hieroglyphs} (Short: \p{Scx=Mero}, \p{Mero}) (32: U+10980..1099F) \p{Script_Extensions: Miao} (Short: \p{Scx=Miao}, \p{Miao}) (149: U+16F00..16F4A, U+16F4F..16F87, U+16F8F..16F9F) \p{Script_Extensions: Mlym} \p{Script_Extensions=Malayalam} (126) \p{Script_Extensions: Modi} (Short: \p{Scx=Modi}, \p{Modi}) (89: U+A830..A839, U+11600..11644, U+11650..11659) \p{Script_Extensions: Mong} \p{Script_Extensions=Mongolian} (171) \p{Script_Extensions: Mongolian} (Short: \p{Scx=Mong}, \p{Mong}) (171: U+1800..180E, U+1810..1819, U+1820..1878, U+1880..18AA, U+202F, U+11660..1166C) \p{Script_Extensions: Mro} (Short: \p{Scx=Mro}, \p{Mro}) (43: U+16A40..16A5E, U+16A60..16A69, U+16A6E..16A6F) \p{Script_Extensions: Mroo} \p{Script_Extensions=Mro} (43) \p{Script_Extensions: Mtei} \p{Script_Extensions=Meetei_Mayek} (79) \p{Script_Extensions: Mult} \p{Script_Extensions=Multani} (48) \p{Script_Extensions: Multani} (Short: \p{Scx=Mult}, \p{Mult}) (48: U+0A66..0A6F, U+11280..11286, U+11288, U+1128A..1128D, U+1128F..1129D, U+1129F..112A9) \p{Script_Extensions: Myanmar} (Short: \p{Scx=Mymr}, \p{Mymr}) (224: U+1000..109F, U+A92E, U+A9E0..A9FE, U+AA60..AA7F) \p{Script_Extensions: Mymr} \p{Script_Extensions=Myanmar} (224) \p{Script_Extensions: Nabataean} (Short: \p{Scx=Nbat}, \p{Nbat}) (40: U+10880..1089E, U+108A7..108AF) \p{Script_Extensions: Nand} \p{Script_Extensions=Nandinagari} (86) \p{Script_Extensions: Nandinagari} (Short: \p{Scx=Nand}, \p{Nand}) (86: U+0964..0965, U+0CE6..0CEF, U+1CE9, U+1CF2, U+1CFA, U+A830..A835 ...) \p{Script_Extensions: Narb} \p{Script_Extensions= Old_North_Arabian} (32) \p{Script_Extensions: Nbat} \p{Script_Extensions=Nabataean} (40) \p{Script_Extensions: New_Tai_Lue} (Short: \p{Scx=Talu}, \p{Talu}) (83: U+1980..19AB, U+19B0..19C9, U+19D0..19DA, U+19DE..19DF) \p{Script_Extensions: Newa} (Short: \p{Scx=Newa}, \p{Newa}) (97: U+11400..1145B, U+1145D..11461) \p{Script_Extensions: Nko} (Short: \p{Scx=Nko}, \p{Nko}) (62: U+07C0..07FA, U+07FD..07FF) \p{Script_Extensions: Nkoo} \p{Script_Extensions=Nko} (62) \p{Script_Extensions: Nshu} \p{Script_Extensions=Nushu} (397) \p{Script_Extensions: Nushu} (Short: \p{Scx=Nshu}, \p{Nshu}) (397: U+16FE1, U+1B170..1B2FB) \p{Script_Extensions: Nyiakeng_Puachue_Hmong} (Short: \p{Scx= Hmnp}, \p{Hmnp}) (71: U+1E100..1E12C, U+1E130..1E13D, U+1E140..1E149, U+1E14E..1E14F) \p{Script_Extensions: Ogam} \p{Script_Extensions=Ogham} (29) \p{Script_Extensions: Ogham} (Short: \p{Scx=Ogam}, \p{Ogam}) (29: U+1680..169C) \p{Script_Extensions: Ol_Chiki} (Short: \p{Scx=Olck}, \p{Olck}) (48: U+1C50..1C7F) \p{Script_Extensions: Olck} \p{Script_Extensions=Ol_Chiki} (48) \p{Script_Extensions: Old_Hungarian} (Short: \p{Scx=Hung}, \p{Hung}) (108: U+10C80..10CB2, U+10CC0..10CF2, U+10CFA..10CFF) \p{Script_Extensions: Old_Italic} (Short: \p{Scx=Ital}, \p{Ital}) (39: U+10300..10323, U+1032D..1032F) \p{Script_Extensions: Old_North_Arabian} (Short: \p{Scx=Narb}, \p{Narb}) (32: U+10A80..10A9F) \p{Script_Extensions: Old_Permic} (Short: \p{Scx=Perm}, \p{Perm}) (44: U+0483, U+10350..1037A) \p{Script_Extensions: Old_Persian} (Short: \p{Scx=Xpeo}, \p{Xpeo}) (50: U+103A0..103C3, U+103C8..103D5) \p{Script_Extensions: Old_Sogdian} (Short: \p{Scx=Sogo}, \p{Sogo}) (40: U+10F00..10F27) \p{Script_Extensions: Old_South_Arabian} (Short: \p{Scx=Sarb}, \p{Sarb}) (32: U+10A60..10A7F) \p{Script_Extensions: Old_Turkic} (Short: \p{Scx=Orkh}, \p{Orkh}) (73: U+10C00..10C48) \p{Script_Extensions: Oriya} (Short: \p{Scx=Orya}, \p{Orya}) (97: U+0951..0952, U+0964..0965, U+0B01..0B03, U+0B05..0B0C, U+0B0F..0B10, U+0B13..0B28 ...) \p{Script_Extensions: Orkh} \p{Script_Extensions=Old_Turkic} (73) \p{Script_Extensions: Orya} \p{Script_Extensions=Oriya} (97) \p{Script_Extensions: Osage} (Short: \p{Scx=Osge}, \p{Osge}) (72: U+104B0..104D3, U+104D8..104FB) \p{Script_Extensions: Osge} \p{Script_Extensions=Osage} (72) \p{Script_Extensions: Osma} \p{Script_Extensions=Osmanya} (40) \p{Script_Extensions: Osmanya} (Short: \p{Scx=Osma}, \p{Osma}) (40: U+10480..1049D, U+104A0..104A9) \p{Script_Extensions: Pahawh_Hmong} (Short: \p{Scx=Hmng}, \p{Hmng}) (127: U+16B00..16B45, U+16B50..16B59, U+16B5B..16B61, U+16B63..16B77, U+16B7D..16B8F) \p{Script_Extensions: Palm} \p{Script_Extensions=Palmyrene} (32) \p{Script_Extensions: Palmyrene} (Short: \p{Scx=Palm}, \p{Palm}) (32: U+10860..1087F) \p{Script_Extensions: Pau_Cin_Hau} (Short: \p{Scx=Pauc}, \p{Pauc}) (57: U+11AC0..11AF8) \p{Script_Extensions: Pauc} \p{Script_Extensions=Pau_Cin_Hau} (57) \p{Script_Extensions: Perm} \p{Script_Extensions=Old_Permic} (44) \p{Script_Extensions: Phag} \p{Script_Extensions=Phags_Pa} (59) \p{Script_Extensions: Phags_Pa} (Short: \p{Scx=Phag}, \p{Phag}) (59: U+1802..1803, U+1805, U+A840..A877) \p{Script_Extensions: Phli} \p{Script_Extensions= Inscriptional_Pahlavi} (27) \p{Script_Extensions: Phlp} \p{Script_Extensions=Psalter_Pahlavi} (30) \p{Script_Extensions: Phnx} \p{Script_Extensions=Phoenician} (29) \p{Script_Extensions: Phoenician} (Short: \p{Scx=Phnx}, \p{Phnx}) (29: U+10900..1091B, U+1091F) \p{Script_Extensions: Plrd} \p{Script_Extensions=Miao} (149) \p{Script_Extensions: Prti} \p{Script_Extensions= Inscriptional_Parthian} (30) \p{Script_Extensions: Psalter_Pahlavi} (Short: \p{Scx=Phlp}, \p{Phlp}) (30: U+0640, U+10B80..10B91, U+10B99..10B9C, U+10BA9..10BAF) \p{Script_Extensions: Qaac} \p{Script_Extensions=Coptic} (165) \p{Script_Extensions: Qaai} \p{Script_Extensions=Inherited} (503) \p{Script_Extensions: Rejang} (Short: \p{Scx=Rjng}, \p{Rjng}) (37: U+A930..A953, U+A95F) \p{Script_Extensions: Rjng} \p{Script_Extensions=Rejang} (37) \p{Script_Extensions: Rohg} \p{Script_Extensions=Hanifi_Rohingya} (55) \p{Script_Extensions: Runic} (Short: \p{Scx=Runr}, \p{Runr}) (86: U+16A0..16EA, U+16EE..16F8) \p{Script_Extensions: Runr} \p{Script_Extensions=Runic} (86) \p{Script_Extensions: Samaritan} (Short: \p{Scx=Samr}, \p{Samr}) (61: U+0800..082D, U+0830..083E) \p{Script_Extensions: Samr} \p{Script_Extensions=Samaritan} (61) \p{Script_Extensions: Sarb} \p{Script_Extensions= Old_South_Arabian} (32) \p{Script_Extensions: Saur} \p{Script_Extensions=Saurashtra} (82) \p{Script_Extensions: Saurashtra} (Short: \p{Scx=Saur}, \p{Saur}) (82: U+A880..A8C5, U+A8CE..A8D9) \p{Script_Extensions: Sgnw} \p{Script_Extensions=SignWriting} (672) \p{Script_Extensions: Sharada} (Short: \p{Scx=Shrd}, \p{Shrd}) (102: U+0951, U+1CD7, U+1CD9, U+1CDC..1CDD, U+1CE0, U+11180..111DF) \p{Script_Extensions: Shavian} (Short: \p{Scx=Shaw}, \p{Shaw}) (48: U+10450..1047F) \p{Script_Extensions: Shaw} \p{Script_Extensions=Shavian} (48) \p{Script_Extensions: Shrd} \p{Script_Extensions=Sharada} (102) \p{Script_Extensions: Sidd} \p{Script_Extensions=Siddham} (92) \p{Script_Extensions: Siddham} (Short: \p{Scx=Sidd}, \p{Sidd}) (92: U+11580..115B5, U+115B8..115DD) \p{Script_Extensions: SignWriting} (Short: \p{Scx=Sgnw}, \p{Sgnw}) (672: U+1D800..1DA8B, U+1DA9B..1DA9F, U+1DAA1..1DAAF) \p{Script_Extensions: Sind} \p{Script_Extensions=Khudawadi} (81) \p{Script_Extensions: Sinh} \p{Script_Extensions=Sinhala} (113) \p{Script_Extensions: Sinhala} (Short: \p{Scx=Sinh}, \p{Sinh}) (113: U+0964..0965, U+0D81..0D83, U+0D85..0D96, U+0D9A..0DB1, U+0DB3..0DBB, U+0DBD ...) \p{Script_Extensions: Sogd} \p{Script_Extensions=Sogdian} (43) \p{Script_Extensions: Sogdian} (Short: \p{Scx=Sogd}, \p{Sogd}) (43: U+0640, U+10F30..10F59) \p{Script_Extensions: Sogo} \p{Script_Extensions=Old_Sogdian} (40) \p{Script_Extensions: Sora} \p{Script_Extensions=Sora_Sompeng} (35) \p{Script_Extensions: Sora_Sompeng} (Short: \p{Scx=Sora}, \p{Sora}) (35: U+110D0..110E8, U+110F0..110F9) \p{Script_Extensions: Soyo} \p{Script_Extensions=Soyombo} (83) \p{Script_Extensions: Soyombo} (Short: \p{Scx=Soyo}, \p{Soyo}) (83: U+11A50..11AA2) \p{Script_Extensions: Sund} \p{Script_Extensions=Sundanese} (72) \p{Script_Extensions: Sundanese} (Short: \p{Scx=Sund}, \p{Sund}) (72: U+1B80..1BBF, U+1CC0..1CC7) \p{Script_Extensions: Sylo} \p{Script_Extensions=Syloti_Nagri} (57) \p{Script_Extensions: Syloti_Nagri} (Short: \p{Scx=Sylo}, \p{Sylo}) (57: U+0964..0965, U+09E6..09EF, U+A800..A82C) \p{Script_Extensions: Syrc} \p{Script_Extensions=Syriac} (106) \p{Script_Extensions: Syriac} (Short: \p{Scx=Syrc}, \p{Syrc}) (106: U+060C, U+061B..061C, U+061F, U+0640, U+064B..0655, U+0670 ...) \p{Script_Extensions: Tagalog} (Short: \p{Scx=Tglg}, \p{Tglg}) (22: U+1700..170C, U+170E..1714, U+1735..1736) \p{Script_Extensions: Tagb} \p{Script_Extensions=Tagbanwa} (20) \p{Script_Extensions: Tagbanwa} (Short: \p{Scx=Tagb}, \p{Tagb}) (20: U+1735..1736, U+1760..176C, U+176E..1770, U+1772..1773) \p{Script_Extensions: Tai_Le} (Short: \p{Scx=Tale}, \p{Tale}) (45: U+1040..1049, U+1950..196D, U+1970..1974) \p{Script_Extensions: Tai_Tham} (Short: \p{Scx=Lana}, \p{Lana}) (127: U+1A20..1A5E, U+1A60..1A7C, U+1A7F..1A89, U+1A90..1A99, U+1AA0..1AAD) \p{Script_Extensions: Tai_Viet} (Short: \p{Scx=Tavt}, \p{Tavt}) (72: U+AA80..AAC2, U+AADB..AADF) \p{Script_Extensions: Takr} \p{Script_Extensions=Takri} (79) \p{Script_Extensions: Takri} (Short: \p{Scx=Takr}, \p{Takr}) (79: U+0964..0965, U+A830..A839, U+11680..116B8, U+116C0..116C9) \p{Script_Extensions: Tale} \p{Script_Extensions=Tai_Le} (45) \p{Script_Extensions: Talu} \p{Script_Extensions=New_Tai_Lue} (83) \p{Script_Extensions: Tamil} (Short: \p{Scx=Taml}, \p{Taml}) (133: U+0951..0952, U+0964..0965, U+0B82..0B83, U+0B85..0B8A, U+0B8E..0B90, U+0B92..0B95 ...) \p{Script_Extensions: Taml} \p{Script_Extensions=Tamil} (133) \p{Script_Extensions: Tang} \p{Script_Extensions=Tangut} (6914) \p{Script_Extensions: Tangut} (Short: \p{Scx=Tang}, \p{Tang}) (6914: U+16FE0, U+17000..187F7, U+18800..18AFF, U+18D00..18D08) \p{Script_Extensions: Tavt} \p{Script_Extensions=Tai_Viet} (72) \p{Script_Extensions: Telu} \p{Script_Extensions=Telugu} (104) \p{Script_Extensions: Telugu} (Short: \p{Scx=Telu}, \p{Telu}) (104: U+0951..0952, U+0964..0965, U+0C00..0C0C, U+0C0E..0C10, U+0C12..0C28, U+0C2A..0C39 ...) \p{Script_Extensions: Tfng} \p{Script_Extensions=Tifinagh} (59) \p{Script_Extensions: Tglg} \p{Script_Extensions=Tagalog} (22) \p{Script_Extensions: Thaa} \p{Script_Extensions=Thaana} (66) \p{Script_Extensions: Thaana} (Short: \p{Scx=Thaa}, \p{Thaa}) (66: U+060C, U+061B..061C, U+061F, U+0660..0669, U+0780..07B1, U+FDF2 ...) \p{Script_Extensions: Thai} (Short: \p{Scx=Thai}, \p{Thai}) (86: U+0E01..0E3A, U+0E40..0E5B) \p{Script_Extensions: Tibetan} (Short: \p{Scx=Tibt}, \p{Tibt}) (207: U+0F00..0F47, U+0F49..0F6C, U+0F71..0F97, U+0F99..0FBC, U+0FBE..0FCC, U+0FCE..0FD4 ...) \p{Script_Extensions: Tibt} \p{Script_Extensions=Tibetan} (207) \p{Script_Extensions: Tifinagh} (Short: \p{Scx=Tfng}, \p{Tfng}) (59: U+2D30..2D67, U+2D6F..2D70, U+2D7F) \p{Script_Extensions: Tirh} \p{Script_Extensions=Tirhuta} (97) \p{Script_Extensions: Tirhuta} (Short: \p{Scx=Tirh}, \p{Tirh}) (97: U+0951..0952, U+0964..0965, U+1CF2, U+A830..A839, U+11480..114C7, U+114D0..114D9) \p{Script_Extensions: Ugar} \p{Script_Extensions=Ugaritic} (31) \p{Script_Extensions: Ugaritic} (Short: \p{Scx=Ugar}, \p{Ugar}) (31: U+10380..1039D, U+1039F) \p{Script_Extensions: Unknown} (Short: \p{Scx=Zzzz}, \p{Zzzz}) (970_188 plus all above-Unicode code points: U+0378..0379, U+0380..0383, U+038B, U+038D, U+03A2, U+0530 ...) \p{Script_Extensions: Vai} (Short: \p{Scx=Vai}, \p{Vai}) (300: U+A500..A62B) \p{Script_Extensions: Vaii} \p{Script_Extensions=Vai} (300) \p{Script_Extensions: Wancho} (Short: \p{Scx=Wcho}, \p{Wcho}) (59: U+1E2C0..1E2F9, U+1E2FF) \p{Script_Extensions: Wara} \p{Script_Extensions=Warang_Citi} (84) \p{Script_Extensions: Warang_Citi} (Short: \p{Scx=Wara}, \p{Wara}) (84: U+118A0..118F2, U+118FF) \p{Script_Extensions: Wcho} \p{Script_Extensions=Wancho} (59) \p{Script_Extensions: Xpeo} \p{Script_Extensions=Old_Persian} (50) \p{Script_Extensions: Xsux} \p{Script_Extensions=Cuneiform} (1234) \p{Script_Extensions: Yezi} \p{Script_Extensions=Yezidi} (60) \p{Script_Extensions: Yezidi} (Short: \p{Scx=Yezi}, \p{Yezi}) (60: U+060C, U+061B, U+061F, U+0660..0669, U+10E80..10EA9, U+10EAB..10EAD ...) \p{Script_Extensions: Yi} (Short: \p{Scx=Yi}, \p{Yi}) (1246: U+3001..3002, U+3008..3011, U+3014..301B, U+30FB, U+A000..A48C, U+A490..A4C6 ...) \p{Script_Extensions: Yiii} \p{Script_Extensions=Yi} (1246) \p{Script_Extensions: Zanabazar_Square} (Short: \p{Scx=Zanb}, \p{Zanb}) (72: U+11A00..11A47) \p{Script_Extensions: Zanb} \p{Script_Extensions=Zanabazar_Square} (72) \p{Script_Extensions: Zinh} \p{Script_Extensions=Inherited} (503) \p{Script_Extensions: Zyyy} \p{Script_Extensions=Common} (7661) \p{Script_Extensions: Zzzz} \p{Script_Extensions=Unknown} (970_188 plus all above-Unicode code points) \p{Scx: *} \p{Script_Extensions: *} \p{SD} \p{Soft_Dotted} (= \p{Soft_Dotted=Y}) (46) \p{SD: *} \p{Soft_Dotted: *} \p{Sentence_Break: AT} \p{Sentence_Break=ATerm} (4) \p{Sentence_Break: ATerm} (Short: \p{SB=AT}) (4: [.], U+2024, U+FE52, U+FF0E) \p{Sentence_Break: CL} \p{Sentence_Break=Close} (187) \p{Sentence_Break: Close} (Short: \p{SB=CL}) (187: [\"\'\(\)\[\] \{\}\xab\xbb], U+0F3A..0F3D, U+169B..169C, U+2018..201F, U+2039..203A, U+2045..2046 ...) \p{Sentence_Break: CR} (Short: \p{SB=CR}) (1: [\r]) \p{Sentence_Break: EX} \p{Sentence_Break=Extend} (2395) \p{Sentence_Break: Extend} (Short: \p{SB=EX}) (2395: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{Sentence_Break: FO} \p{Sentence_Break=Format} (63) \p{Sentence_Break: Format} (Short: \p{SB=FO}) (63: [\xad], U+0600..0605, U+061C, U+06DD, U+070F, U+08E2 ...) \p{Sentence_Break: LE} \p{Sentence_Break=OLetter} (127_413) \p{Sentence_Break: LF} (Short: \p{SB=LF}) (1: [\n]) \p{Sentence_Break: LO} \p{Sentence_Break=Lower} (2297) \p{Sentence_Break: Lower} (Short: \p{SB=LO}) (2297: [a-z\xaa\xb5 \xba\xdf-\xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Sentence_Break: NU} \p{Sentence_Break=Numeric} (652) \p{Sentence_Break: Numeric} (Short: \p{SB=NU}) (652: [0-9], U+0660..0669, U+066B..066C, U+06F0..06F9, U+07C0..07C9, U+0966..096F ...) \p{Sentence_Break: OLetter} (Short: \p{SB=LE}) (127_413: U+01BB, U+01C0..01C3, U+0294, U+02B9..02BF, U+02C6..02D1, U+02EC ...) \p{Sentence_Break: Other} (Short: \p{SB=XX}) (979_014 plus all above-Unicode code points: [^\t\n\cK\f \r\x20!\"\'\(\),\-.0-9:?A-Z\[\]a-z\{\} \x85\xa0\xaa-\xab\xad\xb5\xba-\xbb\xc0- \xd6\xd8-\xf6\xf8-\xff], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..02FF ...) \p{Sentence_Break: SC} \p{Sentence_Break=SContinue} (26) \p{Sentence_Break: SContinue} (Short: \p{SB=SC}) (26: [,\-:], U+055D, U+060C..060D, U+07F8, U+1802, U+1808 ...) \p{Sentence_Break: SE} \p{Sentence_Break=Sep} (3) \p{Sentence_Break: Sep} (Short: \p{SB=SE}) (3: [\x85], U+2028..2029) \p{Sentence_Break: Sp} (Short: \p{SB=Sp}) (20: [\t\cK\f\x20\xa0], U+1680, U+2000..200A, U+202F, U+205F, U+3000) \p{Sentence_Break: ST} \p{Sentence_Break=STerm} (140) \p{Sentence_Break: STerm} (Short: \p{SB=ST}) (140: [!?], U+0589, U+061E..061F, U+06D4, U+0700..0702, U+07F9 ...) \p{Sentence_Break: UP} \p{Sentence_Break=Upper} (1896) \p{Sentence_Break: Upper} (Short: \p{SB=UP}) (1896: [A-Z\xc0-\xd6 \xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Sentence_Break: XX} \p{Sentence_Break=Other} (979_014 plus all above-Unicode code points) \p{Sentence_Terminal} \p{Sentence_Terminal=Y} (Short: \p{STerm}) (143) \p{Sentence_Terminal: N*} (Short: \p{STerm=N}, \P{STerm}) (1_113_969 plus all above-Unicode code points: [\x00-\x20\"#\$\%&\'\(\)*+,\- \/0-9:;<=>\@A-Z\[\\\]\^_`a-z\{\|\}~\x7f- \xff], U+0100..0588, U+058A..061D, U+0620..06D3, U+06D5..06FF, U+0703..07F8 ...) \p{Sentence_Terminal: Y*} (Short: \p{STerm=Y}, \p{STerm}) (143: [!.?], U+0589, U+061E..061F, U+06D4, U+0700..0702, U+07F9 ...) \p{Separator} \p{General_Category=Separator} (Short: \p{Z}) (19) \p{Sgnw} \p{SignWriting} (= \p{Script_Extensions= SignWriting}) (672) \p{Sharada} \p{Script_Extensions=Sharada} (Short: \p{Shrd}; NOT \p{Block=Sharada}) (102) \p{Shavian} \p{Script_Extensions=Shavian} (Short: \p{Shaw}) (48) \p{Shaw} \p{Shavian} (= \p{Script_Extensions= Shavian}) (48) X \p{Shorthand_Format_Controls} \p{Block=Shorthand_Format_Controls} (16) \p{Shrd} \p{Sharada} (= \p{Script_Extensions= Sharada}) (NOT \p{Block=Sharada}) (102) \p{Sidd} \p{Siddham} (= \p{Script_Extensions= Siddham}) (NOT \p{Block=Siddham}) (92) \p{Siddham} \p{Script_Extensions=Siddham} (Short: \p{Sidd}; NOT \p{Block=Siddham}) (92) \p{SignWriting} \p{Script_Extensions=SignWriting} (Short: \p{Sgnw}) (672) \p{Sind} \p{Khudawadi} (= \p{Script_Extensions= Khudawadi}) (NOT \p{Block=Khudawadi}) (81) \p{Sinh} \p{Sinhala} (= \p{Script_Extensions= Sinhala}) (NOT \p{Block=Sinhala}) (113) \p{Sinhala} \p{Script_Extensions=Sinhala} (Short: \p{Sinh}; NOT \p{Block=Sinhala}) (113) X \p{Sinhala_Archaic_Numbers} \p{Block=Sinhala_Archaic_Numbers} (32) \p{Sk} \p{Modifier_Symbol} (= \p{General_Category=Modifier_Symbol}) (123) \p{Sm} \p{Math_Symbol} (= \p{General_Category= Math_Symbol}) (948) X \p{Small_Form_Variants} \p{Block=Small_Form_Variants} (Short: \p{InSmallForms}) (32) X \p{Small_Forms} \p{Small_Form_Variants} (= \p{Block= Small_Form_Variants}) (32) X \p{Small_Kana_Ext} \p{Small_Kana_Extension} (= \p{Block= Small_Kana_Extension}) (64) X \p{Small_Kana_Extension} \p{Block=Small_Kana_Extension} (Short: \p{InSmallKanaExt}) (64) \p{So} \p{Other_Symbol} (= \p{General_Category= Other_Symbol}) (6431) \p{Soft_Dotted} \p{Soft_Dotted=Y} (Short: \p{SD}) (46) \p{Soft_Dotted: N*} (Short: \p{SD=N}, \P{SD}) (1_114_066 plus all above-Unicode code points: [\x00- \x20!\"#\$\%&\'\(\)*+,\-.\/0-9:;<=>?\@A- Z\[\\\]\^_`a-hk-z\{\|\}~\x7f-\xff], U+0100..012E, U+0130..0248, U+024A..0267, U+0269..029C, U+029E..02B1 ...) \p{Soft_Dotted: Y*} (Short: \p{SD=Y}, \p{SD}) (46: [i-j], U+012F, U+0249, U+0268, U+029D, U+02B2 ...) \p{Sogd} \p{Sogdian} (= \p{Script_Extensions= Sogdian}) (NOT \p{Block=Sogdian}) (43) \p{Sogdian} \p{Script_Extensions=Sogdian} (Short: \p{Sogd}; NOT \p{Block=Sogdian}) (43) \p{Sogo} \p{Old_Sogdian} (= \p{Script_Extensions= Old_Sogdian}) (NOT \p{Block= Old_Sogdian}) (40) \p{Sora} \p{Sora_Sompeng} (= \p{Script_Extensions= Sora_Sompeng}) (NOT \p{Block= Sora_Sompeng}) (35) \p{Sora_Sompeng} \p{Script_Extensions=Sora_Sompeng} (Short: \p{Sora}; NOT \p{Block=Sora_Sompeng}) (35) \p{Soyo} \p{Soyombo} (= \p{Script_Extensions= Soyombo}) (NOT \p{Block=Soyombo}) (83) \p{Soyombo} \p{Script_Extensions=Soyombo} (Short: \p{Soyo}; NOT \p{Block=Soyombo}) (83) \p{Space} \p{White_Space} (= \p{White_Space=Y}) (25) \p{Space: *} \p{White_Space: *} \p{Space_Separator} \p{General_Category=Space_Separator} (Short: \p{Zs}) (17) \p{SpacePerl} \p{XPosixSpace} (25) \p{Spacing_Mark} \p{General_Category=Spacing_Mark} (Short: \p{Mc}) (443) X \p{Spacing_Modifier_Letters} \p{Block=Spacing_Modifier_Letters} (Short: \p{InModifierLetters}) (80) X \p{Specials} \p{Block=Specials} (16) \p{STerm} \p{Sentence_Terminal} (= \p{Sentence_Terminal=Y}) (143) \p{STerm: *} \p{Sentence_Terminal: *} \p{Sund} \p{Sundanese} (= \p{Script_Extensions= Sundanese}) (NOT \p{Block=Sundanese}) (72) \p{Sundanese} \p{Script_Extensions=Sundanese} (Short: \p{Sund}; NOT \p{Block=Sundanese}) (72) X \p{Sundanese_Sup} \p{Sundanese_Supplement} (= \p{Block= Sundanese_Supplement}) (16) X \p{Sundanese_Supplement} \p{Block=Sundanese_Supplement} (Short: \p{InSundaneseSup}) (16) X \p{Sup_Arrows_A} \p{Supplemental_Arrows_A} (= \p{Block= Supplemental_Arrows_A}) (16) X \p{Sup_Arrows_B} \p{Supplemental_Arrows_B} (= \p{Block= Supplemental_Arrows_B}) (128) X \p{Sup_Arrows_C} \p{Supplemental_Arrows_C} (= \p{Block= Supplemental_Arrows_C}) (256) X \p{Sup_Math_Operators} \p{Supplemental_Mathematical_Operators} (= \p{Block= Supplemental_Mathematical_Operators}) (256) X \p{Sup_PUA_A} \p{Supplementary_Private_Use_Area_A} (= \p{Block= Supplementary_Private_Use_Area_A}) (65_536) X \p{Sup_PUA_B} \p{Supplementary_Private_Use_Area_B} (= \p{Block= Supplementary_Private_Use_Area_B}) (65_536) X \p{Sup_Punctuation} \p{Supplemental_Punctuation} (= \p{Block= Supplemental_Punctuation}) (128) X \p{Sup_Symbols_And_Pictographs} \p{Supplemental_Symbols_And_Pictographs} (= \p{Block= Supplemental_Symbols_And_Pictographs}) (256) X \p{Super_And_Sub} \p{Superscripts_And_Subscripts} (= \p{Block=Superscripts_And_Subscripts}) (48) X \p{Superscripts_And_Subscripts} \p{Block= Superscripts_And_Subscripts} (Short: \p{InSuperAndSub}) (48) X \p{Supplemental_Arrows_A} \p{Block=Supplemental_Arrows_A} (Short: \p{InSupArrowsA}) (16) X \p{Supplemental_Arrows_B} \p{Block=Supplemental_Arrows_B} (Short: \p{InSupArrowsB}) (128) X \p{Supplemental_Arrows_C} \p{Block=Supplemental_Arrows_C} (Short: \p{InSupArrowsC}) (256) X \p{Supplemental_Mathematical_Operators} \p{Block= Supplemental_Mathematical_Operators} (Short: \p{InSupMathOperators}) (256) X \p{Supplemental_Punctuation} \p{Block=Supplemental_Punctuation} (Short: \p{InSupPunctuation}) (128) X \p{Supplemental_Symbols_And_Pictographs} \p{Block= Supplemental_Symbols_And_Pictographs} (Short: \p{InSupSymbolsAndPictographs}) (256) X \p{Supplementary_Private_Use_Area_A} \p{Block= Supplementary_Private_Use_Area_A} (Short: \p{InSupPUAA}) (65_536) X \p{Supplementary_Private_Use_Area_B} \p{Block= Supplementary_Private_Use_Area_B} (Short: \p{InSupPUAB}) (65_536) \p{Surrogate} \p{General_Category=Surrogate} (Short: \p{Cs}) (2048) X \p{Sutton_SignWriting} \p{Block=Sutton_SignWriting} (688) \p{Sylo} \p{Syloti_Nagri} (= \p{Script_Extensions= Syloti_Nagri}) (NOT \p{Block= Syloti_Nagri}) (57) \p{Syloti_Nagri} \p{Script_Extensions=Syloti_Nagri} (Short: \p{Sylo}; NOT \p{Block=Syloti_Nagri}) (57) \p{Symbol} \p{General_Category=Symbol} (Short: \p{S}) (7564) X \p{Symbols_And_Pictographs_Ext_A} \p{Symbols_And_Pictographs_Extended_A} (= \p{Block= Symbols_And_Pictographs_Extended_A}) (144) X \p{Symbols_And_Pictographs_Extended_A} \p{Block= Symbols_And_Pictographs_Extended_A} (144) X \p{Symbols_For_Legacy_Computing} \p{Block= Symbols_For_Legacy_Computing} (256) \p{Syrc} \p{Syriac} (= \p{Script_Extensions= Syriac}) (NOT \p{Block=Syriac}) (106) \p{Syriac} \p{Script_Extensions=Syriac} (Short: \p{Syrc}; NOT \p{Block=Syriac}) (106) X \p{Syriac_Sup} \p{Syriac_Supplement} (= \p{Block= Syriac_Supplement}) (16) X \p{Syriac_Supplement} \p{Block=Syriac_Supplement} (Short: \p{InSyriacSup}) (16) \p{Tagalog} \p{Script_Extensions=Tagalog} (Short: \p{Tglg}; NOT \p{Block=Tagalog}) (22) \p{Tagb} \p{Tagbanwa} (= \p{Script_Extensions= Tagbanwa}) (NOT \p{Block=Tagbanwa}) (20) \p{Tagbanwa} \p{Script_Extensions=Tagbanwa} (Short: \p{Tagb}; NOT \p{Block=Tagbanwa}) (20) X \p{Tags} \p{Block=Tags} (128) \p{Tai_Le} \p{Script_Extensions=Tai_Le} (Short: \p{Tale}; NOT \p{Block=Tai_Le}) (45) \p{Tai_Tham} \p{Script_Extensions=Tai_Tham} (Short: \p{Lana}; NOT \p{Block=Tai_Tham}) (127) \p{Tai_Viet} \p{Script_Extensions=Tai_Viet} (Short: \p{Tavt}; NOT \p{Block=Tai_Viet}) (72) X \p{Tai_Xuan_Jing} \p{Tai_Xuan_Jing_Symbols} (= \p{Block= Tai_Xuan_Jing_Symbols}) (96) X \p{Tai_Xuan_Jing_Symbols} \p{Block=Tai_Xuan_Jing_Symbols} (Short: \p{InTaiXuanJing}) (96) \p{Takr} \p{Takri} (= \p{Script_Extensions=Takri}) (NOT \p{Block=Takri}) (79) \p{Takri} \p{Script_Extensions=Takri} (Short: \p{Takr}; NOT \p{Block=Takri}) (79) \p{Tale} \p{Tai_Le} (= \p{Script_Extensions= Tai_Le}) (NOT \p{Block=Tai_Le}) (45) \p{Talu} \p{New_Tai_Lue} (= \p{Script_Extensions= New_Tai_Lue}) (NOT \p{Block= New_Tai_Lue}) (83) \p{Tamil} \p{Script_Extensions=Tamil} (Short: \p{Taml}; NOT \p{Block=Tamil}) (133) X \p{Tamil_Sup} \p{Tamil_Supplement} (= \p{Block= Tamil_Supplement}) (64) X \p{Tamil_Supplement} \p{Block=Tamil_Supplement} (Short: \p{InTamilSup}) (64) \p{Taml} \p{Tamil} (= \p{Script_Extensions=Tamil}) (NOT \p{Block=Tamil}) (133) \p{Tang} \p{Tangut} (= \p{Script_Extensions= Tangut}) (NOT \p{Block=Tangut}) (6914) \p{Tangut} \p{Script_Extensions=Tangut} (Short: \p{Tang}; NOT \p{Block=Tangut}) (6914) X \p{Tangut_Components} \p{Block=Tangut_Components} (768) X \p{Tangut_Sup} \p{Tangut_Supplement} (= \p{Block= Tangut_Supplement}) (144) X \p{Tangut_Supplement} \p{Block=Tangut_Supplement} (Short: \p{InTangutSup}) (144) \p{Tavt} \p{Tai_Viet} (= \p{Script_Extensions= Tai_Viet}) (NOT \p{Block=Tai_Viet}) (72) \p{Telu} \p{Telugu} (= \p{Script_Extensions= Telugu}) (NOT \p{Block=Telugu}) (104) \p{Telugu} \p{Script_Extensions=Telugu} (Short: \p{Telu}; NOT \p{Block=Telugu}) (104) \p{Term} \p{Terminal_Punctuation} (= \p{Terminal_Punctuation=Y}) (267) \p{Term: *} \p{Terminal_Punctuation: *} \p{Terminal_Punctuation} \p{Terminal_Punctuation=Y} (Short: \p{Term}) (267) \p{Terminal_Punctuation: N*} (Short: \p{Term=N}, \P{Term}) (1_113_845 plus all above-Unicode code points: [\x00-\x20\"#\$\%&\'\(\)*+\-\/0- 9<=>\@A-Z\[\\\]\^_`a-z\{\|\}~\x7f-\xff], U+0100..037D, U+037F..0386, U+0388..0588, U+058A..05C2, U+05C4..060B ...) \p{Terminal_Punctuation: Y*} (Short: \p{Term=Y}, \p{Term}) (267: [!,.:;?], U+037E, U+0387, U+0589, U+05C3, U+060C ...) \p{Tfng} \p{Tifinagh} (= \p{Script_Extensions= Tifinagh}) (NOT \p{Block=Tifinagh}) (59) \p{Tglg} \p{Tagalog} (= \p{Script_Extensions= Tagalog}) (NOT \p{Block=Tagalog}) (22) \p{Thaa} \p{Thaana} (= \p{Script_Extensions= Thaana}) (NOT \p{Block=Thaana}) (66) \p{Thaana} \p{Script_Extensions=Thaana} (Short: \p{Thaa}; NOT \p{Block=Thaana}) (66) \p{Thai} \p{Script_Extensions=Thai} (NOT \p{Block= Thai}) (86) \p{Tibetan} \p{Script_Extensions=Tibetan} (Short: \p{Tibt}; NOT \p{Block=Tibetan}) (207) \p{Tibt} \p{Tibetan} (= \p{Script_Extensions= Tibetan}) (NOT \p{Block=Tibetan}) (207) \p{Tifinagh} \p{Script_Extensions=Tifinagh} (Short: \p{Tfng}; NOT \p{Block=Tifinagh}) (59) \p{Tirh} \p{Tirhuta} (= \p{Script_Extensions= Tirhuta}) (NOT \p{Block=Tirhuta}) (97) \p{Tirhuta} \p{Script_Extensions=Tirhuta} (Short: \p{Tirh}; NOT \p{Block=Tirhuta}) (97) \p{Title} \p{Titlecase} (/i= Cased=Yes) (31) \p{Titlecase} (= \p{Gc=Lt}) (Short: \p{Title}; /i= Cased=Yes) (31: U+01C5, U+01C8, U+01CB, U+01F2, U+1F88..1F8F, U+1F98..1F9F ...) \p{Titlecase_Letter} \p{General_Category=Titlecase_Letter} (Short: \p{Lt}; /i= General_Category= Cased_Letter) (31) X \p{Transport_And_Map} \p{Transport_And_Map_Symbols} (= \p{Block= Transport_And_Map_Symbols}) (128) X \p{Transport_And_Map_Symbols} \p{Block=Transport_And_Map_Symbols} (Short: \p{InTransportAndMap}) (128) X \p{UCAS} \p{Unified_Canadian_Aboriginal_Syllabics} (= \p{Block= Unified_Canadian_Aboriginal_Syllabics}) (640) X \p{UCAS_Ext} \p{Unified_Canadian_Aboriginal_Syllabics_- Extended} (= \p{Block= Unified_Canadian_Aboriginal_Syllabics_- Extended}) (80) \p{Ugar} \p{Ugaritic} (= \p{Script_Extensions= Ugaritic}) (NOT \p{Block=Ugaritic}) (31) \p{Ugaritic} \p{Script_Extensions=Ugaritic} (Short: \p{Ugar}; NOT \p{Block=Ugaritic}) (31) \p{UIdeo} \p{Unified_Ideograph} (= \p{Unified_Ideograph=Y}) (92_856) \p{UIdeo: *} \p{Unified_Ideograph: *} \p{Unassigned} \p{General_Category=Unassigned} (Short: \p{Cn}) (830_672 plus all above-Unicode code points) \p{Unicode} \p{Any} (1_114_112) X \p{Unified_Canadian_Aboriginal_Syllabics} \p{Block= Unified_Canadian_Aboriginal_Syllabics} (Short: \p{InUCAS}) (640) X \p{Unified_Canadian_Aboriginal_Syllabics_Extended} \p{Block= Unified_Canadian_Aboriginal_Syllabics_- Extended} (Short: \p{InUCASExt}) (80) \p{Unified_Ideograph} \p{Unified_Ideograph=Y} (Short: \p{UIdeo}) (92_856) \p{Unified_Ideograph: N*} (Short: \p{UIdeo=N}, \P{UIdeo}) (1_021_256 plus all above-Unicode code points: U+0000..33FF, U+4DC0..4DFF, U+9FFD..FA0D, U+FA10, U+FA12, U+FA15..FA1E ...) \p{Unified_Ideograph: Y*} (Short: \p{UIdeo=Y}, \p{UIdeo}) (92_856: U+3400..4DBF, U+4E00..9FFC, U+FA0E..FA0F, U+FA11, U+FA13..FA14, U+FA1F ...) \p{Unknown} \p{Script_Extensions=Unknown} (Short: \p{Zzzz}) (970_188 plus all above- Unicode code points) \p{Upper} \p{XPosixUpper} (= \p{Uppercase=Y}) (/i= Cased=Yes) (1911) \p{Upper: *} \p{Uppercase: *} \p{Uppercase} \p{XPosixUpper} (= \p{Uppercase=Y}) (/i= Cased=Yes) (1911) \p{Uppercase: N*} (Short: \p{Upper=N}, \P{Upper}; /i= Cased= No) (1_112_201 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\' \(\)*+,\-.\/0-9:;<=>?\@\[\\\]\^_`a-z\{ \|\}~\x7f-\xbf\xd7\xdf-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ...) \p{Uppercase: Y*} (Short: \p{Upper=Y}, \p{Upper}; /i= Cased= Yes) (1911: [A-Z\xc0-\xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ...) \p{Uppercase_Letter} \p{General_Category=Uppercase_Letter} (Short: \p{Lu}; /i= General_Category= Cased_Letter) (1791) \p{Vai} \p{Script_Extensions=Vai} (NOT \p{Block= Vai}) (300) \p{Vaii} \p{Vai} (= \p{Script_Extensions=Vai}) (NOT \p{Block=Vai}) (300) \p{Variation_Selector} \p{Variation_Selector=Y} (Short: \p{VS}; NOT \p{Variation_Selectors}) (259) \p{Variation_Selector: N*} (Short: \p{VS=N}, \P{VS}) (1_113_853 plus all above-Unicode code points: U+0000..180A, U+180E..FDFF, U+FE10..E00FF, U+E01F0..infinity) \p{Variation_Selector: Y*} (Short: \p{VS=Y}, \p{VS}) (259: U+180B..180D, U+FE00..FE0F, U+E0100..E01EF) X \p{Variation_Selectors} \p{Block=Variation_Selectors} (Short: \p{InVS}) (16) X \p{Variation_Selectors_Supplement} \p{Block= Variation_Selectors_Supplement} (Short: \p{InVSSup}) (240) X \p{Vedic_Ext} \p{Vedic_Extensions} (= \p{Block= Vedic_Extensions}) (48) X \p{Vedic_Extensions} \p{Block=Vedic_Extensions} (Short: \p{InVedicExt}) (48) X \p{Vertical_Forms} \p{Block=Vertical_Forms} (16) \p{Vertical_Orientation: R} \p{Vertical_Orientation=Rotated} (786_865 plus all above-Unicode code points) \p{Vertical_Orientation: Rotated} (Short: \p{Vo=R}) (786_865 plus all above-Unicode code points: [\x00- \xa6\xa8\xaa-\xad\xaf-\xb0\xb2-\xbb\xbf- \xd6\xd8-\xf6\xf8-\xff], U+0100..02E9, U+02EC..10FF, U+1200..1400, U+1680..18AF, U+1900..2015 ...) \p{Vertical_Orientation: Tr} \p{Vertical_Orientation= Transformed_Rotated} (47) \p{Vertical_Orientation: Transformed_Rotated} (Short: \p{Vo=Tr}) (47: U+2329..232A, U+3008..3011, U+3014..301F, U+3030, U+30A0, U+30FC ...) \p{Vertical_Orientation: Transformed_Upright} (Short: \p{Vo=Tu}) (148: U+3001..3002, U+3041, U+3043, U+3045, U+3047, U+3049 ...) \p{Vertical_Orientation: Tu} \p{Vertical_Orientation= Transformed_Upright} (148) \p{Vertical_Orientation: U} \p{Vertical_Orientation=Upright} (327_052) \p{Vertical_Orientation: Upright} (Short: \p{Vo=U}) (327_052: [\xa7\xa9\xae\xb1\xbc-\xbe\xd7\xf7], U+02EA..02EB, U+1100..11FF, U+1401..167F, U+18B0..18FF, U+2016 ...) \p{VertSpace} \v (7: [\n\cK\f\r\x85], U+2028..2029) \p{Vo: *} \p{Vertical_Orientation: *} \p{VS} \p{Variation_Selector} (= \p{Variation_Selector=Y}) (NOT \p{Variation_Selectors}) (259) \p{VS: *} \p{Variation_Selector: *} X \p{VS_Sup} \p{Variation_Selectors_Supplement} (= \p{Block= Variation_Selectors_Supplement}) (240) \p{Wancho} \p{Script_Extensions=Wancho} (Short: \p{Wcho}; NOT \p{Block=Wancho}) (59) \p{Wara} \p{Warang_Citi} (= \p{Script_Extensions= Warang_Citi}) (NOT \p{Block= Warang_Citi}) (84) \p{Warang_Citi} \p{Script_Extensions=Warang_Citi} (Short: \p{Wara}; NOT \p{Block=Warang_Citi}) (84) \p{WB: *} \p{Word_Break: *} \p{Wcho} \p{Wancho} (= \p{Script_Extensions= Wancho}) (NOT \p{Block=Wancho}) (59) \p{White_Space} \p{White_Space=Y} (Short: \p{Space}) (25) \p{White_Space: N*} (Short: \p{Space=N}, \P{Space}) (1_114_087 plus all above-Unicode code points: [^ \t\n\cK\f\r\x20\x85\xa0], U+0100..167F, U+1681..1FFF, U+200B..2027, U+202A..202E, U+2030..205E ...) \p{White_Space: Y*} (Short: \p{Space=Y}, \p{Space}) (25: [\t \n\cK\f\r\x20\x85\xa0], U+1680, U+2000..200A, U+2028..2029, U+202F, U+205F ...) \p{Word} \p{XPosixWord} (134_564) \p{Word_Break: ALetter} (Short: \p{WB=LE}) (28_854: [A-Za-z\xaa \xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02D7, U+02DE..02FF, U+0370..0374, U+0376..0377, U+037A..037D ...) \p{Word_Break: CR} (Short: \p{WB=CR}) (1: [\r]) \p{Word_Break: Double_Quote} (Short: \p{WB=DQ}) (1: [\"]) \p{Word_Break: DQ} \p{Word_Break=Double_Quote} (1) \p{Word_Break: E_Base} (Short: \p{WB=EB}) (0) \p{Word_Break: E_Base_GAZ} (Short: \p{WB=EBG}) (0) \p{Word_Break: E_Modifier} (Short: \p{WB=EM}) (0) \p{Word_Break: EB} \p{Word_Break=E_Base} (0) \p{Word_Break: EBG} \p{Word_Break=E_Base_GAZ} (0) \p{Word_Break: EM} \p{Word_Break=E_Modifier} (0) \p{Word_Break: EX} \p{Word_Break=ExtendNumLet} (11) \p{Word_Break: Extend} (Short: \p{WB=Extend}) (2399: U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ...) \p{Word_Break: ExtendNumLet} (Short: \p{WB=EX}) (11: [_], U+202F, U+203F..2040, U+2054, U+FE33..FE34, U+FE4D..FE4F ...) \p{Word_Break: FO} \p{Word_Break=Format} (62) \p{Word_Break: Format} (Short: \p{WB=FO}) (62: [\xad], U+0600..0605, U+061C, U+06DD, U+070F, U+08E2 ...) \p{Word_Break: GAZ} \p{Word_Break=Glue_After_Zwj} (0) \p{Word_Break: Glue_After_Zwj} (Short: \p{WB=GAZ}) (0) \p{Word_Break: Hebrew_Letter} (Short: \p{WB=HL}) (75: U+05D0..05EA, U+05EF..05F2, U+FB1D, U+FB1F..FB28, U+FB2A..FB36, U+FB38..FB3C ...) \p{Word_Break: HL} \p{Word_Break=Hebrew_Letter} (75) \p{Word_Break: KA} \p{Word_Break=Katakana} (314) \p{Word_Break: Katakana} (Short: \p{WB=KA}) (314: U+3031..3035, U+309B..309C, U+30A0..30FA, U+30FC..30FF, U+31F0..31FF, U+32D0..32FE ...) \p{Word_Break: LE} \p{Word_Break=ALetter} (28_854) \p{Word_Break: LF} (Short: \p{WB=LF}) (1: [\n]) \p{Word_Break: MB} \p{Word_Break=MidNumLet} (7) \p{Word_Break: MidLetter} (Short: \p{WB=ML}) (9: [:\xb7], U+0387, U+055F, U+05F4, U+2027, U+FE13 ...) \p{Word_Break: MidNum} (Short: \p{WB=MN}) (15: [,;], U+037E, U+0589, U+060C..060D, U+066C, U+07F8 ...) \p{Word_Break: MidNumLet} (Short: \p{WB=MB}) (7: [.], U+2018..2019, U+2024, U+FE52, U+FF07, U+FF0E) \p{Word_Break: ML} \p{Word_Break=MidLetter} (9) \p{Word_Break: MN} \p{Word_Break=MidNum} (15) \p{Word_Break: Newline} (Short: \p{WB=NL}) (5: [\cK\f\x85], U+2028..2029) \p{Word_Break: NL} \p{Word_Break=Newline} (5) \p{Word_Break: NU} \p{Word_Break=Numeric} (651) \p{Word_Break: Numeric} (Short: \p{WB=NU}) (651: [0-9], U+0660..0669, U+066B, U+06F0..06F9, U+07C0..07C9, U+0966..096F ...) \p{Word_Break: Other} (Short: \p{WB=XX}) (1_081_665 plus all above-Unicode code points: [^\n\cK\f\r \x20\"\',.0-9:;A-Z_a-z\x85\xaa\xad\xb5 \xb7\xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+02D8..02DD, U+0375, U+0378..0379, U+0380..0385, U+038B ...) \p{Word_Break: Regional_Indicator} (Short: \p{WB=RI}) (26: U+1F1E6..1F1FF) \p{Word_Break: RI} \p{Word_Break=Regional_Indicator} (26) \p{Word_Break: Single_Quote} (Short: \p{WB=SQ}) (1: [\']) \p{Word_Break: SQ} \p{Word_Break=Single_Quote} (1) \p{Word_Break: WSegSpace} (Short: \p{WB=WSegSpace}) (14: [\x20], U+1680, U+2000..2006, U+2008..200A, U+205F, U+3000) \p{Word_Break: XX} \p{Word_Break=Other} (1_081_665 plus all above-Unicode code points) \p{Word_Break: ZWJ} (Short: \p{WB=ZWJ}) (1: U+200D) \p{WSpace} \p{White_Space} (= \p{White_Space=Y}) (25) \p{WSpace: *} \p{White_Space: *} \p{XDigit} \p{XPosixXDigit} (= \p{Hex_Digit=Y}) (44) \p{XID_Continue} \p{XID_Continue=Y} (Short: \p{XIDC}) (134_415) \p{XID_Continue: N*} (Short: \p{XIDC=N}, \P{XIDC}) (979_697 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/:;<=>? \@\[\\\]\^`\{\|\}~\x7f-\xa9\xab-\xb4 \xb6\xb8-\xb9\xbb-\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..02FF ...) \p{XID_Continue: Y*} (Short: \p{XIDC=Y}, \p{XIDC}) (134_415: [0-9A-Z_a-z\xaa\xb5\xb7\xba\xc0-\xd6 \xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{XID_Start} \p{XID_Start=Y} (Short: \p{XIDS}) (131_459) \p{XID_Start: N*} (Short: \p{XIDS=N}, \P{XIDS}) (982_653 plus all above-Unicode code points: [\x00-\x20!\"#\$\%&\'\(\)*+,\-.\/0-9:;<= >?\@\[\\\]\^_`\{\|\}~\x7f-\xa9\xab-\xb4 \xb6-\xb9\xbb-\xbf\xd7\xf7], U+02C2..02C5, U+02D2..02DF, U+02E5..02EB, U+02ED, U+02EF..036F ...) \p{XID_Start: Y*} (Short: \p{XIDS=Y}, \p{XIDS}) (131_459: [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6 \xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{XIDC} \p{XID_Continue} (= \p{XID_Continue=Y}) (134_415) \p{XIDC: *} \p{XID_Continue: *} \p{XIDS} \p{XID_Start} (= \p{XID_Start=Y}) (131_459) \p{XIDS: *} \p{XID_Start: *} \p{Xpeo} \p{Old_Persian} (= \p{Script_Extensions= Old_Persian}) (NOT \p{Block= Old_Persian}) (50) \p{XPerlSpace} \p{XPosixSpace} (25) \p{XPosixAlnum} Alphabetic and (decimal) Numeric (Short: \p{Alnum}) (133_525: [0-9A-Za-z\xaa\xb5 \xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{XPosixAlpha} \p{Alphabetic=Y} (Short: \p{Alpha}) (132_875) \p{XPosixBlank} \h, Horizontal white space (Short: \p{Blank}) (18: [\t\x20\xa0], U+1680, U+2000..200A, U+202F, U+205F, U+3000) \p{XPosixCntrl} \p{General_Category=Control} Control characters (Short: \p{Cc}) (65) \p{XPosixDigit} \p{General_Category=Decimal_Number} [0-9] + all other decimal digits (Short: \p{Nd}) (650) \p{XPosixGraph} Characters that are graphical (Short: \p{Graph}) (281_308: [!\"#\$\%&\' \(\)*+,\-.\/0-9:;<=>?\@A-Z\[\\\]\^_`a-z \{\|\}~\xa1-\xff], U+0100..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1 ...) \p{XPosixLower} \p{Lowercase=Y} (Short: \p{Lower}; /i= Cased=Yes) (2344) \p{XPosixPrint} Characters that are graphical plus space characters (but no controls) (Short: \p{Print}) (281_325: [\x20-\x7e\xa0- \xff], U+0100..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1 ...) \p{XPosixPunct} \p{Punct} + ASCII-range \p{Symbol} (807: [!\"#\$\%&\'\(\)*+,\-.\/:;<=>?\@\[\\\] \^_`\{\|\}~\xa1\xa7\xab\xb6-\xb7\xbb \xbf], U+037E, U+0387, U+055A..055F, U+0589..058A, U+05BE ...) \p{XPosixSpace} \s including beyond ASCII and vertical tab (Short: \p{SpacePerl}) (25: [\t\n\cK\f \r\x20\x85\xa0], U+1680, U+2000..200A, U+2028..2029, U+202F, U+205F ...) \p{XPosixUpper} \p{Uppercase=Y} (Short: \p{Upper}; /i= Cased=Yes) (1911) \p{XPosixWord} \w, including beyond ASCII; = \p{Alnum} + \pM + \p{Pc} + \p{Join_Control} (Short: \p{Word}) (134_564: [0-9A-Z_a-z\xaa\xb5 \xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ...) \p{XPosixXDigit} \p{Hex_Digit=Y} (Short: \p{Hex}) (44) \p{Xsux} \p{Cuneiform} (= \p{Script_Extensions= Cuneiform}) (NOT \p{Block=Cuneiform}) (1234) \p{Yezi} \p{Yezidi} (= \p{Script_Extensions= Yezidi}) (NOT \p{Block=Yezidi}) (60) \p{Yezidi} \p{Script_Extensions=Yezidi} (Short: \p{Yezi}; NOT \p{Block=Yezidi}) (60) \p{Yi} \p{Script_Extensions=Yi} (1246) X \p{Yi_Radicals} \p{Block=Yi_Radicals} (64) X \p{Yi_Syllables} \p{Block=Yi_Syllables} (1168) \p{Yiii} \p{Yi} (= \p{Script_Extensions=Yi}) (1246) X \p{Yijing} \p{Yijing_Hexagram_Symbols} (= \p{Block= Yijing_Hexagram_Symbols}) (64) X \p{Yijing_Hexagram_Symbols} \p{Block=Yijing_Hexagram_Symbols} (Short: \p{InYijing}) (64) \p{Z} \pZ \p{Separator} (= \p{General_Category= Separator}) (19) \p{Zanabazar_Square} \p{Script_Extensions=Zanabazar_Square} (Short: \p{Zanb}; NOT \p{Block= Zanabazar_Square}) (72) \p{Zanb} \p{Zanabazar_Square} (= \p{Script_Extensions=Zanabazar_Square}) (NOT \p{Block=Zanabazar_Square}) (72) \p{Zinh} \p{Inherited} (= \p{Script_Extensions= Inherited}) (503) \p{Zl} \p{Line_Separator} (= \p{General_Category= Line_Separator}) (1) \p{Zp} \p{Paragraph_Separator} (= \p{General_Category= Paragraph_Separator}) (1) \p{Zs} \p{Space_Separator} (= \p{General_Category=Space_Separator}) (17) \p{Zyyy} \p{Common} (= \p{Script_Extensions= Common}) (7661) \p{Zzzz} \p{Unknown} (= \p{Script_Extensions= Unknown}) (970_188 plus all above- Unicode code points) =head2 Legal C<\p{}> and C<\P{}> constructs that match no characters Unicode has some property-value pairs that currently don't match anything. This happens generally either because they are obsolete, or they exist for symmetry with other forms, but no language has yet been encoded that uses them. In this version of Unicode, the following match zero code points: =over 4 =item \p{Canonical_Combining_Class=Attached_Below_Left} =item \p{Canonical_Combining_Class=CCC133} =item \p{Grapheme_Cluster_Break=E_Base} =item \p{Grapheme_Cluster_Break=E_Base_GAZ} =item \p{Grapheme_Cluster_Break=E_Modifier} =item \p{Grapheme_Cluster_Break=Glue_After_Zwj} =item \p{Word_Break=E_Base} =item \p{Word_Break=E_Base_GAZ} =item \p{Word_Break=E_Modifier} =item \p{Word_Break=Glue_After_Zwj} =back =head1 Properties accessible through Unicode::UCD The value of any Unicode (not including Perl extensions) character property mentioned above for any single code point is available through L<Unicode::UCD/charprop()>. L<Unicode::UCD/charprops_all()> returns the values of all the Unicode properties for a given code point. Besides these, all the Unicode character properties mentioned above (except for those marked as for internal use by Perl) are also accessible by L<Unicode::UCD/prop_invlist()>. Due to their nature, not all Unicode character properties are suitable for regular expression matches, nor C<prop_invlist()>. The remaining non-provisional, non-internal ones are accessible via L<Unicode::UCD/prop_invmap()> (except for those that this Perl installation hasn't included; see L<below for which those are|/Unicode character properties that are NOT accepted by Perl>). For compatibility with other parts of Perl, all the single forms given in the table in the L<section above|/Properties accessible through \p{} and \P{}> are recognized. BUT, there are some ambiguities between some Perl extensions and the Unicode properties, all of which are silently resolved in favor of the official Unicode property. To avoid surprises, you should only use C<prop_invmap()> for forms listed in the table below, which omits the non-recommended ones. The affected forms are the Perl single form equivalents of Unicode properties, such as C<\p{sc}> being a single-form equivalent of C<\p{gc=sc}>, which is treated by C<prop_invmap()> as the C<Script> property, whose short name is C<sc>. The table indicates the current ambiguities in the INFO column, beginning with the word C<"NOT">. The standard Unicode properties listed below are documented in L<http://www.unicode.org/reports/tr44/>; Perl_Decimal_Digit is documented in L<Unicode::UCD/prop_invmap()>. The other Perl extensions are in L<perlunicode/Other Properties>; The first column in the table is a name for the property; the second column is an alternative name, if any, plus possibly some annotations. The alternative name is the property's full name, unless that would simply repeat the first column, in which case the second column indicates the property's short name (if different). The annotations are given only in the entry for the full name. The annotations for binary properties include a list of the first few ranges that the property matches. To avoid any ambiguity, the SPACE character is represented as C<\x20>. If a property is obsolete, etc, the entry will be flagged with the same characters used in the table in the L<section above|/Properties accessible through \p{} and \P{}>, like B<D> or B<S>. NAME INFO Age AHex ASCII_Hex_Digit All (Perl extension). All code points, including those above Unicode. Same as qr/./s. U+0000..infinity Alnum XPosixAlnum. (Perl extension) Alpha Alphabetic Alphabetic (Short: Alpha). [A-Za-z\xaa\xb5\xba\xc0- \xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... Any (Perl extension). All Unicode code points. U+0000..10FFFF ASCII Block=Basic_Latin. (Perl extension). [\x00-\x7f] ASCII_Hex_Digit (Short: AHex). [0-9A-Fa-f] Assigned (Perl extension). All assigned code points. U+0000..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1, U+03A3..052F ... Bc Bidi_Class Bidi_C Bidi_Control Bidi_Class (Short: bc) Bidi_Control (Short: Bidi_C). U+061C, U+200E..200F, U+202A..202E, U+2066..2069 Bidi_M Bidi_Mirrored Bidi_Mirrored (Short: Bidi_M). [\(\)<>\[\]\{\}\xab \xbb], U+0F3A..0F3D, U+169B..169C, U+2039..203A, U+2045..2046, U+207D..207E ... Bidi_Mirroring_Glyph (Short: bmg) Bidi_Paired_Bracket (Short: bpb) Bidi_Paired_Bracket_Type (Short: bpt) Blank XPosixBlank. (Perl extension) Blk Block Block (Short: blk) Bmg Bidi_Mirroring_Glyph Bpb Bidi_Paired_Bracket Bpt Bidi_Paired_Bracket_Type Canonical_Combining_Class (Short: ccc) Case_Folding (Short: cf) Case_Ignorable (Short: CI). [\'.:\^`\xa8\xad\xaf\xb4 \xb7-\xb8], U+02B0..036F, U+0374..0375, U+037A, U+0384..0385, U+0387 ... Cased [A-Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8- \xff], U+0100..01BA, U+01BC..01BF, U+01C4..0293, U+0295..02B8, U+02C0..02C1 ... Category General_Category Ccc Canonical_Combining_Class CE Composition_Exclusion Cf Case_Folding; NOT 'cf' meaning 'General_Category=Format' Changes_When_Casefolded (Short: CWCF). [A-Z\xb5\xc0-\xd6\xd8- \xdf], U+0100, U+0102, U+0104, U+0106, U+0108 ... Changes_When_Casemapped (Short: CWCM). [A-Za-z\xb5\xc0-\xd6\xd8- \xf6\xf8-\xff], U+0100..0137, U+0139..018C, U+018E..019A, U+019C..01A9, U+01AC..01B9 ... Changes_When_Lowercased (Short: CWL). [A-Z\xc0-\xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ... Changes_When_NFKC_Casefolded (Short: CWKCF). [A-Z\xa0\xa8\xaa \xad\xaf\xb2-\xb5\xb8-\xba\xbc-\xbe\xc0- \xd6\xd8-\xdf], U+0100, U+0102, U+0104, U+0106, U+0108 ... Changes_When_Titlecased (Short: CWT). [a-z\xb5\xdf-\xf6\xf8- \xff], U+0101, U+0103, U+0105, U+0107, U+0109 ... Changes_When_Uppercased (Short: CWU). [a-z\xb5\xdf-\xf6\xf8- \xff], U+0101, U+0103, U+0105, U+0107, U+0109 ... CI Case_Ignorable Cntrl XPosixCntrl (=General_Category=Control). (Perl extension) Comp_Ex Full_Composition_Exclusion Composition_Exclusion (Short: CE). U+0958..095F, U+09DC..09DD, U+09DF, U+0A33, U+0A36, U+0A59..0A5B ... CWCF Changes_When_Casefolded CWCM Changes_When_Casemapped CWKCF Changes_When_NFKC_Casefolded CWL Changes_When_Lowercased CWT Changes_When_Titlecased CWU Changes_When_Uppercased Dash [\-], U+058A, U+05BE, U+1400, U+1806, U+2010..2015 ... Decomposition_Mapping (Short: dm) Decomposition_Type (Short: dt) Default_Ignorable_Code_Point (Short: DI). [\xad], U+034F, U+061C, U+115F..1160, U+17B4..17B5, U+180B..180E ... Dep Deprecated Deprecated (Short: Dep). U+0149, U+0673, U+0F77, U+0F79, U+17A3..17A4, U+206A..206F ... DI Default_Ignorable_Code_Point Dia Diacritic Diacritic (Short: Dia). [\^`\xa8\xaf\xb4\xb7-\xb8], U+02B0..034E, U+0350..0357, U+035D..0362, U+0374..0375, U+037A ... Digit XPosixDigit (=General_Category= Decimal_Number). (Perl extension) Dm Decomposition_Mapping Dt Decomposition_Type Ea East_Asian_Width East_Asian_Width (Short: ea) EBase Emoji_Modifier_Base EComp Emoji_Component EMod Emoji_Modifier Emoji [#*0-9\xa9\xae], U+203C, U+2049, U+2122, U+2139, U+2194..2199 ... Emoji_Component (Short: EComp). [#*0-9], U+200D, U+20E3, U+FE0F, U+1F1E6..1F1FF, U+1F3FB..1F3FF ... Emoji_Modifier (Short: EMod). U+1F3FB..1F3FF Emoji_Modifier_Base (Short: EBase). U+261D, U+26F9, U+270A..270D, U+1F385, U+1F3C2..1F3C4, U+1F3C7 ... Emoji_Presentation (Short: EPres). U+231A..231B, U+23E9..23EC, U+23F0, U+23F3, U+25FD..25FE, U+2614..2615 ... EPres Emoji_Presentation EqUIdeo Equivalent_Unified_Ideograph Equivalent_Unified_Ideograph (Short: EqUIdeo) Ext Extender Extended_Pictographic (Short: ExtPict). [\xa9\xae], U+203C, U+2049, U+2122, U+2139, U+2194..2199 ... Extender (Short: Ext). [\xb7], U+02D0..02D1, U+0640, U+07FA, U+0B55, U+0E46 ... ExtPict Extended_Pictographic Full_Composition_Exclusion (Short: Comp_Ex). U+0340..0341, U+0343..0344, U+0374, U+037E, U+0387, U+0958..095F ... Gc General_Category GCB Grapheme_Cluster_Break General_Category (Short: gc) Gr_Base Grapheme_Base Gr_Ext Grapheme_Extend Graph XPosixGraph. (Perl extension) Grapheme_Base (Short: Gr_Base). [\x20-\x7e\xa0-\xac \xae-\xff], U+0100..02FF, U+0370..0377, U+037A..037F, U+0384..038A, U+038C ... Grapheme_Cluster_Break (Short: GCB) Grapheme_Extend (Short: Gr_Ext). U+0300..036F, U+0483..0489, U+0591..05BD, U+05BF, U+05C1..05C2, U+05C4..05C5 ... Hangul_Syllable_Type (Short: hst) Hex Hex_Digit Hex_Digit (Short: Hex). [0-9A-Fa-f], U+FF10..FF19, U+FF21..FF26, U+FF41..FF46 HorizSpace XPosixBlank. (Perl extension) Hst Hangul_Syllable_Type D Hyphen [\-\xad], U+058A, U+1806, U+2010..2011, U+2E17, U+30FB ... Supplanted by Line_Break property values; see www.unicode.org/reports/tr14 ID_Continue (Short: IDC). [0-9A-Z_a-z\xaa\xb5\xb7 \xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... ID_Start (Short: IDS). [A-Za-z\xaa\xb5\xba\xc0- \xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... IDC ID_Continue Identifier_Status Identifier_Type Ideo Ideographic Ideographic (Short: Ideo). U+3006..3007, U+3021..3029, U+3038..303A, U+3400..4DBF, U+4E00..9FFC, U+F900..FA6D ... IDS ID_Start IDS_Binary_Operator (Short: IDSB). U+2FF0..2FF1, U+2FF4..2FFB IDS_Trinary_Operator (Short: IDST). U+2FF2..2FF3 IDSB IDS_Binary_Operator IDST IDS_Trinary_Operator In Present_In. (Perl extension) Indic_Positional_Category (Short: InPC) Indic_Syllabic_Category (Short: InSC) InPC Indic_Positional_Category InSC Indic_Syllabic_Category Isc ISO_Comment; NOT 'isc' meaning 'General_Category=Other' ISO_Comment (Short: isc) Jg Joining_Group Join_C Join_Control Join_Control (Short: Join_C). U+200C..200D Joining_Group (Short: jg) Joining_Type (Short: jt) Jt Joining_Type Lb Line_Break Lc Lowercase_Mapping; NOT 'lc' meaning 'General_Category=Cased_Letter' Line_Break (Short: lb) LOE Logical_Order_Exception Logical_Order_Exception (Short: LOE). U+0E40..0E44, U+0EC0..0EC4, U+19B5..19B7, U+19BA, U+AAB5..AAB6, U+AAB9 ... Lower Lowercase Lowercase (Short: Lower). [a-z\xaa\xb5\xba\xdf- \xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ... Lowercase_Mapping (Short: lc) Math [+<=>\^\|~\xac\xb1\xd7\xf7], U+03D0..03D2, U+03D5, U+03F0..03F1, U+03F4..03F6, U+0606..0608 ... Na Name Na1 Unicode_1_Name Name (Short: na) Name_Alias NChar Noncharacter_Code_Point NFC_QC NFC_Quick_Check NFC_Quick_Check (Short: NFC_QC) NFD_QC NFD_Quick_Check NFD_Quick_Check (Short: NFD_QC) NFKC_Casefold (Short: NFKC_CF) NFKC_CF NFKC_Casefold NFKC_QC NFKC_Quick_Check NFKC_Quick_Check (Short: NFKC_QC) NFKD_QC NFKD_Quick_Check NFKD_Quick_Check (Short: NFKD_QC) Noncharacter_Code_Point (Short: NChar). U+FDD0..FDEF, U+FFFE..FFFF, U+1FFFE..1FFFF, U+2FFFE..2FFFF, U+3FFFE..3FFFF, U+4FFFE..4FFFF ... Nt Numeric_Type Numeric_Type (Short: nt) Numeric_Value (Short: nv) Nv Numeric_Value Pat_Syn Pattern_Syntax Pat_WS Pattern_White_Space Pattern_Syntax (Short: Pat_Syn). [!\"#\$\%&\'\(\)*+,\-. \/:;<=>?\@\[\\\]\^`\{\|\}~\xa1-\xa7\xa9 \xab-\xac\xae\xb0-\xb1\xb6\xbb\xbf\xd7 \xf7], U+2010..2027, U+2030..203E, U+2041..2053, U+2055..205E, U+2190..245F ... Pattern_White_Space (Short: Pat_WS). [\t\n\cK\f\r\x20\x85], U+200E..200F, U+2028..2029 PCM Prepended_Concatenation_Mark Perl_Decimal_Digit (Perl extension) PerlSpace PosixSpace. (Perl extension) PerlWord PosixWord. (Perl extension) PosixAlnum (Perl extension). [0-9A-Za-z] PosixAlpha (Perl extension). [A-Za-z] PosixBlank (Perl extension). [\t\x20] PosixCntrl (Perl extension). ASCII control characters. ACK, BEL, BS, CAN, CR, DC1, DC2, DC3, DC4, DEL, DLE, ENQ, EOM, EOT, ESC, ETB, ETX, FF, FS, GS, HT, LF, NAK, NUL, RS, SI, SO, SOH, STX, SUB, SYN, US, VT PosixDigit (Perl extension). [0-9] PosixGraph (Perl extension). [!\"#\$\%&\'\(\)*+,\-. \/0-9:;<=>?\@A-Z\[\\\]\^_`a-z\{\|\}~] PosixLower (Perl extension). [a-z] PosixPrint (Perl extension). [\x20-\x7e] PosixPunct (Perl extension). [!\"#\$\%&\'\(\)*+,\-. \/:;<=>?\@\[\\\]\^_`\{\|\}~] PosixSpace (Perl extension). [\t\n\cK\f\r\x20] PosixUpper (Perl extension). [A-Z] PosixWord (Perl extension). \w, restricted to ASCII. [0-9A-Z_a-z] PosixXDigit ASCII_Hex_Digit. (Perl extension). [0-9A-Fa-f] Prepended_Concatenation_Mark (Short: PCM). U+0600..0605, U+06DD, U+070F, U+08E2, U+110BD, U+110CD Present_In (Short: In). (Perl extension) Print XPosixPrint. (Perl extension) Punct General_Category=Punctuation. (Perl extension). [!\"#\%&\'\(\)*,\-.\/:;?\@ \[\\\]_\{\}\xa1\xa7\xab\xb6-\xb7\xbb\xbf], U+037E, U+0387, U+055A..055F, U+0589..058A, U+05BE ... QMark Quotation_Mark Quotation_Mark (Short: QMark). [\"\'\xab\xbb], U+2018..201F, U+2039..203A, U+2E42, U+300C..300F, U+301D..301F ... Radical U+2E80..2E99, U+2E9B..2EF3, U+2F00..2FD5 Regional_Indicator (Short: RI). U+1F1E6..1F1FF RI Regional_Indicator SB Sentence_Break Sc Script; NOT 'sc' meaning 'General_Category=Currency_Symbol' Scf Simple_Case_Folding Script (Short: sc) Script_Extensions (Short: scx) Scx Script_Extensions SD Soft_Dotted Sentence_Break (Short: SB) Sentence_Terminal (Short: STerm). [!.?], U+0589, U+061E..061F, U+06D4, U+0700..0702, U+07F9 ... Sfc Simple_Case_Folding Simple_Case_Folding (Short: scf) Simple_Lowercase_Mapping (Short: slc) Simple_Titlecase_Mapping (Short: stc) Simple_Uppercase_Mapping (Short: suc) Slc Simple_Lowercase_Mapping Soft_Dotted (Short: SD). [i-j], U+012F, U+0249, U+0268, U+029D, U+02B2 ... Space White_Space SpacePerl XPosixSpace. (Perl extension) Stc Simple_Titlecase_Mapping STerm Sentence_Terminal Suc Simple_Uppercase_Mapping Tc Titlecase_Mapping Term Terminal_Punctuation Terminal_Punctuation (Short: Term). [!,.:;?], U+037E, U+0387, U+0589, U+05C3, U+060C ... Title Titlecase. (Perl extension) Titlecase (Short: Title). (Perl extension). (= \p{Gc=Lt}). U+01C5, U+01C8, U+01CB, U+01F2, U+1F88..1F8F, U+1F98..1F9F ... Titlecase_Mapping (Short: tc) Uc Uppercase_Mapping UIdeo Unified_Ideograph Unicode Any. (Perl extension) Unicode_1_Name (Short: na1) Unified_Ideograph (Short: UIdeo). U+3400..4DBF, U+4E00..9FFC, U+FA0E..FA0F, U+FA11, U+FA13..FA14, U+FA1F ... Upper Uppercase Uppercase (Short: Upper). [A-Z\xc0-\xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ... Uppercase_Mapping (Short: uc) Variation_Selector (Short: VS). U+180B..180D, U+FE00..FE0F, U+E0100..E01EF Vertical_Orientation (Short: vo) VertSpace (Perl extension). \v. [\n\cK\f\r\x85], U+2028..2029 Vo Vertical_Orientation VS Variation_Selector WB Word_Break White_Space (Short: WSpace). [\t\n\cK\f\r\x20\x85 \xa0], U+1680, U+2000..200A, U+2028..2029, U+202F, U+205F ... Word XPosixWord. (Perl extension) Word_Break (Short: WB) WSpace White_Space XDigit XPosixXDigit (=Hex_Digit). (Perl extension) XID_Continue (Short: XIDC). [0-9A-Z_a-z\xaa\xb5\xb7 \xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... XID_Start (Short: XIDS). [A-Za-z\xaa\xb5\xba\xc0- \xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... XIDC XID_Continue XIDS XID_Start XPerlSpace XPosixSpace. (Perl extension) XPosixAlnum (Short: Alnum). (Perl extension). Alphabetic and (decimal) Numeric. [0-9A- Za-z\xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8- \xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... XPosixAlpha Alphabetic. (Perl extension). [A-Za-z \xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... XPosixBlank (Short: Blank). (Perl extension). \h, Horizontal white space. [\t\x20\xa0], U+1680, U+2000..200A, U+202F, U+205F, U+3000 XPosixCntrl General_Category=Control (Short: Cntrl). (Perl extension). Control characters. [\x00-\x1f\x7f-\x9f] XPosixDigit General_Category=Decimal_Number (Short: Digit). (Perl extension). [0-9] + all other decimal digits. [0-9], U+0660..0669, U+06F0..06F9, U+07C0..07C9, U+0966..096F, U+09E6..09EF ... XPosixGraph (Short: Graph). (Perl extension). Characters that are graphical. [!\"#\$ \%&\'\(\)*+,\-.\/0-9:;<=>?\@A-Z\[\\\] \^_`a-z\{\|\}~\xa1-\xff], U+0100..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1 ... XPosixLower Lowercase. (Perl extension). [a-z\xaa \xb5\xba\xdf-\xf6\xf8-\xff], U+0101, U+0103, U+0105, U+0107, U+0109 ... XPosixPrint (Short: Print). (Perl extension). Characters that are graphical plus space characters (but no controls). [\x20-\x7e \xa0-\xff], U+0100..0377, U+037A..037F, U+0384..038A, U+038C, U+038E..03A1 ... XPosixPunct (Perl extension). \p{Punct} + ASCII-range \p{Symbol}. [!\"#\$\%&\'\(\)*+,\-.\/:;<= >?\@\[\\\]\^_`\{\|\}~\xa1\xa7\xab\xb6- \xb7\xbb\xbf], U+037E, U+0387, U+055A..055F, U+0589..058A, U+05BE ... XPosixSpace (Perl extension). \s including beyond ASCII and vertical tab. [\t\n\cK\f\r\x20 \x85\xa0], U+1680, U+2000..200A, U+2028..2029, U+202F, U+205F ... XPosixUpper Uppercase. (Perl extension). [A-Z\xc0- \xd6\xd8-\xde], U+0100, U+0102, U+0104, U+0106, U+0108 ... XPosixWord (Short: Word). (Perl extension). \w, including beyond ASCII; = \p{Alnum} + \pM + \p{Pc} + \p{Join_Control}. [0-9A-Z_a-z \xaa\xb5\xba\xc0-\xd6\xd8-\xf6\xf8-\xff], U+0100..02C1, U+02C6..02D1, U+02E0..02E4, U+02EC, U+02EE ... XPosixXDigit Hex_Digit (Short: XDigit). (Perl extension). [0-9A-Fa-f], U+FF10..FF19, U+FF21..FF26, U+FF41..FF46 =head1 Properties accessible through other means Certain properties are accessible also via core function calls. These are: Lowercase_Mapping lc() and lcfirst() Titlecase_Mapping ucfirst() Uppercase_Mapping uc() Also, Case_Folding is accessible through the C</i> modifier in regular expressions, the C<\F> transliteration escape, and the C<L<fc|perlfunc/fc>> operator. Besides being able to say C<\p{Name=...}>, the Name and Name_Aliases properties are accessible through the C<\N{}> interpolation in double-quoted strings and regular expressions; and functions C<charnames::viacode()>, C<charnames::vianame()>, and C<charnames::string_vianame()> (which require a C<use charnames ();> to be specified. Finally, most properties related to decomposition are accessible via L<Unicode::Normalize>. =head1 Unicode character properties that are NOT accepted by Perl Perl will generate an error for a few character properties in Unicode when used in a regular expression. The non-Unihan ones are listed below, with the reasons they are not accepted, perhaps with work-arounds. The short names for the properties are listed enclosed in (parentheses). As described after the list, an installation can change the defaults and choose to accept any of these. The list is machine generated based on the choices made for the installation that generated this document. =over 4 =item I<Expands_On_NFC> (XO_NFC) =item I<Expands_On_NFD> (XO_NFD) =item I<Expands_On_NFKC> (XO_NFKC) =item I<Expands_On_NFKD> (XO_NFKD) Deprecated by Unicode. These are characters that expand to more than one character in the specified normalization form, but whether they actually take up more bytes or not depends on the encoding being used. For example, a UTF-8 encoded character may expand to a different number of bytes than a UTF-32 encoded character. =item I<Grapheme_Link> (Gr_Link) Duplicates ccc=vr (Canonical_Combining_Class=Virama) =item I<Jamo_Short_Name> (JSN) =item I<Other_Alphabetic> (OAlpha) =item I<Other_Default_Ignorable_Code_Point> (ODI) =item I<Other_Grapheme_Extend> (OGr_Ext) =item I<Other_ID_Continue> (OIDC) =item I<Other_ID_Start> (OIDS) =item I<Other_Lowercase> (OLower) =item I<Other_Math> (OMath) =item I<Other_Uppercase> (OUpper) Used by Unicode internally for generating other properties and not intended to be used stand-alone =item I<Script=Katakana_Or_Hiragana> (sc=Hrkt) Obsolete. All code points previously matched by this have been moved to "Script=Common". Consider instead using "Script_Extensions=Katakana" or "Script_Extensions=Hiragana" (or both) =item I<Script_Extensions=Katakana_Or_Hiragana> (scx=Hrkt) All code points that would be matched by this are matched by either "Script_Extensions=Katakana" or "Script_Extensions=Hiragana" =back An installation can choose to allow any of these to be matched by downloading the Unicode database from L<http://www.unicode.org/Public/> to C<$Config{privlib}>/F<unicore/> in the Perl source tree, changing the controlling lists contained in the program C<$Config{privlib}>/F<unicore/mktables> and then re-compiling and installing. (C<%Config> is available from the Config module). Also, perl can be recompiled to operate on an earlier version of the Unicode standard. Further information is at C<$Config{privlib}>/F<unicore/README.perl>. =head1 Other information in the Unicode data base The Unicode data base is delivered in two different formats. The XML version is valid for more modern Unicode releases. The other version is a collection of files. The two are intended to give equivalent information. Perl uses the older form; this allows you to recompile Perl to use early Unicode releases. The only non-character property that Perl currently supports is Named Sequences, in which a sequence of code points is given a name and generally treated as a single entity. (Perl supports these via the C<\N{...}> double-quotish construct, L<charnames/charnames::string_vianame(name)>, and L<Unicode::UCD/namedseq()>. Below is a list of the files in the Unicode data base that Perl doesn't currently use, along with very brief descriptions of their purposes. Some of the names of the files have been shortened from those that Unicode uses, in order to allow them to be distinguishable from similarly named files on file systems for which only the first 8 characters of a name are significant. =over 4 =item F<auxiliary/GraphemeBreakTest.html> =item F<auxiliary/LineBreakTest.html> =item F<auxiliary/SentenceBreakTest.html> =item F<auxiliary/WordBreakTest.html> Documentation of validation Tests =item F<BidiCharacterTest.txt> =item F<BidiTest.txt> =item F<NormTest.txt> Validation Tests =item F<CJKRadicals.txt> Maps the kRSUnicode property values to corresponding code points =item F<emoji/ReadMe.txt> =item F<ReadMe.txt> Documentation =item F<EmojiSources.txt> Maps certain Unicode code points to their legacy Japanese cell-phone values =item F<extracted/DName.txt> This file adds no new information not already present in other files =item F<Index.txt> Alphabetical index of Unicode characters =item F<NamedSqProv.txt> Named sequences proposed for inclusion in a later version of the Unicode Standard; if you need them now, you can append this file to F<NamedSequences.txt> and recompile perl =item F<NamesList.html> Describes the format and contents of F<NamesList.txt> =item F<NamesList.txt> Annotated list of characters =item F<NormalizationCorrections.txt> Documentation of corrections already incorporated into the Unicode data base =item F<NushuSources.txt> Specifies source material for Nushu characters =item F<StandardizedVariants.html> Obsoleted as of Unicode 9.0, but previously provided a visual display of the standard variant sequences derived from F<StandardizedVariants.txt>. =item F<StandardizedVariants.txt> Certain glyph variations for character display are standardized. This lists the non-Unihan ones; the Unihan ones are also not used by Perl, and are in a separate Unicode data base L<http://www.unicode.org/ivd> =item F<TangutSources.txt> Specifies source mappings for Tangut ideographs and components. This data file also includes informative radical-stroke values that are used internally by Unicode =item F<USourceData.txt> Documentation of status and cross reference of proposals for encoding by Unicode of Unihan characters =item F<USourceGlyphs.pdf> Pictures of the characters in F<USourceData.txt> =back =head1 SEE ALSO L<http://www.unicode.org/reports/tr44/> L<perlrecharclass> L<perlunicode> PK �=�[mF mF perlintern.podnu �[��� -*- buffer-read-only: t -*- !!!!!!! DO NOT EDIT THIS FILE !!!!!!! This file is built by autodoc.pl extracting documentation from the C source files. Any changes made here will be lost! =head1 NAME perlintern - autogenerated documentation of purely B<internal> Perl functions =head1 DESCRIPTION X<internal Perl functions> X<interpreter functions> This file is the autogenerated documentation of functions in the Perl interpreter that are documented using Perl's internal documentation format but are not marked as part of the Perl API. In other words, B<they are not for use in extensions>! =head1 Array Manipulation Functions =over 8 =item AvFILLp X<AvFILLp> int AvFILLp(AV* av) =for hackers Found in file av.h =back =head1 Compile-time scope hooks =over 8 =item BhkENTRY X<BhkENTRY> NOTE: this function is experimental and may change or be removed without notice. Return an entry from the BHK structure. C<which> is a preprocessor token indicating which entry to return. If the appropriate flag is not set this will return C<NULL>. The type of the return value depends on which entry you ask for. void * BhkENTRY(BHK *hk, which) =for hackers Found in file op.h =item BhkFLAGS X<BhkFLAGS> NOTE: this function is experimental and may change or be removed without notice. Return the BHK's flags. U32 BhkFLAGS(BHK *hk) =for hackers Found in file op.h =item CALL_BLOCK_HOOKS X<CALL_BLOCK_HOOKS> NOTE: this function is experimental and may change or be removed without notice. Call all the registered block hooks for type C<which>. C<which> is a preprocessing token; the type of C<arg> depends on C<which>. void CALL_BLOCK_HOOKS(which, arg) =for hackers Found in file op.h =back =head1 Custom Operators =over 8 =item core_prototype X<core_prototype> This function assigns the prototype of the named core function to C<sv>, or to a new mortal SV if C<sv> is C<NULL>. It returns the modified C<sv>, or C<NULL> if the core function has no prototype. C<code> is a code as returned by C<keyword()>. It must not be equal to 0. SV * core_prototype(SV *sv, const char *name, const int code, int * const opnum) =for hackers Found in file op.c =back =head1 CV Manipulation Functions =over 8 =item docatch X<docatch> Check for the cases 0 or 3 of cur_env.je_ret, only used inside an eval context. 0 is used as continue inside eval, 3 is used for a die caught by an inner eval - continue inner loop See F<cop.h>: je_mustcatch, when set at any runlevel to TRUE, means eval ops must establish a local jmpenv to handle exception traps. OP* docatch(Perl_ppaddr_t firstpp) =for hackers Found in file pp_ctl.c =back =head1 CV reference counts and CvOUTSIDE =over 8 =item CvWEAKOUTSIDE X<CvWEAKOUTSIDE> Each CV has a pointer, C<CvOUTSIDE()>, to its lexically enclosing CV (if any). Because pointers to anonymous sub prototypes are stored in C<&> pad slots, it is a possible to get a circular reference, with the parent pointing to the child and vice-versa. To avoid the ensuing memory leak, we do not increment the reference count of the CV pointed to by C<CvOUTSIDE> in the I<one specific instance> that the parent has a C<&> pad slot pointing back to us. In this case, we set the C<CvWEAKOUTSIDE> flag in the child. This allows us to determine under what circumstances we should decrement the refcount of the parent when freeing the child. There is a further complication with non-closure anonymous subs (i.e. those that do not refer to any lexicals outside that sub). In this case, the anonymous prototype is shared rather than being cloned. This has the consequence that the parent may be freed while there are still active children, I<e.g.>, BEGIN { $a = sub { eval '$x' } } In this case, the BEGIN is freed immediately after execution since there are no active references to it: the anon sub prototype has C<CvWEAKOUTSIDE> set since it's not a closure, and $a points to the same CV, so it doesn't contribute to BEGIN's refcount either. When $a is executed, the C<eval '$x'> causes the chain of C<CvOUTSIDE>s to be followed, and the freed BEGIN is accessed. To avoid this, whenever a CV and its associated pad is freed, any C<&> entries in the pad are explicitly removed from the pad, and if the refcount of the pointed-to anon sub is still positive, then that child's C<CvOUTSIDE> is set to point to its grandparent. This will only occur in the single specific case of a non-closure anon prototype having one or more active references (such as C<$a> above). One other thing to consider is that a CV may be merely undefined rather than freed, eg C<undef &foo>. In this case, its refcount may not have reached zero, but we still delete its pad and its C<CvROOT> etc. Since various children may still have their C<CvOUTSIDE> pointing at this undefined CV, we keep its own C<CvOUTSIDE> for the time being, so that the chain of lexical scopes is unbroken. For example, the following should print 123: my $x = 123; sub tmp { sub { eval '$x' } } my $a = tmp(); undef &tmp; print $a->(); bool CvWEAKOUTSIDE(CV *cv) =for hackers Found in file cv.h =back =head1 Embedding Functions =over 8 =item cv_dump X<cv_dump> dump the contents of a CV void cv_dump(const CV *cv, const char *title) =for hackers Found in file pad.c =item cv_forget_slab X<cv_forget_slab> When a CV has a reference count on its slab (C<CvSLABBED>), it is responsible for making sure it is freed. (Hence, no two CVs should ever have a reference count on the same slab.) The CV only needs to reference the slab during compilation. Once it is compiled and C<CvROOT> attached, it has finished its job, so it can forget the slab. void cv_forget_slab(CV *cv) =for hackers Found in file pad.c =item do_dump_pad X<do_dump_pad> Dump the contents of a padlist void do_dump_pad(I32 level, PerlIO *file, PADLIST *padlist, int full) =for hackers Found in file pad.c =item pad_alloc_name X<pad_alloc_name> Allocates a place in the currently-compiling pad (via L<perlapi/pad_alloc>) and then stores a name for that entry. C<name> is adopted and becomes the name entry; it must already contain the name string. C<typestash> and C<ourstash> and the C<padadd_STATE> flag get added to C<name>. None of the other processing of L<perlapi/pad_add_name_pvn> is done. Returns the offset of the allocated pad slot. PADOFFSET pad_alloc_name(PADNAME *name, U32 flags, HV *typestash, HV *ourstash) =for hackers Found in file pad.c =item pad_block_start X<pad_block_start> Update the pad compilation state variables on entry to a new block. void pad_block_start(int full) =for hackers Found in file pad.c =item pad_check_dup X<pad_check_dup> Check for duplicate declarations: report any of: * a 'my' in the current scope with the same name; * an 'our' (anywhere in the pad) with the same name and the same stash as 'ourstash' C<is_our> indicates that the name to check is an C<"our"> declaration. void pad_check_dup(PADNAME *name, U32 flags, const HV *ourstash) =for hackers Found in file pad.c =item pad_findlex X<pad_findlex> Find a named lexical anywhere in a chain of nested pads. Add fake entries in the inner pads if it's found in an outer one. Returns the offset in the bottom pad of the lex or the fake lex. C<cv> is the CV in which to start the search, and seq is the current C<cop_seq> to match against. If C<warn> is true, print appropriate warnings. The C<out_>* vars return values, and so are pointers to where the returned values should be stored. C<out_capture>, if non-null, requests that the innermost instance of the lexical is captured; C<out_name> is set to the innermost matched pad name or fake pad name; C<out_flags> returns the flags normally associated with the C<PARENT_FAKELEX_FLAGS> field of a fake pad name. Note that C<pad_findlex()> is recursive; it recurses up the chain of CVs, then comes back down, adding fake entries as it goes. It has to be this way because fake names in anon protoypes have to store in C<xpadn_low> the index into the parent pad. PADOFFSET pad_findlex(const char *namepv, STRLEN namelen, U32 flags, const CV* cv, U32 seq, int warn, SV** out_capture, PADNAME** out_name, int *out_flags) =for hackers Found in file pad.c =item pad_fixup_inner_anons X<pad_fixup_inner_anons> For any anon CVs in the pad, change C<CvOUTSIDE> of that CV from C<old_cv> to C<new_cv> if necessary. Needed when a newly-compiled CV has to be moved to a pre-existing CV struct. void pad_fixup_inner_anons(PADLIST *padlist, CV *old_cv, CV *new_cv) =for hackers Found in file pad.c =item pad_free X<pad_free> Free the SV at offset po in the current pad. void pad_free(PADOFFSET po) =for hackers Found in file pad.c =item pad_leavemy X<pad_leavemy> Cleanup at end of scope during compilation: set the max seq number for lexicals in this scope and warn of any lexicals that never got introduced. OP * pad_leavemy() =for hackers Found in file pad.c =item padlist_dup X<padlist_dup> Duplicates a pad. PADLIST * padlist_dup(PADLIST *srcpad, CLONE_PARAMS *param) =for hackers Found in file pad.c =item padname_dup X<padname_dup> Duplicates a pad name. PADNAME * padname_dup(PADNAME *src, CLONE_PARAMS *param) =for hackers Found in file pad.c =item padnamelist_dup X<padnamelist_dup> Duplicates a pad name list. PADNAMELIST * padnamelist_dup(PADNAMELIST *srcpad, CLONE_PARAMS *param) =for hackers Found in file pad.c =item pad_push X<pad_push> Push a new pad frame onto the padlist, unless there's already a pad at this depth, in which case don't bother creating a new one. Then give the new pad an C<@_> in slot zero. void pad_push(PADLIST *padlist, int depth) =for hackers Found in file pad.c =item pad_reset X<pad_reset> Mark all the current temporaries for reuse void pad_reset() =for hackers Found in file pad.c =item pad_swipe X<pad_swipe> Abandon the tmp in the current pad at offset C<po> and replace with a new one. void pad_swipe(PADOFFSET po, bool refadjust) =for hackers Found in file pad.c =back =head1 Errno =over 8 =item dSAVEDERRNO X<dSAVEDERRNO> Declare variables needed to save C<errno> and any operating system specific error number. void dSAVEDERRNO =for hackers Found in file perl.h =item dSAVE_ERRNO X<dSAVE_ERRNO> Declare variables needed to save C<errno> and any operating system specific error number, and save them for optional later restoration by C<RESTORE_ERRNO>. void dSAVE_ERRNO =for hackers Found in file perl.h =item RESTORE_ERRNO X<RESTORE_ERRNO> Restore C<errno> and any operating system specific error number that was saved by C<dSAVE_ERRNO> or C<RESTORE_ERRNO>. void RESTORE_ERRNO =for hackers Found in file perl.h =item SAVE_ERRNO X<SAVE_ERRNO> Save C<errno> and any operating system specific error number for optional later restoration by C<RESTORE_ERRNO>. Requires C<dSAVEDERRNO> or C<dSAVE_ERRNO> in scope. void SAVE_ERRNO =for hackers Found in file perl.h =item SETERRNO X<SETERRNO> Set C<errno>, and on VMS set C<vaxc$errno>. void SETERRNO(int errcode, int vmserrcode) =for hackers Found in file perl.h =back =head1 GV Functions =over 8 =item gv_try_downgrade X<gv_try_downgrade> NOTE: this function is experimental and may change or be removed without notice. If the typeglob C<gv> can be expressed more succinctly, by having something other than a real GV in its place in the stash, replace it with the optimised form. Basic requirements for this are that C<gv> is a real typeglob, is sufficiently ordinary, and is only referenced from its package. This function is meant to be used when a GV has been looked up in part to see what was there, causing upgrading, but based on what was found it turns out that the real GV isn't required after all. If C<gv> is a completely empty typeglob, it is deleted from the stash. If C<gv> is a typeglob containing only a sufficiently-ordinary constant sub, the typeglob is replaced with a scalar-reference placeholder that more compactly represents the same thing. void gv_try_downgrade(GV* gv) =for hackers Found in file gv.c =back =head1 Hash Manipulation Functions =over 8 =item hv_ename_add X<hv_ename_add> Adds a name to a stash's internal list of effective names. See C<L</hv_ename_delete>>. This is called when a stash is assigned to a new location in the symbol table. void hv_ename_add(HV *hv, const char *name, U32 len, U32 flags) =for hackers Found in file hv.c =item hv_ename_delete X<hv_ename_delete> Removes a name from a stash's internal list of effective names. If this is the name returned by C<HvENAME>, then another name in the list will take its place (C<HvENAME> will use it). This is called when a stash is deleted from the symbol table. void hv_ename_delete(HV *hv, const char *name, U32 len, U32 flags) =for hackers Found in file hv.c =item refcounted_he_chain_2hv X<refcounted_he_chain_2hv> Generates and returns a C<HV *> representing the content of a C<refcounted_he> chain. C<flags> is currently unused and must be zero. HV * refcounted_he_chain_2hv( const struct refcounted_he *c, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_fetch_pv X<refcounted_he_fetch_pv> Like L</refcounted_he_fetch_pvn>, but takes a nul-terminated string instead of a string/length pair. SV * refcounted_he_fetch_pv( const struct refcounted_he *chain, const char *key, U32 hash, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_fetch_pvn X<refcounted_he_fetch_pvn> Search along a C<refcounted_he> chain for an entry with the key specified by C<keypv> and C<keylen>. If C<flags> has the C<REFCOUNTED_HE_KEY_UTF8> bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. C<hash> is a precomputed hash of the key string, or zero if it has not been precomputed. Returns a mortal scalar representing the value associated with the key, or C<&PL_sv_placeholder> if there is no value associated with the key. SV * refcounted_he_fetch_pvn( const struct refcounted_he *chain, const char *keypv, STRLEN keylen, U32 hash, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_fetch_pvs X<refcounted_he_fetch_pvs> Like L</refcounted_he_fetch_pvn>, but takes a literal string instead of a string/length pair, and no precomputed hash. SV * refcounted_he_fetch_pvs( const struct refcounted_he *chain, "key", U32 flags ) =for hackers Found in file hv.h =item refcounted_he_fetch_sv X<refcounted_he_fetch_sv> Like L</refcounted_he_fetch_pvn>, but takes a Perl scalar instead of a string/length pair. SV * refcounted_he_fetch_sv( const struct refcounted_he *chain, SV *key, U32 hash, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_free X<refcounted_he_free> Decrements the reference count of a C<refcounted_he> by one. If the reference count reaches zero the structure's memory is freed, which (recursively) causes a reduction of its parent C<refcounted_he>'s reference count. It is safe to pass a null pointer to this function: no action occurs in this case. void refcounted_he_free(struct refcounted_he *he) =for hackers Found in file hv.c =item refcounted_he_inc X<refcounted_he_inc> Increment the reference count of a C<refcounted_he>. The pointer to the C<refcounted_he> is also returned. It is safe to pass a null pointer to this function: no action occurs and a null pointer is returned. struct refcounted_he * refcounted_he_inc( struct refcounted_he *he ) =for hackers Found in file hv.c =item refcounted_he_new_pv X<refcounted_he_new_pv> Like L</refcounted_he_new_pvn>, but takes a nul-terminated string instead of a string/length pair. struct refcounted_he * refcounted_he_new_pv( struct refcounted_he *parent, const char *key, U32 hash, SV *value, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_new_pvn X<refcounted_he_new_pvn> Creates a new C<refcounted_he>. This consists of a single key/value pair and a reference to an existing C<refcounted_he> chain (which may be empty), and thus forms a longer chain. When using the longer chain, the new key/value pair takes precedence over any entry for the same key further along the chain. The new key is specified by C<keypv> and C<keylen>. If C<flags> has the C<REFCOUNTED_HE_KEY_UTF8> bit set, the key octets are interpreted as UTF-8, otherwise they are interpreted as Latin-1. C<hash> is a precomputed hash of the key string, or zero if it has not been precomputed. C<value> is the scalar value to store for this key. C<value> is copied by this function, which thus does not take ownership of any reference to it, and later changes to the scalar will not be reflected in the value visible in the C<refcounted_he>. Complex types of scalar will not be stored with referential integrity, but will be coerced to strings. C<value> may be either null or C<&PL_sv_placeholder> to indicate that no value is to be associated with the key; this, as with any non-null value, takes precedence over the existence of a value for the key further along the chain. C<parent> points to the rest of the C<refcounted_he> chain to be attached to the new C<refcounted_he>. This function takes ownership of one reference to C<parent>, and returns one reference to the new C<refcounted_he>. struct refcounted_he * refcounted_he_new_pvn( struct refcounted_he *parent, const char *keypv, STRLEN keylen, U32 hash, SV *value, U32 flags ) =for hackers Found in file hv.c =item refcounted_he_new_pvs X<refcounted_he_new_pvs> Like L</refcounted_he_new_pvn>, but takes a literal string instead of a string/length pair, and no precomputed hash. struct refcounted_he * refcounted_he_new_pvs( struct refcounted_he *parent, "key", SV *value, U32 flags ) =for hackers Found in file hv.h =item refcounted_he_new_sv X<refcounted_he_new_sv> Like L</refcounted_he_new_pvn>, but takes a Perl scalar instead of a string/length pair. struct refcounted_he * refcounted_he_new_sv( struct refcounted_he *parent, SV *key, U32 hash, SV *value, U32 flags ) =for hackers Found in file hv.c =back =head1 IO Functions =over 8 =item start_glob X<start_glob> NOTE: this function is experimental and may change or be removed without notice. Function called by C<do_readline> to spawn a glob (or do the glob inside perl on VMS). This code used to be inline, but now perl uses C<File::Glob> this glob starter is only used by miniperl during the build process, or when PERL_EXTERNAL_GLOB is defined. Moving it away shrinks F<pp_hot.c>; shrinking F<pp_hot.c> helps speed perl up. NOTE: this function must be explicitly called as Perl_start_glob with an aTHX_ parameter. PerlIO* Perl_start_glob(pTHX_ SV *tmpglob, IO *io) =for hackers Found in file doio.c =back =head1 Lexer interface =over 8 =item validate_proto X<validate_proto> NOTE: this function is experimental and may change or be removed without notice. This function performs syntax checking on a prototype, C<proto>. If C<warn> is true, any illegal characters or mismatched brackets will trigger illegalproto warnings, declaring that they were detected in the prototype for C<name>. The return value is C<true> if this is a valid prototype, and C<false> if it is not, regardless of whether C<warn> was C<true> or C<false>. Note that C<NULL> is a valid C<proto> and will always return C<true>. bool validate_proto(SV *name, SV *proto, bool warn, bool curstash) =for hackers Found in file toke.c =back =head1 Magical Functions =over 8 =item magic_clearhint X<magic_clearhint> Triggered by a delete from C<%^H>, records the key to C<PL_compiling.cop_hints_hash>. int magic_clearhint(SV* sv, MAGIC* mg) =for hackers Found in file mg.c =item magic_clearhints X<magic_clearhints> Triggered by clearing C<%^H>, resets C<PL_compiling.cop_hints_hash>. int magic_clearhints(SV* sv, MAGIC* mg) =for hackers Found in file mg.c =item magic_methcall X<magic_methcall> Invoke a magic method (like FETCH). C<sv> and C<mg> are the tied thingy and the tie magic. C<meth> is the name of the method to call. C<argc> is the number of args (in addition to $self) to pass to the method. The C<flags> can be: G_DISCARD invoke method with G_DISCARD flag and don't return a value G_UNDEF_FILL fill the stack with argc pointers to PL_sv_undef The arguments themselves are any values following the C<flags> argument. Returns the SV (if any) returned by the method, or C<NULL> on failure. NOTE: this function must be explicitly called as Perl_magic_methcall with an aTHX_ parameter. SV* Perl_magic_methcall(pTHX_ SV *sv, const MAGIC *mg, SV *meth, U32 flags, U32 argc, ...) =for hackers Found in file mg.c =item magic_sethint X<magic_sethint> Triggered by a store to C<%^H>, records the key/value pair to C<PL_compiling.cop_hints_hash>. It is assumed that hints aren't storing anything that would need a deep copy. Maybe we should warn if we find a reference. int magic_sethint(SV* sv, MAGIC* mg) =for hackers Found in file mg.c =item mg_localize X<mg_localize> Copy some of the magic from an existing SV to new localized version of that SV. Container magic (I<e.g.>, C<%ENV>, C<$1>, C<tie>) gets copied, value magic doesn't (I<e.g.>, C<taint>, C<pos>). If C<setmagic> is false then no set magic will be called on the new (empty) SV. This typically means that assignment will soon follow (e.g. S<C<'local $x = $y'>>), and that will handle the magic. void mg_localize(SV* sv, SV* nsv, bool setmagic) =for hackers Found in file mg.c =back =head1 Miscellaneous Functions =over 8 =item free_c_backtrace X<free_c_backtrace> Deallocates a backtrace received from get_c_bracktrace. void free_c_backtrace(Perl_c_backtrace* bt) =for hackers Found in file util.c =item get_c_backtrace X<get_c_backtrace> Collects the backtrace (aka "stacktrace") into a single linear malloced buffer, which the caller B<must> C<Perl_free_c_backtrace()>. Scans the frames back by S<C<depth + skip>>, then drops the C<skip> innermost, returning at most C<depth> frames. Perl_c_backtrace* get_c_backtrace(int max_depth, int skip) =for hackers Found in file util.c =item quadmath_format_needed X<quadmath_format_needed> C<quadmath_format_needed()> returns true if the C<format> string seems to contain at least one non-Q-prefixed C<%[efgaEFGA]> format specifier, or returns false otherwise. The format specifier detection is not complete printf-syntax detection, but it should catch most common cases. If true is returned, those arguments B<should> in theory be processed with C<quadmath_snprintf()>, but in case there is more than one such format specifier (see L</quadmath_format_valid>), and if there is anything else beyond that one (even just a single byte), they B<cannot> be processed because C<quadmath_snprintf()> is very strict, accepting only one format spec, and nothing else. In this case, the code should probably fail. bool quadmath_format_needed(const char* format) =for hackers Found in file util.c =item quadmath_format_valid X<quadmath_format_valid> C<quadmath_snprintf()> is very strict about its C<format> string and will fail, returning -1, if the format is invalid. It accepts exactly one format spec. C<quadmath_format_valid()> checks that the intended single spec looks sane: begins with C<%>, has only one C<%>, ends with C<[efgaEFGA]>, and has C<Q> before it. This is not a full "printf syntax check", just the basics. Returns true if it is valid, false if not. See also L</quadmath_format_needed>. bool quadmath_format_valid(const char* format) =for hackers Found in file util.c =back =head1 MRO Functions =over 8 =item mro_get_linear_isa_dfs X<mro_get_linear_isa_dfs> Returns the Depth-First Search linearization of C<@ISA> the given stash. The return value is a read-only AV*. C<level> should be 0 (it is used internally in this function's recursion). You are responsible for C<SvREFCNT_inc()> on the return value if you plan to store it anywhere semi-permanently (otherwise it might be deleted out from under you the next time the cache is invalidated). AV* mro_get_linear_isa_dfs(HV* stash, U32 level) =for hackers Found in file mro_core.c =item mro_isa_changed_in X<mro_isa_changed_in> Takes the necessary steps (cache invalidations, mostly) when the C<@ISA> of the given package has changed. Invoked by the C<setisa> magic, should not need to invoke directly. void mro_isa_changed_in(HV* stash) =for hackers Found in file mro_core.c =item mro_package_moved X<mro_package_moved> Call this function to signal to a stash that it has been assigned to another spot in the stash hierarchy. C<stash> is the stash that has been assigned. C<oldstash> is the stash it replaces, if any. C<gv> is the glob that is actually being assigned to. This can also be called with a null first argument to indicate that C<oldstash> has been deleted. This function invalidates isa caches on the old stash, on all subpackages nested inside it, and on the subclasses of all those, including non-existent packages that have corresponding entries in C<stash>. It also sets the effective names (C<HvENAME>) on all the stashes as appropriate. If the C<gv> is present and is not in the symbol table, then this function simply returns. This checked will be skipped if C<flags & 1>. void mro_package_moved(HV * const stash, HV * const oldstash, const GV * const gv, U32 flags) =for hackers Found in file mro_core.c =back =head1 Numeric functions =over 8 =item grok_atoUV X<grok_atoUV> parse a string, looking for a decimal unsigned integer. On entry, C<pv> points to the beginning of the string; C<valptr> points to a UV that will receive the converted value, if found; C<endptr> is either NULL or points to a variable that points to one byte beyond the point in C<pv> that this routine should examine. If C<endptr> is NULL, C<pv> is assumed to be NUL-terminated. Returns FALSE if C<pv> doesn't represent a valid unsigned integer value (with no leading zeros). Otherwise it returns TRUE, and sets C<*valptr> to that value. If you constrain the portion of C<pv> that is looked at by this function (by passing a non-NULL C<endptr>), and if the intial bytes of that portion form a valid value, it will return TRUE, setting C<*endptr> to the byte following the final digit of the value. But if there is no constraint at what's looked at, all of C<pv> must be valid in order for TRUE to be returned. C<*endptr> is unchanged from its value on input if FALSE is returned; The only characters this accepts are the decimal digits '0'..'9'. As opposed to L<atoi(3)> or L<strtol(3)>, C<grok_atoUV> does NOT allow optional leading whitespace, nor negative inputs. If such features are required, the calling code needs to explicitly implement those. Note that this function returns FALSE for inputs that would overflow a UV, or have leading zeros. Thus a single C<0> is accepted, but not C<00> nor C<01>, C<002>, I<etc>. Background: C<atoi> has severe problems with illegal inputs, it cannot be used for incremental parsing, and therefore should be avoided C<atoi> and C<strtol> are also affected by locale settings, which can also be seen as a bug (global state controlled by user environment). bool grok_atoUV(const char* pv, UV* valptr, const char** endptr) =for hackers Found in file numeric.c =item isinfnansv X<isinfnansv> Checks whether the argument would be either an infinity or C<NaN> when used as a number, but is careful not to trigger non-numeric or uninitialized warnings. it assumes the caller has done C<SvGETMAGIC(sv)> already. bool isinfnansv(SV *sv) =for hackers Found in file numeric.c =back =head1 Obsolete backwards compatibility functions =over 8 =item utf8n_to_uvuni X<utf8n_to_uvuni> DEPRECATED! It is planned to remove this function from a future release of Perl. Do not use it for new code; remove it from existing code. Instead use L<perlapi/utf8_to_uvchr_buf>, or rarely, L<perlapi/utf8n_to_uvchr>. This function was useful for code that wanted to handle both EBCDIC and ASCII platforms with Unicode properties, but starting in Perl v5.20, the distinctions between the platforms have mostly been made invisible to most code, so this function is quite unlikely to be what you want. If you do need this precise functionality, use instead C<L<NATIVE_TO_UNI(utf8_to_uvchr_buf(...))|perlapi/utf8_to_uvchr_buf>> or C<L<NATIVE_TO_UNI(utf8n_to_uvchr(...))|perlapi/utf8n_to_uvchr>>. UV utf8n_to_uvuni(const U8 *s, STRLEN curlen, STRLEN *retlen, U32 flags) =for hackers Found in file mathoms.c =item utf8_to_uvuni X<utf8_to_uvuni> DEPRECATED! It is planned to remove this function from a future release of Perl. Do not use it for new code; remove it from existing code. Returns the Unicode code point of the first character in the string C<s> which is assumed to be in UTF-8 encoding; C<retlen> will be set to the length, in bytes, of that character. Some, but not all, UTF-8 malformations are detected, and in fact, some malformed input could cause reading beyond the end of the input buffer, which is one reason why this function is deprecated. The other is that only in extremely limited circumstances should the Unicode versus native code point be of any interest to you. See L</utf8_to_uvuni_buf> for alternatives. If C<s> points to one of the detected malformations, and UTF8 warnings are enabled, zero is returned and C<*retlen> is set (if C<retlen> doesn't point to NULL) to -1. If those warnings are off, the computed value if well-defined (or the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and C<*retlen> is set (if C<retlen> isn't NULL) so that (S<C<s> + C<*retlen>>) is the next possible position in C<s> that could begin a non-malformed character. See L<perlapi/utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is returned. UV utf8_to_uvuni(const U8 *s, STRLEN *retlen) =for hackers Found in file mathoms.c =item uvuni_to_utf8_flags X<uvuni_to_utf8_flags> DEPRECATED! It is planned to remove this function from a future release of Perl. Do not use it for new code; remove it from existing code. Instead you almost certainly want to use L<perlapi/uvchr_to_utf8> or L<perlapi/uvchr_to_utf8_flags>. This function is a deprecated synonym for L</uvoffuni_to_utf8_flags>, which itself, while not deprecated, should be used only in isolated circumstances. These functions were useful for code that wanted to handle both EBCDIC and ASCII platforms with Unicode properties, but starting in Perl v5.20, the distinctions between the platforms have mostly been made invisible to most code, so this function is quite unlikely to be what you want. U8* uvuni_to_utf8_flags(U8 *d, UV uv, UV flags) =for hackers Found in file mathoms.c =back =head1 Optree Manipulation Functions =over 8 =item finalize_optree X<finalize_optree> This function finalizes the optree. Should be called directly after the complete optree is built. It does some additional checking which can't be done in the normal C<ck_>xxx functions and makes the tree thread-safe. void finalize_optree(OP* o) =for hackers Found in file op.c =item newATTRSUB_x X<newATTRSUB_x> Construct a Perl subroutine, also performing some surrounding jobs. This function is expected to be called in a Perl compilation context, and some aspects of the subroutine are taken from global variables associated with compilation. In particular, C<PL_compcv> represents the subroutine that is currently being compiled. It must be non-null when this function is called, and some aspects of the subroutine being constructed are taken from it. The constructed subroutine may actually be a reuse of the C<PL_compcv> object, but will not necessarily be so. If C<block> is null then the subroutine will have no body, and for the time being it will be an error to call it. This represents a forward subroutine declaration such as S<C<sub foo ($$);>>. If C<block> is non-null then it provides the Perl code of the subroutine body, which will be executed when the subroutine is called. This body includes any argument unwrapping code resulting from a subroutine signature or similar. The pad use of the code must correspond to the pad attached to C<PL_compcv>. The code is not expected to include a C<leavesub> or C<leavesublv> op; this function will add such an op. C<block> is consumed by this function and will become part of the constructed subroutine. C<proto> specifies the subroutine's prototype, unless one is supplied as an attribute (see below). If C<proto> is null, then the subroutine will not have a prototype. If C<proto> is non-null, it must point to a C<const> op whose value is a string, and the subroutine will have that string as its prototype. If a prototype is supplied as an attribute, the attribute takes precedence over C<proto>, but in that case C<proto> should preferably be null. In any case, C<proto> is consumed by this function. C<attrs> supplies attributes to be applied the subroutine. A handful of attributes take effect by built-in means, being applied to C<PL_compcv> immediately when seen. Other attributes are collected up and attached to the subroutine by this route. C<attrs> may be null to supply no attributes, or point to a C<const> op for a single attribute, or point to a C<list> op whose children apart from the C<pushmark> are C<const> ops for one or more attributes. Each C<const> op must be a string, giving the attribute name optionally followed by parenthesised arguments, in the manner in which attributes appear in Perl source. The attributes will be applied to the sub by this function. C<attrs> is consumed by this function. If C<o_is_gv> is false and C<o> is null, then the subroutine will be anonymous. If C<o_is_gv> is false and C<o> is non-null, then C<o> must point to a C<const> op, which will be consumed by this function, and its string value supplies a name for the subroutine. The name may be qualified or unqualified, and if it is unqualified then a default stash will be selected in some manner. If C<o_is_gv> is true, then C<o> doesn't point to an C<OP> at all, but is instead a cast pointer to a C<GV> by which the subroutine will be named. If there is already a subroutine of the specified name, then the new sub will either replace the existing one in the glob or be merged with the existing one. A warning may be generated about redefinition. If the subroutine has one of a few special names, such as C<BEGIN> or C<END>, then it will be claimed by the appropriate queue for automatic running of phase-related subroutines. In this case the relevant glob will be left not containing any subroutine, even if it did contain one before. In the case of C<BEGIN>, the subroutine will be executed and the reference to it disposed of before this function returns. The function returns a pointer to the constructed subroutine. If the sub is anonymous then ownership of one counted reference to the subroutine is transferred to the caller. If the sub is named then the caller does not get ownership of a reference. In most such cases, where the sub has a non-phase name, the sub will be alive at the point it is returned by virtue of being contained in the glob that names it. A phase-named subroutine will usually be alive by virtue of the reference owned by the phase's automatic run queue. But a C<BEGIN> subroutine, having already been executed, will quite likely have been destroyed already by the time this function returns, making it erroneous for the caller to make any use of the returned pointer. It is the caller's responsibility to ensure that it knows which of these situations applies. CV* newATTRSUB_x(I32 floor, OP *o, OP *proto, OP *attrs, OP *block, bool o_is_gv) =for hackers Found in file op.c =item newXS_len_flags X<newXS_len_flags> Construct an XS subroutine, also performing some surrounding jobs. The subroutine will have the entry point C<subaddr>. It will have the prototype specified by the nul-terminated string C<proto>, or no prototype if C<proto> is null. The prototype string is copied; the caller can mutate the supplied string afterwards. If C<filename> is non-null, it must be a nul-terminated filename, and the subroutine will have its C<CvFILE> set accordingly. By default C<CvFILE> is set to point directly to the supplied string, which must be static. If C<flags> has the C<XS_DYNAMIC_FILENAME> bit set, then a copy of the string will be taken instead. Other aspects of the subroutine will be left in their default state. If anything else needs to be done to the subroutine for it to function correctly, it is the caller's responsibility to do that after this function has constructed it. However, beware of the subroutine potentially being destroyed before this function returns, as described below. If C<name> is null then the subroutine will be anonymous, with its C<CvGV> referring to an C<__ANON__> glob. If C<name> is non-null then the subroutine will be named accordingly, referenced by the appropriate glob. C<name> is a string of length C<len> bytes giving a sigilless symbol name, in UTF-8 if C<flags> has the C<SVf_UTF8> bit set and in Latin-1 otherwise. The name may be either qualified or unqualified, with the stash defaulting in the same manner as for C<gv_fetchpvn_flags>. C<flags> may contain flag bits understood by C<gv_fetchpvn_flags> with the same meaning as they have there, such as C<GV_ADDWARN>. The symbol is always added to the stash if necessary, with C<GV_ADDMULTI> semantics. If there is already a subroutine of the specified name, then the new sub will replace the existing one in the glob. A warning may be generated about the redefinition. If the old subroutine was C<CvCONST> then the decision about whether to warn is influenced by an expectation about whether the new subroutine will become a constant of similar value. That expectation is determined by C<const_svp>. (Note that the call to this function doesn't make the new subroutine C<CvCONST> in any case; that is left to the caller.) If C<const_svp> is null then it indicates that the new subroutine will not become a constant. If C<const_svp> is non-null then it indicates that the new subroutine will become a constant, and it points to an C<SV*> that provides the constant value that the subroutine will have. If the subroutine has one of a few special names, such as C<BEGIN> or C<END>, then it will be claimed by the appropriate queue for automatic running of phase-related subroutines. In this case the relevant glob will be left not containing any subroutine, even if it did contain one before. In the case of C<BEGIN>, the subroutine will be executed and the reference to it disposed of before this function returns, and also before its prototype is set. If a C<BEGIN> subroutine would not be sufficiently constructed by this function to be ready for execution then the caller must prevent this happening by giving the subroutine a different name. The function returns a pointer to the constructed subroutine. If the sub is anonymous then ownership of one counted reference to the subroutine is transferred to the caller. If the sub is named then the caller does not get ownership of a reference. In most such cases, where the sub has a non-phase name, the sub will be alive at the point it is returned by virtue of being contained in the glob that names it. A phase-named subroutine will usually be alive by virtue of the reference owned by the phase's automatic run queue. But a C<BEGIN> subroutine, having already been executed, will quite likely have been destroyed already by the time this function returns, making it erroneous for the caller to make any use of the returned pointer. It is the caller's responsibility to ensure that it knows which of these situations applies. CV * newXS_len_flags(const char *name, STRLEN len, XSUBADDR_t subaddr, const char *const filename, const char *const proto, SV **const_svp, U32 flags) =for hackers Found in file op.c =item optimize_optree X<optimize_optree> This function applies some optimisations to the optree in top-down order. It is called before the peephole optimizer, which processes ops in execution order. Note that finalize_optree() also does a top-down scan, but is called *after* the peephole optimizer. void optimize_optree(OP* o) =for hackers Found in file op.c =item traverse_op_tree X<traverse_op_tree> Return the next op in a depth-first traversal of the op tree, returning NULL when the traversal is complete. The initial call must supply the root of the tree as both top and o. For now it's static, but it may be exposed to the API in the future. OP* traverse_op_tree(OP* top, OP* o) =for hackers Found in file op.c =back =head1 Pad Data Structures =over 8 =item CX_CURPAD_SAVE X<CX_CURPAD_SAVE> Save the current pad in the given context block structure. void CX_CURPAD_SAVE(struct context) =for hackers Found in file pad.h =item CX_CURPAD_SV X<CX_CURPAD_SV> Access the SV at offset C<po> in the saved current pad in the given context block structure (can be used as an lvalue). SV * CX_CURPAD_SV(struct context, PADOFFSET po) =for hackers Found in file pad.h =item PAD_BASE_SV X<PAD_BASE_SV> Get the value from slot C<po> in the base (DEPTH=1) pad of a padlist SV * PAD_BASE_SV(PADLIST padlist, PADOFFSET po) =for hackers Found in file pad.h =item PAD_CLONE_VARS X<PAD_CLONE_VARS> Clone the state variables associated with running and compiling pads. void PAD_CLONE_VARS(PerlInterpreter *proto_perl, CLONE_PARAMS* param) =for hackers Found in file pad.h =item PAD_COMPNAME_FLAGS X<PAD_COMPNAME_FLAGS> Return the flags for the current compiling pad name at offset C<po>. Assumes a valid slot entry. U32 PAD_COMPNAME_FLAGS(PADOFFSET po) =for hackers Found in file pad.h =item PAD_COMPNAME_GEN X<PAD_COMPNAME_GEN> The generation number of the name at offset C<po> in the current compiling pad (lvalue). STRLEN PAD_COMPNAME_GEN(PADOFFSET po) =for hackers Found in file pad.h =item PAD_COMPNAME_GEN_set X<PAD_COMPNAME_GEN_set> Sets the generation number of the name at offset C<po> in the current ling pad (lvalue) to C<gen>. STRLEN PAD_COMPNAME_GEN_set(PADOFFSET po, int gen) =for hackers Found in file pad.h =item PAD_COMPNAME_OURSTASH X<PAD_COMPNAME_OURSTASH> Return the stash associated with an C<our> variable. Assumes the slot entry is a valid C<our> lexical. HV * PAD_COMPNAME_OURSTASH(PADOFFSET po) =for hackers Found in file pad.h =item PAD_COMPNAME_PV X<PAD_COMPNAME_PV> Return the name of the current compiling pad name at offset C<po>. Assumes a valid slot entry. char * PAD_COMPNAME_PV(PADOFFSET po) =for hackers Found in file pad.h =item PAD_COMPNAME_TYPE X<PAD_COMPNAME_TYPE> Return the type (stash) of the current compiling pad name at offset C<po>. Must be a valid name. Returns null if not typed. HV * PAD_COMPNAME_TYPE(PADOFFSET po) =for hackers Found in file pad.h =item PadnameIsOUR X<PadnameIsOUR> Whether this is an "our" variable. bool PadnameIsOUR(PADNAME * pn) =for hackers Found in file pad.h =item PadnameIsSTATE X<PadnameIsSTATE> Whether this is a "state" variable. bool PadnameIsSTATE(PADNAME * pn) =for hackers Found in file pad.h =item PadnameOURSTASH X<PadnameOURSTASH> The stash in which this "our" variable was declared. HV * PadnameOURSTASH() =for hackers Found in file pad.h =item PadnameOUTER X<PadnameOUTER> Whether this entry belongs to an outer pad. Entries for which this is true are often referred to as 'fake'. bool PadnameOUTER(PADNAME * pn) =for hackers Found in file pad.h =item PadnameTYPE X<PadnameTYPE> The stash associated with a typed lexical. This returns the C<%Foo::> hash for C<my Foo $bar>. HV * PadnameTYPE(PADNAME * pn) =for hackers Found in file pad.h =item PAD_RESTORE_LOCAL X<PAD_RESTORE_LOCAL> Restore the old pad saved into the local variable C<opad> by C<PAD_SAVE_LOCAL()> void PAD_RESTORE_LOCAL(PAD *opad) =for hackers Found in file pad.h =item PAD_SAVE_LOCAL X<PAD_SAVE_LOCAL> Save the current pad to the local variable C<opad>, then make the current pad equal to C<npad> void PAD_SAVE_LOCAL(PAD *opad, PAD *npad) =for hackers Found in file pad.h =item PAD_SAVE_SETNULLPAD X<PAD_SAVE_SETNULLPAD> Save the current pad then set it to null. void PAD_SAVE_SETNULLPAD() =for hackers Found in file pad.h =item PAD_SETSV X<PAD_SETSV> Set the slot at offset C<po> in the current pad to C<sv> SV * PAD_SETSV(PADOFFSET po, SV* sv) =for hackers Found in file pad.h =item PAD_SET_CUR X<PAD_SET_CUR> Set the current pad to be pad C<n> in the padlist, saving the previous current pad. NB currently this macro expands to a string too long for some compilers, so it's best to replace it with SAVECOMPPAD(); PAD_SET_CUR_NOSAVE(padlist,n); void PAD_SET_CUR(PADLIST padlist, I32 n) =for hackers Found in file pad.h =item PAD_SET_CUR_NOSAVE X<PAD_SET_CUR_NOSAVE> like PAD_SET_CUR, but without the save void PAD_SET_CUR_NOSAVE(PADLIST padlist, I32 n) =for hackers Found in file pad.h =item PAD_SV X<PAD_SV> Get the value at offset C<po> in the current pad SV * PAD_SV(PADOFFSET po) =for hackers Found in file pad.h =item PAD_SVl X<PAD_SVl> Lightweight and lvalue version of C<PAD_SV>. Get or set the value at offset C<po> in the current pad. Unlike C<PAD_SV>, does not print diagnostics with -DX. For internal use only. SV * PAD_SVl(PADOFFSET po) =for hackers Found in file pad.h =item SAVECLEARSV X<SAVECLEARSV> Clear the pointed to pad value on scope exit. (i.e. the runtime action of C<my>) void SAVECLEARSV(SV **svp) =for hackers Found in file pad.h =item SAVECOMPPAD X<SAVECOMPPAD> save C<PL_comppad> and C<PL_curpad> void SAVECOMPPAD() =for hackers Found in file pad.h =item SAVEPADSV X<SAVEPADSV> Save a pad slot (used to restore after an iteration) XXX DAPM it would make more sense to make the arg a PADOFFSET void SAVEPADSV(PADOFFSET po) =for hackers Found in file pad.h =back =head1 Per-Interpreter Variables =over 8 =item PL_DBsingle X<PL_DBsingle> When Perl is run in debugging mode, with the B<-d> switch, this SV is a boolean which indicates whether subs are being single-stepped. Single-stepping is automatically turned on after every step. This is the C variable which corresponds to Perl's $DB::single variable. See C<L</PL_DBsub>>. SV * PL_DBsingle =for hackers Found in file intrpvar.h =item PL_DBsub X<PL_DBsub> When Perl is run in debugging mode, with the B<-d> switch, this GV contains the SV which holds the name of the sub being debugged. This is the C variable which corresponds to Perl's $DB::sub variable. See C<L</PL_DBsingle>>. GV * PL_DBsub =for hackers Found in file intrpvar.h =item PL_DBtrace X<PL_DBtrace> Trace variable used when Perl is run in debugging mode, with the B<-d> switch. This is the C variable which corresponds to Perl's $DB::trace variable. See C<L</PL_DBsingle>>. SV * PL_DBtrace =for hackers Found in file intrpvar.h =item PL_dowarn X<PL_dowarn> The C variable that roughly corresponds to Perl's C<$^W> warning variable. However, C<$^W> is treated as a boolean, whereas C<PL_dowarn> is a collection of flag bits. U8 PL_dowarn =for hackers Found in file intrpvar.h =item PL_last_in_gv X<PL_last_in_gv> The GV which was last used for a filehandle input operation. (C<< <FH> >>) GV* PL_last_in_gv =for hackers Found in file intrpvar.h =item PL_ofsgv X<PL_ofsgv> The glob containing the output field separator - C<*,> in Perl space. GV* PL_ofsgv =for hackers Found in file intrpvar.h =item PL_rs X<PL_rs> The input record separator - C<$/> in Perl space. SV* PL_rs =for hackers Found in file intrpvar.h =back =head1 Stack Manipulation Macros =over 8 =item djSP X<djSP> Declare Just C<SP>. This is actually identical to C<dSP>, and declares a local copy of perl's stack pointer, available via the C<SP> macro. See C<L<perlapi/SP>>. (Available for backward source code compatibility with the old (Perl 5.005) thread model.) djSP(); =for hackers Found in file pp.h =item LVRET X<LVRET> True if this op will be the return value of an lvalue subroutine =for hackers Found in file pp.h =back =head1 SV Flags =over 8 =item SVt_INVLIST X<SVt_INVLIST> Type flag for scalars. See L<perlapi/svtype>. =for hackers Found in file sv.h =back =head1 SV Manipulation Functions An SV (or AV, HV, etc.) is allocated in two parts: the head (struct sv, av, hv...) contains type and reference count information, and for many types, a pointer to the body (struct xrv, xpv, xpviv...), which contains fields specific to each type. Some types store all they need in the head, so don't have a body. In all but the most memory-paranoid configurations (ex: PURIFY), heads and bodies are allocated out of arenas, which by default are approximately 4K chunks of memory parcelled up into N heads or bodies. Sv-bodies are allocated by their sv-type, guaranteeing size consistency needed to allocate safely from arrays. For SV-heads, the first slot in each arena is reserved, and holds a link to the next arena, some flags, and a note of the number of slots. Snaked through each arena chain is a linked list of free items; when this becomes empty, an extra arena is allocated and divided up into N items which are threaded into the free list. SV-bodies are similar, but they use arena-sets by default, which separate the link and info from the arena itself, and reclaim the 1st slot in the arena. SV-bodies are further described later. The following global variables are associated with arenas: PL_sv_arenaroot pointer to list of SV arenas PL_sv_root pointer to list of free SV structures PL_body_arenas head of linked-list of body arenas PL_body_roots[] array of pointers to list of free bodies of svtype arrays are indexed by the svtype needed A few special SV heads are not allocated from an arena, but are instead directly created in the interpreter structure, eg PL_sv_undef. The size of arenas can be changed from the default by setting PERL_ARENA_SIZE appropriately at compile time. The SV arena serves the secondary purpose of allowing still-live SVs to be located and destroyed during final cleanup. At the lowest level, the macros new_SV() and del_SV() grab and free an SV head. (If debugging with -DD, del_SV() calls the function S_del_sv() to return the SV to the free list with error checking.) new_SV() calls more_sv() / sv_add_arena() to add an extra arena if the free list is empty. SVs in the free list have their SvTYPE field set to all ones. At the time of very final cleanup, sv_free_arenas() is called from perl_destruct() to physically free all the arenas allocated since the start of the interpreter. The function visit() scans the SV arenas list, and calls a specified function for each SV it finds which is still live - ie which has an SvTYPE other than all 1's, and a non-zero SvREFCNT. visit() is used by the following functions (specified as [function that calls visit()] / [function called by visit() for each SV]): sv_report_used() / do_report_used() dump all remaining SVs (debugging aid) sv_clean_objs() / do_clean_objs(),do_clean_named_objs(), do_clean_named_io_objs(),do_curse() Attempt to free all objects pointed to by RVs, try to do the same for all objects indir- ectly referenced by typeglobs too, and then do a final sweep, cursing any objects that remain. Called once from perl_destruct(), prior to calling sv_clean_all() below. sv_clean_all() / do_clean_all() SvREFCNT_dec(sv) each remaining SV, possibly triggering an sv_free(). It also sets the SVf_BREAK flag on the SV to indicate that the refcnt has been artificially lowered, and thus stopping sv_free() from giving spurious warnings about SVs which unexpectedly have a refcnt of zero. called repeatedly from perl_destruct() until there are no SVs left. =over 8 =item sv_2num X<sv_2num> NOTE: this function is experimental and may change or be removed without notice. Return an SV with the numeric value of the source SV, doing any necessary reference or overload conversion. The caller is expected to have handled get-magic already. SV* sv_2num(SV *const sv) =for hackers Found in file sv.c =item sv_add_arena X<sv_add_arena> Given a chunk of memory, link it to the head of the list of arenas, and split it into a list of free SVs. void sv_add_arena(char *const ptr, const U32 size, const U32 flags) =for hackers Found in file sv.c =item sv_clean_all X<sv_clean_all> Decrement the refcnt of each remaining SV, possibly triggering a cleanup. This function may have to be called multiple times to free SVs which are in complex self-referential hierarchies. I32 sv_clean_all() =for hackers Found in file sv.c =item sv_clean_objs X<sv_clean_objs> Attempt to destroy all objects not yet freed. void sv_clean_objs() =for hackers Found in file sv.c =item sv_free_arenas X<sv_free_arenas> Deallocate the memory used by all arenas. Note that all the individual SV heads and bodies within the arenas must already have been freed. void sv_free_arenas() =for hackers Found in file sv.c =item SvTHINKFIRST X<SvTHINKFIRST> A quick flag check to see whether an C<sv> should be passed to C<sv_force_normal> to be "downgraded" before C<SvIVX> or C<SvPVX> can be modified directly. For example, if your scalar is a reference and you want to modify the C<SvIVX> slot, you can't just do C<SvROK_off>, as that will leak the referent. This is used internally by various sv-modifying functions, such as C<sv_setsv>, C<sv_setiv> and C<sv_pvn_force>. One case that this does not handle is a gv without SvFAKE set. After if (SvTHINKFIRST(gv)) sv_force_normal(gv); it will still be a gv. C<SvTHINKFIRST> sometimes produces false positives. In those cases C<sv_force_normal> does nothing. U32 SvTHINKFIRST(SV *sv) =for hackers Found in file sv.h =back =head1 Unicode Support These are various utility functions for manipulating UTF8-encoded strings. For the uninitiated, this is a method of representing arbitrary Unicode characters as a variable number of bytes, in such a way that characters in the ASCII range are unmodified, and a zero byte never appears within non-zero characters. =over 8 =item find_uninit_var X<find_uninit_var> NOTE: this function is experimental and may change or be removed without notice. Find the name of the undefined variable (if any) that caused the operator to issue a "Use of uninitialized value" warning. If match is true, only return a name if its value matches C<uninit_sv>. So roughly speaking, if a unary operator (such as C<OP_COS>) generates a warning, then following the direct child of the op may yield an C<OP_PADSV> or C<OP_GV> that gives the name of the undefined variable. On the other hand, with C<OP_ADD> there are two branches to follow, so we only print the variable name if we get an exact match. C<desc_p> points to a string pointer holding the description of the op. This may be updated if needed. The name is returned as a mortal SV. Assumes that C<PL_op> is the OP that originally triggered the error, and that C<PL_comppad>/C<PL_curpad> points to the currently executing pad. SV* find_uninit_var(const OP *const obase, const SV *const uninit_sv, bool match, const char **desc_p) =for hackers Found in file sv.c =item isSCRIPT_RUN X<isSCRIPT_RUN> Returns a bool as to whether or not the sequence of bytes from C<s> up to but not including C<send> form a "script run". C<utf8_target> is TRUE iff the sequence starting at C<s> is to be treated as UTF-8. To be precise, except for two degenerate cases given below, this function returns TRUE iff all code points in it come from any combination of three "scripts" given by the Unicode "Script Extensions" property: Common, Inherited, and possibly one other. Additionally all decimal digits must come from the same consecutive sequence of 10. For example, if all the characters in the sequence are Greek, or Common, or Inherited, this function will return TRUE, provided any decimal digits in it are from the same block of digits in Common. (These are the ASCII digits "0".."9" and additionally a block for full width forms of these, and several others used in mathematical notation.) For scripts (unlike Greek) that have their own digits defined this will accept either digits from that set or from one of the Common digit sets, but not a combination of the two. Some scripts, such as Arabic, have more than one set of digits. All digits must come from the same set for this function to return TRUE. C<*ret_script>, if C<ret_script> is not NULL, will on return of TRUE contain the script found, using the C<SCX_enum> typedef. Its value will be C<SCX_INVALID> if the function returns FALSE. If the sequence is empty, TRUE is returned, but C<*ret_script> (if asked for) will be C<SCX_INVALID>. If the sequence contains a single code point which is unassigned to a character in the version of Unicode being used, the function will return TRUE, and the script will be C<SCX_Unknown>. Any other combination of unassigned code points in the input sequence will result in the function treating the input as not being a script run. The returned script will be C<SCX_Inherited> iff all the code points in it are from the Inherited script. Otherwise, the returned script will be C<SCX_Common> iff all the code points in it are from the Inherited or Common scripts. bool isSCRIPT_RUN(const U8 *s, const U8 *send, const bool utf8_target) =for hackers Found in file regexec.c =item is_utf8_non_invariant_string X<is_utf8_non_invariant_string> Returns TRUE if L<perlapi/is_utf8_invariant_string> returns FALSE for the first C<len> bytes of the string C<s>, but they are, nonetheless, legal Perl-extended UTF-8; otherwise returns FALSE. A TRUE return means that at least one code point represented by the sequence either is a wide character not representable as a single byte, or the representation differs depending on whether the sequence is encoded in UTF-8 or not. See also C<L<perlapi/is_utf8_invariant_string>>, C<L<perlapi/is_utf8_string>> bool is_utf8_non_invariant_string(const U8* const s, STRLEN len) =for hackers Found in file inline.h =item report_uninit X<report_uninit> Print appropriate "Use of uninitialized variable" warning. void report_uninit(const SV *uninit_sv) =for hackers Found in file sv.c =item utf8_to_uvuni_buf X<utf8_to_uvuni_buf> DEPRECATED! It is planned to remove this function from a future release of Perl. Do not use it for new code; remove it from existing code. Only in very rare circumstances should code need to be dealing in Unicode (as opposed to native) code points. In those few cases, use C<L<NATIVE_TO_UNI(utf8_to_uvchr_buf(...))|perlapi/utf8_to_uvchr_buf>> instead. If you are not absolutely sure this is one of those cases, then assume it isn't and use plain C<utf8_to_uvchr_buf> instead. Returns the Unicode (not-native) code point of the first character in the string C<s> which is assumed to be in UTF-8 encoding; C<send> points to 1 beyond the end of C<s>. C<retlen> will be set to the length, in bytes, of that character. If C<s> does not point to a well-formed UTF-8 character and UTF8 warnings are enabled, zero is returned and C<*retlen> is set (if C<retlen> isn't NULL) to -1. If those warnings are off, the computed value if well-defined (or the Unicode REPLACEMENT CHARACTER, if not) is silently returned, and C<*retlen> is set (if C<retlen> isn't NULL) so that (S<C<s> + C<*retlen>>) is the next possible position in C<s> that could begin a non-malformed character. See L<perlapi/utf8n_to_uvchr> for details on when the REPLACEMENT CHARACTER is returned. UV utf8_to_uvuni_buf(const U8 *s, const U8 *send, STRLEN *retlen) =for hackers Found in file utf8.c =item uvoffuni_to_utf8_flags X<uvoffuni_to_utf8_flags> THIS FUNCTION SHOULD BE USED IN ONLY VERY SPECIALIZED CIRCUMSTANCES. Instead, B<Almost all code should use L<perlapi/uvchr_to_utf8> or L<perlapi/uvchr_to_utf8_flags>>. This function is like them, but the input is a strict Unicode (as opposed to native) code point. Only in very rare circumstances should code not be using the native code point. For details, see the description for L<perlapi/uvchr_to_utf8_flags>. U8* uvoffuni_to_utf8_flags(U8 *d, UV uv, const UV flags) =for hackers Found in file utf8.c =item valid_utf8_to_uvchr X<valid_utf8_to_uvchr> Like C<L<perlapi/utf8_to_uvchr_buf>>, but should only be called when it is known that the next character in the input UTF-8 string C<s> is well-formed (I<e.g.>, it passes C<L<perlapi/isUTF8_CHAR>>. Surrogates, non-character code points, and non-Unicode code points are allowed. UV valid_utf8_to_uvchr(const U8 *s, STRLEN *retlen) =for hackers Found in file inline.h =item variant_under_utf8_count X<variant_under_utf8_count> This function looks at the sequence of bytes between C<s> and C<e>, which are assumed to be encoded in ASCII/Latin1, and returns how many of them would change should the string be translated into UTF-8. Due to the nature of UTF-8, each of these would occupy two bytes instead of the single one in the input string. Thus, this function returns the precise number of bytes the string would expand by when translated to UTF-8. Unlike most of the other functions that have C<utf8> in their name, the input to this function is NOT a UTF-8-encoded string. The function name is slightly I<odd> to emphasize this. This function is internal to Perl because khw thinks that any XS code that would want this is probably operating too close to the internals. Presenting a valid use case could change that. See also C<L<perlapi/is_utf8_invariant_string>> and C<L<perlapi/is_utf8_invariant_string_loc>>, Size_t variant_under_utf8_count(const U8* const s, const U8* const e) =for hackers Found in file inline.h =back =head1 Undocumented functions The following functions are currently undocumented. If you use one of them, you may wish to consider creating and submitting documentation for it. =over =item ASCII_TO_NEED X<ASCII_TO_NEED> =item NATIVE_TO_NEED X<NATIVE_TO_NEED> =item POPMARK X<POPMARK> =item PadnameIN_SCOPE X<PadnameIN_SCOPE> =item PerlIO_restore_errno X<PerlIO_restore_errno> =item PerlIO_save_errno X<PerlIO_save_errno> =item PerlLIO_dup2_cloexec X<PerlLIO_dup2_cloexec> =item PerlLIO_dup_cloexec X<PerlLIO_dup_cloexec> =item PerlLIO_open3_cloexec X<PerlLIO_open3_cloexec> =item PerlLIO_open_cloexec X<PerlLIO_open_cloexec> =item PerlProc_pipe_cloexec X<PerlProc_pipe_cloexec> =item PerlSock_accept_cloexec X<PerlSock_accept_cloexec> =item PerlSock_socket_cloexec X<PerlSock_socket_cloexec> =item PerlSock_socketpair_cloexec X<PerlSock_socketpair_cloexec> =item ReANY X<ReANY> =item Slab_Alloc X<Slab_Alloc> =item Slab_Free X<Slab_Free> =item Slab_to_ro X<Slab_to_ro> =item Slab_to_rw X<Slab_to_rw> =item TOPMARK X<TOPMARK> =item _add_range_to_invlist X<_add_range_to_invlist> =item _byte_dump_string X<_byte_dump_string> =item _force_out_malformed_utf8_message X<_force_out_malformed_utf8_message> =item _inverse_folds X<_inverse_folds> =item _invlistEQ X<_invlistEQ> =item _invlist_array_init X<_invlist_array_init> =item _invlist_contains_cp X<_invlist_contains_cp> =item _invlist_dump X<_invlist_dump> =item _invlist_intersection X<_invlist_intersection> =item _invlist_intersection_maybe_complement_2nd X<_invlist_intersection_maybe_complement_2nd> =item _invlist_invert X<_invlist_invert> =item _invlist_len X<_invlist_len> =item _invlist_search X<_invlist_search> =item _invlist_subtract X<_invlist_subtract> =item _invlist_union X<_invlist_union> =item _invlist_union_maybe_complement_2nd X<_invlist_union_maybe_complement_2nd> =item _is_cur_LC_category_utf8 X<_is_cur_LC_category_utf8> =item _is_in_locale_category X<_is_in_locale_category> =item _is_uni_FOO X<_is_uni_FOO> =item _is_uni_perl_idcont X<_is_uni_perl_idcont> =item _is_uni_perl_idstart X<_is_uni_perl_idstart> =item _is_utf8_FOO X<_is_utf8_FOO> =item _is_utf8_perl_idcont X<_is_utf8_perl_idcont> =item _is_utf8_perl_idstart X<_is_utf8_perl_idstart> =item _mem_collxfrm X<_mem_collxfrm> =item _new_invlist X<_new_invlist> =item _new_invlist_C_array X<_new_invlist_C_array> =item _setup_canned_invlist X<_setup_canned_invlist> =item _to_fold_latin1 X<_to_fold_latin1> =item _to_uni_fold_flags X<_to_uni_fold_flags> =item _to_upper_title_latin1 X<_to_upper_title_latin1> =item _to_utf8_fold_flags X<_to_utf8_fold_flags> =item _to_utf8_lower_flags X<_to_utf8_lower_flags> =item _to_utf8_title_flags X<_to_utf8_title_flags> =item _to_utf8_upper_flags X<_to_utf8_upper_flags> =item _utf8n_to_uvchr_msgs_helper X<_utf8n_to_uvchr_msgs_helper> =item _warn_problematic_locale X<_warn_problematic_locale> =item abort_execution X<abort_execution> =item add_cp_to_invlist X<add_cp_to_invlist> =item alloc_LOGOP X<alloc_LOGOP> =item allocmy X<allocmy> =item amagic_cmp X<amagic_cmp> =item amagic_cmp_desc X<amagic_cmp_desc> =item amagic_cmp_locale X<amagic_cmp_locale> =item amagic_cmp_locale_desc X<amagic_cmp_locale_desc> =item amagic_i_ncmp X<amagic_i_ncmp> =item amagic_i_ncmp_desc X<amagic_i_ncmp_desc> =item amagic_is_enabled X<amagic_is_enabled> =item amagic_ncmp X<amagic_ncmp> =item amagic_ncmp_desc X<amagic_ncmp_desc> =item append_utf8_from_native_byte X<append_utf8_from_native_byte> =item apply X<apply> =item av_extend_guts X<av_extend_guts> =item av_nonelem X<av_nonelem> =item av_reify X<av_reify> =item bind_match X<bind_match> =item boot_core_PerlIO X<boot_core_PerlIO> =item boot_core_UNIVERSAL X<boot_core_UNIVERSAL> =item boot_core_mro X<boot_core_mro> =item cando X<cando> =item check_utf8_print X<check_utf8_print> =item ck_anoncode X<ck_anoncode> =item ck_backtick X<ck_backtick> =item ck_bitop X<ck_bitop> =item ck_cmp X<ck_cmp> =item ck_concat X<ck_concat> =item ck_defined X<ck_defined> =item ck_delete X<ck_delete> =item ck_each X<ck_each> =item ck_entersub_args_core X<ck_entersub_args_core> =item ck_eof X<ck_eof> =item ck_eval X<ck_eval> =item ck_exec X<ck_exec> =item ck_exists X<ck_exists> =item ck_ftst X<ck_ftst> =item ck_fun X<ck_fun> =item ck_glob X<ck_glob> =item ck_grep X<ck_grep> =item ck_index X<ck_index> =item ck_isa X<ck_isa> =item ck_join X<ck_join> =item ck_length X<ck_length> =item ck_lfun X<ck_lfun> =item ck_listiob X<ck_listiob> =item ck_match X<ck_match> =item ck_method X<ck_method> =item ck_null X<ck_null> =item ck_open X<ck_open> =item ck_prototype X<ck_prototype> =item ck_readline X<ck_readline> =item ck_refassign X<ck_refassign> =item ck_repeat X<ck_repeat> =item ck_require X<ck_require> =item ck_return X<ck_return> =item ck_rfun X<ck_rfun> =item ck_rvconst X<ck_rvconst> =item ck_sassign X<ck_sassign> =item ck_select X<ck_select> =item ck_shift X<ck_shift> =item ck_smartmatch X<ck_smartmatch> =item ck_sort X<ck_sort> =item ck_spair X<ck_spair> =item ck_split X<ck_split> =item ck_stringify X<ck_stringify> =item ck_subr X<ck_subr> =item ck_substr X<ck_substr> =item ck_svconst X<ck_svconst> =item ck_tell X<ck_tell> =item ck_trunc X<ck_trunc> =item closest_cop X<closest_cop> =item cmp_desc X<cmp_desc> =item cmp_locale_desc X<cmp_locale_desc> =item cmpchain_extend X<cmpchain_extend> =item cmpchain_finish X<cmpchain_finish> =item cmpchain_start X<cmpchain_start> =item cntrl_to_mnemonic X<cntrl_to_mnemonic> =item coresub_op X<coresub_op> =item create_eval_scope X<create_eval_scope> =item croak_caller X<croak_caller> =item croak_memory_wrap X<croak_memory_wrap> =item croak_no_mem X<croak_no_mem> =item croak_popstack X<croak_popstack> =item current_re_engine X<current_re_engine> =item custom_op_get_field X<custom_op_get_field> =item cv_ckproto_len_flags X<cv_ckproto_len_flags> =item cv_clone_into X<cv_clone_into> =item cv_const_sv_or_av X<cv_const_sv_or_av> =item cv_undef_flags X<cv_undef_flags> =item cvgv_from_hek X<cvgv_from_hek> =item cvgv_set X<cvgv_set> =item cvstash_set X<cvstash_set> =item deb_stack_all X<deb_stack_all> =item defelem_target X<defelem_target> =item delete_eval_scope X<delete_eval_scope> =item delimcpy_no_escape X<delimcpy_no_escape> =item die_unwind X<die_unwind> =item do_aexec X<do_aexec> =item do_aexec5 X<do_aexec5> =item do_eof X<do_eof> =item do_exec X<do_exec> =item do_exec3 X<do_exec3> =item do_ipcctl X<do_ipcctl> =item do_ipcget X<do_ipcget> =item do_msgrcv X<do_msgrcv> =item do_msgsnd X<do_msgsnd> =item do_ncmp X<do_ncmp> =item do_open6 X<do_open6> =item do_open_raw X<do_open_raw> =item do_print X<do_print> =item do_readline X<do_readline> =item do_seek X<do_seek> =item do_semop X<do_semop> =item do_shmio X<do_shmio> =item do_sysseek X<do_sysseek> =item do_tell X<do_tell> =item do_trans X<do_trans> =item do_uniprop_match X<do_uniprop_match> =item do_vecget X<do_vecget> =item do_vecset X<do_vecset> =item do_vop X<do_vop> =item does_utf8_overflow X<does_utf8_overflow> =item dofile X<dofile> =item drand48_init_r X<drand48_init_r> =item drand48_r X<drand48_r> =item dtrace_probe_call X<dtrace_probe_call> =item dtrace_probe_load X<dtrace_probe_load> =item dtrace_probe_op X<dtrace_probe_op> =item dtrace_probe_phase X<dtrace_probe_phase> =item dump_all_perl X<dump_all_perl> =item dump_packsubs_perl X<dump_packsubs_perl> =item dump_sub_perl X<dump_sub_perl> =item dump_sv_child X<dump_sv_child> =item dup_warnings X<dup_warnings> =item emulate_cop_io X<emulate_cop_io> =item find_first_differing_byte_pos X<find_first_differing_byte_pos> =item find_lexical_cv X<find_lexical_cv> =item find_runcv_where X<find_runcv_where> =item find_script X<find_script> =item foldEQ_latin1_s2_folded X<foldEQ_latin1_s2_folded> =item foldEQ_utf8_flags X<foldEQ_utf8_flags> =item form_alien_digit_msg X<form_alien_digit_msg> =item form_cp_too_large_msg X<form_cp_too_large_msg> =item free_tied_hv_pool X<free_tied_hv_pool> =item get_and_check_backslash_N_name X<get_and_check_backslash_N_name> =item get_db_sub X<get_db_sub> =item get_debug_opts X<get_debug_opts> =item get_deprecated_property_msg X<get_deprecated_property_msg> =item get_hash_seed X<get_hash_seed> =item get_invlist_iter_addr X<get_invlist_iter_addr> =item get_invlist_offset_addr X<get_invlist_offset_addr> =item get_invlist_previous_index_addr X<get_invlist_previous_index_addr> =item get_no_modify X<get_no_modify> =item get_opargs X<get_opargs> =item get_prop_definition X<get_prop_definition> =item get_prop_values X<get_prop_values> =item get_re_arg X<get_re_arg> =item get_re_gclass_nonbitmap_data X<get_re_gclass_nonbitmap_data> =item get_regclass_nonbitmap_data X<get_regclass_nonbitmap_data> =item get_regex_charset_name X<get_regex_charset_name> =item getenv_len X<getenv_len> =item grok_bin_oct_hex X<grok_bin_oct_hex> =item grok_bslash_c X<grok_bslash_c> =item grok_bslash_o X<grok_bslash_o> =item grok_bslash_x X<grok_bslash_x> =item gv_fetchmeth_internal X<gv_fetchmeth_internal> =item gv_override X<gv_override> =item gv_setref X<gv_setref> =item gv_stashpvn_internal X<gv_stashpvn_internal> =item gv_stashsvpvn_cached X<gv_stashsvpvn_cached> =item hfree_next_entry X<hfree_next_entry> =item hv_backreferences_p X<hv_backreferences_p> =item hv_kill_backrefs X<hv_kill_backrefs> =item hv_placeholders_p X<hv_placeholders_p> =item hv_pushkv X<hv_pushkv> =item hv_undef_flags X<hv_undef_flags> =item init_argv_symbols X<init_argv_symbols> =item init_constants X<init_constants> =item init_dbargs X<init_dbargs> =item init_debugger X<init_debugger> =item init_i18nl10n X<init_i18nl10n> =item init_i18nl14n X<init_i18nl14n> =item init_named_cv X<init_named_cv> =item init_uniprops X<init_uniprops> =item invert X<invert> =item invlist_array X<invlist_array> =item invlist_clear X<invlist_clear> =item invlist_clone X<invlist_clone> =item invlist_contents X<invlist_contents> =item invlist_extend X<invlist_extend> =item invlist_highest X<invlist_highest> =item invlist_is_iterating X<invlist_is_iterating> =item invlist_iterfinish X<invlist_iterfinish> =item invlist_iterinit X<invlist_iterinit> =item invlist_iternext X<invlist_iternext> =item invlist_lowest X<invlist_lowest> =item invlist_max X<invlist_max> =item invlist_previous_index X<invlist_previous_index> =item invlist_set_len X<invlist_set_len> =item invlist_set_previous_index X<invlist_set_previous_index> =item invlist_trim X<invlist_trim> =item invmap_dump X<invmap_dump> =item io_close X<io_close> =item isFF_OVERLONG X<isFF_OVERLONG> =item isFOO_lc X<isFOO_lc> =item is_grapheme X<is_grapheme> =item is_invlist X<is_invlist> =item is_utf8_char_helper X<is_utf8_char_helper> =item is_utf8_common X<is_utf8_common> =item is_utf8_overlong_given_start_byte_ok X<is_utf8_overlong_given_start_byte_ok> =item jmaybe X<jmaybe> =item keyword X<keyword> =item keyword_plugin_standard X<keyword_plugin_standard> =item list X<list> =item load_charnames X<load_charnames> =item localize X<localize> =item lossless_NV_to_IV X<lossless_NV_to_IV> =item magic_clear_all_env X<magic_clear_all_env> =item magic_cleararylen_p X<magic_cleararylen_p> =item magic_clearenv X<magic_clearenv> =item magic_clearisa X<magic_clearisa> =item magic_clearpack X<magic_clearpack> =item magic_clearsig X<magic_clearsig> =item magic_copycallchecker X<magic_copycallchecker> =item magic_existspack X<magic_existspack> =item magic_freearylen_p X<magic_freearylen_p> =item magic_freeovrld X<magic_freeovrld> =item magic_get X<magic_get> =item magic_getarylen X<magic_getarylen> =item magic_getdebugvar X<magic_getdebugvar> =item magic_getdefelem X<magic_getdefelem> =item magic_getnkeys X<magic_getnkeys> =item magic_getpack X<magic_getpack> =item magic_getpos X<magic_getpos> =item magic_getsig X<magic_getsig> =item magic_getsubstr X<magic_getsubstr> =item magic_gettaint X<magic_gettaint> =item magic_getuvar X<magic_getuvar> =item magic_getvec X<magic_getvec> =item magic_killbackrefs X<magic_killbackrefs> =item magic_nextpack X<magic_nextpack> =item magic_regdata_cnt X<magic_regdata_cnt> =item magic_regdatum_get X<magic_regdatum_get> =item magic_regdatum_set X<magic_regdatum_set> =item magic_scalarpack X<magic_scalarpack> =item magic_set X<magic_set> =item magic_set_all_env X<magic_set_all_env> =item magic_setarylen X<magic_setarylen> =item magic_setcollxfrm X<magic_setcollxfrm> =item magic_setdbline X<magic_setdbline> =item magic_setdebugvar X<magic_setdebugvar> =item magic_setdefelem X<magic_setdefelem> =item magic_setenv X<magic_setenv> =item magic_setisa X<magic_setisa> =item magic_setlvref X<magic_setlvref> =item magic_setmglob X<magic_setmglob> =item magic_setnkeys X<magic_setnkeys> =item magic_setnonelem X<magic_setnonelem> =item magic_setpack X<magic_setpack> =item magic_setpos X<magic_setpos> =item magic_setregexp X<magic_setregexp> =item magic_setsig X<magic_setsig> =item magic_setsubstr X<magic_setsubstr> =item magic_settaint X<magic_settaint> =item magic_setutf8 X<magic_setutf8> =item magic_setuvar X<magic_setuvar> =item magic_setvec X<magic_setvec> =item magic_sizepack X<magic_sizepack> =item magic_wipepack X<magic_wipepack> =item malloc_good_size X<malloc_good_size> =item malloced_size X<malloced_size> =item mem_collxfrm X<mem_collxfrm> =item mem_log_alloc X<mem_log_alloc> =item mem_log_free X<mem_log_free> =item mem_log_realloc X<mem_log_realloc> =item mg_find_mglob X<mg_find_mglob> =item mode_from_discipline X<mode_from_discipline> =item more_bodies X<more_bodies> =item mortal_getenv X<mortal_getenv> =item mro_meta_dup X<mro_meta_dup> =item mro_meta_init X<mro_meta_init> =item multiconcat_stringify X<multiconcat_stringify> =item multideref_stringify X<multideref_stringify> =item my_atof2 X<my_atof2> =item my_atof3 X<my_atof3> =item my_attrs X<my_attrs> =item my_clearenv X<my_clearenv> =item my_lstat_flags X<my_lstat_flags> =item my_memrchr X<my_memrchr> =item my_mkostemp X<my_mkostemp> =item my_mkostemp_cloexec X<my_mkostemp_cloexec> =item my_mkstemp X<my_mkstemp> =item my_mkstemp_cloexec X<my_mkstemp_cloexec> =item my_stat_flags X<my_stat_flags> =item my_strerror X<my_strerror> =item my_unexec X<my_unexec> =item newGP X<newGP> =item newMETHOP_internal X<newMETHOP_internal> =item newSTUB X<newSTUB> =item newSVavdefelem X<newSVavdefelem> =item newXS_deffile X<newXS_deffile> =item new_warnings_bitfield X<new_warnings_bitfield> =item nextargv X<nextargv> =item noperl_die X<noperl_die> =item notify_parser_that_changed_to_utf8 X<notify_parser_that_changed_to_utf8> =item oopsAV X<oopsAV> =item oopsHV X<oopsHV> =item op_clear X<op_clear> =item op_integerize X<op_integerize> =item op_lvalue_flags X<op_lvalue_flags> =item op_refcnt_dec X<op_refcnt_dec> =item op_refcnt_inc X<op_refcnt_inc> =item op_relocate_sv X<op_relocate_sv> =item op_std_init X<op_std_init> =item op_unscope X<op_unscope> =item opmethod_stash X<opmethod_stash> =item opslab_force_free X<opslab_force_free> =item opslab_free X<opslab_free> =item opslab_free_nopad X<opslab_free_nopad> =item package X<package> =item package_version X<package_version> =item pad_add_weakref X<pad_add_weakref> =item padlist_store X<padlist_store> =item padname_free X<padname_free> =item padnamelist_free X<padnamelist_free> =item parse_unicode_opts X<parse_unicode_opts> =item parser_free X<parser_free> =item parser_free_nexttoke_ops X<parser_free_nexttoke_ops> =item path_is_searchable X<path_is_searchable> =item peep X<peep> =item pmruntime X<pmruntime> =item populate_isa X<populate_isa> =item ptr_hash X<ptr_hash> =item qerror X<qerror> =item re_exec_indentf X<re_exec_indentf> =item re_indentf X<re_indentf> =item re_intuit_start X<re_intuit_start> =item re_intuit_string X<re_intuit_string> =item re_op_compile X<re_op_compile> =item re_printf X<re_printf> =item reg_named_buff X<reg_named_buff> =item reg_named_buff_iter X<reg_named_buff_iter> =item reg_numbered_buff_fetch X<reg_numbered_buff_fetch> =item reg_numbered_buff_length X<reg_numbered_buff_length> =item reg_numbered_buff_store X<reg_numbered_buff_store> =item reg_qr_package X<reg_qr_package> =item reg_skipcomment X<reg_skipcomment> =item reg_temp_copy X<reg_temp_copy> =item regcurly X<regcurly> =item regprop X<regprop> =item report_evil_fh X<report_evil_fh> =item report_redefined_cv X<report_redefined_cv> =item report_wrongway_fh X<report_wrongway_fh> =item rpeep X<rpeep> =item rsignal_restore X<rsignal_restore> =item rsignal_save X<rsignal_save> =item rxres_save X<rxres_save> =item same_dirent X<same_dirent> =item save_strlen X<save_strlen> =item save_to_buffer X<save_to_buffer> =item sawparens X<sawparens> =item scalar X<scalar> =item scalarvoid X<scalarvoid> =item scan_str X<scan_str> =item scan_word X<scan_word> =item set_caret_X X<set_caret_X> =item set_numeric_standard X<set_numeric_standard> =item set_numeric_underlying X<set_numeric_underlying> =item set_padlist X<set_padlist> =item setfd_cloexec X<setfd_cloexec> =item setfd_cloexec_for_nonsysfd X<setfd_cloexec_for_nonsysfd> =item setfd_cloexec_or_inhexec_by_sysfdness X<setfd_cloexec_or_inhexec_by_sysfdness> =item setfd_inhexec X<setfd_inhexec> =item setfd_inhexec_for_sysfd X<setfd_inhexec_for_sysfd> =item should_warn_nl X<should_warn_nl> =item should_we_output_Debug_r X<should_we_output_Debug_r> =item sighandler X<sighandler> =item sighandler1 X<sighandler1> =item sighandler3 X<sighandler3> =item skipspace_flags X<skipspace_flags> =item softref2xv X<softref2xv> =item sortsv_flags_impl X<sortsv_flags_impl> =item sub_crush_depth X<sub_crush_depth> =item sv_add_backref X<sv_add_backref> =item sv_buf_to_ro X<sv_buf_to_ro> =item sv_del_backref X<sv_del_backref> =item sv_free2 X<sv_free2> =item sv_i_ncmp X<sv_i_ncmp> =item sv_i_ncmp_desc X<sv_i_ncmp_desc> =item sv_kill_backrefs X<sv_kill_backrefs> =item sv_len_utf8_nomg X<sv_len_utf8_nomg> =item sv_magicext_mglob X<sv_magicext_mglob> =item sv_ncmp X<sv_ncmp> =item sv_ncmp_desc X<sv_ncmp_desc> =item sv_only_taint_gmagic X<sv_only_taint_gmagic> =item sv_or_pv_pos_u2b X<sv_or_pv_pos_u2b> =item sv_resetpvn X<sv_resetpvn> =item sv_sethek X<sv_sethek> =item sv_setsv_cow X<sv_setsv_cow> =item sv_unglob X<sv_unglob> =item tied_method X<tied_method> =item tmps_grow_p X<tmps_grow_p> =item to_uni_fold X<to_uni_fold> =item to_uni_lower X<to_uni_lower> =item to_uni_title X<to_uni_title> =item to_uni_upper X<to_uni_upper> =item translate_substr_offsets X<translate_substr_offsets> =item try_amagic_bin X<try_amagic_bin> =item try_amagic_un X<try_amagic_un> =item uiv_2buf X<uiv_2buf> =item unshare_hek X<unshare_hek> =item utf16_to_utf8 X<utf16_to_utf8> =item utf16_to_utf8_reversed X<utf16_to_utf8_reversed> =item utf8_to_uvchr_buf_helper X<utf8_to_uvchr_buf_helper> =item utilize X<utilize> =item uvoffuni_to_utf8_flags_msgs X<uvoffuni_to_utf8_flags_msgs> =item uvuni_to_utf8 X<uvuni_to_utf8> =item valid_utf8_to_uvuni X<valid_utf8_to_uvuni> =item variant_byte_number X<variant_byte_number> =item varname X<varname> =item vivify_defelem X<vivify_defelem> =item vivify_ref X<vivify_ref> =item wait4pid X<wait4pid> =item was_lvalue_sub X<was_lvalue_sub> =item watch X<watch> =item win32_croak_not_implemented X<win32_croak_not_implemented> =item write_to_stderr X<write_to_stderr> =item xs_boot_epilog X<xs_boot_epilog> =item xs_handshake X<xs_handshake> =item yyerror X<yyerror> =item yyerror_pv X<yyerror_pv> =item yyerror_pvn X<yyerror_pvn> =item yylex X<yylex> =item yyparse X<yyparse> =item yyquit X<yyquit> =item yyunlex X<yyunlex> =back =head1 AUTHORS The autodocumentation system was originally added to the Perl core by Benjamin Stuhl. Documentation is by whoever was kind enough to document their functions. =head1 SEE ALSO F<config.h> L<perlapi> L<perlapio> L<perlcall> L<perlclib> L<perlfilter> L<perlguts> L<perlmroapi> L<perlxs> L<perlxstut> L<warnings> =cut ex: set ro: PK �=�[6��� � perl5263delta.podnu �[��� =encoding utf8 =head1 NAME perl5263delta - what is new for perl v5.26.3 =head1 DESCRIPTION This document describes differences between the 5.26.2 release and the 5.26.3 release. If you are upgrading from an earlier release such as 5.26.1, first read L<perl5262delta>, which describes differences between 5.26.1 and 5.26.2. =head1 Security =head2 [CVE-2018-12015] Directory traversal in module Archive::Tar By default, L<Archive::Tar> doesn't allow extracting files outside the current working directory. However, this secure extraction mode could be bypassed by putting a symlink and a regular file with the same name into the tar file. L<[perl #133250]|https://rt.perl.org/Ticket/Display.html?id=133250> L<[cpan #125523]|https://rt.cpan.org/Ticket/Display.html?id=125523> =head2 [CVE-2018-18311] Integer overflow leading to buffer overflow and segmentation fault Integer arithmetic in C<Perl_my_setenv()> could wrap when the combined length of the environment variable name and value exceeded around 0x7fffffff. This could lead to writing beyond the end of an allocated buffer with attacker supplied data. L<[perl #133204]|https://rt.perl.org/Ticket/Display.html?id=133204> =head2 [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c) A crafted regular expression could cause heap-buffer-overflow write during compilation, potentially allowing arbitrary code execution. L<[perl #133423]|https://rt.perl.org/Ticket/Display.html?id=133423> =head2 [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N (regcomp.c) A crafted regular expression could cause heap-buffer-overflow read during compilation, potentially leading to sensitive information being leaked. L<[perl #133192]|https://rt.perl.org/Ticket/Display.html?id=133192> =head2 [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c) A crafted regular expression could cause heap-buffer-overflow write during compilation, potentially allowing arbitrary code execution. L<[perl #131649]|https://rt.perl.org/Ticket/Display.html?id=131649> =head1 Incompatible Changes There are no changes intentionally incompatible with 5.26.2. If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Archive::Tar> has been upgraded from version 2.24 to 2.24_01. =item * L<Module::CoreList> has been upgraded from version 5.20180414_26 to 5.20181129_26. =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 New Diagnostics =head3 New Errors =over 4 =item * L<Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Unexpected ']' with no following ')' in (?[... in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>"> (F) While parsing an extended character class a ']' character was encountered at a point in the definition where the only legal use of ']' is to close the character class definition as part of a '])', you may have forgotten the close paren, or otherwise confused the parser. =item * L<Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>"> (F) While parsing a nested extended character class like: (?[ ... (?flags:(?[ ... ])) ... ]) ^ we expected to see a close paren ')' (marked by ^) but did not. =item * L<Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Expecting close paren for wrapper for nested extended charclass in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>"> (F) While parsing a nested extended character class like: (?[ ... (?flags:(?[ ... ])) ... ]) ^ we expected to see a close paren ')' (marked by ^) but did not. =back =head2 Changes to Existing Diagnostics =over 4 =item * L<Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>|perldiag/"Syntax error in (?[...]) in regex; marked by E<lt>-- HERE in mE<sol>%sE<sol>"> This fatal error message has been slightly expanded (from "Syntax error in (?[...]) in regex mE<sol>%sE<sol>") for greater clarity. =back =head1 Acknowledgements Perl 5.26.3 represents approximately 8 months of development since Perl 5.26.2 and contains approximately 4,500 lines of changes across 51 files from 15 authors. Excluding auto-generated files, documentation and release tools, there were approximately 770 lines of changes to 10 .pm, .t, .c and .h files. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.26.3: Aaron Crane, Abigail, Chris 'BinGOs' Williams, Dagfinn Ilmari Mannsåker, David Mitchell, H.Merijn Brand, James E Keenan, John SJ Anderson, Karen Etheridge, Karl Williamson, Sawyer X, Steve Hay, Todd Rinaldo, Tony Cook, Yves Orton. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the perl bug database at L<https://rt.perl.org/> . There may also be information at L<http://www.perl.org/> , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications which make it inappropriate to send to a publicly archived mailing list, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 Give Thanks If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the C<perlthanks> program: perlthanks This will send an email to the Perl 5 Porters list with your show of thanks. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[��z�@ �@ perlos390.podnu �[��� This document is written in pod format hence there are punctuation characters in odd places. Do not worry, you have apparently got the ASCII->EBCDIC translation worked out correctly. You can read more about pod in pod/perlpod.pod or the short summary in the INSTALL file. =head1 NAME perlos390 - building and installing Perl for OS/390 and z/OS =head1 SYNOPSIS This document will help you Configure, build, test and install Perl on OS/390 (aka z/OS) Unix System Services. B<This document needs to be updated, but we don't know what it should say. Please submit comments to L<https://github.com/Perl/perl5/issues>.> =head1 DESCRIPTION This is a fully ported Perl for OS/390 Version 2 Release 3, 5, 6, 7, 8, and 9. It may work on other versions or releases, but those are the ones we have tested it on. You may need to carry out some system configuration tasks before running the Configure script for Perl. =head2 Tools The z/OS Unix Tools and Toys list may prove helpful and contains links to ports of much of the software helpful for building Perl. L<http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html> =head2 Unpacking Perl distribution on OS/390 If using ftp remember to transfer the distribution in binary format. Gunzip/gzip for OS/390 is discussed at: http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html to extract an ASCII tar archive on OS/390, try this: pax -o to=IBM-1047,from=ISO8859-1 -r < latest.tar or zcat latest.tar.Z | pax -o to=IBM-1047,from=ISO8859-1 -r If you get lots of errors of the form tar: FSUM7171 ...: cannot set uid/gid: EDC5139I Operation not permitted you did not read the above and tried to use tar instead of pax, you'll first have to remove the (now corrupt) perl directory rm -rf perl-... and then use pax. =head2 Setup and utilities for Perl on OS/390 Be sure that your yacc installation is in place including any necessary parser template files. If you have not already done so then be sure to: cp /samples/yyparse.c /etc This may also be a good time to ensure that your /etc/protocol file and either your /etc/resolv.conf or /etc/hosts files are in place. The IBM document that described such USS system setup issues was SC28-1890-07 "OS/390 UNIX System Services Planning", in particular Chapter 6 on customizing the OE shell. GNU make for OS/390, which is recommended for the build of perl (as well as building CPAN modules and extensions), is available from the L</Tools>. Some people have reported encountering "Out of memory!" errors while trying to build Perl using GNU make binaries. If you encounter such trouble then try to download the source code kit and build GNU make from source to eliminate any such trouble. You might also find GNU make (as well as Perl and Apache) in the red-piece/book "Open Source Software for OS/390 UNIX", SG24-5944-00 from IBM. If instead of the recommended GNU make you would like to use the system supplied make program then be sure to install the default rules file properly via the shell command: cp /samples/startup.mk /etc and be sure to also set the environment variable _C89_CCMODE=1 (exporting _C89_CCMODE=1 is also a good idea for users of GNU make). You might also want to have GNU groff for OS/390 installed before running the "make install" step for Perl. There is a syntax error in the /usr/include/sys/socket.h header file that IBM supplies with USS V2R7, V2R8, and possibly V2R9. The problem with the header file is that near the definition of the SO_REUSEPORT constant there is a spurious extra '/' character outside of a comment like so: #define SO_REUSEPORT 0x0200 /* allow local address & port reuse */ / You could edit that header yourself to remove that last '/', or you might note that Language Environment (LE) APAR PQ39997 describes the problem and PTF's UQ46272 and UQ46271 are the (R8 at least) fixes and apply them. If left unattended that syntax error will turn up as an inability for Perl to build its "Socket" extension. For successful testing you may need to turn on the sticky bit for your world readable /tmp directory if you have not already done so (see man chmod). =head2 Configure Perl on OS/390 Once you have unpacked the distribution, run "sh Configure" (see INSTALL for a full discussion of the Configure options). There is a "hints" file for os390 that specifies the correct values for most things. Some things to watch out for include: =head3 Shell A message of the form: (I see you are using the Korn shell. Some ksh's blow up on Configure, mainly on older exotic systems. If yours does, try the Bourne shell instead.) is nothing to worry about at all. =head3 Samples Some of the parser default template files in /samples are needed in /etc. In particular be sure that you at least copy /samples/yyparse.c to /etc before running Perl's Configure. This step ensures successful extraction of EBCDIC versions of parser files such as perly.c and perly.h. This has to be done before running Configure the first time. If you failed to do so then the easiest way to re-Configure Perl is to delete your misconfigured build root and re-extract the source from the tar ball. Then you must ensure that /etc/yyparse.c is properly in place before attempting to re-run Configure. =head3 Dynamic loading Dynamic loading is required if you want to use XS modules from CPAN (like DBI (and DBD's), JSON::XS, and Text::CSV_XS) or update CORE modules from CPAN with newer versions (like Encode) without rebuilding all of the perl binary. This port will support dynamic loading, but it is not selected by default. If you would like to experiment with dynamic loading then be sure to specify -Dusedl in the arguments to the Configure script. See the comments in hints/os390.sh for more information on dynamic loading. If you build with dynamic loading then you will need to add the $archlibexp/CORE directory to your LIBPATH environment variable in order for perl to work. See the config.sh file for the value of $archlibexp. If in trying to use Perl you see an error message similar to: CEE3501S The module libperl.dll was not found. From entry point __dllstaticinit at compile unit offset +00000194 at then your LIBPATH does not have the location of libperl.x and either libperl.dll or libperl.so in it. Add that directory to your LIBPATH and proceed. In hints/os390.sh, selecting -Dusedl will default to *also* select -Duseshrplib. Having a shared plib not only requires LIBPATH to be set to the correct location of libperl.so but also makes it close to impossible to run more than one different perl that was built this way at the same time. All objects that are involved in -Dusedl builds should be compiled for this, probably by adding to all ccflags -qexportall -qxplink -qdll -Wc,XPLINK,dll,EXPORTALL -Wl,XPLINK,dll =head3 Optimizing Do not turn on the compiler optimization flag "-O". There is a bug in either the optimizer or perl that causes perl to not work correctly when the optimizer is on. =head3 Config files Some of the configuration files in /etc used by the networking APIs are either missing or have the wrong names. In particular, make sure that there's either an /etc/resolv.conf or an /etc/hosts, so that gethostbyname() works, and make sure that the file /etc/proto has been renamed to /etc/protocol (NOT /etc/protocols, as used by other Unix systems). You may have to look for things like HOSTNAME and DOMAINORIGIN in the "//'SYS1.TCPPARMS(TCPDATA)'" PDS member in order to properly set up your /etc networking files. =head2 Build, Test, Install Perl on OS/390 Simply put: sh Configure make make test if everything looks ok (see the next section for test/IVP diagnosis) then: make install this last step may or may not require UID=0 privileges depending on how you answered the questions that Configure asked and whether or not you have write access to the directories you specified. =head2 Build Anomalies with Perl on OS/390 "Out of memory!" messages during the build of Perl are most often fixed by re building the GNU make utility for OS/390 from a source code kit. Building debugging-enabled binaries (with -g or -g3) will increase the chance of getting these errors. Prevent -g if possible. Another memory limiting item to check is your MAXASSIZE parameter in your 'SYS1.PARMLIB(BPXPRMxx)' data set (note too that as of V2R8 address space limits can be set on a per user ID basis in the USS segment of a RACF profile). People have reported successful builds of Perl with MAXASSIZE parameters as small as 503316480 (and it may be possible to build Perl with a MAXASSIZE smaller than that). Within USS your /etc/profile or $HOME/.profile may limit your ulimit settings. Check that the following command returns reasonable values: ulimit -a To conserve memory you should have your compiler modules loaded into the Link Pack Area (LPA/ELPA) rather than in a link list or step lib. If the c89 compiler complains of syntax errors during the build of the Socket extension then be sure to fix the syntax error in the system header /usr/include/sys/socket.h. =head2 Testing Anomalies with Perl on OS/390 The "make test" step runs a Perl Verification Procedure, usually before installation. You might encounter STDERR messages even during a successful run of "make test". Here is a guide to some of the more commonly seen anomalies: =head3 Signals A message of the form: io/openpid...........CEE5210S The signal SIGHUP was received. CEE5210S The signal SIGHUP was received. CEE5210S The signal SIGHUP was received. ok indicates that the t/io/openpid.t test of Perl has passed but done so with extraneous messages on stderr from CEE. =head3 File::Temp A message of the form: lib/ftmp-security....File::Temp::_gettemp: Parent directory (/tmp/) is not safe (sticky bit not set when world writable?) at lib/ftmp-security.t line 100 File::Temp::_gettemp: Parent directory (/tmp/) is not safe (sticky bit not set when world writable?) at lib/ftmp-security.t line 100 ok indicates a problem with the permissions on your /tmp directory within the HFS. To correct that problem issue the command: chmod a+t /tmp from an account with write access to the directory entry for /tmp. =head3 Out of Memory! Recent perl test suite is quite memory hungry. In addition to the comments above on memory limitations it is also worth checking for _CEE_RUNOPTS in your environment. Perl now has (in miniperlmain.c) a C #pragma to set CEE run options, but the environment variable wins. The C code asks for: #pragma runopts(HEAP(2M,500K,ANYWHERE,KEEP,8K,4K) STACK(,,ANY,) ALL31(ON)) The important parts of that are the second argument (the increment) to HEAP, and allowing the stack to be "Above the (16M) line". If the heap increment is too small then when perl (for example loading unicode/Name.pl) tries to create a "big" (400K+) string it cannot fit in a single segment and you get "Out of Memory!" - even if there is still plenty of memory available. A related issue is use with perl's malloc. Perl's malloc uses C<sbrk()> to get memory, and C<sbrk()> is limited to the first allocation so in this case something like: HEAP(8M,500K,ANYWHERE,KEEP,8K,4K) is needed to get through the test suite. =head2 Installation Anomalies with Perl on OS/390 The installman script will try to run on OS/390. There will be fewer errors if you have a roff utility installed. You can obtain GNU groff from the Redbook SG24-5944-00 ftp site. =head2 Usage Hints for Perl on OS/390 When using perl on OS/390 please keep in mind that the EBCDIC and ASCII character sets are different. See perlebcdic.pod for more on such character set issues. Perl builtin functions that may behave differently under EBCDIC are also mentioned in the perlport.pod document. Open Edition (UNIX System Services) from V2R8 onward does support #!/path/to/perl script invocation. There is a PTF available from IBM for V2R7 that will allow shell/kernel support for #!. USS releases prior to V2R7 did not support the #! means of script invocation. If you are running V2R6 or earlier then see: head `whence perldoc` for an example of how to use the "eval exec" trick to ask the shell to have Perl run your scripts on those older releases of Unix System Services. If you are having trouble with square brackets then consider switching your rlogin or telnet client. Try to avoid older 3270 emulators and ISHELL for working with Perl on USS. =head2 Floating Point Anomalies with Perl on OS/390 There appears to be a bug in the floating point implementation on S/390 systems such that calling int() on the product of a number and a small magnitude number is not the same as calling int() on the quotient of that number and a large magnitude number. For example, in the following Perl code: my $x = 100000.0; my $y = int($x * 1e-5) * 1e5; # '0' my $z = int($x / 1e+5) * 1e5; # '100000' print "\$y is $y and \$z is $z\n"; # $y is 0 and $z is 100000 Although one would expect the quantities $y and $z to be the same and equal to 100000 they will differ and instead will be 0 and 100000 respectively. The problem can be further examined in a roughly equivalent C program: #include <stdio.h> #include <math.h> main() { double r1,r2; double x = 100000.0; double y = 0.0; double z = 0.0; x = 100000.0 * 1e-5; r1 = modf (x,&y); x = 100000.0 / 1e+5; r2 = modf (x,&z); printf("y is %e and z is %e\n",y*1e5,z*1e5); /* y is 0.000000e+00 and z is 1.000000e+05 (with c89) */ } =head2 Modules and Extensions for Perl on OS/390 Pure Perl (that is non XS) modules may be installed via the usual: perl Makefile.PL make make test make install If you built perl with dynamic loading capability then that would also be the way to build XS based extensions. However, if you built perl with the default static linking you can still build XS based extensions for OS/390 but you will need to follow the instructions in ExtUtils::MakeMaker for building statically linked perl binaries. In the simplest configurations building a static perl + XS extension boils down to: perl Makefile.PL make make perl make test make install make -f Makefile.aperl inst_perl MAP_TARGET=perl In most cases people have reported better results with GNU make rather than the system's /bin/make program, whether for plain modules or for XS based extensions. If the make process encounters trouble with either compilation or linking then try setting the _C89_CCMODE to 1. Assuming sh is your login shell then run: export _C89_CCMODE=1 If tcsh is your login shell then use the setenv command. =head1 AUTHORS David Fiander and Peter Prymmer with thanks to Dennis Longnecker and William Raffloer for valuable reports, LPAR and PTF feedback. Thanks to Mike MacIsaac and Egon Terwedow for SG24-5944-00. Thanks to Ignasi Roca for pointing out the floating point problems. Thanks to John Goodyear for dynamic loading help. =head1 SEE ALSO L<INSTALL>, L<perlport>, L<perlebcdic>, L<ExtUtils::MakeMaker>. http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1toy.html http://www.redbooks.ibm.com/redbooks/SG245944.html http://www.ibm.com/servers/eserver/zseries/zos/unix/bpxa1ty1.html#opensrc http://www.xray.mpe.mpg.de/mailing-lists/perl-mvs/ http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/ceea3030/ http://publibz.boulder.ibm.com:80/cgi-bin/bookmgr_OS390/BOOKS/CBCUG030/ =head2 Mailing list for Perl on OS/390 If you are interested in the z/OS (formerly known as OS/390) and POSIX-BC (BS2000) ports of Perl then see the perl-mvs mailing list. To subscribe, send an empty message to perl-mvs-subscribe@perl.org. See also: https://lists.perl.org/list/perl-mvs.html There are web archives of the mailing list at: https://www.nntp.perl.org/group/perl.mvs/ =head1 HISTORY This document was originally written by David Fiander for the 5.005 release of Perl. This document was podified for the 5.005_03 release of Perl 11 March 1999. Updated 12 November 2000 for the 5.7.1 release of Perl. Updated 15 January 2001 for the 5.7.1 release of Perl. Updated 24 January 2001 to mention dynamic loading. Updated 12 March 2001 to mention //'SYS1.TCPPARMS(TCPDATA)'. Updated 28 November 2001 for broken URLs. Updated 03 October 2019 for perl-5.32.0+ =cut PK �=�[�&��d �d perlunicode.podnu �[��� =head1 NAME perlunicode - Unicode support in Perl =head1 DESCRIPTION If you haven't already, before reading this document, you should become familiar with both L<perlunitut> and L<perluniintro>. Unicode aims to B<UNI>-fy the en-B<CODE>-ings of all the world's character sets into a single Standard. For quite a few of the various coding standards that existed when Unicode was first created, converting from each to Unicode essentially meant adding a constant to each code point in the original standard, and converting back meant just subtracting that same constant. For ASCII and ISO-8859-1, the constant is 0. For ISO-8859-5, (Cyrillic) the constant is 864; for Hebrew (ISO-8859-8), it's 1488; Thai (ISO-8859-11), 3424; and so forth. This made it easy to do the conversions, and facilitated the adoption of Unicode. And it worked; nowadays, those legacy standards are rarely used. Most everyone uses Unicode. Unicode is a comprehensive standard. It specifies many things outside the scope of Perl, such as how to display sequences of characters. For a full discussion of all aspects of Unicode, see L<https://www.unicode.org>. =head2 Important Caveats Even though some of this section may not be understandable to you on first reading, we think it's important enough to highlight some of the gotchas before delving further, so here goes: Unicode support is an extensive requirement. While Perl does not implement the Unicode standard or the accompanying technical reports from cover to cover, Perl does support many Unicode features. Also, the use of Unicode may present security issues that aren't obvious, see L</Security Implications of Unicode> below. =over 4 =item Safest if you C<use feature 'unicode_strings'> In order to preserve backward compatibility, Perl does not turn on full internal Unicode support unless the pragma L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature> is specified. (This is automatically selected if you S<C<use 5.012>> or higher.) Failure to do this can trigger unexpected surprises. See L</The "Unicode Bug"> below. This pragma doesn't affect I/O. Nor does it change the internal representation of strings, only their interpretation. There are still several places where Unicode isn't fully supported, such as in filenames. =item Input and Output Layers Use the C<:encoding(...)> layer to read from and write to filehandles using the specified encoding. (See L<open>.) =item You must convert your non-ASCII, non-UTF-8 Perl scripts to be UTF-8. The L<encoding> module has been deprecated since perl 5.18 and the perl internals it requires have been removed with perl 5.26. =item C<use utf8> still needed to enable L<UTF-8|/Unicode Encodings> in scripts If your Perl script is itself encoded in L<UTF-8|/Unicode Encodings>, the S<C<use utf8>> pragma must be explicitly included to enable recognition of that (in string or regular expression literals, or in identifier names). B<This is the only time when an explicit S<C<use utf8>> is needed.> (See L<utf8>). If a Perl script begins with the bytes that form the UTF-8 encoding of the Unicode BYTE ORDER MARK (C<BOM>, see L</Unicode Encodings>), those bytes are completely ignored. =item L<UTF-16|/Unicode Encodings> scripts autodetected If a Perl script begins with the Unicode C<BOM> (UTF-16LE, UTF16-BE), or if the script looks like non-C<BOM>-marked UTF-16 of either endianness, Perl will correctly read in the script as the appropriate Unicode encoding. =back =head2 Byte and Character Semantics Before Unicode, most encodings used 8 bits (a single byte) to encode each character. Thus a character was a byte, and a byte was a character, and there could be only 256 or fewer possible characters. "Byte Semantics" in the title of this section refers to this behavior. There was no need to distinguish between "Byte" and "Character". Then along comes Unicode which has room for over a million characters (and Perl allows for even more). This means that a character may require more than a single byte to represent it, and so the two terms are no longer equivalent. What matter are the characters as whole entities, and not usually the bytes that comprise them. That's what the term "Character Semantics" in the title of this section refers to. Perl had to change internally to decouple "bytes" from "characters". It is important that you too change your ideas, if you haven't already, so that "byte" and "character" no longer mean the same thing in your mind. The basic building block of Perl strings has always been a "character". The changes basically come down to that the implementation no longer thinks that a character is always just a single byte. There are various things to note: =over 4 =item * String handling functions, for the most part, continue to operate in terms of characters. C<length()>, for example, returns the number of characters in a string, just as before. But that number no longer is necessarily the same as the number of bytes in the string (there may be more bytes than characters). The other such functions include C<chop()>, C<chomp()>, C<substr()>, C<pos()>, C<index()>, C<rindex()>, C<sort()>, C<sprintf()>, and C<write()>. The exceptions are: =over 4 =item * the bit-oriented C<vec> E<nbsp> =item * the byte-oriented C<pack>/C<unpack> C<"C"> format However, the C<W> specifier does operate on whole characters, as does the C<U> specifier. =item * some operators that interact with the platform's operating system Operators dealing with filenames are examples. =item * when the functions are called from within the scope of the S<C<L<use bytes|bytes>>> pragma Likely, you should use this only for debugging anyway. =back =item * Strings--including hash keys--and regular expression patterns may contain characters that have ordinal values larger than 255. If you use a Unicode editor to edit your program, Unicode characters may occur directly within the literal strings in UTF-8 encoding, or UTF-16. (The former requires a C<use utf8>, the latter may require a C<BOM>.) L<perluniintro/Creating Unicode> gives other ways to place non-ASCII characters in your strings. =item * The C<chr()> and C<ord()> functions work on whole characters. =item * Regular expressions match whole characters. For example, C<"."> matches a whole character instead of only a single byte. =item * The C<tr///> operator translates whole characters. (Note that the C<tr///CU> functionality has been removed. For similar functionality to that, see C<pack('U0', ...)> and C<pack('C0', ...)>). =item * C<scalar reverse()> reverses by character rather than by byte. =item * The bit string operators, C<& | ^ ~> and (starting in v5.22) C<&. |. ^. ~.> can operate on bit strings encoded in UTF-8, but this can give unexpected results if any of the strings contain code points above 0xFF. Starting in v5.28, it is a fatal error to have such an operand. Otherwise, the operation is performed on a non-UTF-8 copy of the operand. If you're not sure about the encoding of a string, downgrade it before using any of these operators; you can use L<C<utf8::utf8_downgrade()>|utf8/Utility functions>. =back The bottom line is that Perl has always practiced "Character Semantics", but with the advent of Unicode, that is now different than "Byte Semantics". =head2 ASCII Rules versus Unicode Rules Before Unicode, when a character was a byte was a character, Perl knew only about the 128 characters defined by ASCII, code points 0 through 127 (except for under L<S<C<use locale>>|perllocale>). That left the code points 128 to 255 as unassigned, and available for whatever use a program might want. The only semantics they have is their ordinal numbers, and that they are members of none of the non-negative character classes. None are considered to match C<\w> for example, but all match C<\W>. Unicode, of course, assigns each of those code points a particular meaning (along with ones above 255). To preserve backward compatibility, Perl only uses the Unicode meanings when there is some indication that Unicode is what is intended; otherwise the non-ASCII code points remain treated as if they are unassigned. Here are the ways that Perl knows that a string should be treated as Unicode: =over =item * Within the scope of S<C<use utf8>> If the whole program is Unicode (signified by using 8-bit B<U>nicode B<T>ransformation B<F>ormat), then all literal strings within it must be Unicode. =item * Within the scope of L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature> This pragma was created so you can explicitly tell Perl that operations executed within its scope are to use Unicode rules. More operations are affected with newer perls. See L</The "Unicode Bug">. =item * Within the scope of S<C<use 5.012>> or higher This implicitly turns on S<C<use feature 'unicode_strings'>>. =item * Within the scope of L<S<C<use locale 'not_characters'>>|perllocale/Unicode and UTF-8>, or L<S<C<use locale>>|perllocale> and the current locale is a UTF-8 locale. The former is defined to imply Unicode handling; and the latter indicates a Unicode locale, hence a Unicode interpretation of all strings within it. =item * When the string contains a Unicode-only code point Perl has never accepted code points above 255 without them being Unicode, so their use implies Unicode for the whole string. =item * When the string contains a Unicode named code point C<\N{...}> The C<\N{...}> construct explicitly refers to a Unicode code point, even if it is one that is also in ASCII. Therefore the string containing it must be Unicode. =item * When the string has come from an external source marked as Unicode The L<C<-C>|perlrun/-C [numberE<sol>list]> command line option can specify that certain inputs to the program are Unicode, and the values of this can be read by your Perl code, see L<perlvar/"${^UNICODE}">. =item * When the string has been upgraded to UTF-8 The function L<C<utf8::utf8_upgrade()>|utf8/Utility functions> can be explicitly used to permanently (unless a subsequent C<utf8::utf8_downgrade()> is called) cause a string to be treated as Unicode. =item * There are additional methods for regular expression patterns A pattern that is compiled with the C<< /u >> or C<< /a >> modifiers is treated as Unicode (though there are some restrictions with C<< /a >>). Under the C<< /d >> and C<< /l >> modifiers, there are several other indications for Unicode; see L<perlre/Character set modifiers>. =back Note that all of the above are overridden within the scope of C<L<use bytes|bytes>>; but you should be using this pragma only for debugging. Note also that some interactions with the platform's operating system never use Unicode rules. When Unicode rules are in effect: =over 4 =item * Case translation operators use the Unicode case translation tables. Note that C<uc()>, or C<\U> in interpolated strings, translates to uppercase, while C<ucfirst>, or C<\u> in interpolated strings, translates to titlecase in languages that make the distinction (which is equivalent to uppercase in languages without the distinction). There is a CPAN module, C<L<Unicode::Casing>>, which allows you to define your own mappings to be used in C<lc()>, C<lcfirst()>, C<uc()>, C<ucfirst()>, and C<fc> (or their double-quoted string inlined versions such as C<\U>). (Prior to Perl 5.16, this functionality was partially provided in the Perl core, but suffered from a number of insurmountable drawbacks, so the CPAN module was written instead.) =item * Character classes in regular expressions match based on the character properties specified in the Unicode properties database. C<\w> can be used to match a Japanese ideograph, for instance; and C<[[:digit:]]> a Bengali number. =item * Named Unicode properties, scripts, and block ranges may be used (like bracketed character classes) by using the C<\p{}> "matches property" construct and the C<\P{}> negation, "doesn't match property". See L</"Unicode Character Properties"> for more details. You can define your own character properties and use them in the regular expression with the C<\p{}> or C<\P{}> construct. See L</"User-Defined Character Properties"> for more details. =back =head2 Extended Grapheme Clusters (Logical characters) Consider a character, say C<H>. It could appear with various marks around it, such as an acute accent, or a circumflex, or various hooks, circles, arrows, I<etc.>, above, below, to one side or the other, I<etc>. There are many possibilities among the world's languages. The number of combinations is astronomical, and if there were a character for each combination, it would soon exhaust Unicode's more than a million possible characters. So Unicode took a different approach: there is a character for the base C<H>, and a character for each of the possible marks, and these can be variously combined to get a final logical character. So a logical character--what appears to be a single character--can be a sequence of more than one individual characters. The Unicode standard calls these "extended grapheme clusters" (which is an improved version of the no-longer much used "grapheme cluster"); Perl furnishes the C<\X> regular expression construct to match such sequences in their entirety. But Unicode's intent is to unify the existing character set standards and practices, and several pre-existing standards have single characters that mean the same thing as some of these combinations, like ISO-8859-1, which has quite a few of them. For example, C<"LATIN CAPITAL LETTER E WITH ACUTE"> was already in this standard when Unicode came along. Unicode therefore added it to its repertoire as that single character. But this character is considered by Unicode to be equivalent to the sequence consisting of the character C<"LATIN CAPITAL LETTER E"> followed by the character C<"COMBINING ACUTE ACCENT">. C<"LATIN CAPITAL LETTER E WITH ACUTE"> is called a "pre-composed" character, and its equivalence with the "E" and the "COMBINING ACCENT" sequence is called canonical equivalence. All pre-composed characters are said to have a decomposition (into the equivalent sequence), and the decomposition type is also called canonical. A string may be comprised as much as possible of precomposed characters, or it may be comprised of entirely decomposed characters. Unicode calls these respectively, "Normalization Form Composed" (NFC) and "Normalization Form Decomposed". The C<L<Unicode::Normalize>> module contains functions that convert between the two. A string may also have both composed characters and decomposed characters; this module can be used to make it all one or the other. You may be presented with strings in any of these equivalent forms. There is currently nothing in Perl 5 that ignores the differences. So you'll have to specially handle it. The usual advice is to convert your inputs to C<NFD> before processing further. For more detailed information, see L<http://unicode.org/reports/tr15/>. =head2 Unicode Character Properties (The only time that Perl considers a sequence of individual code points as a single logical character is in the C<\X> construct, already mentioned above. Therefore "character" in this discussion means a single Unicode code point.) Very nearly all Unicode character properties are accessible through regular expressions by using the C<\p{}> "matches property" construct and the C<\P{}> "doesn't match property" for its negation. For instance, C<\p{Uppercase}> matches any single character with the Unicode C<"Uppercase"> property, while C<\p{L}> matches any character with a C<General_Category> of C<"L"> (letter) property (see L</General_Category> below). Brackets are not required for single letter property names, so C<\p{L}> is equivalent to C<\pL>. More formally, C<\p{Uppercase}> matches any single character whose Unicode C<Uppercase> property value is C<True>, and C<\P{Uppercase}> matches any character whose C<Uppercase> property value is C<False>, and they could have been written as C<\p{Uppercase=True}> and C<\p{Uppercase=False}>, respectively. This formality is needed when properties are not binary; that is, if they can take on more values than just C<True> and C<False>. For example, the C<Bidi_Class> property (see L</"Bidirectional Character Types"> below), can take on several different values, such as C<Left>, C<Right>, C<Whitespace>, and others. To match these, one needs to specify both the property name (C<Bidi_Class>), AND the value being matched against (C<Left>, C<Right>, I<etc.>). This is done, as in the examples above, by having the two components separated by an equal sign (or interchangeably, a colon), like C<\p{Bidi_Class: Left}>. All Unicode-defined character properties may be written in these compound forms of C<\p{I<property>=I<value>}> or C<\p{I<property>:I<value>}>, but Perl provides some additional properties that are written only in the single form, as well as single-form short-cuts for all binary properties and certain others described below, in which you may omit the property name and the equals or colon separator. Most Unicode character properties have at least two synonyms (or aliases if you prefer): a short one that is easier to type and a longer one that is more descriptive and hence easier to understand. Thus the C<"L"> and C<"Letter"> properties above are equivalent and can be used interchangeably. Likewise, C<"Upper"> is a synonym for C<"Uppercase">, and we could have written C<\p{Uppercase}> equivalently as C<\p{Upper}>. Also, there are typically various synonyms for the values the property can be. For binary properties, C<"True"> has 3 synonyms: C<"T">, C<"Yes">, and C<"Y">; and C<"False"> has correspondingly C<"F">, C<"No">, and C<"N">. But be careful. A short form of a value for one property may not mean the same thing as the short form spelled the same for another. Thus, for the C<L</General_Category>> property, C<"L"> means C<"Letter">, but for the L<C<Bidi_Class>|/Bidirectional Character Types> property, C<"L"> means C<"Left">. A complete list of properties and synonyms is in L<perluniprops>. Upper/lower case differences in property names and values are irrelevant; thus C<\p{Upper}> means the same thing as C<\p{upper}> or even C<\p{UpPeR}>. Similarly, you can add or subtract underscores anywhere in the middle of a word, so that these are also equivalent to C<\p{U_p_p_e_r}>. And white space is generally irrelevant adjacent to non-word characters, such as the braces and the equals or colon separators, so C<\p{ Upper }> and C<\p{ Upper_case : Y }> are equivalent to these as well. In fact, white space and even hyphens can usually be added or deleted anywhere. So even C<\p{ Up-per case = Yes}> is equivalent. All this is called "loose-matching" by Unicode. The "name" property has some restrictions on this due to a few outlier names. Full details are given in L<https://www.unicode.org/reports/tr44/tr44-24.html#UAX44-LM2>. The few places where stricter matching is used is in the middle of numbers, the "name" property, and in the Perl extension properties that begin or end with an underscore. Stricter matching cares about white space (except adjacent to non-word characters), hyphens, and non-interior underscores. You can also use negation in both C<\p{}> and C<\P{}> by introducing a caret (C<^>) between the first brace and the property name: C<\p{^Tamil}> is equal to C<\P{Tamil}>. Almost all properties are immune to case-insensitive matching. That is, adding a C</i> regular expression modifier does not change what they match. There are two sets that are affected. The first set is C<Uppercase_Letter>, C<Lowercase_Letter>, and C<Titlecase_Letter>, all of which match C<Cased_Letter> under C</i> matching. And the second set is C<Uppercase>, C<Lowercase>, and C<Titlecase>, all of which match C<Cased> under C</i> matching. This set also includes its subsets C<PosixUpper> and C<PosixLower> both of which under C</i> match C<PosixAlpha>. (The difference between these sets is that some things, such as Roman numerals, come in both upper and lower case so they are C<Cased>, but aren't considered letters, so they aren't C<Cased_Letter>'s.) See L</Beyond Unicode code points> for special considerations when matching Unicode properties against non-Unicode code points. =head3 B<General_Category> Every Unicode character is assigned a general category, which is the "most usual categorization of a character" (from L<https://www.unicode.org/reports/tr44>). The compound way of writing these is like C<\p{General_Category=Number}> (short: C<\p{gc:n}>). But Perl furnishes shortcuts in which everything up through the equal or colon separator is omitted. So you can instead just write C<\pN>. Here are the short and long forms of the values the C<General Category> property can have: Short Long L Letter LC, L& Cased_Letter (that is: [\p{Ll}\p{Lu}\p{Lt}]) Lu Uppercase_Letter Ll Lowercase_Letter Lt Titlecase_Letter Lm Modifier_Letter Lo Other_Letter M Mark Mn Nonspacing_Mark Mc Spacing_Mark Me Enclosing_Mark N Number Nd Decimal_Number (also Digit) Nl Letter_Number No Other_Number P Punctuation (also Punct) Pc Connector_Punctuation Pd Dash_Punctuation Ps Open_Punctuation Pe Close_Punctuation Pi Initial_Punctuation (may behave like Ps or Pe depending on usage) Pf Final_Punctuation (may behave like Ps or Pe depending on usage) Po Other_Punctuation S Symbol Sm Math_Symbol Sc Currency_Symbol Sk Modifier_Symbol So Other_Symbol Z Separator Zs Space_Separator Zl Line_Separator Zp Paragraph_Separator C Other Cc Control (also Cntrl) Cf Format Cs Surrogate Co Private_Use Cn Unassigned Single-letter properties match all characters in any of the two-letter sub-properties starting with the same letter. C<LC> and C<L&> are special: both are aliases for the set consisting of everything matched by C<Ll>, C<Lu>, and C<Lt>. =head3 B<Bidirectional Character Types> Because scripts differ in their directionality (Hebrew and Arabic are written right to left, for example) Unicode supplies a C<Bidi_Class> property. Some of the values this property can have are: Value Meaning L Left-to-Right LRE Left-to-Right Embedding LRO Left-to-Right Override R Right-to-Left AL Arabic Letter RLE Right-to-Left Embedding RLO Right-to-Left Override PDF Pop Directional Format EN European Number ES European Separator ET European Terminator AN Arabic Number CS Common Separator NSM Non-Spacing Mark BN Boundary Neutral B Paragraph Separator S Segment Separator WS Whitespace ON Other Neutrals This property is always written in the compound form. For example, C<\p{Bidi_Class:R}> matches characters that are normally written right to left. Unlike the C<L</General_Category>> property, this property can have more values added in a future Unicode release. Those listed above comprised the complete set for many Unicode releases, but others were added in Unicode 6.3; you can always find what the current ones are in L<perluniprops>. And L<https://www.unicode.org/reports/tr9/> describes how to use them. =head3 B<Scripts> The world's languages are written in many different scripts. This sentence (unless you're reading it in translation) is written in Latin, while Russian is written in Cyrillic, and Greek is written in, well, Greek; Japanese mainly in Hiragana or Katakana. There are many more. The Unicode C<Script> and C<Script_Extensions> properties give what script a given character is in. The C<Script_Extensions> property is an improved version of C<Script>, as demonstrated below. Either property can be specified with the compound form like C<\p{Script=Hebrew}> (short: C<\p{sc=hebr}>), or C<\p{Script_Extensions=Javanese}> (short: C<\p{scx=java}>). In addition, Perl furnishes shortcuts for all C<Script_Extensions> property names. You can omit everything up through the equals (or colon), and simply write C<\p{Latin}> or C<\P{Cyrillic}>. (This is not true for C<Script>, which is required to be written in the compound form. Prior to Perl v5.26, the single form returned the plain old C<Script> version, but was changed because C<Script_Extensions> gives better results.) The difference between these two properties involves characters that are used in multiple scripts. For example the digits '0' through '9' are used in many parts of the world. These are placed in a script named C<Common>. Other characters are used in just a few scripts. For example, the C<"KATAKANA-HIRAGANA DOUBLE HYPHEN"> is used in both Japanese scripts, Katakana and Hiragana, but nowhere else. The C<Script> property places all characters that are used in multiple scripts in the C<Common> script, while the C<Script_Extensions> property places those that are used in only a few scripts into each of those scripts; while still using C<Common> for those used in many scripts. Thus both these match: "0" =~ /\p{sc=Common}/ # Matches "0" =~ /\p{scx=Common}/ # Matches and only the first of these match: "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Common} # Matches "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Common} # No match And only the last two of these match: "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Hiragana} # No match "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{sc=Katakana} # No match "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Hiragana} # Matches "\N{KATAKANA-HIRAGANA DOUBLE HYPHEN}" =~ /\p{scx=Katakana} # Matches C<Script_Extensions> is thus an improved C<Script>, in which there are fewer characters in the C<Common> script, and correspondingly more in other scripts. It is new in Unicode version 6.0, and its data are likely to change significantly in later releases, as things get sorted out. New code should probably be using C<Script_Extensions> and not plain C<Script>. If you compile perl with a Unicode release that doesn't have C<Script_Extensions>, the single form Perl extensions will instead refer to the plain C<Script> property. If you compile with a version of Unicode that doesn't have the C<Script> property, these extensions will not be defined at all. (Actually, besides C<Common>, the C<Inherited> script, contains characters that are used in multiple scripts. These are modifier characters which inherit the script value of the controlling character. Some of these are used in many scripts, and so go into C<Inherited> in both C<Script> and C<Script_Extensions>. Others are used in just a few scripts, so are in C<Inherited> in C<Script>, but not in C<Script_Extensions>.) It is worth stressing that there are several different sets of digits in Unicode that are equivalent to 0-9 and are matchable by C<\d> in a regular expression. If they are used in a single language only, they are in that language's C<Script> and C<Script_Extensions>. If they are used in more than one script, they will be in C<sc=Common>, but only if they are used in many scripts should they be in C<scx=Common>. The explanation above has omitted some detail; refer to UAX#24 "Unicode Script Property": L<https://www.unicode.org/reports/tr24>. A complete list of scripts and their shortcuts is in L<perluniprops>. =head3 B<Use of the C<"Is"> Prefix> For backward compatibility (with ancient Perl 5.6), all properties writable without using the compound form mentioned so far may have C<Is> or C<Is_> prepended to their name, so C<\P{Is_Lu}>, for example, is equal to C<\P{Lu}>, and C<\p{IsScript:Arabic}> is equal to C<\p{Arabic}>. =head3 B<Blocks> In addition to B<scripts>, Unicode also defines B<blocks> of characters. The difference between scripts and blocks is that the concept of scripts is closer to natural languages, while the concept of blocks is more of an artificial grouping based on groups of Unicode characters with consecutive ordinal values. For example, the C<"Basic Latin"> block is all the characters whose ordinals are between 0 and 127, inclusive; in other words, the ASCII characters. The C<"Latin"> script contains some letters from this as well as several other blocks, like C<"Latin-1 Supplement">, C<"Latin Extended-A">, I<etc.>, but it does not contain all the characters from those blocks. It does not, for example, contain the digits 0-9, because those digits are shared across many scripts, and hence are in the C<Common> script. For more about scripts versus blocks, see UAX#24 "Unicode Script Property": L<https://www.unicode.org/reports/tr24> The C<Script_Extensions> or C<Script> properties are likely to be the ones you want to use when processing natural language; the C<Block> property may occasionally be useful in working with the nuts and bolts of Unicode. Block names are matched in the compound form, like C<\p{Block: Arrows}> or C<\p{Blk=Hebrew}>. Unlike most other properties, only a few block names have a Unicode-defined short name. Perl also defines single form synonyms for the block property in cases where these do not conflict with something else. But don't use any of these, because they are unstable. Since these are Perl extensions, they are subordinate to official Unicode property names; Unicode doesn't know nor care about Perl's extensions. It may happen that a name that currently means the Perl extension will later be changed without warning to mean a different Unicode property in a future version of the perl interpreter that uses a later Unicode release, and your code would no longer work. The extensions are mentioned here for completeness: Take the block name and prefix it with one of: C<In> (for example C<\p{Blk=Arrows}> can currently be written as C<\p{In_Arrows}>); or sometimes C<Is> (like C<\p{Is_Arrows}>); or sometimes no prefix at all (C<\p{Arrows}>). As of this writing (Unicode 9.0) there are no conflicts with using the C<In_> prefix, but there are plenty with the other two forms. For example, C<\p{Is_Hebrew}> and C<\p{Hebrew}> mean C<\p{Script_Extensions=Hebrew}> which is NOT the same thing as C<\p{Blk=Hebrew}>. Our advice used to be to use the C<In_> prefix as a single form way of specifying a block. But Unicode 8.0 added properties whose names begin with C<In>, and it's now clear that it's only luck that's so far prevented a conflict. Using C<In> is only marginally less typing than C<Blk:>, and the latter's meaning is clearer anyway, and guaranteed to never conflict. So don't take chances. Use C<\p{Blk=foo}> for new code. And be sure that block is what you really really want to do. In most cases scripts are what you want instead. A complete list of blocks is in L<perluniprops>. =head3 B<Other Properties> There are many more properties than the very basic ones described here. A complete list is in L<perluniprops>. Unicode defines all its properties in the compound form, so all single-form properties are Perl extensions. Most of these are just synonyms for the Unicode ones, but some are genuine extensions, including several that are in the compound form. And quite a few of these are actually recommended by Unicode (in L<https://www.unicode.org/reports/tr18>). This section gives some details on all extensions that aren't just synonyms for compound-form Unicode properties (for those properties, you'll have to refer to the L<Unicode Standard|https://www.unicode.org/reports/tr44>. =over =item B<C<\p{All}>> This matches every possible code point. It is equivalent to C<qr/./s>. Unlike all the other non-user-defined C<\p{}> property matches, no warning is ever generated if this is property is matched against a non-Unicode code point (see L</Beyond Unicode code points> below). =item B<C<\p{Alnum}>> This matches any C<\p{Alphabetic}> or C<\p{Decimal_Number}> character. =item B<C<\p{Any}>> This matches any of the 1_114_112 Unicode code points. It is a synonym for C<\p{Unicode}>. =item B<C<\p{ASCII}>> This matches any of the 128 characters in the US-ASCII character set, which is a subset of Unicode. =item B<C<\p{Assigned}>> This matches any assigned code point; that is, any code point whose L<general category|/General_Category> is not C<Unassigned> (or equivalently, not C<Cn>). =item B<C<\p{Blank}>> This is the same as C<\h> and C<\p{HorizSpace}>: A character that changes the spacing horizontally. =item B<C<\p{Decomposition_Type: Non_Canonical}>> (Short: C<\p{Dt=NonCanon}>) Matches a character that has a non-canonical decomposition. The L</Extended Grapheme Clusters (Logical characters)> section above talked about canonical decompositions. However, many more characters have a different type of decomposition, a "compatible" or "non-canonical" decomposition. The sequences that form these decompositions are not considered canonically equivalent to the pre-composed character. An example is the C<"SUPERSCRIPT ONE">. It is somewhat like a regular digit 1, but not exactly; its decomposition into the digit 1 is called a "compatible" decomposition, specifically a "super" decomposition. There are several such compatibility decompositions (see L<https://www.unicode.org/reports/tr44>), including one called "compat", which means some miscellaneous type of decomposition that doesn't fit into the other decomposition categories that Unicode has chosen. Note that most Unicode characters don't have a decomposition, so their decomposition type is C<"None">. For your convenience, Perl has added the C<Non_Canonical> decomposition type to mean any of the several compatibility decompositions. =item B<C<\p{Graph}>> Matches any character that is graphic. Theoretically, this means a character that on a printer would cause ink to be used. =item B<C<\p{HorizSpace}>> This is the same as C<\h> and C<\p{Blank}>: a character that changes the spacing horizontally. =item B<C<\p{In=*}>> This is a synonym for C<\p{Present_In=*}> =item B<C<\p{PerlSpace}>> This is the same as C<\s>, restricted to ASCII, namely C<S<[ \f\n\r\t]>> and starting in Perl v5.18, a vertical tab. Mnemonic: Perl's (original) space =item B<C<\p{PerlWord}>> This is the same as C<\w>, restricted to ASCII, namely C<[A-Za-z0-9_]> Mnemonic: Perl's (original) word. =item B<C<\p{Posix...}>> There are several of these, which are equivalents, using the C<\p{}> notation, for Posix classes and are described in L<perlrecharclass/POSIX Character Classes>. =item B<C<\p{Present_In: *}>> (Short: C<\p{In=*}>) This property is used when you need to know in what Unicode version(s) a character is. The "*" above stands for some Unicode version number, such as C<1.1> or C<12.0>; or the "*" can also be C<Unassigned>. This property will match the code points whose final disposition has been settled as of the Unicode release given by the version number; C<\p{Present_In: Unassigned}> will match those code points whose meaning has yet to be assigned. For example, C<U+0041> C<"LATIN CAPITAL LETTER A"> was present in the very first Unicode release available, which is C<1.1>, so this property is true for all valid "*" versions. On the other hand, C<U+1EFF> was not assigned until version 5.1 when it became C<"LATIN SMALL LETTER Y WITH LOOP">, so the only "*" that would match it are 5.1, 5.2, and later. Unicode furnishes the C<Age> property from which this is derived. The problem with Age is that a strict interpretation of it (which Perl takes) has it matching the precise release a code point's meaning is introduced in. Thus C<U+0041> would match only 1.1; and C<U+1EFF> only 5.1. This is not usually what you want. Some non-Perl implementations of the Age property may change its meaning to be the same as the Perl C<Present_In> property; just be aware of that. Another confusion with both these properties is that the definition is not that the code point has been I<assigned>, but that the meaning of the code point has been I<determined>. This is because 66 code points will always be unassigned, and so the C<Age> for them is the Unicode version in which the decision to make them so was made. For example, C<U+FDD0> is to be permanently unassigned to a character, and the decision to do that was made in version 3.1, so C<\p{Age=3.1}> matches this character, as also does C<\p{Present_In: 3.1}> and up. =item B<C<\p{Print}>> This matches any character that is graphical or blank, except controls. =item B<C<\p{SpacePerl}>> This is the same as C<\s>, including beyond ASCII. Mnemonic: Space, as modified by Perl. (It doesn't include the vertical tab until v5.18, which both the Posix standard and Unicode consider white space.) =item B<C<\p{Title}>> and B<C<\p{Titlecase}>> Under case-sensitive matching, these both match the same code points as C<\p{General Category=Titlecase_Letter}> (C<\p{gc=lt}>). The difference is that under C</i> caseless matching, these match the same as C<\p{Cased}>, whereas C<\p{gc=lt}> matches C<\p{Cased_Letter>). =item B<C<\p{Unicode}>> This matches any of the 1_114_112 Unicode code points. C<\p{Any}>. =item B<C<\p{VertSpace}>> This is the same as C<\v>: A character that changes the spacing vertically. =item B<C<\p{Word}>> This is the same as C<\w>, including over 100_000 characters beyond ASCII. =item B<C<\p{XPosix...}>> There are several of these, which are the standard Posix classes extended to the full Unicode range. They are described in L<perlrecharclass/POSIX Character Classes>. =back =head2 Comparison of C<\N{...}> and C<\p{name=...}> Starting in Perl 5.32, you can specify a character by its name in regular expression patterns using C<\p{name=...}>. This is in addition to the longstanding method of using C<\N{...}>. The following summarizes the differences between these two: \N{...} \p{Name=...} can interpolate only with eval yes [1] custom names yes no [2] name aliases yes yes [3] named sequences yes yes [4] name value parsing exact Unicode loose [5] =over =item [1] The ability to interpolate means you can do something like qr/\p{na=latin capital letter $which}/ and specify C<$which> elsewhere. =item [2] You can create your own names for characters, and override official ones when using C<\N{...}>. See L<charnames/CUSTOM ALIASES>. =item [3] Some characters have multiple names (synonyms). =item [4] Some particular sequences of characters are given a single name, in addition to their individual ones. =item [5] Exact name value matching means you have to specify case, hyphens, underscores, and spaces precisely in the name you want. Loose matching follows the Unicode rules L<https://www.unicode.org/reports/tr44/tr44-24.html#UAX44-LM2>, where these are mostly irrelevant. Except for a few outlier character names, these are the same rules as are already used for any other C<\p{...}> property. =back =head2 Wildcards in Property Values Starting in Perl 5.30, it is possible to do something like this: qr!\p{numeric_value=/\A[0-5]\z/}! or, by abbreviating and adding C</x>, qr! \p{nv= /(?x) \A [0-5] \z / }! This matches all code points whose numeric value is one of 0, 1, 2, 3, 4, or 5. This particular example could instead have been written as qr! \A [ \p{nv=0}\p{nv=1}\p{nv=2}\p{nv=3}\p{nv=4}\p{nv=5} ] \z !xx in earlier perls, so in this case this feature just makes things easier and shorter to write. If we hadn't included the C<\A> and C<\z>, these would have matched things like C<1E<sol>2> because that contains a 1 (as well as a 2). As written, it matches things like subscripts that have these numeric values. If we only wanted the decimal digits with those numeric values, we could say, qr! (?[ \d & \p{nv=/[0-5]/ ]) }!x The C<\d> gets rid of needing to anchor the pattern, since it forces the result to only match C<[0-9]>, and the C<[0-5]> further restricts it. The text in the above examples enclosed between the C<"E<sol>"> characters can be just about any regular expression. It is independent of the main pattern, so doesn't share any capturing groups, I<etc>. The delimiters for it must be ASCII punctuation, but it may NOT be delimited by C<"{">, nor C<"}"> nor contain a literal C<"}">, as that delimits the end of the enclosing C<\p{}>. Like any pattern, certain other delimiters are terminated by their mirror images. These are C<"(">, C<"[>", and C<"E<lt>">. If the delimiter is any of C<"-">, C<"_">, C<"+">, or C<"\">, or is the same delimiter as is used for the enclosing pattern, it must be preceded by a backslash escape, both fore and aft. Beware of using C<"$"> to indicate to match the end of the string. It can too easily be interpreted as being a punctuation variable, like C<$/>. No modifiers may follow the final delimiter. Instead, use L<perlre/(?adlupimnsx-imnsx)> and/or L<perlre/(?adluimnsx-imnsx:pattern)> to specify modifiers. However, certain modifiers are illegal in your wildcard subpattern. The only character set modifier specifiable is C</aa>; any other character set, and C<-m>, and C<p>, and C<s> are all illegal. Specifying modifiers like C<qr/.../gc> that aren't legal in the C<(?...)> notation normally raise a warning, but with wildcard subpatterns, their use is an error. The C<m> modifier is ineffective; everything that matches will be a single line. By default, your pattern is matched case-insensitively, as if C</i> had been specified. You can change this by saying C<(?-i)> in your pattern. There are also certain operations that are illegal. You can't nest C<\p{...}> and C<\P{...}> calls within a wildcard subpattern, and C<\G> doesn't make sense, so is also prohibited. And the C<*> quantifier (or its equivalent C<(0,}>) is illegal. This feature is not available when the left-hand side is prefixed by C<Is_>, nor for any form that is marked as "Discouraged" in L<perluniprops/Discouraged>. This experimental feature has been added to begin to implement L<https://www.unicode.org/reports/tr18/#Wildcard_Properties>. Using it will raise a (default-on) warning in the C<experimental::uniprop_wildcards> category. We reserve the right to change its operation as we gain experience. Your subpattern can be just about anything, but for it to have some utility, it should match when called with either or both of a) the full name of the property value with underscores (and/or spaces in the Block property) and some things uppercase; or b) the property value in all lowercase with spaces and underscores squeezed out. For example, qr!\p{Blk=/Old I.*/}! qr!\p{Blk=/oldi.*/}! would match the same things. Another example that shows that within C<\p{...}>, C</x> isn't needed to have spaces: qr!\p{scx= /Hebrew|Greek/ }! To be safe, we should have anchored the above example, to prevent matches for something like C<Hebrew_Braille>, but there aren't any script names like that, so far. A warning is issued if none of the legal values for a property are matched by your pattern. It's likely that a future release will raise a warning if your pattern ends up causing every possible code point to match. Starting in 5.32, the Name, Name Aliases, and Named Sequences properties are allowed to be matched. They are considered to be a single combination property, just as has long been the case for C<\N{}>. Loose matching doesn't work in exactly the same way for these as it does for the values of other properties. The rules are given in L<https://www.unicode.org/reports/tr44/tr44-24.html#UAX44-LM2>. As a result, Perl doesn't try loose matching for you, like it does in other properties. All letters in names are uppercase, but you can add C<(?i)> to your subpattern to ignore case. If you're uncertain where a blank is, you can use C< ?> in your subpattern. No character name contains an underscore, so don't bother trying to match one. The use of hyphens is particularly problematic; refer to the above link. But note that, as of Unicode 13.0, the only script in modern usage which has weirdnesses with these is Tibetan; also the two Korean characters U+116C HANGUL JUNGSEONG OE and U+1180 HANGUL JUNGSEONG O-E. Unicode makes no promises to not add hyphen-problematic names in the future. Using wildcards on these is resource intensive, given the hundreds of thousands of legal names that must be checked against. An example of using Name property wildcards is qr!\p{name=/(SMILING|GRINNING) FACE/}! Another is qr/(?[ \p{name=\/CJK\/} - \p{ideographic} ])/ which is the 200-ish (as of Unicode 13.0) CJK characters that aren't ideographs. There are certain properties that wildcard subpatterns don't currently work with. These are: Bidi Mirroring Glyph Bidi Paired Bracket Case Folding Decomposition Mapping Equivalent Unified Ideograph Lowercase Mapping NFKC Case Fold Titlecase Mapping Uppercase Mapping Nor is the C<@I<unicode_property>@> form implemented. Here's a complete example of matching IPV4 internet protocol addresses in any (single) script no warnings 'experimental::regex_sets'; no warnings 'experimental::uniprop_wildcards'; # Can match a substring, so this intermediate regex needs to have # context or anchoring in its final use. Using nt=de yields decimal # digits. When specifying a subset of these, we must include \d to # prevent things like U+00B2 SUPERSCRIPT TWO from matching my $zero_through_255 = qr/ \b (*sr: # All from same sript (?[ \p{nv=0} & \d ])* # Optional leading zeros ( # Then one of: \d{1,2} # 0 - 99 | (?[ \p{nv=1} & \d ]) \d{2} # 100 - 199 | (?[ \p{nv=2} & \d ]) ( (?[ \p{nv=:[0-4]:} & \d ]) \d # 200 - 249 | (?[ \p{nv=5} & \d ]) (?[ \p{nv=:[0-5]:} & \d ]) # 250 - 255 ) ) ) \b /x; my $ipv4 = qr/ \A (*sr: $zero_through_255 (?: [.] $zero_through_255 ) {3} ) \z /x; =head2 User-Defined Character Properties You can define your own binary character properties by defining subroutines whose names begin with C<"In"> or C<"Is">. (The experimental feature L<perlre/(?[ ])> provides an alternative which allows more complex definitions.) The subroutines can be defined in any package. They override any Unicode properties expressed as the same names. The user-defined properties can be used in the regular expression C<\p{}> and C<\P{}> constructs; if you are using a user-defined property from a package other than the one you are in, you must specify its package in the C<\p{}> or C<\P{}> construct. # assuming property Is_Foreign defined in Lang:: package main; # property package name required if ($txt =~ /\p{Lang::IsForeign}+/) { ... } package Lang; # property package name not required if ($txt =~ /\p{IsForeign}+/) { ... } Note that the effect is compile-time and immutable once defined. However, the subroutines are passed a single parameter, which is 0 if case-sensitive matching is in effect and non-zero if caseless matching is in effect. The subroutine may return different values depending on the value of the flag, and one set of values will immutably be in effect for all case-sensitive matches, and the other set for all case-insensitive matches. Note that if the regular expression is tainted, then Perl will die rather than calling the subroutine when the name of the subroutine is determined by the tainted data. The subroutines must return a specially-formatted string, with one or more newline-separated lines. Each line must be one of the following: =over 4 =item * A single hexadecimal number denoting a code point to include. =item * Two hexadecimal numbers separated by horizontal whitespace (space or tabular characters) denoting a range of code points to include. The second number must not be smaller than the first. =item * Something to include, prefixed by C<"+">: a built-in character property (prefixed by C<"utf8::">) or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point. =item * Something to exclude, prefixed by C<"-">: an existing character property (prefixed by C<"utf8::">) or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point. =item * Something to negate, prefixed C<"!">: an existing character property (prefixed by C<"utf8::">) or a fully qualified (including package name) user-defined character property, to represent all the characters in that property; two hexadecimal code points for a range; or a single hexadecimal code point. =item * Something to intersect with, prefixed by C<"&">: an existing character property (prefixed by C<"utf8::">) or a fully qualified (including package name) user-defined character property, for all the characters except the characters in the property; two hexadecimal code points for a range; or a single hexadecimal code point. =back For example, to define a property that covers both the Japanese syllabaries (hiragana and katakana), you can define sub InKana { return <<END; 3040\t309F 30A0\t30FF END } Imagine that the here-doc end marker is at the beginning of the line. Now you can use C<\p{InKana}> and C<\P{InKana}>. You could also have used the existing block property names: sub InKana { return <<'END'; +utf8::InHiragana +utf8::InKatakana END } Suppose you wanted to match only the allocated characters, not the raw block ranges: in other words, you want to remove the unassigned characters: sub InKana { return <<'END'; +utf8::InHiragana +utf8::InKatakana -utf8::IsCn END } The negation is useful for defining (surprise!) negated classes. sub InNotKana { return <<'END'; !utf8::InHiragana -utf8::InKatakana +utf8::IsCn END } This will match all non-Unicode code points, since every one of them is not in Kana. You can use intersection to exclude these, if desired, as this modified example shows: sub InNotKana { return <<'END'; !utf8::InHiragana -utf8::InKatakana +utf8::IsCn &utf8::Any END } C<&utf8::Any> must be the last line in the definition. Intersection is used generally for getting the common characters matched by two (or more) classes. It's important to remember not to use C<"&"> for the first set; that would be intersecting with nothing, resulting in an empty set. (Similarly using C<"-"> for the first set does nothing). Unlike non-user-defined C<\p{}> property matches, no warning is ever generated if these properties are matched against a non-Unicode code point (see L</Beyond Unicode code points> below). =head2 User-Defined Case Mappings (for serious hackers only) B<This feature has been removed as of Perl 5.16.> The CPAN module C<L<Unicode::Casing>> provides better functionality without the drawbacks that this feature had. If you are using a Perl earlier than 5.16, this feature was most fully documented in the 5.14 version of this pod: L<http://perldoc.perl.org/5.14.0/perlunicode.html#User-Defined-Case-Mappings-%28for-serious-hackers-only%29> =head2 Character Encodings for Input and Output See L<Encode>. =head2 Unicode Regular Expression Support Level The following list of Unicode supported features for regular expressions describes all features currently directly supported by core Perl. The references to "Level I<N>" and the section numbers refer to L<UTS#18 "Unicode Regular Expressions"|https://www.unicode.org/reports/tr18>, version 18, October 2016. =head3 Level 1 - Basic Unicode Support RL1.1 Hex Notation - Done [1] RL1.2 Properties - Done [2] RL1.2a Compatibility Properties - Done [3] RL1.3 Subtraction and Intersection - Experimental [4] RL1.4 Simple Word Boundaries - Done [5] RL1.5 Simple Loose Matches - Done [6] RL1.6 Line Boundaries - Partial [7] RL1.7 Supplementary Code Points - Done [8] =over 4 =item [1] C<\N{U+...}> and C<\x{...}> =item [2] C<\p{...}> C<\P{...}>. This requirement is for a minimal list of properties. Perl supports these. See R2.7 for other properties. =item [3] Perl has C<\d> C<\D> C<\s> C<\S> C<\w> C<\W> C<\X> C<[:I<prop>:]> C<[:^I<prop>:]>, plus all the properties specified by L<https://www.unicode.org/reports/tr18/#Compatibility_Properties>. These are described above in L</Other Properties> =item [4] The experimental feature C<"(?[...])"> starting in v5.18 accomplishes this. See L<perlre/(?[ ])>. If you don't want to use an experimental feature, you can use one of the following: =over 4 =item * Regular expression lookahead You can mimic class subtraction using lookahead. For example, what UTS#18 might write as [{Block=Greek}-[{UNASSIGNED}]] in Perl can be written as: (?!\p{Unassigned})\p{Block=Greek} (?=\p{Assigned})\p{Block=Greek} But in this particular example, you probably really want \p{Greek} which will match assigned characters known to be part of the Greek script. =item * CPAN module C<L<Unicode::Regex::Set>> It does implement the full UTS#18 grouping, intersection, union, and removal (subtraction) syntax. =item * L</"User-Defined Character Properties"> C<"+"> for union, C<"-"> for removal (set-difference), C<"&"> for intersection =back =item [5] C<\b> C<\B> meet most, but not all, the details of this requirement, but C<\b{wb}> and C<\B{wb}> do, as well as the stricter R2.3. =item [6] Note that Perl does Full case-folding in matching, not Simple: For example C<U+1F88> is equivalent to C<U+1F00 U+03B9>, instead of just C<U+1F80>. This difference matters mainly for certain Greek capital letters with certain modifiers: the Full case-folding decomposes the letter, while the Simple case-folding would map it to a single character. =item [7] The reason this is considered to be only partially implemented is that Perl has L<C<qrE<sol>\b{lb}E<sol>>|perlrebackslash/\b{lb}> and C<L<Unicode::LineBreak>> that are conformant with L<UAX#14 "Unicode Line Breaking Algorithm"|https://www.unicode.org/reports/tr14>. The regular expression construct provides default behavior, while the heavier-weight module provides customizable line breaking. But Perl treats C<\n> as the start- and end-line delimiter, whereas Unicode specifies more characters that should be so-interpreted. These are: VT U+000B (\v in C) FF U+000C (\f) CR U+000D (\r) NEL U+0085 LS U+2028 PS U+2029 C<^> and C<$> in regular expression patterns are supposed to match all these, but don't. These characters also don't, but should, affect C<< <> >> C<$.>, and script line numbers. Also, lines should not be split within C<CRLF> (i.e. there is no empty line between C<\r> and C<\n>). For C<CRLF>, try the C<:crlf> layer (see L<PerlIO>). =item [8] UTF-8/UTF-EBDDIC used in Perl allows not only C<U+10000> to C<U+10FFFF> but also beyond C<U+10FFFF> =back =head3 Level 2 - Extended Unicode Support RL2.1 Canonical Equivalents - Retracted [9] by Unicode RL2.2 Extended Grapheme Clusters and - Partial [10] Character Classes with Strings RL2.3 Default Word Boundaries - Done [11] RL2.4 Default Case Conversion - Done RL2.5 Name Properties - Done RL2.6 Wildcards in Property Values - Partial [12] RL2.7 Full Properties - Partial [13] RL2.8 Optional Properties - Partial [14] =over 4 =item [9] Unicode has rewritten this portion of UTS#18 to say that getting canonical equivalence (see UAX#15 L<"Unicode Normalization Forms"|https://www.unicode.org/reports/tr15>) is basically to be done at the programmer level. Use NFD to write both your regular expressions and text to match them against (you can use L<Unicode::Normalize>). =item [10] Perl has C<\X> and C<\b{gcb}>. Unicode has retracted their "Grapheme Cluster Mode", and recently added string properties, which Perl does not yet support. =item [11] see L<UAX#29 "Unicode Text Segmentation"|https://www.unicode.org/reports/tr29>, =item [12] see L</Wildcards in Property Values> above. =item [13] Perl supports all the properties in the Unicode Character Database (UCD). It does not yet support the listed properties that come from other Unicode sources. =item [14] The only optional property that Perl supports is Named Sequence. None of these properties are in the UCD. =back =head3 Level 3 - Tailored Support This has been retracted by Unicode. =head2 Unicode Encodings Unicode characters are assigned to I<code points>, which are abstract numbers. To use these numbers, various encodings are needed. =over 4 =item * UTF-8 UTF-8 is a variable-length (1 to 4 bytes), byte-order independent encoding. In most of Perl's documentation, including elsewhere in this document, the term "UTF-8" means also "UTF-EBCDIC". But in this section, "UTF-8" refers only to the encoding used on ASCII platforms. It is a superset of 7-bit US-ASCII, so anything encoded in ASCII has the identical representation when encoded in UTF-8. The following table is from Unicode 3.2. Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte U+0000..U+007F 00..7F U+0080..U+07FF * C2..DF 80..BF U+0800..U+0FFF E0 * A0..BF 80..BF U+1000..U+CFFF E1..EC 80..BF 80..BF U+D000..U+D7FF ED 80..9F 80..BF U+D800..U+DFFF +++++ utf16 surrogates, not legal utf8 +++++ U+E000..U+FFFF EE..EF 80..BF 80..BF U+10000..U+3FFFF F0 * 90..BF 80..BF 80..BF U+40000..U+FFFFF F1..F3 80..BF 80..BF 80..BF U+100000..U+10FFFF F4 80..8F 80..BF 80..BF Note the gaps marked by "*" before several of the byte entries above. These are caused by legal UTF-8 avoiding non-shortest encodings: it is technically possible to UTF-8-encode a single code point in different ways, but that is explicitly forbidden, and the shortest possible encoding should always be used (and that is what Perl does). Another way to look at it is via bits: Code Points 1st Byte 2nd Byte 3rd Byte 4th Byte 0aaaaaaa 0aaaaaaa 00000bbbbbaaaaaa 110bbbbb 10aaaaaa ccccbbbbbbaaaaaa 1110cccc 10bbbbbb 10aaaaaa 00000dddccccccbbbbbbaaaaaa 11110ddd 10cccccc 10bbbbbb 10aaaaaa As you can see, the continuation bytes all begin with C<"10">, and the leading bits of the start byte tell how many bytes there are in the encoded character. The original UTF-8 specification allowed up to 6 bytes, to allow encoding of numbers up to C<0x7FFF_FFFF>. Perl continues to allow those, and has extended that up to 13 bytes to encode code points up to what can fit in a 64-bit word. However, Perl will warn if you output any of these as being non-portable; and under strict UTF-8 input protocols, they are forbidden. In addition, it is now illegal to use a code point larger than what a signed integer variable on your system can hold. On 32-bit ASCII systems, this means C<0x7FFF_FFFF> is the legal maximum (much higher on 64-bit systems). =item * UTF-EBCDIC Like UTF-8, but EBCDIC-safe, in the way that UTF-8 is ASCII-safe. This means that all the basic characters (which includes all those that have ASCII equivalents (like C<"A">, C<"0">, C<"%">, I<etc.>) are the same in both EBCDIC and UTF-EBCDIC.) UTF-EBCDIC is used on EBCDIC platforms. It generally requires more bytes to represent a given code point than UTF-8 does; the largest Unicode code points take 5 bytes to represent (instead of 4 in UTF-8), and, extended for 64-bit words, it uses 14 bytes instead of 13 bytes in UTF-8. =item * UTF-16, UTF-16BE, UTF-16LE, Surrogates, and C<BOM>'s (Byte Order Marks) The followings items are mostly for reference and general Unicode knowledge, Perl doesn't use these constructs internally. Like UTF-8, UTF-16 is a variable-width encoding, but where UTF-8 uses 8-bit code units, UTF-16 uses 16-bit code units. All code points occupy either 2 or 4 bytes in UTF-16: code points C<U+0000..U+FFFF> are stored in a single 16-bit unit, and code points C<U+10000..U+10FFFF> in two 16-bit units. The latter case is using I<surrogates>, the first 16-bit unit being the I<high surrogate>, and the second being the I<low surrogate>. Surrogates are code points set aside to encode the C<U+10000..U+10FFFF> range of Unicode code points in pairs of 16-bit units. The I<high surrogates> are the range C<U+D800..U+DBFF> and the I<low surrogates> are the range C<U+DC00..U+DFFF>. The surrogate encoding is $hi = ($uni - 0x10000) / 0x400 + 0xD800; $lo = ($uni - 0x10000) % 0x400 + 0xDC00; and the decoding is $uni = 0x10000 + ($hi - 0xD800) * 0x400 + ($lo - 0xDC00); Because of the 16-bitness, UTF-16 is byte-order dependent. UTF-16 itself can be used for in-memory computations, but if storage or transfer is required either UTF-16BE (big-endian) or UTF-16LE (little-endian) encodings must be chosen. This introduces another problem: what if you just know that your data is UTF-16, but you don't know which endianness? Byte Order Marks, or C<BOM>'s, are a solution to this. A special character has been reserved in Unicode to function as a byte order marker: the character with the code point C<U+FEFF> is the C<BOM>. The trick is that if you read a C<BOM>, you will know the byte order, since if it was written on a big-endian platform, you will read the bytes C<0xFE 0xFF>, but if it was written on a little-endian platform, you will read the bytes C<0xFF 0xFE>. (And if the originating platform was writing in ASCII platform UTF-8, you will read the bytes C<0xEF 0xBB 0xBF>.) The way this trick works is that the character with the code point C<U+FFFE> is not supposed to be in input streams, so the sequence of bytes C<0xFF 0xFE> is unambiguously "C<BOM>, represented in little-endian format" and cannot be C<U+FFFE>, represented in big-endian format". Surrogates have no meaning in Unicode outside their use in pairs to represent other code points. However, Perl allows them to be represented individually internally, for example by saying C<chr(0xD801)>, so that all code points, not just those valid for open interchange, are representable. Unicode does define semantics for them, such as their C<L</General_Category>> is C<"Cs">. But because their use is somewhat dangerous, Perl will warn (using the warning category C<"surrogate">, which is a sub-category of C<"utf8">) if an attempt is made to do things like take the lower case of one, or match case-insensitively, or to output them. (But don't try this on Perls before 5.14.) =item * UTF-32, UTF-32BE, UTF-32LE The UTF-32 family is pretty much like the UTF-16 family, except that the units are 32-bit, and therefore the surrogate scheme is not needed. UTF-32 is a fixed-width encoding. The C<BOM> signatures are C<0x00 0x00 0xFE 0xFF> for BE and C<0xFF 0xFE 0x00 0x00> for LE. =item * UCS-2, UCS-4 Legacy, fixed-width encodings defined by the ISO 10646 standard. UCS-2 is a 16-bit encoding. Unlike UTF-16, UCS-2 is not extensible beyond C<U+FFFF>, because it does not use surrogates. UCS-4 is a 32-bit encoding, functionally identical to UTF-32 (the difference being that UCS-4 forbids neither surrogates nor code points larger than C<0x10_FFFF>). =item * UTF-7 A seven-bit safe (non-eight-bit) encoding, which is useful if the transport or storage is not eight-bit safe. Defined by RFC 2152. =back =head2 Noncharacter code points 66 code points are set aside in Unicode as "noncharacter code points". These all have the C<Unassigned> (C<Cn>) C<L</General_Category>>, and no character will ever be assigned to any of them. They are the 32 code points between C<U+FDD0> and C<U+FDEF> inclusive, and the 34 code points: U+FFFE U+FFFF U+1FFFE U+1FFFF U+2FFFE U+2FFFF ... U+EFFFE U+EFFFF U+FFFFE U+FFFFF U+10FFFE U+10FFFF Until Unicode 7.0, the noncharacters were "B<forbidden> for use in open interchange of Unicode text data", so that code that processed those streams could use these code points as sentinels that could be mixed in with character data, and would always be distinguishable from that data. (Emphasis above and in the next paragraph are added in this document.) Unicode 7.0 changed the wording so that they are "B<not recommended> for use in open interchange of Unicode text data". The 7.0 Standard goes on to say: =over 4 "If a noncharacter is received in open interchange, an application is not required to interpret it in any way. It is good practice, however, to recognize it as a noncharacter and to take appropriate action, such as replacing it with C<U+FFFD> replacement character, to indicate the problem in the text. It is not recommended to simply delete noncharacter code points from such text, because of the potential security issues caused by deleting uninterpreted characters. (See conformance clause C7 in Section 3.2, Conformance Requirements, and L<Unicode Technical Report #36, "Unicode Security Considerations"|https://www.unicode.org/reports/tr36/#Substituting_for_Ill_Formed_Subsequences>)." =back This change was made because it was found that various commercial tools like editors, or for things like source code control, had been written so that they would not handle program files that used these code points, effectively precluding their use almost entirely! And that was never the intent. They've always been meant to be usable within an application, or cooperating set of applications, at will. If you're writing code, such as an editor, that is supposed to be able to handle any Unicode text data, then you shouldn't be using these code points yourself, and instead allow them in the input. If you need sentinels, they should instead be something that isn't legal Unicode. For UTF-8 data, you can use the bytes 0xC1 and 0xC2 as sentinels, as they never appear in well-formed UTF-8. (There are equivalents for UTF-EBCDIC). You can also store your Unicode code points in integer variables and use negative values as sentinels. If you're not writing such a tool, then whether you accept noncharacters as input is up to you (though the Standard recommends that you not). If you do strict input stream checking with Perl, these code points continue to be forbidden. This is to maintain backward compatibility (otherwise potential security holes could open up, as an unsuspecting application that was written assuming the noncharacters would be filtered out before getting to it, could now, without warning, start getting them). To do strict checking, you can use the layer C<:encoding('UTF-8')>. Perl continues to warn (using the warning category C<"nonchar">, which is a sub-category of C<"utf8">) if an attempt is made to output noncharacters. =head2 Beyond Unicode code points The maximum Unicode code point is C<U+10FFFF>, and Unicode only defines operations on code points up through that. But Perl works on code points up to the maximum permissible signed number available on the platform. However, Perl will not accept these from input streams unless lax rules are being used, and will warn (using the warning category C<"non_unicode">, which is a sub-category of C<"utf8">) if any are output. Since Unicode rules are not defined on these code points, if a Unicode-defined operation is done on them, Perl uses what we believe are sensible rules, while generally warning, using the C<"non_unicode"> category. For example, C<uc("\x{11_0000}")> will generate such a warning, returning the input parameter as its result, since Perl defines the uppercase of every non-Unicode code point to be the code point itself. (All the case changing operations, not just uppercasing, work this way.) The situation with matching Unicode properties in regular expressions, the C<\p{}> and C<\P{}> constructs, against these code points is not as clear cut, and how these are handled has changed as we've gained experience. One possibility is to treat any match against these code points as undefined. But since Perl doesn't have the concept of a match being undefined, it converts this to failing or C<FALSE>. This is almost, but not quite, what Perl did from v5.14 (when use of these code points became generally reliable) through v5.18. The difference is that Perl treated all C<\p{}> matches as failing, but all C<\P{}> matches as succeeding. One problem with this is that it leads to unexpected, and confusing results in some cases: chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Failed on <= v5.18 chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Failed! on <= v5.18 That is, it treated both matches as undefined, and converted that to false (raising a warning on each). The first case is the expected result, but the second is likely counterintuitive: "How could both be false when they are complements?" Another problem was that the implementation optimized many Unicode property matches down to already existing simpler, faster operations, which don't raise the warning. We chose to not forgo those optimizations, which help the vast majority of matches, just to generate a warning for the unlikely event that an above-Unicode code point is being matched against. As a result of these problems, starting in v5.20, what Perl does is to treat non-Unicode code points as just typical unassigned Unicode characters, and matches accordingly. (Note: Unicode has atypical unassigned code points. For example, it has noncharacter code points, and ones that, when they do get assigned, are destined to be written Right-to-left, as Arabic and Hebrew are. Perl assumes that no non-Unicode code point has any atypical properties.) Perl, in most cases, will raise a warning when matching an above-Unicode code point against a Unicode property when the result is C<TRUE> for C<\p{}>, and C<FALSE> for C<\P{}>. For example: chr(0x110000) =~ \p{ASCII_Hex_Digit=True} # Fails, no warning chr(0x110000) =~ \p{ASCII_Hex_Digit=False} # Succeeds, with warning In both these examples, the character being matched is non-Unicode, so Unicode doesn't define how it should match. It clearly isn't an ASCII hex digit, so the first example clearly should fail, and so it does, with no warning. But it is arguable that the second example should have an undefined, hence C<FALSE>, result. So a warning is raised for it. Thus the warning is raised for many fewer cases than in earlier Perls, and only when what the result is could be arguable. It turns out that none of the optimizations made by Perl (or are ever likely to be made) cause the warning to be skipped, so it solves both problems of Perl's earlier approach. The most commonly used property that is affected by this change is C<\p{Unassigned}> which is a short form for C<\p{General_Category=Unassigned}>. Starting in v5.20, all non-Unicode code points are considered C<Unassigned>. In earlier releases the matches failed because the result was considered undefined. The only place where the warning is not raised when it might ought to have been is if optimizations cause the whole pattern match to not even be attempted. For example, Perl may figure out that for a string to match a certain regular expression pattern, the string has to contain the substring C<"foobar">. Before attempting the match, Perl may look for that substring, and if not found, immediately fail the match without actually trying it; so no warning gets generated even if the string contains an above-Unicode code point. This behavior is more "Do what I mean" than in earlier Perls for most applications. But it catches fewer issues for code that needs to be strictly Unicode compliant. Therefore there is an additional mode of operation available to accommodate such code. This mode is enabled if a regular expression pattern is compiled within the lexical scope where the C<"non_unicode"> warning class has been made fatal, say by: use warnings FATAL => "non_unicode" (see L<warnings>). In this mode of operation, Perl will raise the warning for all matches against a non-Unicode code point (not just the arguable ones), and it skips the optimizations that might cause the warning to not be output. (It currently still won't warn if the match isn't even attempted, like in the C<"foobar"> example above.) In summary, Perl now normally treats non-Unicode code points as typical Unicode unassigned code points for regular expression matches, raising a warning only when it is arguable what the result should be. However, if this warning has been made fatal, it isn't skipped. There is one exception to all this. C<\p{All}> looks like a Unicode property, but it is a Perl extension that is defined to be true for all possible code points, Unicode or not, so no warning is ever generated when matching this against a non-Unicode code point. (Prior to v5.20, it was an exact synonym for C<\p{Any}>, matching code points C<0> through C<0x10FFFF>.) =head2 Security Implications of Unicode First, read L<Unicode Security Considerations|https://www.unicode.org/reports/tr36>. Also, note the following: =over 4 =item * Malformed UTF-8 UTF-8 is very structured, so many combinations of bytes are invalid. In the past, Perl tried to soldier on and make some sense of invalid combinations, but this can lead to security holes, so now, if the Perl core needs to process an invalid combination, it will either raise a fatal error, or will replace those bytes by the sequence that forms the Unicode REPLACEMENT CHARACTER, for which purpose Unicode created it. Every code point can be represented by more than one possible syntactically valid UTF-8 sequence. Early on, both Unicode and Perl considered any of these to be valid, but now, all sequences longer than the shortest possible one are considered to be malformed. Unicode considers many code points to be illegal, or to be avoided. Perl generally accepts them, once they have passed through any input filters that may try to exclude them. These have been discussed above (see "Surrogates" under UTF-16 in L</Unicode Encodings>, L</Noncharacter code points>, and L</Beyond Unicode code points>). =item * Regular expression pattern matching may surprise you if you're not accustomed to Unicode. Starting in Perl 5.14, several pattern modifiers are available to control this, called the character set modifiers. Details are given in L<perlre/Character set modifiers>. =back As discussed elsewhere, Perl has one foot (two hooves?) planted in each of two worlds: the old world of ASCII and single-byte locales, and the new world of Unicode, upgrading when necessary. If your legacy code does not explicitly use Unicode, no automatic switch-over to Unicode should happen. =head2 Unicode in Perl on EBCDIC Unicode is supported on EBCDIC platforms. See L<perlebcdic>. Unless ASCII vs. EBCDIC issues are specifically being discussed, references to UTF-8 encoding in this document and elsewhere should be read as meaning UTF-EBCDIC on EBCDIC platforms. See L<perlebcdic/Unicode and UTF>. Because UTF-EBCDIC is so similar to UTF-8, the differences are mostly hidden from you; S<C<use utf8>> (and NOT something like S<C<use utfebcdic>>) declares the script is in the platform's "native" 8-bit encoding of Unicode. (Similarly for the C<":utf8"> layer.) =head2 Locales See L<perllocale/Unicode and UTF-8> =head2 When Unicode Does Not Happen There are still many places where Unicode (in some encoding or another) could be given as arguments or received as results, or both in Perl, but it is not, in spite of Perl having extensive ways to input and output in Unicode, and a few other "entry points" like the C<@ARGV> array (which can sometimes be interpreted as UTF-8). The following are such interfaces. Also, see L</The "Unicode Bug">. For all of these interfaces Perl currently (as of v5.16.0) simply assumes byte strings both as arguments and results, or UTF-8 strings if the (deprecated) C<encoding> pragma has been used. One reason that Perl does not attempt to resolve the role of Unicode in these situations is that the answers are highly dependent on the operating system and the file system(s). For example, whether filenames can be in Unicode and in exactly what kind of encoding, is not exactly a portable concept. Similarly for C<qx> and C<system>: how well will the "command-line interface" (and which of them?) handle Unicode? =over 4 =item * C<chdir>, C<chmod>, C<chown>, C<chroot>, C<exec>, C<link>, C<lstat>, C<mkdir>, C<rename>, C<rmdir>, C<stat>, C<symlink>, C<truncate>, C<unlink>, C<utime>, C<-X> =item * C<%ENV> =item * C<glob> (aka the C<E<lt>*E<gt>>) =item * C<open>, C<opendir>, C<sysopen> =item * C<qx> (aka the backtick operator), C<system> =item * C<readdir>, C<readlink> =back =head2 The "Unicode Bug" The term, "Unicode bug" has been applied to an inconsistency with the code points in the C<Latin-1 Supplement> block, that is, between 128 and 255. Without a locale specified, unlike all other characters or code points, these characters can have very different semantics depending on the rules in effect. (Characters whose code points are above 255 force Unicode rules; whereas the rules for ASCII characters are the same under both ASCII and Unicode rules.) Under Unicode rules, these upper-Latin1 characters are interpreted as Unicode code points, which means they have the same semantics as Latin-1 (ISO-8859-1) and C1 controls. As explained in L</ASCII Rules versus Unicode Rules>, under ASCII rules, they are considered to be unassigned characters. This can lead to unexpected results. For example, a string's semantics can suddenly change if a code point above 255 is appended to it, which changes the rules from ASCII to Unicode. As an example, consider the following program and its output: $ perl -le' no feature "unicode_strings"; $s1 = "\xC2"; $s2 = "\x{2660}"; for ($s1, $s2, $s1.$s2) { print /\w/ || 0; } ' 0 0 1 If there's no C<\w> in C<s1> nor in C<s2>, why does their concatenation have one? This anomaly stems from Perl's attempt to not disturb older programs that didn't use Unicode, along with Perl's desire to add Unicode support seamlessly. But the result turned out to not be seamless. (By the way, you can choose to be warned when things like this happen. See C<L<encoding::warnings>>.) L<S<C<use feature 'unicode_strings'>>|feature/The 'unicode_strings' feature> was added, starting in Perl v5.12, to address this problem. It affects these things: =over 4 =item * Changing the case of a scalar, that is, using C<uc()>, C<ucfirst()>, C<lc()>, and C<lcfirst()>, or C<\L>, C<\U>, C<\u> and C<\l> in double-quotish contexts, such as regular expression substitutions. Under C<unicode_strings> starting in Perl 5.12.0, Unicode rules are generally used. See L<perlfunc/lc> for details on how this works in combination with various other pragmas. =item * Using caseless (C</i>) regular expression matching. Starting in Perl 5.14.0, regular expressions compiled within the scope of C<unicode_strings> use Unicode rules even when executed or compiled into larger regular expressions outside the scope. =item * Matching any of several properties in regular expressions. These properties are C<\b> (without braces), C<\B> (without braces), C<\s>, C<\S>, C<\w>, C<\W>, and all the Posix character classes I<except> C<[[:ascii:]]>. Starting in Perl 5.14.0, regular expressions compiled within the scope of C<unicode_strings> use Unicode rules even when executed or compiled into larger regular expressions outside the scope. =item * In C<quotemeta> or its inline equivalent C<\Q>. Starting in Perl 5.16.0, consistent quoting rules are used within the scope of C<unicode_strings>, as described in L<perlfunc/quotemeta>. Prior to that, or outside its scope, no code points above 127 are quoted in UTF-8 encoded strings, but in byte encoded strings, code points between 128-255 are always quoted. =item * In the C<..> or L<range|perlop/Range Operators> operator. Starting in Perl 5.26.0, the range operator on strings treats their lengths consistently within the scope of C<unicode_strings>. Prior to that, or outside its scope, it could produce strings whose length in characters exceeded that of the right-hand side, where the right-hand side took up more bytes than the correct range endpoint. =item * In L<< C<split>'s special-case whitespace splitting|perlfunc/split >>. Starting in Perl 5.28.0, the C<split> function with a pattern specified as a string containing a single space handles whitespace characters consistently within the scope of C<unicode_strings>. Prior to that, or outside its scope, characters that are whitespace according to Unicode rules but not according to ASCII rules were treated as field contents rather than field separators when they appear in byte-encoded strings. =back You can see from the above that the effect of C<unicode_strings> increased over several Perl releases. (And Perl's support for Unicode continues to improve; it's best to use the latest available release in order to get the most complete and accurate results possible.) Note that C<unicode_strings> is automatically chosen if you S<C<use 5.012>> or higher. For Perls earlier than those described above, or when a string is passed to a function outside the scope of C<unicode_strings>, see the next section. =head2 Forcing Unicode in Perl (Or Unforcing Unicode in Perl) Sometimes (see L</"When Unicode Does Not Happen"> or L</The "Unicode Bug">) there are situations where you simply need to force a byte string into UTF-8, or vice versa. The standard module L<Encode> can be used for this, or the low-level calls L<C<utf8::upgrade($bytestring)>|utf8/Utility functions> and L<C<utf8::downgrade($utf8string[, FAIL_OK])>|utf8/Utility functions>. Note that C<utf8::downgrade()> can fail if the string contains characters that don't fit into a byte. Calling either function on a string that already is in the desired state is a no-op. L</ASCII Rules versus Unicode Rules> gives all the ways that a string is made to use Unicode rules. =head2 Using Unicode in XS See L<perlguts/"Unicode Support"> for an introduction to Unicode at the XS level, and L<perlapi/Unicode Support> for the API details. =head2 Hacking Perl to work on earlier Unicode versions (for very serious hackers only) Perl by default comes with the latest supported Unicode version built-in, but the goal is to allow you to change to use any earlier one. In Perls v5.20 and v5.22, however, the earliest usable version is Unicode 5.1. Perl v5.18 and v5.24 are able to handle all earlier versions. Download the files in the desired version of Unicode from the Unicode web site L<https://www.unicode.org>). These should replace the existing files in F<lib/unicore> in the Perl source tree. Follow the instructions in F<README.perl> in that directory to change some of their names, and then build perl (see L<INSTALL>). =head2 Porting code from perl-5.6.X Perls starting in 5.8 have a different Unicode model from 5.6. In 5.6 the programmer was required to use the C<utf8> pragma to declare that a given scope expected to deal with Unicode data and had to make sure that only Unicode data were reaching that scope. If you have code that is working with 5.6, you will need some of the following adjustments to your code. The examples are written such that the code will continue to work under 5.6, so you should be safe to try them out. =over 3 =item * A filehandle that should read or write UTF-8 if ($] > 5.008) { binmode $fh, ":encoding(UTF-8)"; } =item * A scalar that is going to be passed to some extension Be it C<Compress::Zlib>, C<Apache::Request> or any extension that has no mention of Unicode in the manpage, you need to make sure that the UTF8 flag is stripped off. Note that at the time of this writing (January 2012) the mentioned modules are not UTF-8-aware. Please check the documentation to verify if this is still true. if ($] > 5.008) { require Encode; $val = Encode::encode("UTF-8", $val); # make octets } =item * A scalar we got back from an extension If you believe the scalar comes back as UTF-8, you will most likely want the UTF8 flag restored: if ($] > 5.008) { require Encode; $val = Encode::decode("UTF-8", $val); } =item * Same thing, if you are really sure it is UTF-8 if ($] > 5.008) { require Encode; Encode::_utf8_on($val); } =item * A wrapper for L<DBI> C<fetchrow_array> and C<fetchrow_hashref> When the database contains only UTF-8, a wrapper function or method is a convenient way to replace all your C<fetchrow_array> and C<fetchrow_hashref> calls. A wrapper function will also make it easier to adapt to future enhancements in your database driver. Note that at the time of this writing (January 2012), the DBI has no standardized way to deal with UTF-8 data. Please check the L<DBI documentation|DBI> to verify if that is still true. sub fetchrow { # $what is one of fetchrow_{array,hashref} my($self, $sth, $what) = @_; if ($] < 5.008) { return $sth->$what; } else { require Encode; if (wantarray) { my @arr = $sth->$what; for (@arr) { defined && /[^\000-\177]/ && Encode::_utf8_on($_); } return @arr; } else { my $ret = $sth->$what; if (ref $ret) { for my $k (keys %$ret) { defined && /[^\000-\177]/ && Encode::_utf8_on($_) for $ret->{$k}; } return $ret; } else { defined && /[^\000-\177]/ && Encode::_utf8_on($_) for $ret; return $ret; } } } } =item * A large scalar that you know can only contain ASCII Scalars that contain only ASCII and are marked as UTF-8 are sometimes a drag to your program. If you recognize such a situation, just remove the UTF8 flag: utf8::downgrade($val) if $] > 5.008; =back =head1 BUGS See also L</The "Unicode Bug"> above. =head2 Interaction with Extensions When Perl exchanges data with an extension, the extension should be able to understand the UTF8 flag and act accordingly. If the extension doesn't recognize that flag, it's likely that the extension will return incorrectly-flagged data. So if you're working with Unicode data, consult the documentation of every module you're using if there are any issues with Unicode data exchange. If the documentation does not talk about Unicode at all, suspect the worst and probably look at the source to learn how the module is implemented. Modules written completely in Perl shouldn't cause problems. Modules that directly or indirectly access code written in other programming languages are at risk. For affected functions, the simple strategy to avoid data corruption is to always make the encoding of the exchanged data explicit. Choose an encoding that you know the extension can handle. Convert arguments passed to the extensions to that encoding and convert results back from that encoding. Write wrapper functions that do the conversions for you, so you can later change the functions when the extension catches up. To provide an example, let's say the popular C<Foo::Bar::escape_html> function doesn't deal with Unicode data yet. The wrapper function would convert the argument to raw UTF-8 and convert the result back to Perl's internal representation like so: sub my_escape_html ($) { my($what) = shift; return unless defined $what; Encode::decode("UTF-8", Foo::Bar::escape_html( Encode::encode("UTF-8", $what))); } Sometimes, when the extension does not convert data but just stores and retrieves it, you will be able to use the otherwise dangerous L<C<Encode::_utf8_on()>|Encode/_utf8_on> function. Let's say the popular C<Foo::Bar> extension, written in C, provides a C<param> method that lets you store and retrieve data according to these prototypes: $self->param($name, $value); # set a scalar $value = $self->param($name); # retrieve a scalar If it does not yet provide support for any encoding, one could write a derived class with such a C<param> method: sub param { my($self,$name,$value) = @_; utf8::upgrade($name); # make sure it is UTF-8 encoded if (defined $value) { utf8::upgrade($value); # make sure it is UTF-8 encoded return $self->SUPER::param($name,$value); } else { my $ret = $self->SUPER::param($name); Encode::_utf8_on($ret); # we know, it is UTF-8 encoded return $ret; } } Some extensions provide filters on data entry/exit points, such as C<DB_File::filter_store_key> and family. Look out for such filters in the documentation of your extensions; they can make the transition to Unicode data much easier. =head2 Speed Some functions are slower when working on UTF-8 encoded strings than on byte encoded strings. All functions that need to hop over characters such as C<length()>, C<substr()> or C<index()>, or matching regular expressions can work B<much> faster when the underlying data are byte-encoded. In Perl 5.8.0 the slowness was often quite spectacular; in Perl 5.8.1 a caching scheme was introduced which improved the situation. In general, operations with UTF-8 encoded strings are still slower. As an example, the Unicode properties (character classes) like C<\p{Nd}> are known to be quite a bit slower (5-20 times) than their simpler counterparts like C<[0-9]> (then again, there are hundreds of Unicode characters matching C<Nd> compared with the 10 ASCII characters matching C<[0-9]>). =head1 SEE ALSO L<perlunitut>, L<perluniintro>, L<perluniprops>, L<Encode>, L<open>, L<utf8>, L<bytes>, L<perlretut>, L<perlvar/"${^UNICODE}">, L<https://www.unicode.org/reports/tr44>). =cut PK �=�[��Fgo o perltw.podnu �[��� =encoding utf8 如果你用一般的文字編輯器閱覽這份文件, 請忽略文中奇特的註記字符. 這份文件是以 POD (簡明文件格式) 寫成; 這種格式是為了能讓人直接讀取, 而特別設計的. 關於此格式的進一步資訊, 請參考 perlpod 線上文件. =head1 NAME perltw - 正體中文 Perl 指南 =head1 DESCRIPTION 歡迎來到 Perl 的天地! 從 5.8.0 版開始, Perl 具備了完善的 Unicode (萬國碼) 支援, 也連帶支援了許多拉丁語系以外的編碼方式; CJK (中日韓) 便是其中的一部份. Unicode 是國際性的標準, 試圖涵蓋世界上所有的字符: 西方世界, 東方世界, 以及兩者間的一切 (希臘文, 敘利亞文, 阿拉伯文, 希伯來文, 印度文, 印地安文, 等等). 它也容納了多種作業系統與平臺 (如 PC 及麥金塔). Perl 本身以 Unicode 進行操作. 這表示 Perl 內部的字串資料可用 Unicode 表示; Perl 的函式與算符 (例如正規表示式比對) 也能對 Unicode 進行操作. 在輸入及輸出時, 為了處理以 Unicode 之前的編碼方式儲存的資料, Perl 提供了 Encode 這個模組, 可以讓你輕易地讀取及寫入舊有的編碼資料. Encode 延伸模組支援下列正體中文的編碼方式 ('big5' 表示 'big5-eten'): big5-eten Big5 編碼 (含倚天延伸字形) big5-hkscs Big5 + 香港外字集, 2001 年版 cp950 字碼頁 950 (Big5 + 微軟添加的字符) 舉例來說, 將 Big5 編碼的檔案轉成 Unicode, 祗需鍵入下列指令: perl -MEncode -pe '$_= encode( utf8 => decode( big5 => $_ ) )' \ < file.big5 > file.utf8 Perl 也內附了 "piconv", 一支完全以 Perl 寫成的字符轉換工具程式, 用法如下: piconv -f big5 -t utf8 < file.big5 > file.utf8 piconv -f utf8 -t big5 < file.utf8 > file.big5 另外,若程式碼本身以 utf8 編碼儲存,配合使用 utf8 模組,可讓程式碼中字串以及其運 算皆以字符為單位,而不以位元為單位,如下所示: #!/usr/bin/env perl use utf8; print length("駱駝"); # 2 (不是 6) print index("諄諄教誨", "教誨"); # 2 (從 0 起算第 2 個字符) =head2 額外的中文編碼 如果需要更多的中文編碼, 可以從 CPAN (L<https://www.cpan.org/>) 下載 Encode::HanExtra 模組. 它目前提供下列編碼方式: cccii 1980 年文建會的中文資訊交換碼 euc-tw Unix 延伸字符集, 包含 CNS11643 平面 1-7 big5plus 中文數位化技術推廣基金會的 Big5+ big5ext 中文數位化技術推廣基金會的 Big5e 另外, Encode::HanConvert 模組則提供了簡繁轉換用的兩種編碼: big5-simp Big5 正體中文與 Unicode 簡體中文互轉 gbk-trad GBK 簡體中文與 Unicode 正體中文互轉 若想在 GBK 與 Big5 之間互轉, 請參考該模組內附的 b2g.pl 與 g2b.pl 兩支程式, 或在程式內使用下列寫法: use Encode::HanConvert; $euc_cn = big5_to_gb($big5); # 從 Big5 轉為 GBK $big5 = gb_to_big5($euc_cn); # 從 GBK 轉為 Big5 =head2 進一步的資訊 請參考 Perl 內附的大量說明文件 (不幸全是用英文寫的), 來學習更多關於 Perl 的知識, 以及 Unicode 的使用方式. 不過, 外部的資源相當豐富: =head2 提供 Perl 資源的網址 =over 4 =item L<https://www.perl.org/> Perl 的首頁 =item L<https://www.perl.com/> 由 Perl 基金會所營運的文章輯錄 =item L<https://www.cpan.org/> Perl 綜合典藏網 (Comprehensive Perl Archive Network) =item L<https://lists.perl.org/> Perl 郵遞論壇一覽 =back =head2 學習 Perl 的網址 =over 4 =item L<http://www.oreilly.com.tw/product_perl.php?id=index_perl> 正體中文版的歐萊禮 Perl 書藉 =back =head2 Perl 使用者集會 =over 4 =item L<https://www.pm.org/groups/taiwan.html> 臺灣 Perl 推廣組一覽 =item L<irc://irc.freenode.org/#perl.tw> Perl.tw 線上聊天室 =back =head2 Unicode 相關網址 =over 4 =item L<https://www.unicode.org/> Unicode 學術學會 (Unicode 標準的制定者) =item L<http://www.cl.cam.ac.uk/%7Emgk25/unicode.html> Unix/Linux 上的 UTF-8 及 Unicode 答客問 =back =head2 中文化資訊 =over 4 =item 中文化軟體聯盟 L<http://www.cpatch.org/> =back =head1 SEE ALSO L<Encode>, L<Encode::TW>, L<perluniintro>, L<perlunicode> =head1 AUTHORS Jarkko Hietaniemi E<lt>jhi@iki.fiE<gt> Audrey Tang (唐鳳) E<lt>audreyt@audreyt.orgE<gt> =cut PK �=�[�?�$wV wV perlintro.podnu �[��� =head1 NAME perlintro -- a brief introduction and overview of Perl =head1 DESCRIPTION This document is intended to give you a quick overview of the Perl programming language, along with pointers to further documentation. It is intended as a "bootstrap" guide for those who are new to the language, and provides just enough information for you to be able to read other peoples' Perl and understand roughly what it's doing, or write your own simple scripts. This introductory document does not aim to be complete. It does not even aim to be entirely accurate. In some cases perfection has been sacrificed in the goal of getting the general idea across. You are I<strongly> advised to follow this introduction with more information from the full Perl manual, the table of contents to which can be found in L<perltoc>. Throughout this document you'll see references to other parts of the Perl documentation. You can read that documentation using the C<perldoc> command or whatever method you're using to read this document. Throughout Perl's documentation, you'll find numerous examples intended to help explain the discussed features. Please keep in mind that many of them are code fragments rather than complete programs. These examples often reflect the style and preference of the author of that piece of the documentation, and may be briefer than a corresponding line of code in a real program. Except where otherwise noted, you should assume that C<use strict> and C<use warnings> statements appear earlier in the "program", and that any variables used have already been declared, even if those declarations have been omitted to make the example easier to read. Do note that the examples have been written by many different authors over a period of several decades. Styles and techniques will therefore differ, although some effort has been made to not vary styles too widely in the same sections. Do not consider one style to be better than others - "There's More Than One Way To Do It" is one of Perl's mottos. After all, in your journey as a programmer, you are likely to encounter different styles. =head2 What is Perl? Perl is a general-purpose programming language originally developed for text manipulation and now used for a wide range of tasks including system administration, web development, network programming, GUI development, and more. The language is intended to be practical (easy to use, efficient, complete) rather than beautiful (tiny, elegant, minimal). Its major features are that it's easy to use, supports both procedural and object-oriented (OO) programming, has powerful built-in support for text processing, and has one of the world's most impressive collections of third-party modules. Different definitions of Perl are given in L<perl>, L<perlfaq1> and no doubt other places. From this we can determine that Perl is different things to different people, but that lots of people think it's at least worth writing about. =head2 Running Perl programs To run a Perl program from the Unix command line: perl progname.pl Alternatively, put this as the first line of your script: #!/usr/bin/env perl ... and run the script as F</path/to/script.pl>. Of course, it'll need to be executable first, so C<chmod 755 script.pl> (under Unix). (This start line assumes you have the B<env> program. You can also put directly the path to your perl executable, like in C<#!/usr/bin/perl>). For more information, including instructions for other platforms such as Windows and Mac OS, read L<perlrun>. =head2 Safety net Perl by default is very forgiving. In order to make it more robust it is recommended to start every program with the following lines: #!/usr/bin/perl use strict; use warnings; The two additional lines request from perl to catch various common problems in your code. They check different things so you need both. A potential problem caught by C<use strict;> will cause your code to stop immediately when it is encountered, while C<use warnings;> will merely give a warning (like the command-line switch B<-w>) and let your code run. To read more about them check their respective manual pages at L<strict> and L<warnings>. =head2 Basic syntax overview A Perl script or program consists of one or more statements. These statements are simply written in the script in a straightforward fashion. There is no need to have a C<main()> function or anything of that kind. Perl statements end in a semi-colon: print "Hello, world"; Comments start with a hash symbol and run to the end of the line # This is a comment Whitespace is irrelevant: print "Hello, world" ; ... except inside quoted strings: # this would print with a linebreak in the middle print "Hello world"; Double quotes or single quotes may be used around literal strings: print "Hello, world"; print 'Hello, world'; However, only double quotes "interpolate" variables and special characters such as newlines (C<\n>): print "Hello, $name\n"; # works fine print 'Hello, $name\n'; # prints $name\n literally Numbers don't need quotes around them: print 42; You can use parentheses for functions' arguments or omit them according to your personal taste. They are only required occasionally to clarify issues of precedence. print("Hello, world\n"); print "Hello, world\n"; More detailed information about Perl syntax can be found in L<perlsyn>. =head2 Perl variable types Perl has three main variable types: scalars, arrays, and hashes. =over 4 =item Scalars A scalar represents a single value: my $animal = "camel"; my $answer = 42; Scalar values can be strings, integers or floating point numbers, and Perl will automatically convert between them as required. There is no need to pre-declare your variable types, but you have to declare them using the C<my> keyword the first time you use them. (This is one of the requirements of C<use strict;>.) Scalar values can be used in various ways: print $animal; print "The animal is $animal\n"; print "The square of $answer is ", $answer * $answer, "\n"; There are a number of "magic" scalars with names that look like punctuation or line noise. These special variables are used for all kinds of purposes, and are documented in L<perlvar>. The only one you need to know about for now is C<$_> which is the "default variable". It's used as the default argument to a number of functions in Perl, and it's set implicitly by certain looping constructs. print; # prints contents of $_ by default =item Arrays An array represents a list of values: my @animals = ("camel", "llama", "owl"); my @numbers = (23, 42, 69); my @mixed = ("camel", 42, 1.23); Arrays are zero-indexed. Here's how you get at elements in an array: print $animals[0]; # prints "camel" print $animals[1]; # prints "llama" The special variable C<$#array> tells you the index of the last element of an array: print $mixed[$#mixed]; # last element, prints 1.23 You might be tempted to use C<$#array + 1> to tell you how many items there are in an array. Don't bother. As it happens, using C<@array> where Perl expects to find a scalar value ("in scalar context") will give you the number of elements in the array: if (@animals < 5) { ... } The elements we're getting from the array start with a C<$> because we're getting just a single value out of the array; you ask for a scalar, you get a scalar. To get multiple values from an array: @animals[0,1]; # gives ("camel", "llama"); @animals[0..2]; # gives ("camel", "llama", "owl"); @animals[1..$#animals]; # gives all except the first element This is called an "array slice". You can do various useful things to lists: my @sorted = sort @animals; my @backwards = reverse @numbers; There are a couple of special arrays too, such as C<@ARGV> (the command line arguments to your script) and C<@_> (the arguments passed to a subroutine). These are documented in L<perlvar>. =item Hashes A hash represents a set of key/value pairs: my %fruit_color = ("apple", "red", "banana", "yellow"); You can use whitespace and the C<< => >> operator to lay them out more nicely: my %fruit_color = ( apple => "red", banana => "yellow", ); To get at hash elements: $fruit_color{"apple"}; # gives "red" You can get at lists of keys and values with C<keys()> and C<values()>. my @fruits = keys %fruit_color; my @colors = values %fruit_color; Hashes have no particular internal order, though you can sort the keys and loop through them. Just like special scalars and arrays, there are also special hashes. The most well known of these is C<%ENV> which contains environment variables. Read all about it (and other special variables) in L<perlvar>. =back Scalars, arrays and hashes are documented more fully in L<perldata>. More complex data types can be constructed using references, which allow you to build lists and hashes within lists and hashes. A reference is a scalar value and can refer to any other Perl data type. So by storing a reference as the value of an array or hash element, you can easily create lists and hashes within lists and hashes. The following example shows a 2 level hash of hash structure using anonymous hash references. my $variables = { scalar => { description => "single item", sigil => '$', }, array => { description => "ordered list of items", sigil => '@', }, hash => { description => "key/value pairs", sigil => '%', }, }; print "Scalars begin with a $variables->{'scalar'}->{'sigil'}\n"; Exhaustive information on the topic of references can be found in L<perlreftut>, L<perllol>, L<perlref> and L<perldsc>. =head2 Variable scoping Throughout the previous section all the examples have used the syntax: my $var = "value"; The C<my> is actually not required; you could just use: $var = "value"; However, the above usage will create global variables throughout your program, which is bad programming practice. C<my> creates lexically scoped variables instead. The variables are scoped to the block (i.e. a bunch of statements surrounded by curly-braces) in which they are defined. my $x = "foo"; my $some_condition = 1; if ($some_condition) { my $y = "bar"; print $x; # prints "foo" print $y; # prints "bar" } print $x; # prints "foo" print $y; # prints nothing; $y has fallen out of scope Using C<my> in combination with a C<use strict;> at the top of your Perl scripts means that the interpreter will pick up certain common programming errors. For instance, in the example above, the final C<print $y> would cause a compile-time error and prevent you from running the program. Using C<strict> is highly recommended. =head2 Conditional and looping constructs Perl has most of the usual conditional and looping constructs. As of Perl 5.10, it even has a case/switch statement (spelled C<given>/C<when>). See L<perlsyn/"Switch Statements"> for more details. The conditions can be any Perl expression. See the list of operators in the next section for information on comparison and boolean logic operators, which are commonly used in conditional statements. =over 4 =item if if ( condition ) { ... } elsif ( other condition ) { ... } else { ... } There's also a negated version of it: unless ( condition ) { ... } This is provided as a more readable version of C<if (!I<condition>)>. Note that the braces are required in Perl, even if you've only got one line in the block. However, there is a clever way of making your one-line conditional blocks more English like: # the traditional way if ($zippy) { print "Yow!"; } # the Perlish post-condition way print "Yow!" if $zippy; print "We have no bananas" unless $bananas; =item while while ( condition ) { ... } There's also a negated version, for the same reason we have C<unless>: until ( condition ) { ... } You can also use C<while> in a post-condition: print "LA LA LA\n" while 1; # loops forever =item for Exactly like C: for ($i = 0; $i <= $max; $i++) { ... } The C style for loop is rarely needed in Perl since Perl provides the more friendly list scanning C<foreach> loop. =item foreach foreach (@array) { print "This element is $_\n"; } print $list[$_] foreach 0 .. $max; # you don't have to use the default $_ either... foreach my $key (keys %hash) { print "The value of $key is $hash{$key}\n"; } The C<foreach> keyword is actually a synonym for the C<for> keyword. See C<L<perlsyn/"Foreach Loops">>. =back For more detail on looping constructs (and some that weren't mentioned in this overview) see L<perlsyn>. =head2 Builtin operators and functions Perl comes with a wide selection of builtin functions. Some of the ones we've already seen include C<print>, C<sort> and C<reverse>. A list of them is given at the start of L<perlfunc> and you can easily read about any given function by using C<perldoc -f I<functionname>>. Perl operators are documented in full in L<perlop>, but here are a few of the most common ones: =over 4 =item Arithmetic + addition - subtraction * multiplication / division =item Numeric comparison == equality != inequality < less than > greater than <= less than or equal >= greater than or equal =item String comparison eq equality ne inequality lt less than gt greater than le less than or equal ge greater than or equal (Why do we have separate numeric and string comparisons? Because we don't have special variable types, and Perl needs to know whether to sort numerically (where 99 is less than 100) or alphabetically (where 100 comes before 99). =item Boolean logic && and || or ! not (C<and>, C<or> and C<not> aren't just in the above table as descriptions of the operators. They're also supported as operators in their own right. They're more readable than the C-style operators, but have different precedence to C<&&> and friends. Check L<perlop> for more detail.) =item Miscellaneous = assignment . string concatenation x string multiplication (repeats strings) .. range operator (creates a list of numbers or strings) =back Many operators can be combined with a C<=> as follows: $a += 1; # same as $a = $a + 1 $a -= 1; # same as $a = $a - 1 $a .= "\n"; # same as $a = $a . "\n"; =head2 Files and I/O You can open a file for input or output using the C<open()> function. It's documented in extravagant detail in L<perlfunc> and L<perlopentut>, but in short: open(my $in, "<", "input.txt") or die "Can't open input.txt: $!"; open(my $out, ">", "output.txt") or die "Can't open output.txt: $!"; open(my $log, ">>", "my.log") or die "Can't open my.log: $!"; You can read from an open filehandle using the C<< <> >> operator. In scalar context it reads a single line from the filehandle, and in list context it reads the whole file in, assigning each line to an element of the list: my $line = <$in>; my @lines = <$in>; Reading in the whole file at one time is called slurping. It can be useful but it may be a memory hog. Most text file processing can be done a line at a time with Perl's looping constructs. The C<< <> >> operator is most often seen in a C<while> loop: while (<$in>) { # assigns each line in turn to $_ print "Just read in this line: $_"; } We've already seen how to print to standard output using C<print()>. However, C<print()> can also take an optional first argument specifying which filehandle to print to: print STDERR "This is your final warning.\n"; print $out $record; print $log $logmessage; When you're done with your filehandles, you should C<close()> them (though to be honest, Perl will clean up after you if you forget): close $in or die "$in: $!"; =head2 Regular expressions Perl's regular expression support is both broad and deep, and is the subject of lengthy documentation in L<perlrequick>, L<perlretut>, and elsewhere. However, in short: =over 4 =item Simple matching if (/foo/) { ... } # true if $_ contains "foo" if ($a =~ /foo/) { ... } # true if $a contains "foo" The C<//> matching operator is documented in L<perlop>. It operates on C<$_> by default, or can be bound to another variable using the C<=~> binding operator (also documented in L<perlop>). =item Simple substitution s/foo/bar/; # replaces foo with bar in $_ $a =~ s/foo/bar/; # replaces foo with bar in $a $a =~ s/foo/bar/g; # replaces ALL INSTANCES of foo with bar # in $a The C<s///> substitution operator is documented in L<perlop>. =item More complex regular expressions You don't just have to match on fixed strings. In fact, you can match on just about anything you could dream of by using more complex regular expressions. These are documented at great length in L<perlre>, but for the meantime, here's a quick cheat sheet: . a single character \s a whitespace character (space, tab, newline, ...) \S non-whitespace character \d a digit (0-9) \D a non-digit \w a word character (a-z, A-Z, 0-9, _) \W a non-word character [aeiou] matches a single character in the given set [^aeiou] matches a single character outside the given set (foo|bar|baz) matches any of the alternatives specified ^ start of string $ end of string Quantifiers can be used to specify how many of the previous thing you want to match on, where "thing" means either a literal character, one of the metacharacters listed above, or a group of characters or metacharacters in parentheses. * zero or more of the previous thing + one or more of the previous thing ? zero or one of the previous thing {3} matches exactly 3 of the previous thing {3,6} matches between 3 and 6 of the previous thing {3,} matches 3 or more of the previous thing Some brief examples: /^\d+/ string starts with one or more digits /^$/ nothing in the string (start and end are adjacent) /(\d\s){3}/ three digits, each followed by a whitespace character (eg "3 4 5 ") /(a.)+/ matches a string in which every odd-numbered letter is a (eg "abacadaf") # This loop reads from STDIN, and prints non-blank lines: while (<>) { next if /^$/; print; } =item Parentheses for capturing As well as grouping, parentheses serve a second purpose. They can be used to capture the results of parts of the regexp match for later use. The results end up in C<$1>, C<$2> and so on. # a cheap and nasty way to break an email address up into parts if ($email =~ /([^@]+)@(.+)/) { print "Username is $1\n"; print "Hostname is $2\n"; } =item Other regexp features Perl regexps also support backreferences, lookaheads, and all kinds of other complex details. Read all about them in L<perlrequick>, L<perlretut>, and L<perlre>. =back =head2 Writing subroutines Writing subroutines is easy: sub logger { my $logmessage = shift; open my $logfile, ">>", "my.log" or die "Could not open my.log: $!"; print $logfile $logmessage; } Now we can use the subroutine just as any other built-in function: logger("We have a logger subroutine!"); What's that C<shift>? Well, the arguments to a subroutine are available to us as a special array called C<@_> (see L<perlvar> for more on that). The default argument to the C<shift> function just happens to be C<@_>. So C<my $logmessage = shift;> shifts the first item off the list of arguments and assigns it to C<$logmessage>. We can manipulate C<@_> in other ways too: my ($logmessage, $priority) = @_; # common my $logmessage = $_[0]; # uncommon, and ugly Subroutines can also return values: sub square { my $num = shift; my $result = $num * $num; return $result; } Then use it like: $sq = square(8); For more information on writing subroutines, see L<perlsub>. =head2 OO Perl OO Perl is relatively simple and is implemented using references which know what sort of object they are based on Perl's concept of packages. However, OO Perl is largely beyond the scope of this document. Read L<perlootut> and L<perlobj>. As a beginning Perl programmer, your most common use of OO Perl will be in using third-party modules, which are documented below. =head2 Using Perl modules Perl modules provide a range of features to help you avoid reinventing the wheel, and can be downloaded from CPAN ( L<http://www.cpan.org/> ). A number of popular modules are included with the Perl distribution itself. Categories of modules range from text manipulation to network protocols to database integration to graphics. A categorized list of modules is also available from CPAN. To learn how to install modules you download from CPAN, read L<perlmodinstall>. To learn how to use a particular module, use C<perldoc I<Module::Name>>. Typically you will want to C<use I<Module::Name>>, which will then give you access to exported functions or an OO interface to the module. L<perlfaq> contains questions and answers related to many common tasks, and often provides suggestions for good CPAN modules to use. L<perlmod> describes Perl modules in general. L<perlmodlib> lists the modules which came with your Perl installation. If you feel the urge to write Perl modules, L<perlnewmod> will give you good advice. =head1 AUTHOR Kirrily "Skud" Robert <skud@cpan.org> PK �=�[.E��h� h� perlref.podnu �[��� =head1 NAME X<reference> X<pointer> X<data structure> X<structure> X<struct> perlref - Perl references and nested data structures =head1 NOTE This is complete documentation about all aspects of references. For a shorter, tutorial introduction to just the essential features, see L<perlreftut>. =head1 DESCRIPTION Before release 5 of Perl it was difficult to represent complex data structures, because all references had to be symbolic--and even then it was difficult to refer to a variable instead of a symbol table entry. Perl now not only makes it easier to use symbolic references to variables, but also lets you have "hard" references to any piece of data or code. Any scalar may hold a hard reference. Because arrays and hashes contain scalars, you can now easily build arrays of arrays, arrays of hashes, hashes of arrays, arrays of hashes of functions, and so on. Hard references are smart--they keep track of reference counts for you, automatically freeing the thing referred to when its reference count goes to zero. (Reference counts for values in self-referential or cyclic data structures may not go to zero without a little help; see L</"Circular References"> for a detailed explanation.) If that thing happens to be an object, the object is destructed. See L<perlobj> for more about objects. (In a sense, everything in Perl is an object, but we usually reserve the word for references to objects that have been officially "blessed" into a class package.) Symbolic references are names of variables or other objects, just as a symbolic link in a Unix filesystem contains merely the name of a file. The C<*glob> notation is something of a symbolic reference. (Symbolic references are sometimes called "soft references", but please don't call them that; references are confusing enough without useless synonyms.) X<reference, symbolic> X<reference, soft> X<symbolic reference> X<soft reference> In contrast, hard references are more like hard links in a Unix file system: They are used to access an underlying object without concern for what its (other) name is. When the word "reference" is used without an adjective, as in the following paragraph, it is usually talking about a hard reference. X<reference, hard> X<hard reference> References are easy to use in Perl. There is just one overriding principle: in general, Perl does no implicit referencing or dereferencing. When a scalar is holding a reference, it always behaves as a simple scalar. It doesn't magically start being an array or hash or subroutine; you have to tell it explicitly to do so, by dereferencing it. =head2 Making References X<reference, creation> X<referencing> References can be created in several ways. =head3 Backslash Operator X<\> X<backslash> By using the backslash operator on a variable, subroutine, or value. (This works much like the & (address-of) operator in C.) This typically creates I<another> reference to a variable, because there's already a reference to the variable in the symbol table. But the symbol table reference might go away, and you'll still have the reference that the backslash returned. Here are some examples: $scalarref = \$foo; $arrayref = \@ARGV; $hashref = \%ENV; $coderef = \&handler; $globref = \*foo; It isn't possible to create a true reference to an IO handle (filehandle or dirhandle) using the backslash operator. The most you can get is a reference to a typeglob, which is actually a complete symbol table entry. But see the explanation of the C<*foo{THING}> syntax below. However, you can still use type globs and globrefs as though they were IO handles. =head3 Square Brackets X<array, anonymous> X<[> X<[]> X<square bracket> X<bracket, square> X<arrayref> X<array reference> X<reference, array> A reference to an anonymous array can be created using square brackets: $arrayref = [1, 2, ['a', 'b', 'c']]; Here we've created a reference to an anonymous array of three elements whose final element is itself a reference to another anonymous array of three elements. (The multidimensional syntax described later can be used to access this. For example, after the above, C<< $arrayref->[2][1] >> would have the value "b".) Taking a reference to an enumerated list is not the same as using square brackets--instead it's the same as creating a list of references! @list = (\$a, \@b, \%c); @list = \($a, @b, %c); # same thing! As a special case, C<\(@foo)> returns a list of references to the contents of C<@foo>, not a reference to C<@foo> itself. Likewise for C<%foo>, except that the key references are to copies (since the keys are just strings rather than full-fledged scalars). =head3 Curly Brackets X<hash, anonymous> X<{> X<{}> X<curly bracket> X<bracket, curly> X<brace> X<hashref> X<hash reference> X<reference, hash> A reference to an anonymous hash can be created using curly brackets: $hashref = { 'Adam' => 'Eve', 'Clyde' => 'Bonnie', }; Anonymous hash and array composers like these can be intermixed freely to produce as complicated a structure as you want. The multidimensional syntax described below works for these too. The values above are literals, but variables and expressions would work just as well, because assignment operators in Perl (even within local() or my()) are executable statements, not compile-time declarations. Because curly brackets (braces) are used for several other things including BLOCKs, you may occasionally have to disambiguate braces at the beginning of a statement by putting a C<+> or a C<return> in front so that Perl realizes the opening brace isn't starting a BLOCK. The economy and mnemonic value of using curlies is deemed worth this occasional extra hassle. For example, if you wanted a function to make a new hash and return a reference to it, you have these options: sub hashem { { @_ } } # silently wrong sub hashem { +{ @_ } } # ok sub hashem { return { @_ } } # ok On the other hand, if you want the other meaning, you can do this: sub showem { { @_ } } # ambiguous (currently ok, # but may change) sub showem { {; @_ } } # ok sub showem { { return @_ } } # ok The leading C<+{> and C<{;> always serve to disambiguate the expression to mean either the HASH reference, or the BLOCK. =head3 Anonymous Subroutines X<subroutine, anonymous> X<subroutine, reference> X<reference, subroutine> X<scope, lexical> X<closure> X<lexical> X<lexical scope> A reference to an anonymous subroutine can be created by using C<sub> without a subname: $coderef = sub { print "Boink!\n" }; Note the semicolon. Except for the code inside not being immediately executed, a C<sub {}> is not so much a declaration as it is an operator, like C<do{}> or C<eval{}>. (However, no matter how many times you execute that particular line (unless you're in an C<eval("...")>), $coderef will still have a reference to the I<same> anonymous subroutine.) Anonymous subroutines act as closures with respect to my() variables, that is, variables lexically visible within the current scope. Closure is a notion out of the Lisp world that says if you define an anonymous function in a particular lexical context, it pretends to run in that context even when it's called outside the context. In human terms, it's a funny way of passing arguments to a subroutine when you define it as well as when you call it. It's useful for setting up little bits of code to run later, such as callbacks. You can even do object-oriented stuff with it, though Perl already provides a different mechanism to do that--see L<perlobj>. You might also think of closure as a way to write a subroutine template without using eval(). Here's a small example of how closures work: sub newprint { my $x = shift; return sub { my $y = shift; print "$x, $y!\n"; }; } $h = newprint("Howdy"); $g = newprint("Greetings"); # Time passes... &$h("world"); &$g("earthlings"); This prints Howdy, world! Greetings, earthlings! Note particularly that $x continues to refer to the value passed into newprint() I<despite> "my $x" having gone out of scope by the time the anonymous subroutine runs. That's what a closure is all about. This applies only to lexical variables, by the way. Dynamic variables continue to work as they have always worked. Closure is not something that most Perl programmers need trouble themselves about to begin with. =head3 Constructors X<constructor> X<new> References are often returned by special subroutines called constructors. Perl objects are just references to a special type of object that happens to know which package it's associated with. Constructors are just special subroutines that know how to create that association. They do so by starting with an ordinary reference, and it remains an ordinary reference even while it's also being an object. Constructors are often named C<new()>. You I<can> call them indirectly: $objref = new Doggie( Tail => 'short', Ears => 'long' ); But that can produce ambiguous syntax in certain cases, so it's often better to use the direct method invocation approach: $objref = Doggie->new(Tail => 'short', Ears => 'long'); use Term::Cap; $terminal = Term::Cap->Tgetent( { OSPEED => 9600 }); use Tk; $main = MainWindow->new(); $menubar = $main->Frame(-relief => "raised", -borderwidth => 2) =head3 Autovivification X<autovivification> References of the appropriate type can spring into existence if you dereference them in a context that assumes they exist. Because we haven't talked about dereferencing yet, we can't show you any examples yet. =head3 Typeglob Slots X<*foo{THING}> X<*> A reference can be created by using a special syntax, lovingly known as the *foo{THING} syntax. *foo{THING} returns a reference to the THING slot in *foo (which is the symbol table entry which holds everything known as foo). $scalarref = *foo{SCALAR}; $arrayref = *ARGV{ARRAY}; $hashref = *ENV{HASH}; $coderef = *handler{CODE}; $ioref = *STDIN{IO}; $globref = *foo{GLOB}; $formatref = *foo{FORMAT}; $globname = *foo{NAME}; # "foo" $pkgname = *foo{PACKAGE}; # "main" Most of these are self-explanatory, but C<*foo{IO}> deserves special attention. It returns the IO handle, used for file handles (L<perlfunc/open>), sockets (L<perlfunc/socket> and L<perlfunc/socketpair>), and directory handles (L<perlfunc/opendir>). For compatibility with previous versions of Perl, C<*foo{FILEHANDLE}> is a synonym for C<*foo{IO}>, though it is discouraged, to encourage a consistent use of one name: IO. On perls between v5.8 and v5.22, it will issue a deprecation warning, but this deprecation has since been rescinded. C<*foo{THING}> returns undef if that particular THING hasn't been used yet, except in the case of scalars. C<*foo{SCALAR}> returns a reference to an anonymous scalar if $foo hasn't been used yet. This might change in a future release. C<*foo{NAME}> and C<*foo{PACKAGE}> are the exception, in that they return strings, rather than references. These return the package and name of the typeglob itself, rather than one that has been assigned to it. So, after C<*foo=*Foo::bar>, C<*foo> will become "*Foo::bar" when used as a string, but C<*foo{PACKAGE}> and C<*foo{NAME}> will continue to produce "main" and "foo", respectively. C<*foo{IO}> is an alternative to the C<*HANDLE> mechanism given in L<perldata/"Typeglobs and Filehandles"> for passing filehandles into or out of subroutines, or storing into larger data structures. Its disadvantage is that it won't create a new filehandle for you. Its advantage is that you have less risk of clobbering more than you want to with a typeglob assignment. (It still conflates file and directory handles, though.) However, if you assign the incoming value to a scalar instead of a typeglob as we do in the examples below, there's no risk of that happening. splutter(*STDOUT); # pass the whole glob splutter(*STDOUT{IO}); # pass both file and dir handles sub splutter { my $fh = shift; print $fh "her um well a hmmm\n"; } $rec = get_rec(*STDIN); # pass the whole glob $rec = get_rec(*STDIN{IO}); # pass both file and dir handles sub get_rec { my $fh = shift; return scalar <$fh>; } =head2 Using References X<reference, use> X<dereferencing> X<dereference> That's it for creating references. By now you're probably dying to know how to use references to get back to your long-lost data. There are several basic methods. =head3 Simple Scalar Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a simple scalar variable containing a reference of the correct type: $bar = $$scalarref; push(@$arrayref, $filename); $$arrayref[0] = "January"; $$hashref{"KEY"} = "VALUE"; &$coderef(1,2,3); print $globref "output\n"; It's important to understand that we are specifically I<not> dereferencing C<$arrayref[0]> or C<$hashref{"KEY"}> there. The dereference of the scalar variable happens I<before> it does any key lookups. Anything more complicated than a simple scalar variable must use methods 2 or 3 below. However, a "simple scalar" includes an identifier that itself uses method 1 recursively. Therefore, the following prints "howdy". $refrefref = \\\"howdy"; print $$$$refrefref; =head3 Block Anywhere you'd put an identifier (or chain of identifiers) as part of a variable or subroutine name, you can replace the identifier with a BLOCK returning a reference of the correct type. In other words, the previous examples could be written like this: $bar = ${$scalarref}; push(@{$arrayref}, $filename); ${$arrayref}[0] = "January"; ${$hashref}{"KEY"} = "VALUE"; &{$coderef}(1,2,3); $globref->print("output\n"); # iff IO::Handle is loaded Admittedly, it's a little silly to use the curlies in this case, but the BLOCK can contain any arbitrary expression, in particular, subscripted expressions: &{ $dispatch{$index} }(1,2,3); # call correct routine Because of being able to omit the curlies for the simple case of C<$$x>, people often make the mistake of viewing the dereferencing symbols as proper operators, and wonder about their precedence. If they were, though, you could use parentheses instead of braces. That's not the case. Consider the difference below; case 0 is a short-hand version of case 1, I<not> case 2: $$hashref{"KEY"} = "VALUE"; # CASE 0 ${$hashref}{"KEY"} = "VALUE"; # CASE 1 ${$hashref{"KEY"}} = "VALUE"; # CASE 2 ${$hashref->{"KEY"}} = "VALUE"; # CASE 3 Case 2 is also deceptive in that you're accessing a variable called %hashref, not dereferencing through $hashref to the hash it's presumably referencing. That would be case 3. =head3 Arrow Notation Subroutine calls and lookups of individual array elements arise often enough that it gets cumbersome to use method 2. As a form of syntactic sugar, the examples for method 2 may be written: $arrayref->[0] = "January"; # Array element $hashref->{"KEY"} = "VALUE"; # Hash element $coderef->(1,2,3); # Subroutine call The left side of the arrow can be any expression returning a reference, including a previous dereference. Note that C<$array[$x]> is I<not> the same thing as C<< $array->[$x] >> here: $array[$x]->{"foo"}->[0] = "January"; This is one of the cases we mentioned earlier in which references could spring into existence when in an lvalue context. Before this statement, C<$array[$x]> may have been undefined. If so, it's automatically defined with a hash reference so that we can look up C<{"foo"}> in it. Likewise C<< $array[$x]->{"foo"} >> will automatically get defined with an array reference so that we can look up C<[0]> in it. This process is called I<autovivification>. One more thing here. The arrow is optional I<between> brackets subscripts, so you can shrink the above down to $array[$x]{"foo"}[0] = "January"; Which, in the degenerate case of using only ordinary arrays, gives you multidimensional arrays just like C's: $score[$x][$y][$z] += 42; Well, okay, not entirely like C's arrays, actually. C doesn't know how to grow its arrays on demand. Perl does. =head3 Objects If a reference happens to be a reference to an object, then there are probably methods to access the things referred to, and you should probably stick to those methods unless you're in the class package that defines the object's methods. In other words, be nice, and don't violate the object's encapsulation without a very good reason. Perl does not enforce encapsulation. We are not totalitarians here. We do expect some basic civility though. =head3 Miscellaneous Usage Using a string or number as a reference produces a symbolic reference, as explained above. Using a reference as a number produces an integer representing its storage location in memory. The only useful thing to be done with this is to compare two references numerically to see whether they refer to the same location. X<reference, numeric context> if ($ref1 == $ref2) { # cheap numeric compare of references print "refs 1 and 2 refer to the same thing\n"; } Using a reference as a string produces both its referent's type, including any package blessing as described in L<perlobj>, as well as the numeric address expressed in hex. The ref() operator returns just the type of thing the reference is pointing to, without the address. See L<perlfunc/ref> for details and examples of its use. X<reference, string context> The bless() operator may be used to associate the object a reference points to with a package functioning as an object class. See L<perlobj>. A typeglob may be dereferenced the same way a reference can, because the dereference syntax always indicates the type of reference desired. So C<${*foo}> and C<${\$foo}> both indicate the same scalar variable. Here's a trick for interpolating a subroutine call into a string: print "My sub returned @{[mysub(1,2,3)]} that time.\n"; The way it works is that when the C<@{...}> is seen in the double-quoted string, it's evaluated as a block. The block creates a reference to an anonymous array containing the results of the call to C<mysub(1,2,3)>. So the whole block returns a reference to an array, which is then dereferenced by C<@{...}> and stuck into the double-quoted string. This chicanery is also useful for arbitrary expressions: print "That yields @{[$n + 5]} widgets\n"; Similarly, an expression that returns a reference to a scalar can be dereferenced via C<${...}>. Thus, the above expression may be written as: print "That yields ${\($n + 5)} widgets\n"; =head2 Circular References X<circular reference> X<reference, circular> It is possible to create a "circular reference" in Perl, which can lead to memory leaks. A circular reference occurs when two references contain a reference to each other, like this: my $foo = {}; my $bar = { foo => $foo }; $foo->{bar} = $bar; You can also create a circular reference with a single variable: my $foo; $foo = \$foo; In this case, the reference count for the variables will never reach 0, and the references will never be garbage-collected. This can lead to memory leaks. Because objects in Perl are implemented as references, it's possible to have circular references with objects as well. Imagine a TreeNode class where each node references its parent and child nodes. Any node with a parent will be part of a circular reference. You can break circular references by creating a "weak reference". A weak reference does not increment the reference count for a variable, which means that the object can go out of scope and be destroyed. You can weaken a reference with the C<weaken> function exported by the L<Scalar::Util> module. Here's how we can make the first example safer: use Scalar::Util 'weaken'; my $foo = {}; my $bar = { foo => $foo }; $foo->{bar} = $bar; weaken $foo->{bar}; The reference from C<$foo> to C<$bar> has been weakened. When the C<$bar> variable goes out of scope, it will be garbage-collected. The next time you look at the value of the C<< $foo->{bar} >> key, it will be C<undef>. This action at a distance can be confusing, so you should be careful with your use of weaken. You should weaken the reference in the variable that will go out of scope I<first>. That way, the longer-lived variable will contain the expected reference until it goes out of scope. =head2 Symbolic references X<reference, symbolic> X<reference, soft> X<symbolic reference> X<soft reference> We said that references spring into existence as necessary if they are undefined, but we didn't say what happens if a value used as a reference is already defined, but I<isn't> a hard reference. If you use it as a reference, it'll be treated as a symbolic reference. That is, the value of the scalar is taken to be the I<name> of a variable, rather than a direct link to a (possibly) anonymous value. People frequently expect it to work like this. So it does. $name = "foo"; $$name = 1; # Sets $foo ${$name} = 2; # Sets $foo ${$name x 2} = 3; # Sets $foofoo $name->[0] = 4; # Sets $foo[0] @$name = (); # Clears @foo &$name(); # Calls &foo() $pack = "THAT"; ${"${pack}::$name"} = 5; # Sets $THAT::foo without eval This is powerful, and slightly dangerous, in that it's possible to intend (with the utmost sincerity) to use a hard reference, and accidentally use a symbolic reference instead. To protect against that, you can say use strict 'refs'; and then only hard references will be allowed for the rest of the enclosing block. An inner block may countermand that with no strict 'refs'; Only package variables (globals, even if localized) are visible to symbolic references. Lexical variables (declared with my()) aren't in a symbol table, and thus are invisible to this mechanism. For example: local $value = 10; $ref = "value"; { my $value = 20; print $$ref; } This will still print 10, not 20. Remember that local() affects package variables, which are all "global" to the package. =head2 Not-so-symbolic references Brackets around a symbolic reference can simply serve to isolate an identifier or variable name from the rest of an expression, just as they always have within a string. For example, $push = "pop on "; print "${push}over"; has always meant to print "pop on over", even though push is a reserved word. This is generalized to work the same without the enclosing double quotes, so that print ${push} . "over"; and even print ${ push } . "over"; will have the same effect. This construct is I<not> considered to be a symbolic reference when you're using strict refs: use strict 'refs'; ${ bareword }; # Okay, means $bareword. ${ "bareword" }; # Error, symbolic reference. Similarly, because of all the subscripting that is done using single words, the same rule applies to any bareword that is used for subscripting a hash. So now, instead of writing $hash{ "aaa" }{ "bbb" }{ "ccc" } you can write just $hash{ aaa }{ bbb }{ ccc } and not worry about whether the subscripts are reserved words. In the rare event that you do wish to do something like $hash{ shift } you can force interpretation as a reserved word by adding anything that makes it more than a bareword: $hash{ shift() } $hash{ +shift } $hash{ shift @_ } The C<use warnings> pragma or the B<-w> switch will warn you if it interprets a reserved word as a string. But it will no longer warn you about using lowercase words, because the string is effectively quoted. =head2 Pseudo-hashes: Using an array as a hash X<pseudo-hash> X<pseudo hash> X<pseudohash> Pseudo-hashes have been removed from Perl. The 'fields' pragma remains available. =head2 Function Templates X<scope, lexical> X<closure> X<lexical> X<lexical scope> X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local> As explained above, an anonymous function with access to the lexical variables visible when that function was compiled, creates a closure. It retains access to those variables even though it doesn't get run until later, such as in a signal handler or a Tk callback. Using a closure as a function template allows us to generate many functions that act similarly. Suppose you wanted functions named after the colors that generated HTML font changes for the various colors: print "Be ", red("careful"), "with that ", green("light"); The red() and green() functions would be similar. To create these, we'll assign a closure to a typeglob of the name of the function we're trying to build. @colors = qw(red blue green yellow orange purple violet); for my $name (@colors) { no strict 'refs'; # allow symbol table manipulation *$name = *{uc $name} = sub { "<FONT COLOR='$name'>@_</FONT>" }; } Now all those different functions appear to exist independently. You can call red(), RED(), blue(), BLUE(), green(), etc. This technique saves on both compile time and memory use, and is less error-prone as well, since syntax checks happen at compile time. It's critical that any variables in the anonymous subroutine be lexicals in order to create a proper closure. That's the reasons for the C<my> on the loop iteration variable. This is one of the only places where giving a prototype to a closure makes much sense. If you wanted to impose scalar context on the arguments of these functions (probably not a wise idea for this particular example), you could have written it this way instead: *$name = sub ($) { "<FONT COLOR='$name'>$_[0]</FONT>" }; However, since prototype checking happens at compile time, the assignment above happens too late to be of much use. You could address this by putting the whole loop of assignments within a BEGIN block, forcing it to occur during compilation. Access to lexicals that change over time--like those in the C<for> loop above, basically aliases to elements from the surrounding lexical scopes-- only works with anonymous subs, not with named subroutines. Generally said, named subroutines do not nest properly and should only be declared in the main package scope. This is because named subroutines are created at compile time so their lexical variables get assigned to the parent lexicals from the first execution of the parent block. If a parent scope is entered a second time, its lexicals are created again, while the nested subs still reference the old ones. Anonymous subroutines get to capture each time you execute the C<sub> operator, as they are created on the fly. If you are accustomed to using nested subroutines in other programming languages with their own private variables, you'll have to work at it a bit in Perl. The intuitive coding of this type of thing incurs mysterious warnings about "will not stay shared" due to the reasons explained above. For example, this won't work: sub outer { my $x = $_[0] + 35; sub inner { return $x * 19 } # WRONG return $x + inner(); } A work-around is the following: sub outer { my $x = $_[0] + 35; local *inner = sub { return $x * 19 }; return $x + inner(); } Now inner() can only be called from within outer(), because of the temporary assignments of the anonymous subroutine. But when it does, it has normal access to the lexical variable $x from the scope of outer() at the time outer is invoked. This has the interesting effect of creating a function local to another function, something not normally supported in Perl. =head1 WARNING: Don't use references as hash keys X<reference, string context> X<reference, use as hash key> You may not (usefully) use a reference as the key to a hash. It will be converted into a string: $x{ \$a } = $a; If you try to dereference the key, it won't do a hard dereference, and you won't accomplish what you're attempting. You might want to do something more like $r = \@a; $x{ $r } = $r; And then at least you can use the values(), which will be real refs, instead of the keys(), which won't. The standard Tie::RefHash module provides a convenient workaround to this. =head2 Postfix Dereference Syntax Beginning in v5.20.0, a postfix syntax for using references is available. It behaves as described in L</Using References>, but instead of a prefixed sigil, a postfixed sigil-and-star is used. For example: $r = \@a; @b = $r->@*; # equivalent to @$r or @{ $r } $r = [ 1, [ 2, 3 ], 4 ]; $r->[1]->@*; # equivalent to @{ $r->[1] } In Perl 5.20 and 5.22, this syntax must be enabled with C<use feature 'postderef'>. As of Perl 5.24, no feature declarations are required to make it available. Postfix dereference should work in all circumstances where block (circumfix) dereference worked, and should be entirely equivalent. This syntax allows dereferencing to be written and read entirely left-to-right. The following equivalencies are defined: $sref->$*; # same as ${ $sref } $aref->@*; # same as @{ $aref } $aref->$#*; # same as $#{ $aref } $href->%*; # same as %{ $href } $cref->&*; # same as &{ $cref } $gref->**; # same as *{ $gref } Note especially that C<< $cref->&* >> is I<not> equivalent to C<< $cref->() >>, and can serve different purposes. Glob elements can be extracted through the postfix dereferencing feature: $gref->*{SCALAR}; # same as *{ $gref }{SCALAR} Postfix array and scalar dereferencing I<can> be used in interpolating strings (double quotes or the C<qq> operator), but only if the C<postderef_qq> feature is enabled. =head2 Postfix Reference Slicing Value slices of arrays and hashes may also be taken with postfix dereferencing notation, with the following equivalencies: $aref->@[ ... ]; # same as @$aref[ ... ] $href->@{ ... }; # same as @$href{ ... } Postfix key/value pair slicing, added in 5.20.0 and documented in L<the KeyE<sol>Value Hash Slices section of perldata|perldata/"Key/Value Hash Slices">, also behaves as expected: $aref->%[ ... ]; # same as %$aref[ ... ] $href->%{ ... }; # same as %$href{ ... } As with postfix array, postfix value slice dereferencing I<can> be used in interpolating strings (double quotes or the C<qq> operator), but only if the C<postderef_qq> L<feature> is enabled. =head2 Assigning to References Beginning in v5.22.0, the referencing operator can be assigned to. It performs an aliasing operation, so that the variable name referenced on the left-hand side becomes an alias for the thing referenced on the right-hand side: \$a = \$b; # $a and $b now point to the same scalar \&foo = \&bar; # foo() now means bar() This syntax must be enabled with C<use feature 'refaliasing'>. It is experimental, and will warn by default unless C<no warnings 'experimental::refaliasing'> is in effect. These forms may be assigned to, and cause the right-hand side to be evaluated in scalar context: \$scalar \@array \%hash \&sub \my $scalar \my @array \my %hash \state $scalar # or @array, etc. \our $scalar # etc. \local $scalar # etc. \local our $scalar # etc. \$some_array[$index] \$some_hash{$key} \local $some_array[$index] \local $some_hash{$key} condition ? \$this : \$that[0] # etc. Slicing operations and parentheses cause the right-hand side to be evaluated in list context: \@array[5..7] (\@array[5..7]) \(@array[5..7]) \@hash{'foo','bar'} (\@hash{'foo','bar'}) \(@hash{'foo','bar'}) (\$scalar) \($scalar) \(my $scalar) \my($scalar) (\@array) (\%hash) (\&sub) \(&sub) \($foo, @bar, %baz) (\$foo, \@bar, \%baz) Each element on the right-hand side must be a reference to a datum of the right type. Parentheses immediately surrounding an array (and possibly also C<my>/C<state>/C<our>/C<local>) will make each element of the array an alias to the corresponding scalar referenced on the right-hand side: \(@a) = \(@b); # @a and @b now have the same elements \my(@a) = \(@b); # likewise \(my @a) = \(@b); # likewise push @a, 3; # but now @a has an extra element that @b lacks \(@a) = (\$a, \$b, \$c); # @a now contains $a, $b, and $c Combining that form with C<local> and putting parentheses immediately around a hash are forbidden (because it is not clear what they should do): \local(@array) = foo(); # WRONG \(%hash) = bar(); # WRONG Assignment to references and non-references may be combined in lists and conditional ternary expressions, as long as the values on the right-hand side are the right type for each element on the left, though this may make for obfuscated code: (my $tom, \my $dick, \my @harry) = (\1, \2, [1..3]); # $tom is now \1 # $dick is now 2 (read-only) # @harry is (1,2,3) my $type = ref $thingy; ($type ? $type eq 'ARRAY' ? \@foo : \$bar : $baz) = $thingy; The C<foreach> loop can also take a reference constructor for its loop variable, though the syntax is limited to one of the following, with an optional C<my>, C<state>, or C<our> after the backslash: \$s \@a \%h \&c No parentheses are permitted. This feature is particularly useful for arrays-of-arrays, or arrays-of-hashes: foreach \my @a (@array_of_arrays) { frobnicate($a[0], $a[-1]); } foreach \my %h (@array_of_hashes) { $h{gelastic}++ if $h{type} eq 'funny'; } B<CAVEAT:> Aliasing does not work correctly with closures. If you try to alias lexical variables from an inner subroutine or C<eval>, the aliasing will only be visible within that inner sub, and will not affect the outer subroutine where the variables are declared. This bizarre behavior is subject to change. =head1 Declaring a Reference to a Variable Beginning in v5.26.0, the referencing operator can come after C<my>, C<state>, C<our>, or C<local>. This syntax must be enabled with C<use feature 'declared_refs'>. It is experimental, and will warn by default unless C<no warnings 'experimental::refaliasing'> is in effect. This feature makes these: my \$x; our \$y; equivalent to: \my $x; \our $x; It is intended mainly for use in assignments to references (see L</Assigning to References>, above). It also allows the backslash to be used on just some items in a list of declared variables: my ($foo, \@bar, \%baz); # equivalent to: my $foo, \my(@bar, %baz); =head1 SEE ALSO Besides the obvious documents, source code can be instructive. Some pathological examples of the use of references can be found in the F<t/op/ref.t> regression test in the Perl source directory. See also L<perldsc> and L<perllol> for how to use references to create complex data structures, and L<perlootut> and L<perlobj> for how to use them to create objects. PK �=�[�.��5 �5 perlgpl.podnu �[��� =head1 NAME perlgpl - the GNU General Public License, version 1 =head1 SYNOPSIS You can refer to this document in Pod via "L<perlgpl>" Or you can see this document by entering "perldoc perlgpl" =head1 DESCRIPTION Perl is free software; you can redistribute it and/or modify it under the terms of either: a) the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version, or b) the "Artistic License" which comes with this Kit. This is the B<"GNU General Public License, version 1">. It's here so that modules, programs, etc., that want to declare this as their distribution license can link to it. For the Perl Artistic License, see L<perlartistic>. =cut # Because the following document's language disallows "changing" # it, we haven't gone thru and prettied it up with =item's or # anything. It's good enough the way it is. =head1 GNU GENERAL PUBLIC LICENSE GNU GENERAL PUBLIC LICENSE Version 1, February 1989 Copyright (C) 1989 Free Software Foundation, Inc. 51 Franklin St, Fifth Floor, Boston, MA 02110-1301 USA Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The license agreements of most software companies try to keep users at the mercy of those companies. By contrast, our General Public License is intended to guarantee your freedom to share and change free software--to make sure the software is free for all its users. The General Public License applies to the Free Software Foundation's software and to any other program whose authors commit to using it. You can use it for your programs, too. When we speak of free software, we are referring to freedom, not price. Specifically, the General Public License is designed to make sure that you have the freedom to give away or sell copies of free software, that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs; and that you know you can do these things. To protect your rights, we need to make restrictions that forbid anyone to deny you these rights or to ask you to surrender the rights. These restrictions translate to certain responsibilities for you if you distribute copies of the software, or if you modify it. For example, if you distribute copies of a such a program, whether gratis or for a fee, you must give the recipients all the rights that you have. You must make sure that they, too, receive or can get the source code. And you must tell them their rights. We protect your rights with two steps: (1) copyright the software, and (2) offer you this license which gives you legal permission to copy, distribute and/or modify the software. Also, for each author's protection and ours, we want to make certain that everyone understands that there is no warranty for this free software. If the software is modified by someone else and passed on, we want its recipients to know that what they have is not the original, so that any problems introduced by others will not reflect on the original authors' reputations. The precise terms and conditions for copying, distribution and modification follow. GNU GENERAL PUBLIC LICENSE TERMS AND CONDITIONS FOR COPYING, DISTRIBUTION AND MODIFICATION 0. This License Agreement applies to any program or other work which contains a notice placed by the copyright holder saying it may be distributed under the terms of this General Public License. The "Program", below, refers to any such program or work, and a "work based on the Program" means either the Program or any work containing the Program or a portion of it, either verbatim or with modifications. Each licensee is addressed as "you". 1. You may copy and distribute verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice and disclaimer of warranty; keep intact all the notices that refer to this General Public License and to the absence of any warranty; and give any other recipients of the Program a copy of this General Public License along with the Program. You may charge a fee for the physical act of transferring a copy. 2. You may modify your copy or copies of the Program or any portion of it, and copy and distribute such modifications under the terms of Paragraph 1 above, provided that you also do the following: a) cause the modified files to carry prominent notices stating that you changed the files and the date of any change; and b) cause the whole of any work that you distribute or publish, that in whole or in part contains the Program or any part thereof, either with or without modifications, to be licensed at no charge to all third parties under the terms of this General Public License (except that you may choose to grant warranty protection to some or all third parties, at your option). c) If the modified program normally reads commands interactively when run, you must cause it, when started running for such interactive use in the simplest and most usual way, to print or display an announcement including an appropriate copyright notice and a notice that there is no warranty (or else, saying that you provide a warranty) and that users may redistribute the program under these conditions, and telling the user how to view a copy of this General Public License. d) You may charge a fee for the physical act of transferring a copy, and you may at your option offer warranty protection in exchange for a fee. Mere aggregation of another independent work with the Program (or its derivative) on a volume of a storage or distribution medium does not bring the other work under the scope of these terms. 3. You may copy and distribute the Program (or a portion or derivative of it, under Paragraph 2) in object code or executable form under the terms of Paragraphs 1 and 2 above provided that you also do one of the following: a) accompany it with the complete corresponding machine-readable source code, which must be distributed under the terms of Paragraphs 1 and 2 above; or, b) accompany it with a written offer, valid for at least three years, to give any third party free (except for a nominal charge for the cost of distribution) a complete machine-readable copy of the corresponding source code, to be distributed under the terms of Paragraphs 1 and 2 above; or, c) accompany it with the information you received as to where the corresponding source code may be obtained. (This alternative is allowed only for noncommercial distribution and only if you received the program in object code or executable form alone.) Source code for a work means the preferred form of the work for making modifications to it. For an executable file, complete source code means all the source code for all modules it contains; but, as a special exception, it need not include source code for modules which are standard libraries that accompany the operating system on which the executable file runs, or for standard header files or definitions files that accompany that operating system. 4. You may not copy, modify, sublicense, distribute or transfer the Program except as expressly provided under this General Public License. Any attempt otherwise to copy, modify, sublicense, distribute or transfer the Program is void, and will automatically terminate your rights to use the Program under this License. However, parties who have received copies, or rights to use copies, from you under this General Public License will not have their licenses terminated so long as such parties remain in full compliance. 5. By copying, distributing or modifying the Program (or any work based on the Program) you indicate your acceptance of this license to do so, and all its terms and conditions. 6. Each time you redistribute the Program (or any work based on the Program), the recipient automatically receives a license from the original licensor to copy, distribute or modify the Program subject to these terms and conditions. You may not impose any further restrictions on the recipients' exercise of the rights granted herein. 7. The Free Software Foundation may publish revised and/or new versions of the General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies a version number of the license which applies to it and "any later version", you have the option of following the terms and conditions either of that version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the license, you may choose any version ever published by the Free Software Foundation. 8. If you wish to incorporate parts of the Program into other free programs whose distribution conditions are different, write to the author to ask for permission. For software which is copyrighted by the Free Software Foundation, write to the Free Software Foundation; we sometimes make exceptions for this. Our decision will be guided by the two goals of preserving the free status of all derivatives of our free software and of promoting the sharing and reuse of software generally. NO WARRANTY 9. BECAUSE THE PROGRAM IS LICENSED FREE OF CHARGE, THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 10. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MAY MODIFY AND/OR REDISTRIBUTE THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. END OF TERMS AND CONDITIONS Appendix: How to Apply These Terms to Your New Programs If you develop a new program, and you want it to be of the greatest possible use to humanity, the best way to achieve this is to make it free software which everyone can redistribute and change under these terms. To do so, attach the following notices to the program. It is safest to attach them to the start of each source file to most effectively convey the exclusion of warranty; and each file should have at least the "copyright" line and a pointer to where the full notice is found. <one line to give the program's name and a brief idea of what it does.> Copyright (C) 19yy <name of author> This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 1, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston MA 02110-1301 USA Also add information on how to contact you by electronic and paper mail. If the program is interactive, make it output a short notice like this when it starts in an interactive mode: Gnomovision version 69, Copyright (C) 19xx name of author Gnomovision comes with ABSOLUTELY NO WARRANTY; for details type 'show w'. This is free software, and you are welcome to redistribute it under certain conditions; type 'show c' for details. The hypothetical commands 'show w' and 'show c' should show the appropriate parts of the General Public License. Of course, the commands you use may be called something other than 'show w' and 'show c'; they could even be mouse-clicks or menu items--whatever suits your program. You should also get your employer (if you work as a programmer) or your school, if any, to sign a "copyright disclaimer" for the program, if necessary. Here a sample; alter the names: Yoyodyne, Inc., hereby disclaims all copyright interest in the program 'Gnomovision' (a program to direct compilers to make passes at assemblers) written by James Hacker. <signature of Ty Coon>, 1 April 1989 Ty Coon, President of Vice That's all there is to it! =cut PK �=�[�P\,�" �" perlsub.podnu �[��� =head1 NAME X<subroutine> X<function> perlsub - Perl subroutines =head1 SYNOPSIS To declare subroutines: X<subroutine, declaration> X<sub> sub NAME; # A "forward" declaration. sub NAME(PROTO); # ditto, but with prototypes sub NAME : ATTRS; # with attributes sub NAME(PROTO) : ATTRS; # with attributes and prototypes sub NAME BLOCK # A declaration and a definition. sub NAME(PROTO) BLOCK # ditto, but with prototypes sub NAME : ATTRS BLOCK # with attributes sub NAME(PROTO) : ATTRS BLOCK # with prototypes and attributes use feature 'signatures'; sub NAME(SIG) BLOCK # with signature sub NAME :ATTRS (SIG) BLOCK # with signature, attributes sub NAME :prototype(PROTO) (SIG) BLOCK # with signature, prototype To define an anonymous subroutine at runtime: X<subroutine, anonymous> $subref = sub BLOCK; # no proto $subref = sub (PROTO) BLOCK; # with proto $subref = sub : ATTRS BLOCK; # with attributes $subref = sub (PROTO) : ATTRS BLOCK; # with proto and attributes use feature 'signatures'; $subref = sub (SIG) BLOCK; # with signature $subref = sub : ATTRS(SIG) BLOCK; # with signature, attributes To import subroutines: X<import> use MODULE qw(NAME1 NAME2 NAME3); To call subroutines: X<subroutine, call> X<call> NAME(LIST); # & is optional with parentheses. NAME LIST; # Parentheses optional if predeclared/imported. &NAME(LIST); # Circumvent prototypes. &NAME; # Makes current @_ visible to called subroutine. =head1 DESCRIPTION Like many languages, Perl provides for user-defined subroutines. These may be located anywhere in the main program, loaded in from other files via the C<do>, C<require>, or C<use> keywords, or generated on the fly using C<eval> or anonymous subroutines. You can even call a function indirectly using a variable containing its name or a CODE reference. The Perl model for function call and return values is simple: all functions are passed as parameters one single flat list of scalars, and all functions likewise return to their caller one single flat list of scalars. Any arrays or hashes in these call and return lists will collapse, losing their identities--but you may always use pass-by-reference instead to avoid this. Both call and return lists may contain as many or as few scalar elements as you'd like. (Often a function without an explicit return statement is called a subroutine, but there's really no difference from Perl's perspective.) X<subroutine, parameter> X<parameter> Any arguments passed in show up in the array C<@_>. (They may also show up in lexical variables introduced by a signature; see L</Signatures> below.) Therefore, if you called a function with two arguments, those would be stored in C<$_[0]> and C<$_[1]>. The array C<@_> is a local array, but its elements are aliases for the actual scalar parameters. In particular, if an element C<$_[0]> is updated, the corresponding argument is updated (or an error occurs if it is not updatable). If an argument is an array or hash element which did not exist when the function was called, that element is created only when (and if) it is modified or a reference to it is taken. (Some earlier versions of Perl created the element whether or not the element was assigned to.) Assigning to the whole array C<@_> removes that aliasing, and does not update any arguments. X<subroutine, argument> X<argument> X<@_> A C<return> statement may be used to exit a subroutine, optionally specifying the returned value, which will be evaluated in the appropriate context (list, scalar, or void) depending on the context of the subroutine call. If you specify no return value, the subroutine returns an empty list in list context, the undefined value in scalar context, or nothing in void context. If you return one or more aggregates (arrays and hashes), these will be flattened together into one large indistinguishable list. If no C<return> is found and if the last statement is an expression, its value is returned. If the last statement is a loop control structure like a C<foreach> or a C<while>, the returned value is unspecified. The empty sub returns the empty list. X<subroutine, return value> X<return value> X<return> Aside from an experimental facility (see L</Signatures> below), Perl does not have named formal parameters. In practice all you do is assign to a C<my()> list of these. Variables that aren't declared to be private are global variables. For gory details on creating private variables, see L</"Private Variables via my()"> and L</"Temporary Values via local()">. To create protected environments for a set of functions in a separate package (and probably a separate file), see L<perlmod/"Packages">. X<formal parameter> X<parameter, formal> Example: sub max { my $max = shift(@_); foreach $foo (@_) { $max = $foo if $max < $foo; } return $max; } $bestday = max($mon,$tue,$wed,$thu,$fri); Example: # get a line, combining continuation lines # that start with whitespace sub get_line { $thisline = $lookahead; # global variables! LINE: while (defined($lookahead = <STDIN>)) { if ($lookahead =~ /^[ \t]/) { $thisline .= $lookahead; } else { last LINE; } } return $thisline; } $lookahead = <STDIN>; # get first line while (defined($line = get_line())) { ... } Assigning to a list of private variables to name your arguments: sub maybeset { my($key, $value) = @_; $Foo{$key} = $value unless $Foo{$key}; } Because the assignment copies the values, this also has the effect of turning call-by-reference into call-by-value. Otherwise a function is free to do in-place modifications of C<@_> and change its caller's values. X<call-by-reference> X<call-by-value> upcase_in($v1, $v2); # this changes $v1 and $v2 sub upcase_in { for (@_) { tr/a-z/A-Z/ } } You aren't allowed to modify constants in this way, of course. If an argument were actually literal and you tried to change it, you'd take a (presumably fatal) exception. For example, this won't work: X<call-by-reference> X<call-by-value> upcase_in("frederick"); It would be much safer if the C<upcase_in()> function were written to return a copy of its parameters instead of changing them in place: ($v3, $v4) = upcase($v1, $v2); # this doesn't change $v1 and $v2 sub upcase { return unless defined wantarray; # void context, do nothing my @parms = @_; for (@parms) { tr/a-z/A-Z/ } return wantarray ? @parms : $parms[0]; } Notice how this (unprototyped) function doesn't care whether it was passed real scalars or arrays. Perl sees all arguments as one big, long, flat parameter list in C<@_>. This is one area where Perl's simple argument-passing style shines. The C<upcase()> function would work perfectly well without changing the C<upcase()> definition even if we fed it things like this: @newlist = upcase(@list1, @list2); @newlist = upcase( split /:/, $var ); Do not, however, be tempted to do this: (@a, @b) = upcase(@list1, @list2); Like the flattened incoming parameter list, the return list is also flattened on return. So all you have managed to do here is stored everything in C<@a> and made C<@b> empty. See L</Pass by Reference> for alternatives. A subroutine may be called using an explicit C<&> prefix. The C<&> is optional in modern Perl, as are parentheses if the subroutine has been predeclared. The C<&> is I<not> optional when just naming the subroutine, such as when it's used as an argument to defined() or undef(). Nor is it optional when you want to do an indirect subroutine call with a subroutine name or reference using the C<&$subref()> or C<&{$subref}()> constructs, although the C<< $subref->() >> notation solves that problem. See L<perlref> for more about all that. X<&> Subroutines may be called recursively. If a subroutine is called using the C<&> form, the argument list is optional, and if omitted, no C<@_> array is set up for the subroutine: the C<@_> array at the time of the call is visible to subroutine instead. This is an efficiency mechanism that new users may wish to avoid. X<recursion> &foo(1,2,3); # pass three arguments foo(1,2,3); # the same foo(); # pass a null list &foo(); # the same &foo; # foo() get current args, like foo(@_) !! use strict 'subs'; foo; # like foo() iff sub foo predeclared, else # a compile-time error no strict 'subs'; foo; # like foo() iff sub foo predeclared, else # a literal string "foo" Not only does the C<&> form make the argument list optional, it also disables any prototype checking on arguments you do provide. This is partly for historical reasons, and partly for having a convenient way to cheat if you know what you're doing. See L</Prototypes> below. X<&> Since Perl 5.16.0, the C<__SUB__> token is available under C<use feature 'current_sub'> and C<use 5.16.0>. It will evaluate to a reference to the currently-running sub, which allows for recursive calls without knowing your subroutine's name. use 5.16.0; my $factorial = sub { my ($x) = @_; return 1 if $x == 1; return($x * __SUB__->( $x - 1 ) ); }; The behavior of C<__SUB__> within a regex code block (such as C</(?{...})/>) is subject to change. Subroutines whose names are in all upper case are reserved to the Perl core, as are modules whose names are in all lower case. A subroutine in all capitals is a loosely-held convention meaning it will be called indirectly by the run-time system itself, usually due to a triggered event. Subroutines whose name start with a left parenthesis are also reserved the same way. The following is a list of some subroutines that currently do special, pre-defined things. =over =item documented later in this document C<AUTOLOAD> =item documented in L<perlmod> C<CLONE>, C<CLONE_SKIP> =item documented in L<perlobj> C<DESTROY>, C<DOES> =item documented in L<perltie> C<BINMODE>, C<CLEAR>, C<CLOSE>, C<DELETE>, C<DESTROY>, C<EOF>, C<EXISTS>, C<EXTEND>, C<FETCH>, C<FETCHSIZE>, C<FILENO>, C<FIRSTKEY>, C<GETC>, C<NEXTKEY>, C<OPEN>, C<POP>, C<PRINT>, C<PRINTF>, C<PUSH>, C<READ>, C<READLINE>, C<SCALAR>, C<SEEK>, C<SHIFT>, C<SPLICE>, C<STORE>, C<STORESIZE>, C<TELL>, C<TIEARRAY>, C<TIEHANDLE>, C<TIEHASH>, C<TIESCALAR>, C<UNSHIFT>, C<UNTIE>, C<WRITE> =item documented in L<PerlIO::via> C<BINMODE>, C<CLEARERR>, C<CLOSE>, C<EOF>, C<ERROR>, C<FDOPEN>, C<FILENO>, C<FILL>, C<FLUSH>, C<OPEN>, C<POPPED>, C<PUSHED>, C<READ>, C<SEEK>, C<SETLINEBUF>, C<SYSOPEN>, C<TELL>, C<UNREAD>, C<UTF8>, C<WRITE> =item documented in L<perlfunc> L<< C<import> | perlfunc/use >>, L<< C<unimport> | perlfunc/use >>, L<< C<INC> | perlfunc/require >> =item documented in L<UNIVERSAL> C<VERSION> =item documented in L<perldebguts> C<DB::DB>, C<DB::sub>, C<DB::lsub>, C<DB::goto>, C<DB::postponed> =item undocumented, used internally by the L<overload> feature any starting with C<(> =back The C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END> subroutines are not so much subroutines as named special code blocks, of which you can have more than one in a package, and which you can B<not> call explicitly. See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> =head2 Signatures B<WARNING>: Subroutine signatures are experimental. The feature may be modified or removed in future versions of Perl. Perl has an experimental facility to allow a subroutine's formal parameters to be introduced by special syntax, separate from the procedural code of the subroutine body. The formal parameter list is known as a I<signature>. The facility must be enabled first by a pragmatic declaration, C<use feature 'signatures'>, and it will produce a warning unless the "experimental::signatures" warnings category is disabled. The signature is part of a subroutine's body. Normally the body of a subroutine is simply a braced block of code, but when using a signature, the signature is a parenthesised list that goes immediately before the block, after any name or attributes. For example, sub foo :lvalue ($a, $b = 1, @c) { .... } The signature declares lexical variables that are in scope for the block. When the subroutine is called, the signature takes control first. It populates the signature variables from the list of arguments that were passed. If the argument list doesn't meet the requirements of the signature, then it will throw an exception. When the signature processing is complete, control passes to the block. Positional parameters are handled by simply naming scalar variables in the signature. For example, sub foo ($left, $right) { return $left + $right; } takes two positional parameters, which must be filled at runtime by two arguments. By default the parameters are mandatory, and it is not permitted to pass more arguments than expected. So the above is equivalent to sub foo { die "Too many arguments for subroutine" unless @_ <= 2; die "Too few arguments for subroutine" unless @_ >= 2; my $left = $_[0]; my $right = $_[1]; return $left + $right; } An argument can be ignored by omitting the main part of the name from a parameter declaration, leaving just a bare C<$> sigil. For example, sub foo ($first, $, $third) { return "first=$first, third=$third"; } Although the ignored argument doesn't go into a variable, it is still mandatory for the caller to pass it. A positional parameter is made optional by giving a default value, separated from the parameter name by C<=>: sub foo ($left, $right = 0) { return $left + $right; } The above subroutine may be called with either one or two arguments. The default value expression is evaluated when the subroutine is called, so it may provide different default values for different calls. It is only evaluated if the argument was actually omitted from the call. For example, my $auto_id = 0; sub foo ($thing, $id = $auto_id++) { print "$thing has ID $id"; } automatically assigns distinct sequential IDs to things for which no ID was supplied by the caller. A default value expression may also refer to parameters earlier in the signature, making the default for one parameter vary according to the earlier parameters. For example, sub foo ($first_name, $surname, $nickname = $first_name) { print "$first_name $surname is known as \"$nickname\""; } An optional parameter can be nameless just like a mandatory parameter. For example, sub foo ($thing, $ = 1) { print $thing; } The parameter's default value will still be evaluated if the corresponding argument isn't supplied, even though the value won't be stored anywhere. This is in case evaluating it has important side effects. However, it will be evaluated in void context, so if it doesn't have side effects and is not trivial it will generate a warning if the "void" warning category is enabled. If a nameless optional parameter's default value is not important, it may be omitted just as the parameter's name was: sub foo ($thing, $=) { print $thing; } Optional positional parameters must come after all mandatory positional parameters. (If there are no mandatory positional parameters then an optional positional parameters can be the first thing in the signature.) If there are multiple optional positional parameters and not enough arguments are supplied to fill them all, they will be filled from left to right. After positional parameters, additional arguments may be captured in a slurpy parameter. The simplest form of this is just an array variable: sub foo ($filter, @inputs) { print $filter->($_) foreach @inputs; } With a slurpy parameter in the signature, there is no upper limit on how many arguments may be passed. A slurpy array parameter may be nameless just like a positional parameter, in which case its only effect is to turn off the argument limit that would otherwise apply: sub foo ($thing, @) { print $thing; } A slurpy parameter may instead be a hash, in which case the arguments available to it are interpreted as alternating keys and values. There must be as many keys as values: if there is an odd argument then an exception will be thrown. Keys will be stringified, and if there are duplicates then the later instance takes precedence over the earlier, as with standard hash construction. sub foo ($filter, %inputs) { print $filter->($_, $inputs{$_}) foreach sort keys %inputs; } A slurpy hash parameter may be nameless just like other kinds of parameter. It still insists that the number of arguments available to it be even, even though they're not being put into a variable. sub foo ($thing, %) { print $thing; } A slurpy parameter, either array or hash, must be the last thing in the signature. It may follow mandatory and optional positional parameters; it may also be the only thing in the signature. Slurpy parameters cannot have default values: if no arguments are supplied for them then you get an empty array or empty hash. A signature may be entirely empty, in which case all it does is check that the caller passed no arguments: sub foo () { return 123; } When using a signature, the arguments are still available in the special array variable C<@_>, in addition to the lexical variables of the signature. There is a difference between the two ways of accessing the arguments: C<@_> I<aliases> the arguments, but the signature variables get I<copies> of the arguments. So writing to a signature variable only changes that variable, and has no effect on the caller's variables, but writing to an element of C<@_> modifies whatever the caller used to supply that argument. There is a potential syntactic ambiguity between signatures and prototypes (see L</Prototypes>), because both start with an opening parenthesis and both can appear in some of the same places, such as just after the name in a subroutine declaration. For historical reasons, when signatures are not enabled, any opening parenthesis in such a context will trigger very forgiving prototype parsing. Most signatures will be interpreted as prototypes in those circumstances, but won't be valid prototypes. (A valid prototype cannot contain any alphabetic character.) This will lead to somewhat confusing error messages. To avoid ambiguity, when signatures are enabled the special syntax for prototypes is disabled. There is no attempt to guess whether a parenthesised group was intended to be a prototype or a signature. To give a subroutine a prototype under these circumstances, use a L<prototype attribute|attributes/Built-in Attributes>. For example, sub foo :prototype($) { $_[0] } It is entirely possible for a subroutine to have both a prototype and a signature. They do different jobs: the prototype affects compilation of calls to the subroutine, and the signature puts argument values into lexical variables at runtime. You can therefore write sub foo :prototype($$) ($left, $right) { return $left + $right; } The prototype attribute, and any other attributes, must come before the signature. The signature always immediately precedes the block of the subroutine's body. =head2 Private Variables via my() X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical> X<lexical scope> X<attributes, my> Synopsis: my $foo; # declare $foo lexically local my (@wid, %get); # declare list of variables local my $foo = "flurp"; # declare $foo lexical, and init it my @oof = @bar; # declare @oof lexical, and init it my $x : Foo = $y; # similar, with an attribute applied B<WARNING>: The use of attribute lists on C<my> declarations is still evolving. The current semantics and interface are subject to change. See L<attributes> and L<Attribute::Handlers>. The C<my> operator declares the listed variables to be lexically confined to the enclosing block, conditional (C<if>/C<unless>/C<elsif>/C<else>), loop (C<for>/C<foreach>/C<while>/C<until>/C<continue>), subroutine, C<eval>, or C<do>/C<require>/C<use>'d file. If more than one value is listed, the list must be placed in parentheses. All listed elements must be legal lvalues. Only alphanumeric identifiers may be lexically scoped--magical built-ins like C<$/> must currently be C<local>ized with C<local> instead. Unlike dynamic variables created by the C<local> operator, lexical variables declared with C<my> are totally hidden from the outside world, including any called subroutines. This is true if it's the same subroutine called from itself or elsewhere--every call gets its own copy. X<local> This doesn't mean that a C<my> variable declared in a statically enclosing lexical scope would be invisible. Only dynamic scopes are cut off. For example, the C<bumpx()> function below has access to the lexical $x variable because both the C<my> and the C<sub> occurred at the same scope, presumably file scope. my $x = 10; sub bumpx { $x++ } An C<eval()>, however, can see lexical variables of the scope it is being evaluated in, so long as the names aren't hidden by declarations within the C<eval()> itself. See L<perlref>. X<eval, scope of> The parameter list to my() may be assigned to if desired, which allows you to initialize your variables. (If no initializer is given for a particular variable, it is created with the undefined value.) Commonly this is used to name input parameters to a subroutine. Examples: $arg = "fred"; # "global" variable $n = cube_root(27); print "$arg thinks the root is $n\n"; fred thinks the root is 3 sub cube_root { my $arg = shift; # name doesn't matter $arg **= 1/3; return $arg; } The C<my> is simply a modifier on something you might assign to. So when you do assign to variables in its argument list, C<my> doesn't change whether those variables are viewed as a scalar or an array. So my ($foo) = <STDIN>; # WRONG? my @FOO = <STDIN>; both supply a list context to the right-hand side, while my $foo = <STDIN>; supplies a scalar context. But the following declares only one variable: my $foo, $bar = 1; # WRONG That has the same effect as my $foo; $bar = 1; The declared variable is not introduced (is not visible) until after the current statement. Thus, my $x = $x; can be used to initialize a new $x with the value of the old $x, and the expression my $x = 123 and $x == 123 is false unless the old $x happened to have the value C<123>. Lexical scopes of control structures are not bounded precisely by the braces that delimit their controlled blocks; control expressions are part of that scope, too. Thus in the loop while (my $line = <>) { $line = lc $line; } continue { print $line; } the scope of $line extends from its declaration throughout the rest of the loop construct (including the C<continue> clause), but not beyond it. Similarly, in the conditional if ((my $answer = <STDIN>) =~ /^yes$/i) { user_agrees(); } elsif ($answer =~ /^no$/i) { user_disagrees(); } else { chomp $answer; die "'$answer' is neither 'yes' nor 'no'"; } the scope of $answer extends from its declaration through the rest of that conditional, including any C<elsif> and C<else> clauses, but not beyond it. See L<perlsyn/"Simple Statements"> for information on the scope of variables in statements with modifiers. The C<foreach> loop defaults to scoping its index variable dynamically in the manner of C<local>. However, if the index variable is prefixed with the keyword C<my>, or if there is already a lexical by that name in scope, then a new lexical is created instead. Thus in the loop X<foreach> X<for> for my $i (1, 2, 3) { some_function(); } the scope of $i extends to the end of the loop, but not beyond it, rendering the value of $i inaccessible within C<some_function()>. X<foreach> X<for> Some users may wish to encourage the use of lexically scoped variables. As an aid to catching implicit uses to package variables, which are always global, if you say use strict 'vars'; then any variable mentioned from there to the end of the enclosing block must either refer to a lexical variable, be predeclared via C<our> or C<use vars>, or else must be fully qualified with the package name. A compilation error results otherwise. An inner block may countermand this with C<no strict 'vars'>. A C<my> has both a compile-time and a run-time effect. At compile time, the compiler takes notice of it. The principal usefulness of this is to quiet C<use strict 'vars'>, but it is also essential for generation of closures as detailed in L<perlref>. Actual initialization is delayed until run time, though, so it gets executed at the appropriate time, such as each time through a loop, for example. Variables declared with C<my> are not part of any package and are therefore never fully qualified with the package name. In particular, you're not allowed to try to make a package variable (or other global) lexical: my $pack::var; # ERROR! Illegal syntax In fact, a dynamic variable (also known as package or global variables) are still accessible using the fully qualified C<::> notation even while a lexical of the same name is also visible: package main; local $x = 10; my $x = 20; print "$x and $::x\n"; That will print out C<20> and C<10>. You may declare C<my> variables at the outermost scope of a file to hide any such identifiers from the world outside that file. This is similar in spirit to C's static variables when they are used at the file level. To do this with a subroutine requires the use of a closure (an anonymous function that accesses enclosing lexicals). If you want to create a private subroutine that cannot be called from outside that block, it can declare a lexical variable containing an anonymous sub reference: my $secret_version = '1.001-beta'; my $secret_sub = sub { print $secret_version }; &$secret_sub(); As long as the reference is never returned by any function within the module, no outside module can see the subroutine, because its name is not in any package's symbol table. Remember that it's not I<REALLY> called C<$some_pack::secret_version> or anything; it's just $secret_version, unqualified and unqualifiable. This does not work with object methods, however; all object methods have to be in the symbol table of some package to be found. See L<perlref/"Function Templates"> for something of a work-around to this. =head2 Persistent Private Variables X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure> There are two ways to build persistent private variables in Perl 5.10. First, you can simply use the C<state> feature. Or, you can use closures, if you want to stay compatible with releases older than 5.10. =head3 Persistent variables via state() Beginning with Perl 5.10.0, you can declare variables with the C<state> keyword in place of C<my>. For that to work, though, you must have enabled that feature beforehand, either by using the C<feature> pragma, or by using C<-E> on one-liners (see L<feature>). Beginning with Perl 5.16, the C<CORE::state> form does not require the C<feature> pragma. The C<state> keyword creates a lexical variable (following the same scoping rules as C<my>) that persists from one subroutine call to the next. If a state variable resides inside an anonymous subroutine, then each copy of the subroutine has its own copy of the state variable. However, the value of the state variable will still persist between calls to the same copy of the anonymous subroutine. (Don't forget that C<sub { ... }> creates a new subroutine each time it is executed.) For example, the following code maintains a private counter, incremented each time the gimme_another() function is called: use feature 'state'; sub gimme_another { state $x; return ++$x } And this example uses anonymous subroutines to create separate counters: use feature 'state'; sub create_counter { return sub { state $x; return ++$x } } Also, since C<$x> is lexical, it can't be reached or modified by any Perl code outside. When combined with variable declaration, simple assignment to C<state> variables (as in C<state $x = 42>) is executed only the first time. When such statements are evaluated subsequent times, the assignment is ignored. The behavior of assignment to C<state> declarations where the left hand side of the assignment involves any parentheses is currently undefined. =head3 Persistent variables with closures Just because a lexical variable is lexically (also called statically) scoped to its enclosing block, C<eval>, or C<do> FILE, this doesn't mean that within a function it works like a C static. It normally works more like a C auto, but with implicit garbage collection. Unlike local variables in C or C++, Perl's lexical variables don't necessarily get recycled just because their scope has exited. If something more permanent is still aware of the lexical, it will stick around. So long as something else references a lexical, that lexical won't be freed--which is as it should be. You wouldn't want memory being free until you were done using it, or kept around once you were done. Automatic garbage collection takes care of this for you. This means that you can pass back or save away references to lexical variables, whereas to return a pointer to a C auto is a grave error. It also gives us a way to simulate C's function statics. Here's a mechanism for giving a function private variables with both lexical scoping and a static lifetime. If you do want to create something like C's static variables, just enclose the whole function in an extra block, and put the static variable outside the function but in the block. { my $secret_val = 0; sub gimme_another { return ++$secret_val; } } # $secret_val now becomes unreachable by the outside # world, but retains its value between calls to gimme_another If this function is being sourced in from a separate file via C<require> or C<use>, then this is probably just fine. If it's all in the main program, you'll need to arrange for the C<my> to be executed early, either by putting the whole block above your main program, or more likely, placing merely a C<BEGIN> code block around it to make sure it gets executed before your program starts to run: BEGIN { my $secret_val = 0; sub gimme_another { return ++$secret_val; } } See L<perlmod/"BEGIN, UNITCHECK, CHECK, INIT and END"> about the special triggered code blocks, C<BEGIN>, C<UNITCHECK>, C<CHECK>, C<INIT> and C<END>. If declared at the outermost scope (the file scope), then lexicals work somewhat like C's file statics. They are available to all functions in that same file declared below them, but are inaccessible from outside that file. This strategy is sometimes used in modules to create private variables that the whole module can see. =head2 Temporary Values via local() X<local> X<scope, dynamic> X<dynamic scope> X<variable, local> X<variable, temporary> B<WARNING>: In general, you should be using C<my> instead of C<local>, because it's faster and safer. Exceptions to this include the global punctuation variables, global filehandles and formats, and direct manipulation of the Perl symbol table itself. C<local> is mostly used when the current value of a variable must be visible to called subroutines. Synopsis: # localization of values local $foo; # make $foo dynamically local local (@wid, %get); # make list of variables local local $foo = "flurp"; # make $foo dynamic, and init it local @oof = @bar; # make @oof dynamic, and init it local $hash{key} = "val"; # sets a local value for this hash entry delete local $hash{key}; # delete this entry for the current block local ($cond ? $v1 : $v2); # several types of lvalues support # localization # localization of symbols local *FH; # localize $FH, @FH, %FH, &FH ... local *merlyn = *randal; # now $merlyn is really $randal, plus # @merlyn is really @randal, etc local *merlyn = 'randal'; # SAME THING: promote 'randal' to *randal local *merlyn = \$randal; # just alias $merlyn, not @merlyn etc A C<local> modifies its listed variables to be "local" to the enclosing block, C<eval>, or C<do FILE>--and to I<any subroutine called from within that block>. A C<local> just gives temporary values to global (meaning package) variables. It does I<not> create a local variable. This is known as dynamic scoping. Lexical scoping is done with C<my>, which works more like C's auto declarations. Some types of lvalues can be localized as well: hash and array elements and slices, conditionals (provided that their result is always localizable), and symbolic references. As for simple variables, this creates new, dynamically scoped values. If more than one variable or expression is given to C<local>, they must be placed in parentheses. This operator works by saving the current values of those variables in its argument list on a hidden stack and restoring them upon exiting the block, subroutine, or eval. This means that called subroutines can also reference the local variable, but not the global one. The argument list may be assigned to if desired, which allows you to initialize your local variables. (If no initializer is given for a particular variable, it is created with an undefined value.) Because C<local> is a run-time operator, it gets executed each time through a loop. Consequently, it's more efficient to localize your variables outside the loop. =head3 Grammatical note on local() X<local, context> A C<local> is simply a modifier on an lvalue expression. When you assign to a C<local>ized variable, the C<local> doesn't change whether its list is viewed as a scalar or an array. So local($foo) = <STDIN>; local @FOO = <STDIN>; both supply a list context to the right-hand side, while local $foo = <STDIN>; supplies a scalar context. =head3 Localization of special variables X<local, special variable> If you localize a special variable, you'll be giving a new value to it, but its magic won't go away. That means that all side-effects related to this magic still work with the localized value. This feature allows code like this to work : # Read the whole contents of FILE in $slurp { local $/ = undef; $slurp = <FILE>; } Note, however, that this restricts localization of some values ; for example, the following statement dies, as of perl 5.10.0, with an error I<Modification of a read-only value attempted>, because the $1 variable is magical and read-only : local $1 = 2; One exception is the default scalar variable: starting with perl 5.14 C<local($_)> will always strip all magic from $_, to make it possible to safely reuse $_ in a subroutine. B<WARNING>: Localization of tied arrays and hashes does not currently work as described. This will be fixed in a future release of Perl; in the meantime, avoid code that relies on any particular behavior of localising tied arrays or hashes (localising individual elements is still okay). See L<perl58delta/"Localising Tied Arrays and Hashes Is Broken"> for more details. X<local, tie> =head3 Localization of globs X<local, glob> X<glob> The construct local *name; creates a whole new symbol table entry for the glob C<name> in the current package. That means that all variables in its glob slot ($name, @name, %name, &name, and the C<name> filehandle) are dynamically reset. This implies, among other things, that any magic eventually carried by those variables is locally lost. In other words, saying C<local */> will not have any effect on the internal value of the input record separator. =head3 Localization of elements of composite types X<local, composite type element> X<local, array element> X<local, hash element> It's also worth taking a moment to explain what happens when you C<local>ize a member of a composite type (i.e. an array or hash element). In this case, the element is C<local>ized I<by name>. This means that when the scope of the C<local()> ends, the saved value will be restored to the hash element whose key was named in the C<local()>, or the array element whose index was named in the C<local()>. If that element was deleted while the C<local()> was in effect (e.g. by a C<delete()> from a hash or a C<shift()> of an array), it will spring back into existence, possibly extending an array and filling in the skipped elements with C<undef>. For instance, if you say %hash = ( 'This' => 'is', 'a' => 'test' ); @ary = ( 0..5 ); { local($ary[5]) = 6; local($hash{'a'}) = 'drill'; while (my $e = pop(@ary)) { print "$e . . .\n"; last unless $e > 3; } if (@ary) { $hash{'only a'} = 'test'; delete $hash{'a'}; } } print join(' ', map { "$_ $hash{$_}" } sort keys %hash),".\n"; print "The array has ",scalar(@ary)," elements: ", join(', ', map { defined $_ ? $_ : 'undef' } @ary),"\n"; Perl will print 6 . . . 4 . . . 3 . . . This is a test only a test. The array has 6 elements: 0, 1, 2, undef, undef, 5 The behavior of local() on non-existent members of composite types is subject to change in future. The behavior of local() on array elements specified using negative indexes is particularly surprising, and is very likely to change. =head3 Localized deletion of elements of composite types X<delete> X<local, composite type element> X<local, array element> X<local, hash element> You can use the C<delete local $array[$idx]> and C<delete local $hash{key}> constructs to delete a composite type entry for the current block and restore it when it ends. They return the array/hash value before the localization, which means that they are respectively equivalent to do { my $val = $array[$idx]; local $array[$idx]; delete $array[$idx]; $val } and do { my $val = $hash{key}; local $hash{key}; delete $hash{key}; $val } except that for those the C<local> is scoped to the C<do> block. Slices are also accepted. my %hash = ( a => [ 7, 8, 9 ], b => 1, ) { my $a = delete local $hash{a}; # $a is [ 7, 8, 9 ] # %hash is (b => 1) { my @nums = delete local @$a[0, 2] # @nums is (7, 9) # $a is [ undef, 8 ] $a[0] = 999; # will be erased when the scope ends } # $a is back to [ 7, 8, 9 ] } # %hash is back to its original state =head2 Lvalue subroutines X<lvalue> X<subroutine, lvalue> It is possible to return a modifiable value from a subroutine. To do this, you have to declare the subroutine to return an lvalue. my $val; sub canmod : lvalue { $val; # or: return $val; } sub nomod { $val; } canmod() = 5; # assigns to $val nomod() = 5; # ERROR The scalar/list context for the subroutine and for the right-hand side of assignment is determined as if the subroutine call is replaced by a scalar. For example, consider: data(2,3) = get_data(3,4); Both subroutines here are called in a scalar context, while in: (data(2,3)) = get_data(3,4); and in: (data(2),data(3)) = get_data(3,4); all the subroutines are called in a list context. Lvalue subroutines are convenient, but you have to keep in mind that, when used with objects, they may violate encapsulation. A normal mutator can check the supplied argument before setting the attribute it is protecting, an lvalue subroutine cannot. If you require any special processing when storing and retrieving the values, consider using the CPAN module Sentinel or something similar. =head2 Lexical Subroutines X<my sub> X<state sub> X<our sub> X<subroutine, lexical> Beginning with Perl 5.18, you can declare a private subroutine with C<my> or C<state>. As with state variables, the C<state> keyword is only available under C<use feature 'state'> or C<use 5.010> or higher. Prior to Perl 5.26, lexical subroutines were deemed experimental and were available only under the C<use feature 'lexical_subs'> pragma. They also produced a warning unless the "experimental::lexical_subs" warnings category was disabled. These subroutines are only visible within the block in which they are declared, and only after that declaration: # Include these two lines if your code is intended to run under Perl # versions earlier than 5.26. no warnings "experimental::lexical_subs"; use feature 'lexical_subs'; foo(); # calls the package/global subroutine state sub foo { foo(); # also calls the package subroutine } foo(); # calls "state" sub my $ref = \&foo; # take a reference to "state" sub my sub bar { ... } bar(); # calls "my" sub You can't (directly) write a recursive lexical subroutine: # WRONG my sub baz { baz(); } This example fails because C<baz()> refers to the package/global subroutine C<baz>, not the lexical subroutine currently being defined. The solution is to use L<C<__SUB__>|perlfunc/__SUB__>: my sub baz { __SUB__->(); # calls itself } It is possible to predeclare a lexical subroutine. The C<sub foo {...}> subroutine definition syntax respects any previous C<my sub;> or C<state sub;> declaration. Using this to define recursive subroutines is a bad idea, however: my sub baz; # predeclaration sub baz { # define the "my" sub baz(); # WRONG: calls itself, but leaks memory } Just like C<< my $f; $f = sub { $f->() } >>, this example leaks memory. The name C<baz> is a reference to the subroutine, and the subroutine uses the name C<baz>; they keep each other alive (see L<perlref/Circular References>). =head3 C<state sub> vs C<my sub> What is the difference between "state" subs and "my" subs? Each time that execution enters a block when "my" subs are declared, a new copy of each sub is created. "State" subroutines persist from one execution of the containing block to the next. So, in general, "state" subroutines are faster. But "my" subs are necessary if you want to create closures: sub whatever { my $x = shift; my sub inner { ... do something with $x ... } inner(); } In this example, a new C<$x> is created when C<whatever> is called, and also a new C<inner>, which can see the new C<$x>. A "state" sub will only see the C<$x> from the first call to C<whatever>. =head3 C<our> subroutines Like C<our $variable>, C<our sub> creates a lexical alias to the package subroutine of the same name. The two main uses for this are to switch back to using the package sub inside an inner scope: sub foo { ... } sub bar { my sub foo { ... } { # need to use the outer foo here our sub foo; foo(); } } and to make a subroutine visible to other packages in the same scope: package MySneakyModule; our sub do_something { ... } sub do_something_with_caller { package DB; () = caller 1; # sets @DB::args do_something(@args); # uses MySneakyModule::do_something } =head2 Passing Symbol Table Entries (typeglobs) X<typeglob> X<*> B<WARNING>: The mechanism described in this section was originally the only way to simulate pass-by-reference in older versions of Perl. While it still works fine in modern versions, the new reference mechanism is generally easier to work with. See below. Sometimes you don't want to pass the value of an array to a subroutine but rather the name of it, so that the subroutine can modify the global copy of it rather than working with a local copy. In perl you can refer to all objects of a particular name by prefixing the name with a star: C<*foo>. This is often known as a "typeglob", because the star on the front can be thought of as a wildcard match for all the funny prefix characters on variables and subroutines and such. When evaluated, the typeglob produces a scalar value that represents all the objects of that name, including any filehandle, format, or subroutine. When assigned to, it causes the name mentioned to refer to whatever C<*> value was assigned to it. Example: sub doubleary { local(*someary) = @_; foreach $elem (@someary) { $elem *= 2; } } doubleary(*foo); doubleary(*bar); Scalars are already passed by reference, so you can modify scalar arguments without using this mechanism by referring explicitly to C<$_[0]> etc. You can modify all the elements of an array by passing all the elements as scalars, but you have to use the C<*> mechanism (or the equivalent reference mechanism) to C<push>, C<pop>, or change the size of an array. It will certainly be faster to pass the typeglob (or reference). Even if you don't want to modify an array, this mechanism is useful for passing multiple arrays in a single LIST, because normally the LIST mechanism will merge all the array values so that you can't extract out the individual arrays. For more on typeglobs, see L<perldata/"Typeglobs and Filehandles">. =head2 When to Still Use local() X<local> X<variable, local> Despite the existence of C<my>, there are still three places where the C<local> operator still shines. In fact, in these three places, you I<must> use C<local> instead of C<my>. =over 4 =item 1. You need to give a global variable a temporary value, especially $_. The global variables, like C<@ARGV> or the punctuation variables, must be C<local>ized with C<local()>. This block reads in F</etc/motd>, and splits it up into chunks separated by lines of equal signs, which are placed in C<@Fields>. { local @ARGV = ("/etc/motd"); local $/ = undef; local $_ = <>; @Fields = split /^\s*=+\s*$/; } It particular, it's important to C<local>ize $_ in any routine that assigns to it. Look out for implicit assignments in C<while> conditionals. =item 2. You need to create a local file or directory handle or a local function. A function that needs a filehandle of its own must use C<local()> on a complete typeglob. This can be used to create new symbol table entries: sub ioqueue { local (*READER, *WRITER); # not my! pipe (READER, WRITER) or die "pipe: $!"; return (*READER, *WRITER); } ($head, $tail) = ioqueue(); See the Symbol module for a way to create anonymous symbol table entries. Because assignment of a reference to a typeglob creates an alias, this can be used to create what is effectively a local function, or at least, a local alias. { local *grow = \&shrink; # only until this block exits grow(); # really calls shrink() move(); # if move() grow()s, it shrink()s too } grow(); # get the real grow() again See L<perlref/"Function Templates"> for more about manipulating functions by name in this way. =item 3. You want to temporarily change just one element of an array or hash. You can C<local>ize just one element of an aggregate. Usually this is done on dynamics: { local $SIG{INT} = 'IGNORE'; funct(); # uninterruptible } # interruptibility automatically restored here But it also works on lexically declared aggregates. =back =head2 Pass by Reference X<pass by reference> X<pass-by-reference> X<reference> If you want to pass more than one array or hash into a function--or return them from it--and have them maintain their integrity, then you're going to have to use an explicit pass-by-reference. Before you do that, you need to understand references as detailed in L<perlref>. This section may not make much sense to you otherwise. Here are a few simple examples. First, let's pass in several arrays to a function and have it C<pop> all of then, returning a new list of all their former last elements: @tailings = popmany ( \@a, \@b, \@c, \@d ); sub popmany { my $aref; my @retlist; foreach $aref ( @_ ) { push @retlist, pop @$aref; } return @retlist; } Here's how you might write a function that returns a list of keys occurring in all the hashes passed to it: @common = inter( \%foo, \%bar, \%joe ); sub inter { my ($k, $href, %seen); # locals foreach $href (@_) { while ( $k = each %$href ) { $seen{$k}++; } } return grep { $seen{$_} == @_ } keys %seen; } So far, we're using just the normal list return mechanism. What happens if you want to pass or return a hash? Well, if you're using only one of them, or you don't mind them concatenating, then the normal calling convention is ok, although a little expensive. Where people get into trouble is here: (@a, @b) = func(@c, @d); or (%a, %b) = func(%c, %d); That syntax simply won't work. It sets just C<@a> or C<%a> and clears the C<@b> or C<%b>. Plus the function didn't get passed into two separate arrays or hashes: it got one long list in C<@_>, as always. If you can arrange for everyone to deal with this through references, it's cleaner code, although not so nice to look at. Here's a function that takes two array references as arguments, returning the two array elements in order of how many elements they have in them: ($aref, $bref) = func(\@c, \@d); print "@$aref has more than @$bref\n"; sub func { my ($cref, $dref) = @_; if (@$cref > @$dref) { return ($cref, $dref); } else { return ($dref, $cref); } } It turns out that you can actually do this also: (*a, *b) = func(\@c, \@d); print "@a has more than @b\n"; sub func { local (*c, *d) = @_; if (@c > @d) { return (\@c, \@d); } else { return (\@d, \@c); } } Here we're using the typeglobs to do symbol table aliasing. It's a tad subtle, though, and also won't work if you're using C<my> variables, because only globals (even in disguise as C<local>s) are in the symbol table. If you're passing around filehandles, you could usually just use the bare typeglob, like C<*STDOUT>, but typeglobs references work, too. For example: splutter(\*STDOUT); sub splutter { my $fh = shift; print $fh "her um well a hmmm\n"; } $rec = get_rec(\*STDIN); sub get_rec { my $fh = shift; return scalar <$fh>; } If you're planning on generating new filehandles, you could do this. Notice to pass back just the bare *FH, not its reference. sub openit { my $path = shift; local *FH; return open (FH, $path) ? *FH : undef; } =head2 Prototypes X<prototype> X<subroutine, prototype> Perl supports a very limited kind of compile-time argument checking using function prototyping. This can be declared in either the PROTO section or with a L<prototype attribute|attributes/Built-in Attributes>. If you declare either of sub mypush (\@@) sub mypush :prototype(\@@) then C<mypush()> takes arguments exactly like C<push()> does. If subroutine signatures are enabled (see L</Signatures>), then the shorter PROTO syntax is unavailable, because it would clash with signatures. In that case, a prototype can only be declared in the form of an attribute. The function declaration must be visible at compile time. The prototype affects only interpretation of new-style calls to the function, where new-style is defined as not using the C<&> character. In other words, if you call it like a built-in function, then it behaves like a built-in function. If you call it like an old-fashioned subroutine, then it behaves like an old-fashioned subroutine. It naturally falls out from this rule that prototypes have no influence on subroutine references like C<\&foo> or on indirect subroutine calls like C<&{$subref}> or C<< $subref->() >>. Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance. Because the intent of this feature is primarily to let you define subroutines that work like built-in functions, here are prototypes for some other functions that parse almost exactly like the corresponding built-in. Declared as Called as sub mylink ($$) mylink $old, $new sub myvec ($$$) myvec $var, $offset, 1 sub myindex ($$;$) myindex &getstring, "substr" sub mysyswrite ($$$;$) mysyswrite $buf, 0, length($buf) - $off, $off sub myreverse (@) myreverse $a, $b, $c sub myjoin ($@) myjoin ":", $a, $b, $c sub mypop (\@) mypop @array sub mysplice (\@$$@) mysplice @array, 0, 2, @pushme sub mykeys (\[%@]) mykeys %{$hashref} sub myopen (*;$) myopen HANDLE, $name sub mypipe (**) mypipe READHANDLE, WRITEHANDLE sub mygrep (&@) mygrep { /foo/ } $a, $b, $c sub myrand (;$) myrand 42 sub mytime () mytime Any backslashed prototype character represents an actual argument that must start with that character (optionally preceded by C<my>, C<our> or C<local>), with the exception of C<$>, which will accept any scalar lvalue expression, such as C<$foo = 7> or C<< my_function()->[0] >>. The value passed as part of C<@_> will be a reference to the actual argument given in the subroutine call, obtained by applying C<\> to that argument. You can use the C<\[]> backslash group notation to specify more than one allowed argument type. For example: sub myref (\[$@%&*]) will allow calling myref() as myref $var myref @array myref %hash myref &sub myref *glob and the first argument of myref() will be a reference to a scalar, an array, a hash, a code, or a glob. Unbackslashed prototype characters have special meanings. Any unbackslashed C<@> or C<%> eats all remaining arguments, and forces list context. An argument represented by C<$> forces scalar context. An C<&> requires an anonymous subroutine, which, if passed as the first argument, does not require the C<sub> keyword or a subsequent comma. A C<*> allows the subroutine to accept a bareword, constant, scalar expression, typeglob, or a reference to a typeglob in that slot. The value will be available to the subroutine either as a simple scalar, or (in the latter two cases) as a reference to the typeglob. If you wish to always convert such arguments to a typeglob reference, use Symbol::qualify_to_ref() as follows: use Symbol 'qualify_to_ref'; sub foo (*) { my $fh = qualify_to_ref(shift, caller); ... } The C<+> prototype is a special alternative to C<$> that will act like C<\[@%]> when given a literal array or hash variable, but will otherwise force scalar context on the argument. This is useful for functions which should accept either a literal array or an array reference as the argument: sub mypush (+@) { my $aref = shift; die "Not an array or arrayref" unless ref $aref eq 'ARRAY'; push @$aref, @_; } When using the C<+> prototype, your function must check that the argument is of an acceptable type. A semicolon (C<;>) separates mandatory arguments from optional arguments. It is redundant before C<@> or C<%>, which gobble up everything else. As the last character of a prototype, or just before a semicolon, a C<@> or a C<%>, you can use C<_> in place of C<$>: if this argument is not provided, C<$_> will be used instead. Note how the last three examples in the table above are treated specially by the parser. C<mygrep()> is parsed as a true list operator, C<myrand()> is parsed as a true unary operator with unary precedence the same as C<rand()>, and C<mytime()> is truly without arguments, just like C<time()>. That is, if you say mytime +2; you'll get C<mytime() + 2>, not C<mytime(2)>, which is how it would be parsed without a prototype. If you want to force a unary function to have the same precedence as a list operator, add C<;> to the end of the prototype: sub mygetprotobynumber($;); mygetprotobynumber $a > $b; # parsed as mygetprotobynumber($a > $b) The interesting thing about C<&> is that you can generate new syntax with it, provided it's in the initial position: X<&> sub try (&@) { my($try,$catch) = @_; eval { &$try }; if ($@) { local $_ = $@; &$catch; } } sub catch (&) { $_[0] } try { die "phooey"; } catch { /phooey/ and print "unphooey\n"; }; That prints C<"unphooey">. (Yes, there are still unresolved issues having to do with visibility of C<@_>. I'm ignoring that question for the moment. (But note that if we make C<@_> lexically scoped, those anonymous subroutines can act like closures... (Gee, is this sounding a little Lispish? (Never mind.)))) And here's a reimplementation of the Perl C<grep> operator: X<grep> sub mygrep (&@) { my $code = shift; my @result; foreach $_ (@_) { push(@result, $_) if &$code; } @result; } Some folks would prefer full alphanumeric prototypes. Alphanumerics have been intentionally left out of prototypes for the express purpose of someday in the future adding named, formal parameters. The current mechanism's main goal is to let module writers provide better diagnostics for module users. Larry feels the notation quite understandable to Perl programmers, and that it will not intrude greatly upon the meat of the module, nor make it harder to read. The line noise is visually encapsulated into a small pill that's easy to swallow. If you try to use an alphanumeric sequence in a prototype you will generate an optional warning - "Illegal character in prototype...". Unfortunately earlier versions of Perl allowed the prototype to be used as long as its prefix was a valid prototype. The warning may be upgraded to a fatal error in a future version of Perl once the majority of offending code is fixed. It's probably best to prototype new functions, not retrofit prototyping into older ones. That's because you must be especially careful about silent impositions of differing list versus scalar contexts. For example, if you decide that a function should take just one parameter, like this: sub func ($) { my $n = shift; print "you gave me $n\n"; } and someone has been calling it with an array or expression returning a list: func(@foo); func( $text =~ /\w+/g ); Then you've just supplied an automatic C<scalar> in front of their argument, which can be more than a bit surprising. The old C<@foo> which used to hold one thing doesn't get passed in. Instead, C<func()> now gets passed in a C<1>; that is, the number of elements in C<@foo>. And the C<m//g> gets called in scalar context so instead of a list of words it returns a boolean result and advances C<pos($text)>. Ouch! If a sub has both a PROTO and a BLOCK, the prototype is not applied until after the BLOCK is completely defined. This means that a recursive function with a prototype has to be predeclared for the prototype to take effect, like so: sub foo($$); sub foo($$) { foo 1, 2; } This is all very powerful, of course, and should be used only in moderation to make the world a better place. =head2 Constant Functions X<constant> Functions with a prototype of C<()> are potential candidates for inlining. If the result after optimization and constant folding is either a constant or a lexically-scoped scalar which has no other references, then it will be used in place of function calls made without C<&>. Calls made using C<&> are never inlined. (See F<constant.pm> for an easy way to declare most constants.) The following functions would all be inlined: sub pi () { 3.14159 } # Not exact, but close. sub PI () { 4 * atan2 1, 1 } # As good as it gets, # and it's inlined, too! sub ST_DEV () { 0 } sub ST_INO () { 1 } sub FLAG_FOO () { 1 << 8 } sub FLAG_BAR () { 1 << 9 } sub FLAG_MASK () { FLAG_FOO | FLAG_BAR } sub OPT_BAZ () { not (0x1B58 & FLAG_MASK) } sub N () { int(OPT_BAZ) / 3 } sub FOO_SET () { 1 if FLAG_MASK & FLAG_FOO } sub FOO_SET2 () { if (FLAG_MASK & FLAG_FOO) { 1 } } (Be aware that the last example was not always inlined in Perl 5.20 and earlier, which did not behave consistently with subroutines containing inner scopes.) You can countermand inlining by using an explicit C<return>: sub baz_val () { if (OPT_BAZ) { return 23; } else { return 42; } } sub bonk_val () { return 12345 } As alluded to earlier you can also declare inlined subs dynamically at BEGIN time if their body consists of a lexically-scoped scalar which has no other references. Only the first example here will be inlined: BEGIN { my $var = 1; no strict 'refs'; *INLINED = sub () { $var }; } BEGIN { my $var = 1; my $ref = \$var; no strict 'refs'; *NOT_INLINED = sub () { $var }; } A not so obvious caveat with this (see [RT #79908]) is that the variable will be immediately inlined, and will stop behaving like a normal lexical variable, e.g. this will print C<79907>, not C<79908>: BEGIN { my $x = 79907; *RT_79908 = sub () { $x }; $x++; } print RT_79908(); # prints 79907 As of Perl 5.22, this buggy behavior, while preserved for backward compatibility, is detected and emits a deprecation warning. If you want the subroutine to be inlined (with no warning), make sure the variable is not used in a context where it could be modified aside from where it is declared. # Fine, no warning BEGIN { my $x = 54321; *INLINED = sub () { $x }; } # Warns. Future Perl versions will stop inlining it. BEGIN { my $x; $x = 54321; *ALSO_INLINED = sub () { $x }; } Perl 5.22 also introduces the experimental "const" attribute as an alternative. (Disable the "experimental::const_attr" warnings if you want to use it.) When applied to an anonymous subroutine, it forces the sub to be called when the C<sub> expression is evaluated. The return value is captured and turned into a constant subroutine: my $x = 54321; *INLINED = sub : const { $x }; $x++; The return value of C<INLINED> in this example will always be 54321, regardless of later modifications to $x. You can also put any arbitrary code inside the sub, at it will be executed immediately and its return value captured the same way. If you really want a subroutine with a C<()> prototype that returns a lexical variable you can easily force it to not be inlined by adding an explicit C<return>: BEGIN { my $x = 79907; *RT_79908 = sub () { return $x }; $x++; } print RT_79908(); # prints 79908 The easiest way to tell if a subroutine was inlined is by using L<B::Deparse>. Consider this example of two subroutines returning C<1>, one with a C<()> prototype causing it to be inlined, and one without (with deparse output truncated for clarity): $ perl -MO=Deparse -le 'sub ONE { 1 } if (ONE) { print ONE if ONE }' sub ONE { 1; } if (ONE ) { print ONE() if ONE ; } $ perl -MO=Deparse -le 'sub ONE () { 1 } if (ONE) { print ONE if ONE }' sub ONE () { 1 } do { print 1 }; If you redefine a subroutine that was eligible for inlining, you'll get a warning by default. You can use this warning to tell whether or not a particular subroutine is considered inlinable, since it's different than the warning for overriding non-inlined subroutines: $ perl -e 'sub one () {1} sub one () {2}' Constant subroutine one redefined at -e line 1. $ perl -we 'sub one {1} sub one {2}' Subroutine one redefined at -e line 1. The warning is considered severe enough not to be affected by the B<-w> switch (or its absence) because previously compiled invocations of the function will still be using the old value of the function. If you need to be able to redefine the subroutine, you need to ensure that it isn't inlined, either by dropping the C<()> prototype (which changes calling semantics, so beware) or by thwarting the inlining mechanism in some other way, e.g. by adding an explicit C<return>, as mentioned above: sub not_inlined () { return 23 } =head2 Overriding Built-in Functions X<built-in> X<override> X<CORE> X<CORE::GLOBAL> Many built-in functions may be overridden, though this should be tried only occasionally and for good reason. Typically this might be done by a package attempting to emulate missing built-in functionality on a non-Unix system. Overriding may be done only by importing the name from a module at compile time--ordinary predeclaration isn't good enough. However, the C<use subs> pragma lets you, in effect, predeclare subs via the import syntax, and these names may then override built-in ones: use subs 'chdir', 'chroot', 'chmod', 'chown'; chdir $somewhere; sub chdir { ... } To unambiguously refer to the built-in form, precede the built-in name with the special package qualifier C<CORE::>. For example, saying C<CORE::open()> always refers to the built-in C<open()>, even if the current package has imported some other subroutine called C<&open()> from elsewhere. Even though it looks like a regular function call, it isn't: the CORE:: prefix in that case is part of Perl's syntax, and works for any keyword, regardless of what is in the CORE package. Taking a reference to it, that is, C<\&CORE::open>, only works for some keywords. See L<CORE>. Library modules should not in general export built-in names like C<open> or C<chdir> as part of their default C<@EXPORT> list, because these may sneak into someone else's namespace and change the semantics unexpectedly. Instead, if the module adds that name to C<@EXPORT_OK>, then it's possible for a user to import the name explicitly, but not implicitly. That is, they could say use Module 'open'; and it would import the C<open> override. But if they said use Module; they would get the default imports without overrides. The foregoing mechanism for overriding built-in is restricted, quite deliberately, to the package that requests the import. There is a second method that is sometimes applicable when you wish to override a built-in everywhere, without regard to namespace boundaries. This is achieved by importing a sub into the special namespace C<CORE::GLOBAL::>. Here is an example that quite brazenly replaces the C<glob> operator with something that understands regular expressions. package REGlob; require Exporter; @ISA = 'Exporter'; @EXPORT_OK = 'glob'; sub import { my $pkg = shift; return unless @_; my $sym = shift; my $where = ($sym =~ s/^GLOBAL_// ? 'CORE::GLOBAL' : caller(0)); $pkg->export($where, $sym, @_); } sub glob { my $pat = shift; my @got; if (opendir my $d, '.') { @got = grep /$pat/, readdir $d; closedir $d; } return @got; } 1; And here's how it could be (ab)used: #use REGlob 'GLOBAL_glob'; # override glob() in ALL namespaces package Foo; use REGlob 'glob'; # override glob() in Foo:: only print for <^[a-z_]+\.pm\$>; # show all pragmatic modules The initial comment shows a contrived, even dangerous example. By overriding C<glob> globally, you would be forcing the new (and subversive) behavior for the C<glob> operator for I<every> namespace, without the complete cognizance or cooperation of the modules that own those namespaces. Naturally, this should be done with extreme caution--if it must be done at all. The C<REGlob> example above does not implement all the support needed to cleanly override perl's C<glob> operator. The built-in C<glob> has different behaviors depending on whether it appears in a scalar or list context, but our C<REGlob> doesn't. Indeed, many perl built-in have such context sensitive behaviors, and these must be adequately supported by a properly written override. For a fully functional example of overriding C<glob>, study the implementation of C<File::DosGlob> in the standard library. When you override a built-in, your replacement should be consistent (if possible) with the built-in native syntax. You can achieve this by using a suitable prototype. To get the prototype of an overridable built-in, use the C<prototype> function with an argument of C<"CORE::builtin_name"> (see L<perlfunc/prototype>). Note however that some built-ins can't have their syntax expressed by a prototype (such as C<system> or C<chomp>). If you override them you won't be able to fully mimic their original syntax. The built-ins C<do>, C<require> and C<glob> can also be overridden, but due to special magic, their original syntax is preserved, and you don't have to define a prototype for their replacements. (You can't override the C<do BLOCK> syntax, though). C<require> has special additional dark magic: if you invoke your C<require> replacement as C<require Foo::Bar>, it will actually receive the argument C<"Foo/Bar.pm"> in @_. See L<perlfunc/require>. And, as you'll have noticed from the previous example, if you override C<glob>, the C<< <*> >> glob operator is overridden as well. In a similar fashion, overriding the C<readline> function also overrides the equivalent I/O operator C<< <FILEHANDLE> >>. Also, overriding C<readpipe> also overrides the operators C<``> and C<qx//>. Finally, some built-ins (e.g. C<exists> or C<grep>) can't be overridden. =head2 Autoloading X<autoloading> X<AUTOLOAD> If you call a subroutine that is undefined, you would ordinarily get an immediate, fatal error complaining that the subroutine doesn't exist. (Likewise for subroutines being used as methods, when the method doesn't exist in any base class of the class's package.) However, if an C<AUTOLOAD> subroutine is defined in the package or packages used to locate the original subroutine, then that C<AUTOLOAD> subroutine is called with the arguments that would have been passed to the original subroutine. The fully qualified name of the original subroutine magically appears in the global $AUTOLOAD variable of the same package as the C<AUTOLOAD> routine. The name is not passed as an ordinary argument because, er, well, just because, that's why. (As an exception, a method call to a nonexistent C<import> or C<unimport> method is just skipped instead. Also, if the AUTOLOAD subroutine is an XSUB, there are other ways to retrieve the subroutine name. See L<perlguts/Autoloading with XSUBs> for details.) Many C<AUTOLOAD> routines load in a definition for the requested subroutine using eval(), then execute that subroutine using a special form of goto() that erases the stack frame of the C<AUTOLOAD> routine without a trace. (See the source to the standard module documented in L<AutoLoader>, for example.) But an C<AUTOLOAD> routine can also just emulate the routine and never define it. For example, let's pretend that a function that wasn't defined should just invoke C<system> with those arguments. All you'd do is: sub AUTOLOAD { our $AUTOLOAD; # keep 'use strict' happy my $program = $AUTOLOAD; $program =~ s/.*:://; system($program, @_); } date(); who(); ls('-l'); In fact, if you predeclare functions you want to call that way, you don't even need parentheses: use subs qw(date who ls); date; who; ls '-l'; A more complete example of this is the Shell module on CPAN, which can treat undefined subroutine calls as calls to external programs. Mechanisms are available to help modules writers split their modules into autoloadable files. See the standard AutoLoader module described in L<AutoLoader> and in L<AutoSplit>, the standard SelfLoader modules in L<SelfLoader>, and the document on adding C functions to Perl code in L<perlxs>. =head2 Subroutine Attributes X<attribute> X<subroutine, attribute> X<attrs> A subroutine declaration or definition may have a list of attributes associated with it. If such an attribute list is present, it is broken up at space or colon boundaries and treated as though a C<use attributes> had been seen. See L<attributes> for details about what attributes are currently supported. Unlike the limitation with the obsolescent C<use attrs>, the C<sub : ATTRLIST> syntax works to associate the attributes with a pre-declaration, and not just with a subroutine definition. The attributes must be valid as simple identifier names (without any punctuation other than the '_' character). They may have a parameter list appended, which is only checked for whether its parentheses ('(',')') nest properly. Examples of valid syntax (even though the attributes are unknown): sub fnord (&\%) : switch(10,foo(7,3)) : expensive; sub plugh () : Ugly('\(") :Bad; sub xyzzy : _5x5 { ... } Examples of invalid syntax: sub fnord : switch(10,foo(); # ()-string not balanced sub snoid : Ugly('('); # ()-string not balanced sub xyzzy : 5x5; # "5x5" not a valid identifier sub plugh : Y2::north; # "Y2::north" not a simple identifier sub snurt : foo + bar; # "+" not a colon or space The attribute list is passed as a list of constant strings to the code which associates them with the subroutine. In particular, the second example of valid syntax above currently looks like this in terms of how it's parsed and invoked: use attributes __PACKAGE__, \&plugh, q[Ugly('\(")], 'Bad'; For further details on attribute lists and their manipulation, see L<attributes> and L<Attribute::Handlers>. =head1 SEE ALSO See L<perlref/"Function Templates"> for more about references and closures. See L<perlxs> if you'd like to learn about calling C subroutines from Perl. See L<perlembed> if you'd like to learn about calling Perl subroutines from C. See L<perlmod> to learn about bundling up your functions in separate files. See L<perlmodlib> to learn what library modules come standard on your system. See L<perlootut> to learn how to make object method calls. PK �=�[� 4t�; �; perlsymbian.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlsymbian - Perl version 5 on Symbian OS =head1 DESCRIPTION This document describes various features of the Symbian operating system that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. B<NOTE: this port (as of 0.4.1) does not compile into a Symbian OS GUI application, but instead it results in a Symbian DLL.> The DLL includes a C++ class called CPerlBase, which one can then (derive from and) use to embed Perl into applications, see F<symbian/README>. The base port of Perl to Symbian only implements the basic POSIX-like functionality; it does not implement any further Symbian or Series 60, Series 80, or UIQ bindings for Perl. It is also possible to generate Symbian executables for "miniperl" and "perl", but since there is no standard command line interface for Symbian (nor full keyboards in the devices), these are useful mainly as demonstrations. =head2 Compiling Perl on Symbian (0) You need to have the appropriate Symbian SDK installed. These instructions have been tested under various Nokia Series 60 Symbian SDKs (1.2 to 2.6, 2.8 should also work, 1.2 compiles but does not work), Series 80 2.0, and Nokia 7710 (Series 90) SDK. You can get the SDKs from Forum Nokia (L<http://www.forum.nokia.com/>). A very rough port ("it compiles") to UIQ 2.1 has also been made. A prerequisite for any of the SDKs is to install ActivePerl from ActiveState, L<http://www.activestate.com/Products/ActivePerl/> Having the SDK installed also means that you need to have either the Metrowerks CodeWarrior installed (2.8 and 3.0 were used in testing) or the Microsoft Visual C++ 6.0 installed (SP3 minimum, SP5 recommended). Note that for example the Series 60 2.0 VC SDK installation talks about ActivePerl build 518, which does no more (as of mid-2005) exist at the ActiveState website. The ActivePerl 5.8.4 build 810 was used successfully for compiling Perl on Symbian. The 5.6.x ActivePerls do not work. Other SDKs or compilers like Visual.NET, command-line-only Visual.NET, Borland, GnuPoc, or sdk2unix have not been tried. These instructions almost certainly won't work with older Symbian releases or other SDKs. Patches to get this port running in other releases, SDKs, compilers, platforms, or devices are naturally welcome. (1) Get a Perl source code distribution (for example the file perl-5.9.2.tar.gz is fine) from L<http://www.cpan.org/src/> and unpack it in your the C:/Symbian directory of your Windows system. (2) Change to the perl source directory. cd c:\Symbian\perl-5.x.x (3) Run the following script using the perl coming with the SDK perl symbian\config.pl You must use the cmd.exe, the Cygwin shell will not work. The PATH must include the SDK tools, including a Perl, which should be the case under cmd.exe. If you do not have that, see the end of symbian\sdk.pl for notes of how your environment should be set up for Symbian compiles. (4) Build the project, either by make all in cmd.exe or by using either the Metrowerks CodeWarrior or the Visual C++ 6.0, or the Visual Studio 8 (the Visual C++ 2005 Express Edition works fine). If you use the VC IDE, you will have to run F<symbian\config.pl> first using the cmd.exe, and then run 'make win.mf vc6.mf' to generate the VC6 makefiles and workspaces. "make vc6" will compile for the VC6, and "make cw" for the CodeWarrior. The following SDK and compiler configurations and Nokia phones were tested at some point in time (+ = compiled and PerlApp run, - = not), both for Perl 5.8.x and 5.9.x: SDK | VC | CW | --------+----+----+--- S60 1.2 | + | + | 3650 (*) S60 2.0 | + | + | 6600 S60 2.1 | - | + | 6670 S60 2.6 | + | + | 6630 S60 2.8 | + | + | (not tested in a device) S80 2.6 | - | + | 9300 S90 1.1 | + | - | 7710 UIQ 2.1 | - | + | (not tested in a device) (*) Compiles but does not work, unfortunately, a problem with Symbian. If you are using the 'make' directly, it is the GNU make from the SDKs, and it will invoke the right make commands for the Windows emulator build and the Arm target builds ('thumb' by default) as necessary. The build scripts assume the 'absolute style' SDK installs under C:, the 'subst style' will not work. If using the VC IDE, to build use for example the File->Open Workspace-> C:\Symbian\8.0a\S60_2nd_FP2\epoc32\build\symbian\perl\perl\wins\perl.dsw The emulator binaries will appear in the same directory. If using the VC IDE, you will a lot of warnings in the beginning of the build because a lot of headers mentioned by the source cannot be found, but this is not serious since those headers are not used. The Metrowerks will give a lot of warnings about unused variables and empty declarations, you can ignore those. When the Windows and Arm DLLs are built do not be scared by a very long messages whizzing by: it is the "export freeze" phase where the whole (rather large) API of Perl is listed. Once the build is completed you need to create the DLL SIS file by make perldll.sis which will create the file perlXYZ.sis (the XYZ being the Perl version) which you can then install into your Symbian device: an easy way to do this is to send them via Bluetooth or infrared and just open the messages. Since the total size of all Perl SIS files once installed is over 2 MB, it is recommended to do the installation into a memory card (drive E:) instead of the C: drive. The size of the perlXYZ.SIS is about 370 kB but once it is in the device it is about one 750 kB (according to the application manager). The perlXYZ.sis includes only the Perl DLL: to create an additional SIS file which includes some of the standard (pure) Perl libraries, issue the command make perllib.sis Some of the standard Perl libraries are included, but not all: see L</HISTORY> or F<symbian\install.cfg> for more details (250 kB -> 700 kB). Some of the standard Perl XS extensions (see L</HISTORY> are also available: make perlext.sis which will create perlXYZext.sis (290 kB -> 770 kB). To compile the demonstration application PerlApp you need first to install the Perl headers under the SDK. To install the Perl headers and the class CPerlBase documentation so that you no more need the Perl sources around to compile Perl applications using the SDK: make sdkinstall The destination directory is C:\Symbian\perl\X.Y.Z. For more details, see F<symbian\PerlBase.pod>. Once the headers have been installed, you can create a SIS for the PerlApp: make perlapp.sis The perlapp.sis (11 kB -> 16 kB) will be built in the symbian subdirectory, but a copy will also be made to the main directory. If you want to package the Perl DLLs (one for WINS, one for ARMI), the headers, and the documentation: make perlsdk.zip which will create perlXYZsdk.zip that can be used in another Windows system with the SDK, without having to compile Perl in that system. If you want to package the PerlApp sources: make perlapp.zip If you want to package the perl.exe and miniperl.exe, you can use the perlexe.sis and miniperlexe.sis make targets. You also probably want the perllib.sis for the libraries and maybe even the perlapp.sis for the recognizer. The make target 'allsis' combines all the above SIS targets. To clean up after compilation you can use either of make clean make distclean depending on how clean you want to be. =head2 Compilation problems If you see right after "make" this cat makefile.sh >makefile 'cat' is not recognized as an internal or external command, operable program or batch file. it means you need to (re)run the F<symbian\config.pl>. If you get the error 'perl' is not recognized as an internal or external command, operable program or batch file. you may need to reinstall the ActivePerl. If you see this ren makedef.pl nomakedef.pl The system cannot find the file specified. C:\Symbian\...\make.exe: [rename_makedef] Error 1 (ignored) please ignore it since it is nothing serious (the build process of renames the Perl makedef.pl as nomakedef.pl to avoid confusing it with a makedef.pl of the SDK). =head2 PerlApp The PerlApp application demonstrates how to embed Perl interpreters to a Symbian application. The "Time" menu item runs the following Perl code: C<print "Running in ", $^O, "\n", scalar localtime>, the "Oneliner" allows one to type in Perl code, and the "Run" opens a file chooser for selecting a Perl file to run. The PerlApp also is started when the "Perl recognizer" (also included and installed) detects a Perl file being activated through the GUI, and offers either to install it under \Perl (if the Perl file is in the inbox of the messaging application) or to run it (if the Perl file is under \Perl). =head2 sisify.pl In the symbian subdirectory there is F<sisify.pl> utility which can be used to package Perl scripts and/or Perl library directories into SIS files, which can be installed to the device. To run the sisify.pl utility, you will need to have the 'makesis' and 'uidcrc' utilities already installed. If you don't have the Win32 SDKs, you may try for example L<http://gnupoc.sourceforge.net/> or L<http://symbianos.org/~andreh/>. =head2 Using Perl in Symbian First of all note that you have full access to the Symbian device when using Perl: you can do a lot of damage to your device (like removing system files) unless you are careful. Please do take backups before doing anything. The Perl port has been done for the most part using the Symbian standard POSIX-ish STDLIB library. It is a reasonably complete library, but certain corners of such emulation libraries that tend to be left unimplemented on non-UNIX platforms have been left unimplemented also this time: fork(), signals(), user/group ids, select() working for sockets, non-blocking sockets, and so forth. See the file F<symbian/config.sh> and look for 'undef' to find the unsupported APIs (or from Perl use Config). The filesystem of Symbian devices uses DOSish syntax, "drives" separated from paths by a colon, and backslashes for the path. The exact assignment of the drives probably varies between platforms, but for example in Series 60 you might see C: as the (flash) main memory, D: as the RAM drive, E: as the memory card (MMC), Z: as the ROM. In Series 80 D: is the memory card. As far the devices go the NUL: is the bit bucket, the COMx: are the serial lines, IRCOMx: are the IR ports, TMP: might be C:\System\Temp. Remember to double those backslashes in doublequoted strings. The Perl DLL is installed in \System\Libs\. The Perl libraries and extension DLLs are installed in \System\Libs\Perl\X.Y.Z\. The PerlApp is installed in \System\Apps\, and the SIS also installs a couple of demo scripts in \Perl\ (C:\Mydocs\Perl\ on Nokia 7710). Note that the Symbian filesystem is very picky: it strongly prefers the \ instead of the /. When doing XS / Symbian C++ programming include first the Symbian headers, then any standard C/POSIX headers, then Perl headers, and finally any application headers. New() and Copy() are unfortunately used by both Symbian and Perl code so you'll have to play cpp games if you need them. PerlBase.h undefines the Perl definitions and redefines them as PerlNew() and PerlCopy(). =head1 TO DO Lots. See F<symbian/TODO>. =head1 WARNING As of Perl Symbian port version 0.4.1 any part of Perl's standard regression test suite has not been run on a real Symbian device using the ported Perl, so innumerable bugs may lie in wait. Therefore there is absolutely no warranty. =head1 NOTE When creating and extending application programming interfaces (APIs) for Symbian or Series 60 or Series 80 or Series 90 it is suggested that trademarks, registered trademarks, or trade names are not used in the API names. Instead, developers should consider basing the API naming in the existing (C++, or maybe Java) public component and API naming, modified as appropriate by the rules of the programming language the new APIs are for. Nokia is a registered trademark of Nokia Corporation. Nokia's product names are trademarks or registered trademarks of Nokia. Other product and company names mentioned herein may be trademarks or trade names of their respective owners. =head1 AUTHOR Jarkko Hietaniemi =head1 COPYRIGHT Copyright (c) 2004-2005 Nokia. All rights reserved. Copyright (c) 2006-2007 Jarkko Hietaniemi. =head1 LICENSE The Symbian port is licensed under the same terms as Perl itself. =head1 HISTORY =over 4 =item * 0.1.0: April 2005 (This will show as "0.01" in the Symbian Installer.) - The console window is a very simple console indeed: one can get the newline with "000" and the "C" button is a backspace. Do not expect a terminal capable of vt100 or ANSI sequences. The console is also "ASCII", you cannot input e.g. any accented letters. Because of obvious physical constraints the console is also very small: (in Nokia 6600) 22 columns, 17 rows. - The following libraries are available: AnyDBM_File AutoLoader base Carp Config Cwd constant DynaLoader Exporter File::Spec integer lib strict Symbol vars warnings XSLoader - The following extensions are available: attributes Compress::Zlib Cwd Data::Dumper Devel::Peek Digest::MD5 DynaLoader Fcntl File::Glob Filter::Util::Call IO List::Util MIME::Base64 PerlIO::scalar PerlIO::via SDBM_File Socket Storable Time::HiRes - The following extensions are missing for various technical reasons: B ByteLoader Devel::DProf Devel::PPPort Encode GDBM_File IPC::SysV NDBM_File Opcode PerlIO::encoding POSIX re Safe Sys::Hostname Sys::Syslog threads threads::shared Unicode::Normalize - Using MakeMaker or the Module::* to build and install modules is not supported. - Building XS other than the ones in the core is not supported. Since this is 0.something release, any future releases are almost guaranteed to be binary incompatible. As a sign of this the Symbian symbol exports are kept unfrozen and the .def files fully rebuilt every time. =item * 0.2.0: October 2005 - Perl 5.9.3 (patch level 25741) - Compress::Zlib and IO::Zlib supported - sisify.pl added We maintain the binary incompatibility. =item * 0.3.0: October 2005 - Perl 5.9.3 (patch level 25911) - Series 80 2.0 and UIQ 2.1 support We maintain the binary incompatibility. =item * 0.4.0: November 2005 - Perl 5.9.3 (patch level 26052) - adding a sample Symbian extension We maintain the binary incompatibility. =item * 0.4.1: December 2006 - Perl 5.9.5-to-be (patch level 30002) - added extensions: Compress/Raw/Zlib, Digest/SHA, Hash/Util, Math/BigInt/FastCalc, Text/Soundex, Time/Piece - port to S90 1.1 by alexander smishlajev We maintain the binary incompatibility. =item * 0.4.2: March 2007 - catchup with Perl 5.9.5-to-be (patch level 30812) - tested to build with Microsoft Visual C++ 2005 Express Edition (which uses Microsoft Visual C 8, instead of the old VC6), SDK used for testing S60_2nd_FP3 aka 8.1a We maintain the binary incompatibility. =back =cut PK �=�[tHܢ� �� perlcall.podnu �[��� =head1 NAME perlcall - Perl calling conventions from C =head1 DESCRIPTION The purpose of this document is to show you how to call Perl subroutines directly from C, i.e., how to write I<callbacks>. Apart from discussing the C interface provided by Perl for writing callbacks the document uses a series of examples to show how the interface actually works in practice. In addition some techniques for coding callbacks are covered. Examples where callbacks are necessary include =over 5 =item * An Error Handler You have created an XSUB interface to an application's C API. A fairly common feature in applications is to allow you to define a C function that will be called whenever something nasty occurs. What we would like is to be able to specify a Perl subroutine that will be called instead. =item * An Event-Driven Program The classic example of where callbacks are used is when writing an event driven program, such as for an X11 application. In this case you register functions to be called whenever specific events occur, e.g., a mouse button is pressed, the cursor moves into a window or a menu item is selected. =back Although the techniques described here are applicable when embedding Perl in a C program, this is not the primary goal of this document. There are other details that must be considered and are specific to embedding Perl. For details on embedding Perl in C refer to L<perlembed>. Before you launch yourself head first into the rest of this document, it would be a good idea to have read the following two documents--L<perlxs> and L<perlguts>. =head1 THE CALL_ FUNCTIONS Although this stuff is easier to explain using examples, you first need be aware of a few important definitions. Perl has a number of C functions that allow you to call Perl subroutines. They are I32 call_sv(SV* sv, I32 flags); I32 call_pv(char *subname, I32 flags); I32 call_method(char *methname, I32 flags); I32 call_argv(char *subname, I32 flags, char **argv); The key function is I<call_sv>. All the other functions are fairly simple wrappers which make it easier to call Perl subroutines in special cases. At the end of the day they will all call I<call_sv> to invoke the Perl subroutine. All the I<call_*> functions have a C<flags> parameter which is used to pass a bit mask of options to Perl. This bit mask operates identically for each of the functions. The settings available in the bit mask are discussed in L</FLAG VALUES>. Each of the functions will now be discussed in turn. =over 5 =item call_sv I<call_sv> takes two parameters. The first, C<sv>, is an SV*. This allows you to specify the Perl subroutine to be called either as a C string (which has first been converted to an SV) or a reference to a subroutine. The section, L</Using call_sv>, shows how you can make use of I<call_sv>. =item call_pv The function, I<call_pv>, is similar to I<call_sv> except it expects its first parameter to be a C char* which identifies the Perl subroutine you want to call, e.g., C<call_pv("fred", 0)>. If the subroutine you want to call is in another package, just include the package name in the string, e.g., C<"pkg::fred">. =item call_method The function I<call_method> is used to call a method from a Perl class. The parameter C<methname> corresponds to the name of the method to be called. Note that the class that the method belongs to is passed on the Perl stack rather than in the parameter list. This class can be either the name of the class (for a static method) or a reference to an object (for a virtual method). See L<perlobj> for more information on static and virtual methods and L</Using call_method> for an example of using I<call_method>. =item call_argv I<call_argv> calls the Perl subroutine specified by the C string stored in the C<subname> parameter. It also takes the usual C<flags> parameter. The final parameter, C<argv>, consists of a NULL-terminated list of C strings to be passed as parameters to the Perl subroutine. See L</Using call_argv>. =back All the functions return an integer. This is a count of the number of items returned by the Perl subroutine. The actual items returned by the subroutine are stored on the Perl stack. As a general rule you should I<always> check the return value from these functions. Even if you are expecting only a particular number of values to be returned from the Perl subroutine, there is nothing to stop someone from doing something unexpected--don't say you haven't been warned. =head1 FLAG VALUES The C<flags> parameter in all the I<call_*> functions is one of G_VOID, G_SCALAR, or G_ARRAY, which indicate the call context, OR'ed together with a bit mask of any combination of the other G_* symbols defined below. =head2 G_VOID Calls the Perl subroutine in a void context. This flag has 2 effects: =over 5 =item 1. It indicates to the subroutine being called that it is executing in a void context (if it executes I<wantarray> the result will be the undefined value). =item 2. It ensures that nothing is actually returned from the subroutine. =back The value returned by the I<call_*> function indicates how many items have been returned by the Perl subroutine--in this case it will be 0. =head2 G_SCALAR Calls the Perl subroutine in a scalar context. This is the default context flag setting for all the I<call_*> functions. This flag has 2 effects: =over 5 =item 1. It indicates to the subroutine being called that it is executing in a scalar context (if it executes I<wantarray> the result will be false). =item 2. It ensures that only a scalar is actually returned from the subroutine. The subroutine can, of course, ignore the I<wantarray> and return a list anyway. If so, then only the last element of the list will be returned. =back The value returned by the I<call_*> function indicates how many items have been returned by the Perl subroutine - in this case it will be either 0 or 1. If 0, then you have specified the G_DISCARD flag. If 1, then the item actually returned by the Perl subroutine will be stored on the Perl stack - the section L</Returning a Scalar> shows how to access this value on the stack. Remember that regardless of how many items the Perl subroutine returns, only the last one will be accessible from the stack - think of the case where only one value is returned as being a list with only one element. Any other items that were returned will not exist by the time control returns from the I<call_*> function. The section L</Returning a List in Scalar Context> shows an example of this behavior. =head2 G_ARRAY Calls the Perl subroutine in a list context. As with G_SCALAR, this flag has 2 effects: =over 5 =item 1. It indicates to the subroutine being called that it is executing in a list context (if it executes I<wantarray> the result will be true). =item 2. It ensures that all items returned from the subroutine will be accessible when control returns from the I<call_*> function. =back The value returned by the I<call_*> function indicates how many items have been returned by the Perl subroutine. If 0, then you have specified the G_DISCARD flag. If not 0, then it will be a count of the number of items returned by the subroutine. These items will be stored on the Perl stack. The section L</Returning a List of Values> gives an example of using the G_ARRAY flag and the mechanics of accessing the returned items from the Perl stack. =head2 G_DISCARD By default, the I<call_*> functions place the items returned from by the Perl subroutine on the stack. If you are not interested in these items, then setting this flag will make Perl get rid of them automatically for you. Note that it is still possible to indicate a context to the Perl subroutine by using either G_SCALAR or G_ARRAY. If you do not set this flag then it is I<very> important that you make sure that any temporaries (i.e., parameters passed to the Perl subroutine and values returned from the subroutine) are disposed of yourself. The section L</Returning a Scalar> gives details of how to dispose of these temporaries explicitly and the section L</Using Perl to Dispose of Temporaries> discusses the specific circumstances where you can ignore the problem and let Perl deal with it for you. =head2 G_NOARGS Whenever a Perl subroutine is called using one of the I<call_*> functions, it is assumed by default that parameters are to be passed to the subroutine. If you are not passing any parameters to the Perl subroutine, you can save a bit of time by setting this flag. It has the effect of not creating the C<@_> array for the Perl subroutine. Although the functionality provided by this flag may seem straightforward, it should be used only if there is a good reason to do so. The reason for being cautious is that, even if you have specified the G_NOARGS flag, it is still possible for the Perl subroutine that has been called to think that you have passed it parameters. In fact, what can happen is that the Perl subroutine you have called can access the C<@_> array from a previous Perl subroutine. This will occur when the code that is executing the I<call_*> function has itself been called from another Perl subroutine. The code below illustrates this sub fred { print "@_\n" } sub joe { &fred } &joe(1,2,3); This will print 1 2 3 What has happened is that C<fred> accesses the C<@_> array which belongs to C<joe>. =head2 G_EVAL It is possible for the Perl subroutine you are calling to terminate abnormally, e.g., by calling I<die> explicitly or by not actually existing. By default, when either of these events occurs, the process will terminate immediately. If you want to trap this type of event, specify the G_EVAL flag. It will put an I<eval { }> around the subroutine call. Whenever control returns from the I<call_*> function you need to check the C<$@> variable as you would in a normal Perl script. The value returned from the I<call_*> function is dependent on what other flags have been specified and whether an error has occurred. Here are all the different cases that can occur: =over 5 =item * If the I<call_*> function returns normally, then the value returned is as specified in the previous sections. =item * If G_DISCARD is specified, the return value will always be 0. =item * If G_ARRAY is specified I<and> an error has occurred, the return value will always be 0. =item * If G_SCALAR is specified I<and> an error has occurred, the return value will be 1 and the value on the top of the stack will be I<undef>. This means that if you have already detected the error by checking C<$@> and you want the program to continue, you must remember to pop the I<undef> from the stack. =back See L</Using G_EVAL> for details on using G_EVAL. =head2 G_KEEPERR Using the G_EVAL flag described above will always set C<$@>: clearing it if there was no error, and setting it to describe the error if there was an error in the called code. This is what you want if your intention is to handle possible errors, but sometimes you just want to trap errors and stop them interfering with the rest of the program. This scenario will mostly be applicable to code that is meant to be called from within destructors, asynchronous callbacks, and signal handlers. In such situations, where the code being called has little relation to the surrounding dynamic context, the main program needs to be insulated from errors in the called code, even if they can't be handled intelligently. It may also be useful to do this with code for C<__DIE__> or C<__WARN__> hooks, and C<tie> functions. The G_KEEPERR flag is meant to be used in conjunction with G_EVAL in I<call_*> functions that are used to implement such code, or with C<eval_sv>. This flag has no effect on the C<call_*> functions when G_EVAL is not used. When G_KEEPERR is used, any error in the called code will terminate the call as usual, and the error will not propagate beyond the call (as usual for G_EVAL), but it will not go into C<$@>. Instead the error will be converted into a warning, prefixed with the string "\t(in cleanup)". This can be disabled using C<no warnings 'misc'>. If there is no error, C<$@> will not be cleared. Note that the G_KEEPERR flag does not propagate into inner evals; these may still set C<$@>. The G_KEEPERR flag was introduced in Perl version 5.002. See L</Using G_KEEPERR> for an example of a situation that warrants the use of this flag. =head2 Determining the Context As mentioned above, you can determine the context of the currently executing subroutine in Perl with I<wantarray>. The equivalent test can be made in C by using the C<GIMME_V> macro, which returns C<G_ARRAY> if you have been called in a list context, C<G_SCALAR> if in a scalar context, or C<G_VOID> if in a void context (i.e., the return value will not be used). An older version of this macro is called C<GIMME>; in a void context it returns C<G_SCALAR> instead of C<G_VOID>. An example of using the C<GIMME_V> macro is shown in section L</Using GIMME_V>. =head1 EXAMPLES Enough of the definition talk! Let's have a few examples. Perl provides many macros to assist in accessing the Perl stack. Wherever possible, these macros should always be used when interfacing to Perl internals. We hope this should make the code less vulnerable to any changes made to Perl in the future. Another point worth noting is that in the first series of examples I have made use of only the I<call_pv> function. This has been done to keep the code simpler and ease you into the topic. Wherever possible, if the choice is between using I<call_pv> and I<call_sv>, you should always try to use I<call_sv>. See L</Using call_sv> for details. =head2 No Parameters, Nothing Returned This first trivial example will call a Perl subroutine, I<PrintUID>, to print out the UID of the process. sub PrintUID { print "UID is $<\n"; } and here is a C function to call it static void call_PrintUID() { dSP; PUSHMARK(SP); call_pv("PrintUID", G_DISCARD|G_NOARGS); } Simple, eh? A few points to note about this example: =over 5 =item 1. Ignore C<dSP> and C<PUSHMARK(SP)> for now. They will be discussed in the next example. =item 2. We aren't passing any parameters to I<PrintUID> so G_NOARGS can be specified. =item 3. We aren't interested in anything returned from I<PrintUID>, so G_DISCARD is specified. Even if I<PrintUID> was changed to return some value(s), having specified G_DISCARD will mean that they will be wiped by the time control returns from I<call_pv>. =item 4. As I<call_pv> is being used, the Perl subroutine is specified as a C string. In this case the subroutine name has been 'hard-wired' into the code. =item 5. Because we specified G_DISCARD, it is not necessary to check the value returned from I<call_pv>. It will always be 0. =back =head2 Passing Parameters Now let's make a slightly more complex example. This time we want to call a Perl subroutine, C<LeftString>, which will take 2 parameters--a string ($s) and an integer ($n). The subroutine will simply print the first $n characters of the string. So the Perl subroutine would look like this: sub LeftString { my($s, $n) = @_; print substr($s, 0, $n), "\n"; } The C function required to call I<LeftString> would look like this: static void call_LeftString(a, b) char * a; int b; { dSP; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSVpv(a, 0))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; call_pv("LeftString", G_DISCARD); FREETMPS; LEAVE; } Here are a few notes on the C function I<call_LeftString>. =over 5 =item 1. Parameters are passed to the Perl subroutine using the Perl stack. This is the purpose of the code beginning with the line C<dSP> and ending with the line C<PUTBACK>. The C<dSP> declares a local copy of the stack pointer. This local copy should B<always> be accessed as C<SP>. =item 2. If you are going to put something onto the Perl stack, you need to know where to put it. This is the purpose of the macro C<dSP>--it declares and initializes a I<local> copy of the Perl stack pointer. All the other macros which will be used in this example require you to have used this macro. The exception to this rule is if you are calling a Perl subroutine directly from an XSUB function. In this case it is not necessary to use the C<dSP> macro explicitly--it will be declared for you automatically. =item 3. Any parameters to be pushed onto the stack should be bracketed by the C<PUSHMARK> and C<PUTBACK> macros. The purpose of these two macros, in this context, is to count the number of parameters you are pushing automatically. Then whenever Perl is creating the C<@_> array for the subroutine, it knows how big to make it. The C<PUSHMARK> macro tells Perl to make a mental note of the current stack pointer. Even if you aren't passing any parameters (like the example shown in the section L</No Parameters, Nothing Returned>) you must still call the C<PUSHMARK> macro before you can call any of the I<call_*> functions--Perl still needs to know that there are no parameters. The C<PUTBACK> macro sets the global copy of the stack pointer to be the same as our local copy. If we didn't do this, I<call_pv> wouldn't know where the two parameters we pushed were--remember that up to now all the stack pointer manipulation we have done is with our local copy, I<not> the global copy. =item 4. Next, we come to EXTEND and PUSHs. This is where the parameters actually get pushed onto the stack. In this case we are pushing a string and an integer. Alternatively you can use the XPUSHs() macro, which combines a C<EXTEND(SP, 1)> and C<PUSHs()>. This is less efficient if you're pushing multiple values. See L<perlguts/"XSUBs and the Argument Stack"> for details on how the PUSH macros work. =item 5. Because we created temporary values (by means of sv_2mortal() calls) we will have to tidy up the Perl stack and dispose of mortal SVs. This is the purpose of ENTER; SAVETMPS; at the start of the function, and FREETMPS; LEAVE; at the end. The C<ENTER>/C<SAVETMPS> pair creates a boundary for any temporaries we create. This means that the temporaries we get rid of will be limited to those which were created after these calls. The C<FREETMPS>/C<LEAVE> pair will get rid of any values returned by the Perl subroutine (see next example), plus it will also dump the mortal SVs we have created. Having C<ENTER>/C<SAVETMPS> at the beginning of the code makes sure that no other mortals are destroyed. Think of these macros as working a bit like C<{> and C<}> in Perl to limit the scope of local variables. See the section L</Using Perl to Dispose of Temporaries> for details of an alternative to using these macros. =item 6. Finally, I<LeftString> can now be called via the I<call_pv> function. The only flag specified this time is G_DISCARD. Because we are passing 2 parameters to the Perl subroutine this time, we have not specified G_NOARGS. =back =head2 Returning a Scalar Now for an example of dealing with the items returned from a Perl subroutine. Here is a Perl subroutine, I<Adder>, that takes 2 integer parameters and simply returns their sum. sub Adder { my($a, $b) = @_; $a + $b; } Because we are now concerned with the return value from I<Adder>, the C function required to call it is now a bit more complex. static void call_Adder(a, b) int a; int b; { dSP; int count; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(a))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; count = call_pv("Adder", G_SCALAR); SPAGAIN; if (count != 1) croak("Big trouble\n"); printf ("The sum of %d and %d is %d\n", a, b, POPi); PUTBACK; FREETMPS; LEAVE; } Points to note this time are =over 5 =item 1. The only flag specified this time was G_SCALAR. That means that the C<@_> array will be created and that the value returned by I<Adder> will still exist after the call to I<call_pv>. =item 2. The purpose of the macro C<SPAGAIN> is to refresh the local copy of the stack pointer. This is necessary because it is possible that the memory allocated to the Perl stack has been reallocated during the I<call_pv> call. If you are making use of the Perl stack pointer in your code you must always refresh the local copy using SPAGAIN whenever you make use of the I<call_*> functions or any other Perl internal function. =item 3. Although only a single value was expected to be returned from I<Adder>, it is still good practice to check the return code from I<call_pv> anyway. Expecting a single value is not quite the same as knowing that there will be one. If someone modified I<Adder> to return a list and we didn't check for that possibility and take appropriate action the Perl stack would end up in an inconsistent state. That is something you I<really> don't want to happen ever. =item 4. The C<POPi> macro is used here to pop the return value from the stack. In this case we wanted an integer, so C<POPi> was used. Here is the complete list of POP macros available, along with the types they return. POPs SV POPp pointer (PV) POPpbytex pointer to bytes (PV) POPn double (NV) POPi integer (IV) POPu unsigned integer (UV) POPl long POPul unsigned long Since these macros have side-effects don't use them as arguments to macros that may evaluate their argument several times, for example: /* Bad idea, don't do this */ STRLEN len; const char *s = SvPV(POPs, len); Instead, use a temporary: STRLEN len; SV *sv = POPs; const char *s = SvPV(sv, len); or a macro that guarantees it will evaluate its arguments only once: STRLEN len; const char *s = SvPVx(POPs, len); =item 5. The final C<PUTBACK> is used to leave the Perl stack in a consistent state before exiting the function. This is necessary because when we popped the return value from the stack with C<POPi> it updated only our local copy of the stack pointer. Remember, C<PUTBACK> sets the global stack pointer to be the same as our local copy. =back =head2 Returning a List of Values Now, let's extend the previous example to return both the sum of the parameters and the difference. Here is the Perl subroutine sub AddSubtract { my($a, $b) = @_; ($a+$b, $a-$b); } and this is the C function static void call_AddSubtract(a, b) int a; int b; { dSP; int count; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(a))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; count = call_pv("AddSubtract", G_ARRAY); SPAGAIN; if (count != 2) croak("Big trouble\n"); printf ("%d - %d = %d\n", a, b, POPi); printf ("%d + %d = %d\n", a, b, POPi); PUTBACK; FREETMPS; LEAVE; } If I<call_AddSubtract> is called like this call_AddSubtract(7, 4); then here is the output 7 - 4 = 3 7 + 4 = 11 Notes =over 5 =item 1. We wanted list context, so G_ARRAY was used. =item 2. Not surprisingly C<POPi> is used twice this time because we were retrieving 2 values from the stack. The important thing to note is that when using the C<POP*> macros they come off the stack in I<reverse> order. =back =head2 Returning a List in Scalar Context Say the Perl subroutine in the previous section was called in a scalar context, like this static void call_AddSubScalar(a, b) int a; int b; { dSP; int count; int i; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(a))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; count = call_pv("AddSubtract", G_SCALAR); SPAGAIN; printf ("Items Returned = %d\n", count); for (i = 1; i <= count; ++i) printf ("Value %d = %d\n", i, POPi); PUTBACK; FREETMPS; LEAVE; } The other modification made is that I<call_AddSubScalar> will print the number of items returned from the Perl subroutine and their value (for simplicity it assumes that they are integer). So if I<call_AddSubScalar> is called call_AddSubScalar(7, 4); then the output will be Items Returned = 1 Value 1 = 3 In this case the main point to note is that only the last item in the list is returned from the subroutine. I<AddSubtract> actually made it back to I<call_AddSubScalar>. =head2 Returning Data from Perl via the Parameter List It is also possible to return values directly via the parameter list--whether it is actually desirable to do it is another matter entirely. The Perl subroutine, I<Inc>, below takes 2 parameters and increments each directly. sub Inc { ++ $_[0]; ++ $_[1]; } and here is a C function to call it. static void call_Inc(a, b) int a; int b; { dSP; int count; SV * sva; SV * svb; ENTER; SAVETMPS; sva = sv_2mortal(newSViv(a)); svb = sv_2mortal(newSViv(b)); PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sva); PUSHs(svb); PUTBACK; count = call_pv("Inc", G_DISCARD); if (count != 0) croak ("call_Inc: expected 0 values from 'Inc', got %d\n", count); printf ("%d + 1 = %d\n", a, SvIV(sva)); printf ("%d + 1 = %d\n", b, SvIV(svb)); FREETMPS; LEAVE; } To be able to access the two parameters that were pushed onto the stack after they return from I<call_pv> it is necessary to make a note of their addresses--thus the two variables C<sva> and C<svb>. The reason this is necessary is that the area of the Perl stack which held them will very likely have been overwritten by something else by the time control returns from I<call_pv>. =head2 Using G_EVAL Now an example using G_EVAL. Below is a Perl subroutine which computes the difference of its 2 parameters. If this would result in a negative result, the subroutine calls I<die>. sub Subtract { my ($a, $b) = @_; die "death can be fatal\n" if $a < $b; $a - $b; } and some C to call it static void call_Subtract(a, b) int a; int b; { dSP; int count; SV *err_tmp; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(a))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; count = call_pv("Subtract", G_EVAL|G_SCALAR); SPAGAIN; /* Check the eval first */ err_tmp = ERRSV; if (SvTRUE(err_tmp)) { printf ("Uh oh - %s\n", SvPV_nolen(err_tmp)); POPs; } else { if (count != 1) croak("call_Subtract: wanted 1 value from 'Subtract', got %d\n", count); printf ("%d - %d = %d\n", a, b, POPi); } PUTBACK; FREETMPS; LEAVE; } If I<call_Subtract> is called thus call_Subtract(4, 5) the following will be printed Uh oh - death can be fatal Notes =over 5 =item 1. We want to be able to catch the I<die> so we have used the G_EVAL flag. Not specifying this flag would mean that the program would terminate immediately at the I<die> statement in the subroutine I<Subtract>. =item 2. The code err_tmp = ERRSV; if (SvTRUE(err_tmp)) { printf ("Uh oh - %s\n", SvPV_nolen(err_tmp)); POPs; } is the direct equivalent of this bit of Perl print "Uh oh - $@\n" if $@; C<PL_errgv> is a perl global of type C<GV *> that points to the symbol table entry containing the error. C<ERRSV> therefore refers to the C equivalent of C<$@>. We use a local temporary, C<err_tmp>, since C<ERRSV> is a macro that calls a function, and C<SvTRUE(ERRSV)> would end up calling that function multiple times. =for apidoc Amnh|GV *|PL_errgv =item 3. Note that the stack is popped using C<POPs> in the block where C<SvTRUE(err_tmp)> is true. This is necessary because whenever a I<call_*> function invoked with G_EVAL|G_SCALAR returns an error, the top of the stack holds the value I<undef>. Because we want the program to continue after detecting this error, it is essential that the stack be tidied up by removing the I<undef>. =back =head2 Using G_KEEPERR Consider this rather facetious example, where we have used an XS version of the call_Subtract example above inside a destructor: package Foo; sub new { bless {}, $_[0] } sub Subtract { my($a,$b) = @_; die "death can be fatal" if $a < $b; $a - $b; } sub DESTROY { call_Subtract(5, 4); } sub foo { die "foo dies"; } package main; { my $foo = Foo->new; eval { $foo->foo }; } print "Saw: $@" if $@; # should be, but isn't This example will fail to recognize that an error occurred inside the C<eval {}>. Here's why: the call_Subtract code got executed while perl was cleaning up temporaries when exiting the outer braced block, and because call_Subtract is implemented with I<call_pv> using the G_EVAL flag, it promptly reset C<$@>. This results in the failure of the outermost test for C<$@>, and thereby the failure of the error trap. Appending the G_KEEPERR flag, so that the I<call_pv> call in call_Subtract reads: count = call_pv("Subtract", G_EVAL|G_SCALAR|G_KEEPERR); will preserve the error and restore reliable error handling. =head2 Using call_sv In all the previous examples I have 'hard-wired' the name of the Perl subroutine to be called from C. Most of the time though, it is more convenient to be able to specify the name of the Perl subroutine from within the Perl script, and you'll want to use L<call_sv|perlapi/call_sv>. Consider the Perl code below sub fred { print "Hello there\n"; } CallSubPV("fred"); Here is a snippet of XSUB which defines I<CallSubPV>. void CallSubPV(name) char * name CODE: PUSHMARK(SP); call_pv(name, G_DISCARD|G_NOARGS); That is fine as far as it goes. The thing is, the Perl subroutine can be specified as only a string, however, Perl allows references to subroutines and anonymous subroutines. This is where I<call_sv> is useful. The code below for I<CallSubSV> is identical to I<CallSubPV> except that the C<name> parameter is now defined as an SV* and we use I<call_sv> instead of I<call_pv>. void CallSubSV(name) SV * name CODE: PUSHMARK(SP); call_sv(name, G_DISCARD|G_NOARGS); Because we are using an SV to call I<fred> the following can all be used: CallSubSV("fred"); CallSubSV(\&fred); $ref = \&fred; CallSubSV($ref); CallSubSV( sub { print "Hello there\n" } ); As you can see, I<call_sv> gives you much greater flexibility in how you can specify the Perl subroutine. You should note that, if it is necessary to store the SV (C<name> in the example above) which corresponds to the Perl subroutine so that it can be used later in the program, it not enough just to store a copy of the pointer to the SV. Say the code above had been like this: static SV * rememberSub; void SaveSub1(name) SV * name CODE: rememberSub = name; void CallSavedSub1() CODE: PUSHMARK(SP); call_sv(rememberSub, G_DISCARD|G_NOARGS); The reason this is wrong is that, by the time you come to use the pointer C<rememberSub> in C<CallSavedSub1>, it may or may not still refer to the Perl subroutine that was recorded in C<SaveSub1>. This is particularly true for these cases: SaveSub1(\&fred); CallSavedSub1(); SaveSub1( sub { print "Hello there\n" } ); CallSavedSub1(); By the time each of the C<SaveSub1> statements above has been executed, the SV*s which corresponded to the parameters will no longer exist. Expect an error message from Perl of the form Can't use an undefined value as a subroutine reference at ... for each of the C<CallSavedSub1> lines. Similarly, with this code $ref = \&fred; SaveSub1($ref); $ref = 47; CallSavedSub1(); you can expect one of these messages (which you actually get is dependent on the version of Perl you are using) Not a CODE reference at ... Undefined subroutine &main::47 called ... The variable $ref may have referred to the subroutine C<fred> whenever the call to C<SaveSub1> was made but by the time C<CallSavedSub1> gets called it now holds the number C<47>. Because we saved only a pointer to the original SV in C<SaveSub1>, any changes to $ref will be tracked by the pointer C<rememberSub>. This means that whenever C<CallSavedSub1> gets called, it will attempt to execute the code which is referenced by the SV* C<rememberSub>. In this case though, it now refers to the integer C<47>, so expect Perl to complain loudly. A similar but more subtle problem is illustrated with this code: $ref = \&fred; SaveSub1($ref); $ref = \&joe; CallSavedSub1(); This time whenever C<CallSavedSub1> gets called it will execute the Perl subroutine C<joe> (assuming it exists) rather than C<fred> as was originally requested in the call to C<SaveSub1>. To get around these problems it is necessary to take a full copy of the SV. The code below shows C<SaveSub2> modified to do that. /* this isn't thread-safe */ static SV * keepSub = (SV*)NULL; void SaveSub2(name) SV * name CODE: /* Take a copy of the callback */ if (keepSub == (SV*)NULL) /* First time, so create a new SV */ keepSub = newSVsv(name); else /* Been here before, so overwrite */ SvSetSV(keepSub, name); void CallSavedSub2() CODE: PUSHMARK(SP); call_sv(keepSub, G_DISCARD|G_NOARGS); To avoid creating a new SV every time C<SaveSub2> is called, the function first checks to see if it has been called before. If not, then space for a new SV is allocated and the reference to the Perl subroutine C<name> is copied to the variable C<keepSub> in one operation using C<newSVsv>. Thereafter, whenever C<SaveSub2> is called, the existing SV, C<keepSub>, is overwritten with the new value using C<SvSetSV>. Note: using a static or global variable to store the SV isn't thread-safe. You can either use the C<MY_CXT> mechanism documented in L<perlxs/Safely Storing Static Data in XS> which is fast, or store the values in perl global variables, using get_sv(), which is much slower. =head2 Using call_argv Here is a Perl subroutine which prints whatever parameters are passed to it. sub PrintList { my(@list) = @_; foreach (@list) { print "$_\n" } } And here is an example of I<call_argv> which will call I<PrintList>. static char * words[] = {"alpha", "beta", "gamma", "delta", NULL}; static void call_PrintList() { call_argv("PrintList", G_DISCARD, words); } Note that it is not necessary to call C<PUSHMARK> in this instance. This is because I<call_argv> will do it for you. =head2 Using call_method Consider the following Perl code: { package Mine; sub new { my($type) = shift; bless [@_] } sub Display { my ($self, $index) = @_; print "$index: $$self[$index]\n"; } sub PrintID { my($class) = @_; print "This is Class $class version 1.0\n"; } } It implements just a very simple class to manage an array. Apart from the constructor, C<new>, it declares methods, one static and one virtual. The static method, C<PrintID>, prints out simply the class name and a version number. The virtual method, C<Display>, prints out a single element of the array. Here is an all-Perl example of using it. $a = Mine->new('red', 'green', 'blue'); $a->Display(1); Mine->PrintID; will print 1: green This is Class Mine version 1.0 Calling a Perl method from C is fairly straightforward. The following things are required: =over 5 =item * A reference to the object for a virtual method or the name of the class for a static method =item * The name of the method =item * Any other parameters specific to the method =back Here is a simple XSUB which illustrates the mechanics of calling both the C<PrintID> and C<Display> methods from C. void call_Method(ref, method, index) SV * ref char * method int index CODE: PUSHMARK(SP); EXTEND(SP, 2); PUSHs(ref); PUSHs(sv_2mortal(newSViv(index))); PUTBACK; call_method(method, G_DISCARD); void call_PrintID(class, method) char * class char * method CODE: PUSHMARK(SP); XPUSHs(sv_2mortal(newSVpv(class, 0))); PUTBACK; call_method(method, G_DISCARD); So the methods C<PrintID> and C<Display> can be invoked like this: $a = Mine->new('red', 'green', 'blue'); call_Method($a, 'Display', 1); call_PrintID('Mine', 'PrintID'); The only thing to note is that, in both the static and virtual methods, the method name is not passed via the stack--it is used as the first parameter to I<call_method>. =head2 Using GIMME_V Here is a trivial XSUB which prints the context in which it is currently executing. void PrintContext() CODE: U8 gimme = GIMME_V; if (gimme == G_VOID) printf ("Context is Void\n"); else if (gimme == G_SCALAR) printf ("Context is Scalar\n"); else printf ("Context is Array\n"); And here is some Perl to test it. PrintContext; $a = PrintContext; @a = PrintContext; The output from that will be Context is Void Context is Scalar Context is Array =head2 Using Perl to Dispose of Temporaries In the examples given to date, any temporaries created in the callback (i.e., parameters passed on the stack to the I<call_*> function or values returned via the stack) have been freed by one of these methods: =over 5 =item * Specifying the G_DISCARD flag with I<call_*> =item * Explicitly using the C<ENTER>/C<SAVETMPS>--C<FREETMPS>/C<LEAVE> pairing =back There is another method which can be used, namely letting Perl do it for you automatically whenever it regains control after the callback has terminated. This is done by simply not using the ENTER; SAVETMPS; ... FREETMPS; LEAVE; sequence in the callback (and not, of course, specifying the G_DISCARD flag). If you are going to use this method you have to be aware of a possible memory leak which can arise under very specific circumstances. To explain these circumstances you need to know a bit about the flow of control between Perl and the callback routine. The examples given at the start of the document (an error handler and an event driven program) are typical of the two main sorts of flow control that you are likely to encounter with callbacks. There is a very important distinction between them, so pay attention. In the first example, an error handler, the flow of control could be as follows. You have created an interface to an external library. Control can reach the external library like this perl --> XSUB --> external library Whilst control is in the library, an error condition occurs. You have previously set up a Perl callback to handle this situation, so it will get executed. Once the callback has finished, control will drop back to Perl again. Here is what the flow of control will be like in that situation perl --> XSUB --> external library ... error occurs ... external library --> call_* --> perl | perl <-- XSUB <-- external library <-- call_* <----+ After processing of the error using I<call_*> is completed, control reverts back to Perl more or less immediately. In the diagram, the further right you go the more deeply nested the scope is. It is only when control is back with perl on the extreme left of the diagram that you will have dropped back to the enclosing scope and any temporaries you have left hanging around will be freed. In the second example, an event driven program, the flow of control will be more like this perl --> XSUB --> event handler ... event handler --> call_* --> perl | event handler <-- call_* <----+ ... event handler --> call_* --> perl | event handler <-- call_* <----+ ... event handler --> call_* --> perl | event handler <-- call_* <----+ In this case the flow of control can consist of only the repeated sequence event handler --> call_* --> perl for practically the complete duration of the program. This means that control may I<never> drop back to the surrounding scope in Perl at the extreme left. So what is the big problem? Well, if you are expecting Perl to tidy up those temporaries for you, you might be in for a long wait. For Perl to dispose of your temporaries, control must drop back to the enclosing scope at some stage. In the event driven scenario that may never happen. This means that, as time goes on, your program will create more and more temporaries, none of which will ever be freed. As each of these temporaries consumes some memory your program will eventually consume all the available memory in your system--kapow! So here is the bottom line--if you are sure that control will revert back to the enclosing Perl scope fairly quickly after the end of your callback, then it isn't absolutely necessary to dispose explicitly of any temporaries you may have created. Mind you, if you are at all uncertain about what to do, it doesn't do any harm to tidy up anyway. =head2 Strategies for Storing Callback Context Information Potentially one of the trickiest problems to overcome when designing a callback interface can be figuring out how to store the mapping between the C callback function and the Perl equivalent. To help understand why this can be a real problem first consider how a callback is set up in an all C environment. Typically a C API will provide a function to register a callback. This will expect a pointer to a function as one of its parameters. Below is a call to a hypothetical function C<register_fatal> which registers the C function to get called when a fatal error occurs. register_fatal(cb1); The single parameter C<cb1> is a pointer to a function, so you must have defined C<cb1> in your code, say something like this static void cb1() { printf ("Fatal Error\n"); exit(1); } Now change that to call a Perl subroutine instead static SV * callback = (SV*)NULL; static void cb1() { dSP; PUSHMARK(SP); /* Call the Perl sub to process the callback */ call_sv(callback, G_DISCARD); } void register_fatal(fn) SV * fn CODE: /* Remember the Perl sub */ if (callback == (SV*)NULL) callback = newSVsv(fn); else SvSetSV(callback, fn); /* register the callback with the external library */ register_fatal(cb1); where the Perl equivalent of C<register_fatal> and the callback it registers, C<pcb1>, might look like this # Register the sub pcb1 register_fatal(\&pcb1); sub pcb1 { die "I'm dying...\n"; } The mapping between the C callback and the Perl equivalent is stored in the global variable C<callback>. This will be adequate if you ever need to have only one callback registered at any time. An example could be an error handler like the code sketched out above. Remember though, repeated calls to C<register_fatal> will replace the previously registered callback function with the new one. Say for example you want to interface to a library which allows asynchronous file i/o. In this case you may be able to register a callback whenever a read operation has completed. To be of any use we want to be able to call separate Perl subroutines for each file that is opened. As it stands, the error handler example above would not be adequate as it allows only a single callback to be defined at any time. What we require is a means of storing the mapping between the opened file and the Perl subroutine we want to be called for that file. Say the i/o library has a function C<asynch_read> which associates a C function C<ProcessRead> with a file handle C<fh>--this assumes that it has also provided some routine to open the file and so obtain the file handle. asynch_read(fh, ProcessRead) This may expect the C I<ProcessRead> function of this form void ProcessRead(fh, buffer) int fh; char * buffer; { ... } To provide a Perl interface to this library we need to be able to map between the C<fh> parameter and the Perl subroutine we want called. A hash is a convenient mechanism for storing this mapping. The code below shows a possible implementation static HV * Mapping = (HV*)NULL; void asynch_read(fh, callback) int fh SV * callback CODE: /* If the hash doesn't already exist, create it */ if (Mapping == (HV*)NULL) Mapping = newHV(); /* Save the fh -> callback mapping */ hv_store(Mapping, (char*)&fh, sizeof(fh), newSVsv(callback), 0); /* Register with the C Library */ asynch_read(fh, asynch_read_if); and C<asynch_read_if> could look like this static void asynch_read_if(fh, buffer) int fh; char * buffer; { dSP; SV ** sv; /* Get the callback associated with fh */ sv = hv_fetch(Mapping, (char*)&fh , sizeof(fh), FALSE); if (sv == (SV**)NULL) croak("Internal error...\n"); PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(fh))); PUSHs(sv_2mortal(newSVpv(buffer, 0))); PUTBACK; /* Call the Perl sub */ call_sv(*sv, G_DISCARD); } For completeness, here is C<asynch_close>. This shows how to remove the entry from the hash C<Mapping>. void asynch_close(fh) int fh CODE: /* Remove the entry from the hash */ (void) hv_delete(Mapping, (char*)&fh, sizeof(fh), G_DISCARD); /* Now call the real asynch_close */ asynch_close(fh); So the Perl interface would look like this sub callback1 { my($handle, $buffer) = @_; } # Register the Perl callback asynch_read($fh, \&callback1); asynch_close($fh); The mapping between the C callback and Perl is stored in the global hash C<Mapping> this time. Using a hash has the distinct advantage that it allows an unlimited number of callbacks to be registered. What if the interface provided by the C callback doesn't contain a parameter which allows the file handle to Perl subroutine mapping? Say in the asynchronous i/o package, the callback function gets passed only the C<buffer> parameter like this void ProcessRead(buffer) char * buffer; { ... } Without the file handle there is no straightforward way to map from the C callback to the Perl subroutine. In this case a possible way around this problem is to predefine a series of C functions to act as the interface to Perl, thus #define MAX_CB 3 #define NULL_HANDLE -1 typedef void (*FnMap)(); struct MapStruct { FnMap Function; SV * PerlSub; int Handle; }; static void fn1(); static void fn2(); static void fn3(); static struct MapStruct Map [MAX_CB] = { { fn1, NULL, NULL_HANDLE }, { fn2, NULL, NULL_HANDLE }, { fn3, NULL, NULL_HANDLE } }; static void Pcb(index, buffer) int index; char * buffer; { dSP; PUSHMARK(SP); XPUSHs(sv_2mortal(newSVpv(buffer, 0))); PUTBACK; /* Call the Perl sub */ call_sv(Map[index].PerlSub, G_DISCARD); } static void fn1(buffer) char * buffer; { Pcb(0, buffer); } static void fn2(buffer) char * buffer; { Pcb(1, buffer); } static void fn3(buffer) char * buffer; { Pcb(2, buffer); } void array_asynch_read(fh, callback) int fh SV * callback CODE: int index; int null_index = MAX_CB; /* Find the same handle or an empty entry */ for (index = 0; index < MAX_CB; ++index) { if (Map[index].Handle == fh) break; if (Map[index].Handle == NULL_HANDLE) null_index = index; } if (index == MAX_CB && null_index == MAX_CB) croak ("Too many callback functions registered\n"); if (index == MAX_CB) index = null_index; /* Save the file handle */ Map[index].Handle = fh; /* Remember the Perl sub */ if (Map[index].PerlSub == (SV*)NULL) Map[index].PerlSub = newSVsv(callback); else SvSetSV(Map[index].PerlSub, callback); asynch_read(fh, Map[index].Function); void array_asynch_close(fh) int fh CODE: int index; /* Find the file handle */ for (index = 0; index < MAX_CB; ++ index) if (Map[index].Handle == fh) break; if (index == MAX_CB) croak ("could not close fh %d\n", fh); Map[index].Handle = NULL_HANDLE; SvREFCNT_dec(Map[index].PerlSub); Map[index].PerlSub = (SV*)NULL; asynch_close(fh); In this case the functions C<fn1>, C<fn2>, and C<fn3> are used to remember the Perl subroutine to be called. Each of the functions holds a separate hard-wired index which is used in the function C<Pcb> to access the C<Map> array and actually call the Perl subroutine. There are some obvious disadvantages with this technique. Firstly, the code is considerably more complex than with the previous example. Secondly, there is a hard-wired limit (in this case 3) to the number of callbacks that can exist simultaneously. The only way to increase the limit is by modifying the code to add more functions and then recompiling. None the less, as long as the number of functions is chosen with some care, it is still a workable solution and in some cases is the only one available. To summarize, here are a number of possible methods for you to consider for storing the mapping between C and the Perl callback =over 5 =item 1. Ignore the problem - Allow only 1 callback For a lot of situations, like interfacing to an error handler, this may be a perfectly adequate solution. =item 2. Create a sequence of callbacks - hard wired limit If it is impossible to tell from the parameters passed back from the C callback what the context is, then you may need to create a sequence of C callback interface functions, and store pointers to each in an array. =item 3. Use a parameter to map to the Perl callback A hash is an ideal mechanism to store the mapping between C and Perl. =back =head2 Alternate Stack Manipulation Although I have made use of only the C<POP*> macros to access values returned from Perl subroutines, it is also possible to bypass these macros and read the stack using the C<ST> macro (See L<perlxs> for a full description of the C<ST> macro). Most of the time the C<POP*> macros should be adequate; the main problem with them is that they force you to process the returned values in sequence. This may not be the most suitable way to process the values in some cases. What we want is to be able to access the stack in a random order. The C<ST> macro as used when coding an XSUB is ideal for this purpose. The code below is the example given in the section L</Returning a List of Values> recoded to use C<ST> instead of C<POP*>. static void call_AddSubtract2(a, b) int a; int b; { dSP; I32 ax; int count; ENTER; SAVETMPS; PUSHMARK(SP); EXTEND(SP, 2); PUSHs(sv_2mortal(newSViv(a))); PUSHs(sv_2mortal(newSViv(b))); PUTBACK; count = call_pv("AddSubtract", G_ARRAY); SPAGAIN; SP -= count; ax = (SP - PL_stack_base) + 1; if (count != 2) croak("Big trouble\n"); printf ("%d + %d = %d\n", a, b, SvIV(ST(0))); printf ("%d - %d = %d\n", a, b, SvIV(ST(1))); PUTBACK; FREETMPS; LEAVE; } Notes =over 5 =item 1. Notice that it was necessary to define the variable C<ax>. This is because the C<ST> macro expects it to exist. If we were in an XSUB it would not be necessary to define C<ax> as it is already defined for us. =item 2. The code SPAGAIN; SP -= count; ax = (SP - PL_stack_base) + 1; sets the stack up so that we can use the C<ST> macro. =item 3. Unlike the original coding of this example, the returned values are not accessed in reverse order. So C<ST(0)> refers to the first value returned by the Perl subroutine and C<ST(count-1)> refers to the last. =back =head2 Creating and Calling an Anonymous Subroutine in C As we've already shown, C<call_sv> can be used to invoke an anonymous subroutine. However, our example showed a Perl script invoking an XSUB to perform this operation. Let's see how it can be done inside our C code: ... SV *cvrv = eval_pv("sub { print 'You will not find me cluttering any namespace!' }", TRUE); ... call_sv(cvrv, G_VOID|G_NOARGS); C<eval_pv> is used to compile the anonymous subroutine, which will be the return value as well (read more about C<eval_pv> in L<perlapi/eval_pv>). Once this code reference is in hand, it can be mixed in with all the previous examples we've shown. =head1 LIGHTWEIGHT CALLBACKS Sometimes you need to invoke the same subroutine repeatedly. This usually happens with a function that acts on a list of values, such as Perl's built-in sort(). You can pass a comparison function to sort(), which will then be invoked for every pair of values that needs to be compared. The first() and reduce() functions from L<List::Util> follow a similar pattern. In this case it is possible to speed up the routine (often quite substantially) by using the lightweight callback API. The idea is that the calling context only needs to be created and destroyed once, and the sub can be called arbitrarily many times in between. It is usual to pass parameters using global variables (typically $_ for one parameter, or $a and $b for two parameters) rather than via @_. (It is possible to use the @_ mechanism if you know what you're doing, though there is as yet no supported API for it. It's also inherently slower.) The pattern of macro calls is like this: dMULTICALL; /* Declare local variables */ U8 gimme = G_SCALAR; /* context of the call: G_SCALAR, * G_ARRAY, or G_VOID */ PUSH_MULTICALL(cv); /* Set up the context for calling cv, and set local vars appropriately */ /* loop */ { /* set the value(s) af your parameter variables */ MULTICALL; /* Make the actual call */ } /* end of loop */ POP_MULTICALL; /* Tear down the calling context */ For some concrete examples, see the implementation of the first() and reduce() functions of List::Util 1.18. There you will also find a header file that emulates the multicall API on older versions of perl. =head1 SEE ALSO L<perlxs>, L<perlguts>, L<perlembed> =head1 AUTHOR Paul Marquess Special thanks to the following people who assisted in the creation of the document. Jeff Okamoto, Tim Bunce, Nick Gianniotis, Steve Kelem, Gurusamy Sarathy and Larry Wall. =head1 DATE Last updated for perl 5.23.1. PK �=�[��-X X perl5124delta.podnu �[��� =encoding utf8 =head1 NAME perl5124delta - what is new for perl v5.12.4 =head1 DESCRIPTION This document describes differences between the 5.12.3 release and the 5.12.4 release. If you are upgrading from an earlier release such as 5.12.2, first read L<perl5123delta>, which describes differences between 5.12.2 and 5.12.3. The major changes made in 5.12.0 are described in L<perl5120delta>. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.12.3. If any exist, they are bugs and reports are welcome. =head1 Selected Bug Fixes When strict "refs" mode is off, C<%{...}> in rvalue context returns C<undef> if its argument is undefined. An optimisation introduced in Perl 5.12.0 to make C<keys %{...}> faster when used as a boolean did not take this into account, causing C<keys %{+undef}> (and C<keys %$foo> when C<$foo> is undefined) to be an error, which it should be so in strict mode only [perl #81750]. C<lc>, C<uc>, C<lcfirst>, and C<ucfirst> no longer return untainted strings when the argument is tainted. This has been broken since perl 5.8.9 [perl #87336]. Fixed a case where it was possible that a freed buffer may have been read from when parsing a here document. =head1 Modules and Pragmata L<Module::CoreList> has been upgraded from version 2.43 to 2.50. =head1 Testing The F<cpan/CGI/t/http.t> test script has been fixed to work when the environment has HTTPS_* environment variables, such as HTTPS_PROXY. =head1 Documentation Updated the documentation for rand() in L<perlfunc> to note that it is not cryptographically secure. =head1 Platform Specific Notes =over 4 =item Linux Support Ubuntu 11.04's new multi-arch library layout. =back =head1 Acknowledgements Perl 5.12.4 represents approximately 5 months of development since Perl 5.12.3 and contains approximately 200 lines of changes across 11 files from 8 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.12.4: Andy Dougherty, David Golden, David Leadbeater, Father Chrysostomos, Florian Ragwitz, Jesse Vincent, Leon Brocard, Zsbán Ambrus. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�� �w �w perlhpux.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlhpux - Perl version 5 on Hewlett-Packard Unix (HP-UX) systems =head1 DESCRIPTION This document describes various features of HP's Unix operating system (HP-UX) that will affect how Perl version 5 (hereafter just Perl) is compiled and/or runs. =head2 Using perl as shipped with HP-UX Application release September 2001, HP-UX 11.00 is the first to ship with Perl. By the time it was perl-5.6.1 in /opt/perl. The first occurrence is on CD 5012-7954 and can be installed using swinstall -s /cdrom perl assuming you have mounted that CD on /cdrom. That build was a portable hppa-1.1 multithread build that supports large files compiled with gcc-2.9-hppa-991112. If you perform a new installation, then (a newer) Perl will be installed automatically. Pre-installed HP-UX systems now have more recent versions of Perl and the updated modules. The official (threaded) builds from HP, as they are shipped on the Application DVD/CD's are available on L<http://www.software.hp.com/portal/swdepot/displayProductInfo.do?productNumber=PERL> for both PA-RISC and IPF (Itanium Processor Family). They are built with the HP ANSI-C compiler. Up till 5.8.8 that was done by ActiveState. To see what version is included on the DVD (assumed here to be mounted on /cdrom), issue this command: # swlist -s /cdrom perl # perl D.5.8.8.B 5.8.8 Perl Programming Language perl.Perl5-32 D.5.8.8.B 32-bit 5.8.8 Perl Programming Language with Extensions perl.Perl5-64 D.5.8.8.B 64-bit 5.8.8 Perl Programming Language with Extensions To see what is installed on your system: # swlist -R perl # perl E.5.8.8.J Perl Programming Language # perl.Perl5-32 E.5.8.8.J 32-bit Perl Programming Language with Extensions perl.Perl5-32.PERL-MAN E.5.8.8.J 32-bit Perl Man Pages for IA perl.Perl5-32.PERL-RUN E.5.8.8.J 32-bit Perl Binaries for IA # perl.Perl5-64 E.5.8.8.J 64-bit Perl Programming Language with Extensions perl.Perl5-64.PERL-MAN E.5.8.8.J 64-bit Perl Man Pages for IA perl.Perl5-64.PERL-RUN E.5.8.8.J 64-bit Perl Binaries for IA =head2 Using perl from HP's porting centre HP porting centre tries to keep up with customer demand and release updates from the Open Source community. Having precompiled Perl binaries available is obvious, though "up-to-date" is something relative. At the moment of writing perl-5.10.1 and 5.28.0 were available. The HP porting centres are limited in what systems they are allowed to port to and they usually choose the two most recent OS versions available. HP has asked the porting centre to move Open Source binaries from /opt to /usr/local, so binaries produced since the start of July 2002 are located in /usr/local. One of HP porting centres URL's is L<http://hpux.connect.org.uk/> The port currently available is built with GNU gcc. As porting modern GNU gcc is extremely hard on HP-UX, they are stuck at version gcc-4.2.3. =head2 Other prebuilt perl binaries To get more perl depots for the whole range of HP-UX, visit H.Merijn Brand's site at L<http://mirrors.develooper.com/hpux/#Perl>. Carefully read the notes to see if the available versions suit your needs. =head2 Compiling Perl 5 on HP-UX When compiling Perl, you must use an ANSI C compiler. The C compiler that ships with all HP-UX systems is a K&R compiler that should only be used to build new kernels. Perl can be compiled with either HP's ANSI C compiler or with gcc. The former is recommended, as not only can it compile Perl with no difficulty, but also can take advantage of features listed later that require the use of HP compiler-specific command-line flags. If you decide to use gcc, make sure your installation is recent and complete, and be sure to read the Perl INSTALL file for more gcc-specific details. =head2 PA-RISC The last and final version of PA-RISC is 2.0, HP no longer sells any system with these CPU's. HP's HP9000 Unix systems run on HP's own Precision Architecture (PA-RISC) chip. HP-UX used to run on the Motorola MC68000 family of chips, but any machine with this chip in it is quite obsolete and this document will not attempt to address issues for compiling Perl on the Motorola chipset. Even though PA-RISC hardware is not sold anymore, a lot of machines still running on these CPU's can be found in the wild. The last order date for HP 9000 systems was December 31, 2008. HP PA-RISC systems are usually referred to with model description "HP 9000". The last CPU in this series is the PA-8900. Support for PA-RISC architectured machines officially ended as shown in the following table: PA-RISC End-of-Life Roadmap +--------+----------------+----------------+-----------------+ | HP9000 | Superdome | PA-8700 | Spring 2011 | | 4-128 | | PA-8800/sx1000 | Summer 2012 | | cores | | PA-8900/sx1000 | 2014 | | | | PA-8900/sx2000 | 2015 | +--------+----------------+----------------+-----------------+ | HP9000 | rp7410, rp8400 | PA-8700 | Spring 2011 | | 2-32 | rp7420, rp8420 | PA-8800/sx1000 | 2012 | | cores | rp7440, rp8440 | PA-8900/sx1000 | Autumn 2013 | | | | PA-8900/sx2000 | 2015 | +--------+----------------+----------------+-----------------+ | HP9000 | rp44x0 | PA-8700 | Spring 2011 | | 1-8 | | PA-8800/rp44x0 | 2012 | | cores | | PA-8900/rp44x0 | 2014 | +--------+----------------+----------------+-----------------+ | HP9000 | rp34x0 | PA-8700 | Spring 2011 | | 1-4 | | PA-8800/rp34x0 | 2012 | | cores | | PA-8900/rp34x0 | 2014 | +--------+----------------+----------------+-----------------+ A complete list of models at the time the OS was built is in the file /usr/sam/lib/mo/sched.models. The first column corresponds to the last part of the output of the "model" command. The second column is the PA-RISC version and the third column is the exact chip type used. (Start browsing at the bottom to prevent confusion ;-) # model 9000/800/L1000-44 # grep L1000-44 /usr/sam/lib/mo/sched.models L1000-44 2.0 PA8500 =head2 PA-RISC 1.0 The original version of PA-RISC, HP no longer sells any system with this chip. The following systems contained PA-RISC 1.0 chips: 600, 635, 645, 808, 815, 822, 825, 832, 834, 835, 840, 842, 845, 850, 852, 855, 860, 865, 870, 890 =head2 PA-RISC 1.1 An upgrade to the PA-RISC design, it shipped for many years in many different system. The following systems contain with PA-RISC 1.1 chips: 705, 710, 712, 715, 720, 722, 725, 728, 730, 735, 742, 743, 744, 745, 747, 750, 755, 770, 777, 778, 779, 800, 801, 803, 806, 807, 809, 811, 813, 816, 817, 819, 821, 826, 827, 829, 831, 837, 839, 841, 847, 849, 851, 856, 857, 859, 867, 869, 877, 887, 891, 892, 897, A180, A180C, B115, B120, B132L, B132L+, B160L, B180L, C100, C110, C115, C120, C160L, D200, D210, D220, D230, D250, D260, D310, D320, D330, D350, D360, D410, DX0, DX5, DXO, E25, E35, E45, E55, F10, F20, F30, G30, G40, G50, G60, G70, H20, H30, H40, H50, H60, H70, I30, I40, I50, I60, I70, J200, J210, J210XC, K100, K200, K210, K220, K230, K400, K410, K420, S700i, S715, S744, S760, T500, T520 =head2 PA-RISC 2.0 The most recent upgrade to the PA-RISC design, it added support for 64-bit integer data. As of the date of this document's last update, the following systems contain PA-RISC 2.0 chips: 700, 780, 781, 782, 783, 785, 802, 804, 810, 820, 861, 871, 879, 889, 893, 895, 896, 898, 899, A400, A500, B1000, B2000, C130, C140, C160, C180, C180+, C180-XP, C200+, C400+, C3000, C360, C3600, CB260, D270, D280, D370, D380, D390, D650, J220, J2240, J280, J282, J400, J410, J5000, J5500XM, J5600, J7000, J7600, K250, K260, K260-EG, K270, K360, K370, K380, K450, K460, K460-EG, K460-XP, K470, K570, K580, L1000, L2000, L3000, N4000, R380, R390, SD16000, SD32000, SD64000, T540, T600, V2000, V2200, V2250, V2500, V2600 Just before HP took over Compaq, some systems were renamed. the link that contained the explanation is dead, so here's a short summary: HP 9000 A-Class servers, now renamed HP Server rp2400 series. HP 9000 L-Class servers, now renamed HP Server rp5400 series. HP 9000 N-Class servers, now renamed HP Server rp7400. rp2400, rp2405, rp2430, rp2450, rp2470, rp3410, rp3440, rp4410, rp4440, rp5400, rp5405, rp5430, rp5450, rp5470, rp7400, rp7405, rp7410, rp7420, rp7440, rp8400, rp8420, rp8440, Superdome The current naming convention is: aadddd ||||`+- 00 - 99 relative capacity & newness (upgrades, etc.) |||`--- unique number for each architecture to ensure different ||| systems do not have the same numbering across ||| architectures ||`---- 1 - 9 identifies family and/or relative positioning || |`----- c = ia32 (cisc) | p = pa-risc | x = ia-64 (Itanium & Itanium 2) | h = housing `------ t = tower r = rack optimized s = super scalable b = blade sa = appliance =head2 Portability Between PA-RISC Versions An executable compiled on a PA-RISC 2.0 platform will not execute on a PA-RISC 1.1 platform, even if they are running the same version of HP-UX. If you are building Perl on a PA-RISC 2.0 platform and want that Perl to also run on a PA-RISC 1.1, the compiler flags +DAportable and +DS32 should be used. It is no longer possible to compile PA-RISC 1.0 executables on either the PA-RISC 1.1 or 2.0 platforms. The command-line flags are accepted, but the resulting executable will not run when transferred to a PA-RISC 1.0 system. =head2 Itanium Processor Family (IPF) and HP-UX HP-UX also runs on the newer Itanium processor. This requires the use of HP-UX version 11.23 (11i v2) or 11.31 (11i v3), and with the exception of a few differences detailed below and in later sections, Perl should compile with no problems. Although PA-RISC binaries can run on Itanium systems, you should not attempt to use a PA-RISC version of Perl on an Itanium system. This is because shared libraries created on an Itanium system cannot be loaded while running a PA-RISC executable. HP Itanium 2 systems are usually referred to with model description "HP Integrity". =head2 Itanium, Itanium 2 & Madison 6 HP also ships servers with the 128-bit Itanium processor(s). The cx26x0 is told to have Madison 6. As of the date of this document's last update, the following systems contain Itanium or Itanium 2 chips (this is likely to be out of date): BL60p, BL860c, BL870c, BL890c, cx2600, cx2620, rx1600, rx1620, rx2600, rx2600hptc, rx2620, rx2660, rx2800, rx3600, rx4610, rx4640, rx5670, rx6600, rx7420, rx7620, rx7640, rx8420, rx8620, rx8640, rx9610, sx1000, sx2000 To see all about your machine, type # model ia64 hp server rx2600 # /usr/contrib/bin/machinfo =head2 HP-UX versions Not all architectures (PA = PA-RISC, IPF = Itanium Processor Family) support all versions of HP-UX, here is a short list HP-UX version Kernel Architecture End-of-factory support ------------- ------ ------------ ---------------------------------- 10.20 32 bit PA 30-Jun-2003 11.00 32/64 PA 31-Dec-2006 11.11 11i v1 32/64 PA 31-Dec-2015 11.22 11i v2 64 IPF 30-Apr-2004 11.23 11i v2 64 PA & IPF 31-Dec-2015 11.31 11i v3 64 PA & IPF 31-Dec-2020 (PA) 31-Dec-2025 (IPF) See for the full list of hardware/OS support and expected end-of-life L<https://h20195.www2.hpe.com/V2/getpdf.aspx/4AA4-7673ENW.pdf> =head2 Building Dynamic Extensions on HP-UX HP-UX supports dynamically loadable libraries (shared libraries). Shared libraries end with the suffix .sl. On Itanium systems, they end with the suffix .so. Shared libraries created on a platform using a particular PA-RISC version are not usable on platforms using an earlier PA-RISC version by default. However, this backwards compatibility may be enabled using the same +DAportable compiler flag (with the same PA-RISC 1.0 caveat mentioned above). Shared libraries created on an Itanium platform cannot be loaded on a PA-RISC platform. Shared libraries created on a PA-RISC platform can only be loaded on an Itanium platform if it is a PA-RISC executable that is attempting to load the PA-RISC library. A PA-RISC shared library cannot be loaded into an Itanium executable nor vice-versa. To create a shared library, the following steps must be performed: 1. Compile source modules with +z or +Z flag to create a .o module which contains Position-Independent Code (PIC). The linker will tell you in the next step if +Z was needed. (For gcc, the appropriate flag is -fpic or -fPIC.) 2. Link the shared library using the -b flag. If the code calls any functions in other system libraries (e.g., libm), it must be included on this line. (Note that these steps are usually handled automatically by the extension's Makefile). If these dependent libraries are not listed at shared library creation time, you will get fatal "Unresolved symbol" errors at run time when the library is loaded. You may create a shared library that refers to another library, which may be either an archive library or a shared library. If this second library is a shared library, this is called a "dependent library". The dependent library's name is recorded in the main shared library, but it is not linked into the shared library. Instead, it is loaded when the main shared library is loaded. This can cause problems if you build an extension on one system and move it to another system where the libraries may not be located in the same place as on the first system. If the referred library is an archive library, then it is treated as a simple collection of .o modules (all of which must contain PIC). These modules are then linked into the shared library. Note that it is okay to create a library which contains a dependent library that is already linked into perl. Some extensions, like DB_File and Compress::Zlib use/require prebuilt libraries for the perl extensions/modules to work. If these libraries are built using the default configuration, it might happen that you run into an error like "invalid loader fixup" during load phase. HP is aware of this problem. Search the HP-UX cxx-dev forums for discussions about the subject. The short answer is that B<everything> (all libraries, everything) must be compiled with C<+z> or C<+Z> to be PIC (position independent code). (For gcc, that would be C<-fpic> or C<-fPIC>). In HP-UX 11.00 or newer the linker error message should tell the name of the offending object file. A more general approach is to intervene manually, as with an example for the DB_File module, which requires SleepyCat's libdb.sl: # cd .../db-3.2.9/build_unix # vi Makefile ... add +Z to all cflags to create shared objects CFLAGS= -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \ -I/usr/local/include -I/usr/include/X11R6 CXXFLAGS= -c $(CPPFLAGS) +Z -Ae +O2 +Onolimit \ -I/usr/local/include -I/usr/include/X11R6 # make clean # make # mkdir tmp # cd tmp # ar x ../libdb.a # ld -b -o libdb-3.2.sl *.o # mv libdb-3.2.sl /usr/local/lib # rm *.o # cd /usr/local/lib # rm -f libdb.sl # ln -s libdb-3.2.sl libdb.sl # cd .../DB_File-1.76 # make distclean # perl Makefile.PL # make # make test # make install As of db-4.2.x it is no longer needed to do this by hand. Sleepycat has changed the configuration process to add +z on HP-UX automatically. # cd .../db-4.2.25/build_unix # env CFLAGS=+DD64 LDFLAGS=+DD64 ../dist/configure should work to generate 64bit shared libraries for HP-UX 11.00 and 11i. It is no longer possible to link PA-RISC 1.0 shared libraries (even though the command-line flags are still present). PA-RISC and Itanium object files are not interchangeable. Although you may be able to use ar to create an archive library of PA-RISC object files on an Itanium system, you cannot link against it using an Itanium link editor. =head2 The HP ANSI C Compiler When using this compiler to build Perl, you should make sure that the flag -Aa is added to the cpprun and cppstdin variables in the config.sh file (though see the section on 64-bit perl below). If you are using a recent version of the Perl distribution, these flags are set automatically. Even though HP-UX 10.20 and 11.00 are not actively maintained by HP anymore, updates for the HP ANSI C compiler are still available from time to time, and it might be advisable to see if updates are applicable. At the moment of writing, the latests available patches for 11.00 that should be applied are PHSS_35098, PHSS_35175, PHSS_35100, PHSS_33036, and PHSS_33902). If you have a SUM account, you can use it to search for updates/patches. Enter "ANSI" as keyword. =head2 The GNU C Compiler When you are going to use the GNU C compiler (gcc), and you don't have gcc yet, you can either build it yourself (if you feel masochistic enough) from the sources (available from e.g. L<http://gcc.gnu.org/mirrors.html>) or fetch a prebuilt binary from the HP porting center at L<http://hpux.connect.org.uk/hppd/cgi-bin/search?term=gcc&Search=Search> or from the DSPP (you need to be a member) at L<http://h21007.www2.hp.com/portal/site/dspp/menuitem.863c3e4cbcdc3f3515b49c108973a801?ciid=2a08725cc2f02110725cc2f02110275d6e10RCRD&jumpid=reg_r1002_usen_c-001_title_r0001> (Browse through the list, because there are often multiple versions of the same package available). Most mentioned distributions are depots. H.Merijn Brand has made prebuilt gcc binaries available on L<http://mirrors.develooper.com/hpux/> and/or L<http://www.cmve.net/~merijn/> for HP-UX 10.20 (only 32bit), HP-UX 11.00, HP-UX 11.11 (HP-UX 11i v1), and HP-UX 11.23 (HP-UX 11i v2 PA-RISC) in both 32- and 64-bit versions. For HP-UX 11.23 IPF and HP-UX 11.31 IPF depots are available too. The IPF versions do not need two versions of GNU gcc. On PA-RISC you need a different compiler for 32-bit applications and for 64-bit applications. On PA-RISC, 32-bit objects and 64-bit objects do not mix. Period. There is no different behaviour for HP C-ANSI-C or GNU gcc. So if you require your perl binary to use 64-bit libraries, like Oracle-64bit, you MUST build a 64-bit perl. Building a 64-bit capable gcc on PA-RISC from source is possible only when you have the HP C-ANSI C compiler or an already working 64-bit binary of gcc available. Best performance for perl is achieved with HP's native compiler. =head2 Using Large Files with Perl on HP-UX Beginning with HP-UX version 10.20, files larger than 2GB (2^31 bytes) may be created and manipulated. Three separate methods of doing this are available. Of these methods, the best method for Perl is to compile using the -Duselargefiles flag to Configure. This causes Perl to be compiled using structures and functions in which these are 64 bits wide, rather than 32 bits wide. (Note that this will only work with HP's ANSI C compiler. If you want to compile Perl using gcc, you will have to get a version of the compiler that supports 64-bit operations. See above for where to find it.) There are some drawbacks to this approach. One is that any extension which calls any file-manipulating C function will need to be recompiled (just follow the usual "perl Makefile.PL; make; make test; make install" procedure). The list of functions that will need to recompiled is: creat, fgetpos, fopen, freopen, fsetpos, fstat, fstatvfs, fstatvfsdev, ftruncate, ftw, lockf, lseek, lstat, mmap, nftw, open, prealloc, stat, statvfs, statvfsdev, tmpfile, truncate, getrlimit, setrlimit Another drawback is only valid for Perl versions before 5.6.0. This drawback is that the seek and tell functions (both the builtin version and POSIX module version) will not perform correctly. It is strongly recommended that you use this flag when you run Configure. If you do not do this, but later answer the question about large files when Configure asks you, you may get a configuration that cannot be compiled, or that does not function as expected. =head2 Threaded Perl on HP-UX It is possible to compile a version of threaded Perl on any version of HP-UX before 10.30, but it is strongly suggested that you be running on HP-UX 11.00 at least. To compile Perl with threads, add -Dusethreads to the arguments of Configure. Verify that the -D_POSIX_C_SOURCE=199506L compiler flag is automatically added to the list of flags. Also make sure that -lpthread is listed before -lc in the list of libraries to link Perl with. The hints provided for HP-UX during Configure will try very hard to get this right for you. HP-UX versions before 10.30 require a separate installation of a POSIX threads library package. Two examples are the HP DCE package, available on "HP-UX Hardware Extensions 3.0, Install and Core OS, Release 10.20, April 1999 (B3920-13941)" or the Freely available PTH package, available on H.Merijn's site (L<http://mirrors.develooper.com/hpux/>). The use of PTH will be unsupported in perl-5.12 and up and is rather buggy in 5.11.x. If you are going to use the HP DCE package, the library used for threading is /usr/lib/libcma.sl, but there have been multiple updates of that library over time. Perl will build with the first version, but it will not pass the test suite. Older Oracle versions might be a compelling reason not to update that library, otherwise please find a newer version in one of the following patches: PHSS_19739, PHSS_20608, or PHSS_23672 reformatted output: d3:/usr/lib 106 > what libcma-*.1 libcma-00000.1: HP DCE/9000 1.5 Module: libcma.sl (Export) Date: Apr 29 1996 22:11:24 libcma-19739.1: HP DCE/9000 1.5 PHSS_19739-40 Module: libcma.sl (Export) Date: Sep 4 1999 01:59:07 libcma-20608.1: HP DCE/9000 1.5 PHSS_20608 Module: libcma.1 (Export) Date: Dec 8 1999 18:41:23 libcma-23672.1: HP DCE/9000 1.5 PHSS_23672 Module: libcma.1 (Export) Date: Apr 9 2001 10:01:06 d3:/usr/lib 107 > If you choose for the PTH package, use swinstall to install pth in the default location (/opt/pth), and then make symbolic links to the libraries from /usr/lib # cd /usr/lib # ln -s /opt/pth/lib/libpth* . For building perl to support Oracle, it needs to be linked with libcl and libpthread. So even if your perl is an unthreaded build, these libraries might be required. See "Oracle on HP-UX" below. =head2 64-bit Perl on HP-UX Beginning with HP-UX 11.00, programs compiled under HP-UX can take advantage of the LP64 programming environment (LP64 means Longs and Pointers are 64 bits wide), in which scalar variables will be able to hold numbers larger than 2^32 with complete precision. Perl has proven to be consistent and reliable in 64bit mode since 5.8.1 on all HP-UX 11.xx. As of the date of this document, Perl is fully 64-bit compliant on HP-UX 11.00 and up for both cc- and gcc builds. If you are about to build a 64-bit perl with GNU gcc, please read the gcc section carefully. Should a user have the need for compiling Perl in the LP64 environment, use the -Duse64bitall flag to Configure. This will force Perl to be compiled in a pure LP64 environment (with the +DD64 flag for HP C-ANSI-C, with no additional options for GNU gcc 64-bit on PA-RISC, and with -mlp64 for GNU gcc on Itanium). If you want to compile Perl using gcc, you will have to get a version of the compiler that supports 64-bit operations.) You can also use the -Duse64bitint flag to Configure. Although there are some minor differences between compiling Perl with this flag versus the -Duse64bitall flag, they should not be noticeable from a Perl user's perspective. When configuring -Duse64bitint using a 64bit gcc on a pa-risc architecture, -Duse64bitint is silently promoted to -Duse64bitall. In both cases, it is strongly recommended that you use these flags when you run Configure. If you do not use do this, but later answer the questions about 64-bit numbers when Configure asks you, you may get a configuration that cannot be compiled, or that does not function as expected. =head2 Oracle on HP-UX Using perl to connect to Oracle databases through DBI and DBD::Oracle has caused a lot of people many headaches. Read README.hpux in the DBD::Oracle for much more information. The reason to mention it here is that Oracle requires a perl built with libcl and libpthread, the latter even when perl is build without threads. Building perl using all defaults, but still enabling to build DBD::Oracle later on can be achieved using Configure -A prepend:libswanted='cl pthread ' ... Do not forget the space before the trailing quote. Also note that this does not (yet) work with all configurations, it is known to fail with 64-bit versions of GCC. =head2 GDBM and Threads on HP-UX If you attempt to compile Perl with (POSIX) threads on an 11.X system and also link in the GDBM library, then Perl will immediately core dump when it starts up. The only workaround at this point is to relink the GDBM library under 11.X, then relink it into Perl. the error might show something like: Pthread internal error: message: __libc_reinit() failed, file: ../pthreads/pthread.c, line: 1096 Return Pointer is 0xc082bf33 sh: 5345 Quit(coredump) and Configure will give up. =head2 NFS filesystems and utime(2) on HP-UX If you are compiling Perl on a remotely-mounted NFS filesystem, the test io/fs.t may fail on test #18. This appears to be a bug in HP-UX and no fix is currently available. =head2 HP-UX Kernel Parameters (maxdsiz) for Compiling Perl By default, HP-UX comes configured with a maximum data segment size of 64MB. This is too small to correctly compile Perl with the maximum optimization levels. You can increase the size of the maxdsiz kernel parameter through the use of SAM. When using the GUI version of SAM, click on the Kernel Configuration icon, then the Configurable Parameters icon. Scroll down and select the maxdsiz line. From the Actions menu, select the Modify Configurable Parameter item. Insert the new formula into the Formula/Value box. Then follow the instructions to rebuild your kernel and reboot your system. In general, a value of 256MB (or "256*1024*1024") is sufficient for Perl to compile at maximum optimization. =head1 nss_delete core dump from op/pwent or op/grent You may get a bus error core dump from the op/pwent or op/grent tests. If compiled with -g you will see a stack trace much like the following: #0 0xc004216c in () from /usr/lib/libc.2 #1 0xc00d7550 in __nss_src_state_destr () from /usr/lib/libc.2 #2 0xc00d7768 in __nss_src_state_destr () from /usr/lib/libc.2 #3 0xc00d78a8 in nss_delete () from /usr/lib/libc.2 #4 0xc01126d8 in endpwent () from /usr/lib/libc.2 #5 0xd1950 in Perl_pp_epwent () from ./perl #6 0x94d3c in Perl_runops_standard () from ./perl #7 0x23728 in S_run_body () from ./perl #8 0x23428 in perl_run () from ./perl #9 0x2005c in main () from ./perl The key here is the C<nss_delete> call. One workaround for this bug seems to be to create add to the file F</etc/nsswitch.conf> (at least) the following lines group: files passwd: files Whether you are using NIS does not matter. Amazingly enough, the same bug also affects Solaris. =head1 error: pasting ")" and "l" does not give a valid preprocessing token There seems to be a broken system header file in HP-UX 11.00 that breaks perl building in 32bit mode with GNU gcc-4.x causing this error. The same file for HP-UX 11.11 (even though the file is older) does not show this failure, and has the correct definition, so the best fix is to patch the header to match: --- /usr/include/inttypes.h 2001-04-20 18:42:14 +0200 +++ /usr/include/inttypes.h 2000-11-14 09:00:00 +0200 @@ -72,7 +72,7 @@ #define UINT32_C(__c) __CONCAT_U__(__c) #else /* __LP64 */ #define INT32_C(__c) __CONCAT__(__c,l) -#define UINT32_C(__c) __CONCAT__(__CONCAT_U__(__c),l) +#define UINT32_C(__c) __CONCAT__(__c,ul) #endif /* __LP64 */ #define INT64_C(__c) __CONCAT_L__(__c,l) =head1 Redeclaration of "sendpath" with a different storage class specifier The following compilation warnings may happen in HP-UX releases earlier than 11.31 but are harmless: cc: "/usr/include/sys/socket.h", line 535: warning 562: Redeclaration of "sendfile" with a different storage class specifier: "sendfile" will have internal linkage. cc: "/usr/include/sys/socket.h", line 536: warning 562: Redeclaration of "sendpath" with a different storage class specifier: "sendpath" will have internal linkage. They seem to be caused by broken system header files, and also other open source projects are seeing them. The following HP-UX patches should make the warnings go away: CR JAGae12001: PHNE_27063 Warning 562 on sys/socket.h due to redeclaration of prototypes CR JAGae16787: Warning 562 from socket.h sendpath/sendfile -D_FILEFFSET_BITS=64 CR JAGae73470 (11.23) ER: Compiling socket.h with cc -D_FILEFFSET_BITS=64 warning 267/562 =head1 Miscellaneous HP-UX 11 Y2K patch "Y2K-1100 B.11.00.B0125 HP-UX Core OS Year 2000 Patch Bundle" has been reported to break the io/fs test #18 which tests whether utime() can change timestamps. The Y2K patch seems to break utime() so that over NFS the timestamps do not get changed (on local filesystems utime() still works). This has probably been fixed on your system by now. =head1 AUTHOR H.Merijn Brand <h.m.brand@xs4all.nl> Jeff Okamoto <okamoto@corp.hp.com> With much assistance regarding shared libraries from Marc Sabatella. =cut PK �=�[��u u perldbmfilter.podnu �[��� =head1 NAME perldbmfilter - Perl DBM Filters =head1 SYNOPSIS $db = tie %hash, 'DBM', ... $old_filter = $db->filter_store_key ( sub { ... } ); $old_filter = $db->filter_store_value( sub { ... } ); $old_filter = $db->filter_fetch_key ( sub { ... } ); $old_filter = $db->filter_fetch_value( sub { ... } ); =head1 DESCRIPTION The four C<filter_*> methods shown above are available in all the DBM modules that ship with Perl, namely DB_File, GDBM_File, NDBM_File, ODBM_File and SDBM_File. Each of the methods works identically, and is used to install (or uninstall) a single DBM Filter. The only difference between them is the place that the filter is installed. To summarise: =over 5 =item B<filter_store_key> If a filter has been installed with this method, it will be invoked every time you write a key to a DBM database. =item B<filter_store_value> If a filter has been installed with this method, it will be invoked every time you write a value to a DBM database. =item B<filter_fetch_key> If a filter has been installed with this method, it will be invoked every time you read a key from a DBM database. =item B<filter_fetch_value> If a filter has been installed with this method, it will be invoked every time you read a value from a DBM database. =back You can use any combination of the methods from none to all four. All filter methods return the existing filter, if present, or C<undef> if not. To delete a filter pass C<undef> to it. =head2 The Filter When each filter is called by Perl, a local copy of C<$_> will contain the key or value to be filtered. Filtering is achieved by modifying the contents of C<$_>. The return code from the filter is ignored. =head2 An Example: the NULL termination problem. DBM Filters are useful for a class of problems where you I<always> want to make the same transformation to all keys, all values or both. For example, consider the following scenario. You have a DBM database that you need to share with a third-party C application. The C application assumes that I<all> keys and values are NULL terminated. Unfortunately when Perl writes to DBM databases it doesn't use NULL termination, so your Perl application will have to manage NULL termination itself. When you write to the database you will have to use something like this: $hash{"$key\0"} = "$value\0"; Similarly the NULL needs to be taken into account when you are considering the length of existing keys/values. It would be much better if you could ignore the NULL terminations issue in the main application code and have a mechanism that automatically added the terminating NULL to all keys and values whenever you write to the database and have them removed when you read from the database. As I'm sure you have already guessed, this is a problem that DBM Filters can fix very easily. use strict; use warnings; use SDBM_File; use Fcntl; my %hash; my $filename = "filt"; unlink $filename; my $db = tie(%hash, 'SDBM_File', $filename, O_RDWR|O_CREAT, 0640) or die "Cannot open $filename: $!\n"; # Install DBM Filters $db->filter_fetch_key ( sub { s/\0$// } ); $db->filter_store_key ( sub { $_ .= "\0" } ); $db->filter_fetch_value( sub { no warnings 'uninitialized'; s/\0$// } ); $db->filter_store_value( sub { $_ .= "\0" } ); $hash{"abc"} = "def"; my $a = $hash{"ABC"}; # ... undef $db; untie %hash; The code above uses SDBM_File, but it will work with any of the DBM modules. Hopefully the contents of each of the filters should be self-explanatory. Both "fetch" filters remove the terminating NULL, and both "store" filters add a terminating NULL. =head2 Another Example: Key is a C int. Here is another real-life example. By default, whenever Perl writes to a DBM database it always writes the key and value as strings. So when you use this: $hash{12345} = "something"; the key 12345 will get stored in the DBM database as the 5 byte string "12345". If you actually want the key to be stored in the DBM database as a C int, you will have to use C<pack> when writing, and C<unpack> when reading. Here is a DBM Filter that does it: use strict; use warnings; use DB_File; my %hash; my $filename = "filt"; unlink $filename; my $db = tie %hash, 'DB_File', $filename, O_CREAT|O_RDWR, 0666, $DB_HASH or die "Cannot open $filename: $!\n"; $db->filter_fetch_key ( sub { $_ = unpack("i", $_) } ); $db->filter_store_key ( sub { $_ = pack ("i", $_) } ); $hash{123} = "def"; # ... undef $db; untie %hash; The code above uses DB_File, but again it will work with any of the DBM modules. This time only two filters have been used; we only need to manipulate the contents of the key, so it wasn't necessary to install any value filters. =head1 SEE ALSO L<DB_File>, L<GDBM_File>, L<NDBM_File>, L<ODBM_File> and L<SDBM_File>. =head1 AUTHOR Paul Marquess PK �=�[�I�� � perldiag.podnu �[��� =head1 NAME perldiag - various Perl diagnostics =head1 DESCRIPTION These messages are classified as follows (listed in increasing order of desperation): (W) A warning (optional). (D) A deprecation (enabled by default). (S) A severe warning (enabled by default). (F) A fatal error (trappable). (P) An internal error you should never see (trappable). (X) A very fatal error (nontrappable). (A) An alien error message (not generated by Perl). The majority of messages from the first three classifications above (W, D & S) can be controlled using the C<warnings> pragma. If a message can be controlled by the C<warnings> pragma, its warning category is included with the classification letter in the description below. E.g. C<(W closed)> means a warning in the C<closed> category. Optional warnings are enabled by using the C<warnings> pragma or the B<-w> and B<-W> switches. Warnings may be captured by setting C<$SIG{__WARN__}> to a reference to a routine that will be called on each warning instead of printing it. See L<perlvar>. Severe warnings are always enabled, unless they are explicitly disabled with the C<warnings> pragma or the B<-X> switch. Trappable errors may be trapped using the eval operator. See L<perlfunc/eval>. In almost all cases, warnings may be selectively disabled or promoted to fatal errors using the C<warnings> pragma. See L<warnings>. The messages are in alphabetical order, without regard to upper or lower-case. Some of these messages are generic. Spots that vary are denoted with a %s or other printf-style escape. These escapes are ignored by the alphabetical order, as are all characters other than letters. To look up your message, just ignore anything that is not a letter. =over 4 =item accept() on closed socket %s (W closed) You tried to do an accept on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/accept>. =item Aliasing via reference is experimental (S experimental::refaliasing) This warning is emitted if you use a reference constructor on the left-hand side of an assignment to alias one variable to another. Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version: no warnings "experimental::refaliasing"; use feature "refaliasing"; \$x = \$y; =item Allocation too large: %x (X) You can't allocate more than 64K on an MS-DOS machine. =item '%c' allowed only after types %s in %s (F) The modifiers '!', '<' and '>' are allowed in pack() or unpack() only after certain types. See L<perlfunc/pack>. =item alpha->numify() is lossy (W numeric) An alpha version can not be numified without losing information. =item Ambiguous call resolved as CORE::%s(), qualify as such or use & (W ambiguous) A subroutine you have declared has the same name as a Perl keyword, and you have used the name without qualification for calling one or the other. Perl decided to call the builtin because the subroutine is not imported. To force interpretation as a subroutine call, either put an ampersand before the subroutine name, or qualify the name with its package. Alternatively, you can import the subroutine (or pretend that it's imported with the C<use subs> pragma). To silently interpret it as the Perl operator, use the C<CORE::> prefix on the operator (e.g. C<CORE::log($x)>) or declare the subroutine to be an object method (see L<perlsub/"Subroutine Attributes"> or L<attributes>). =item Ambiguous range in transliteration operator (F) You wrote something like C<tr/a-z-0//> which doesn't mean anything at all. To include a C<-> character in a transliteration, put it either first or last. (In the past, C<tr/a-z-0//> was synonymous with C<tr/a-y//>, which was probably not what you would have expected.) =item Ambiguous use of %s resolved as %s (S ambiguous) You said something that may not be interpreted the way you thought. Normally it's pretty easy to disambiguate it by supplying a missing quote, operator, parenthesis pair or declaration. =item Ambiguous use of -%s resolved as -&%s() (S ambiguous) You wrote something like C<-foo>, which might be the string C<"-foo">, or a call to the function C<foo>, negated. If you meant the string, just write C<"-foo">. If you meant the function call, write C<-foo()>. =item Ambiguous use of %c resolved as operator %c (S ambiguous) C<%>, C<&>, and C<*> are both infix operators (modulus, bitwise and, and multiplication) I<and> initial special characters (denoting hashes, subroutines and typeglobs), and you said something like C<*foo * foo> that might be interpreted as either of them. We assumed you meant the infix operator, but please try to make it more clear -- in the example given, you might write C<*foo * foo()> if you really meant to multiply a glob by the result of calling a function. =item Ambiguous use of %c{%s} resolved to %c%s (W ambiguous) You wrote something like C<@{foo}>, which might be asking for the variable C<@foo>, or it might be calling a function named foo, and dereferencing it as an array reference. If you wanted the variable, you can just write C<@foo>. If you wanted to call the function, write C<@{foo()}> ... or you could just not have a variable and a function with the same name, and save yourself a lot of trouble. =item Ambiguous use of %c{%s[...]} resolved to %c%s[...] =item Ambiguous use of %c{%s{...}} resolved to %c%s{...} (W ambiguous) You wrote something like C<${foo[2]}> (where foo represents the name of a Perl keyword), which might be looking for element number 2 of the array named C<@foo>, in which case please write C<$foo[2]>, or you might have meant to pass an anonymous arrayref to the function named foo, and then do a scalar deref on the value it returns. If you meant that, write C<${foo([2])}>. In regular expressions, the C<${foo[2]}> syntax is sometimes necessary to disambiguate between array subscripts and character classes. C</$length[2345]/>, for instance, will be interpreted as C<$length> followed by the character class C<[2345]>. If an array subscript is what you want, you can avoid the warning by changing C</${length[2345]}/> to the unsightly C</${\$length[2345]}/>, by renaming your array to something that does not coincide with a built-in keyword, or by simply turning off warnings with C<no warnings 'ambiguous';>. =item '|' and '<' may not both be specified on command line (F) An error peculiar to VMS. Perl does its own command line redirection, and found that STDIN was a pipe, and that you also tried to redirect STDIN using '<'. Only one STDIN stream to a customer, please. =item '|' and '>' may not both be specified on command line (F) An error peculiar to VMS. Perl does its own command line redirection, and thinks you tried to redirect stdout both to a file and into a pipe to another command. You need to choose one or the other, though nothing's stopping you from piping into a program or Perl script which 'splits' output into two streams, such as open(OUT,">$ARGV[0]") or die "Can't write to $ARGV[0]: $!"; while (<STDIN>) { print; print OUT; } close OUT; =item Applying %s to %s will act on scalar(%s) (W misc) The pattern match (C<//>), substitution (C<s///>), and transliteration (C<tr///>) operators work on scalar values. If you apply one of them to an array or a hash, it will convert the array or hash to a scalar value (the length of an array, or the population info of a hash) and then work on that scalar value. This is probably not what you meant to do. See L<perlfunc/grep> and L<perlfunc/map> for alternatives. =item Arg too short for msgsnd (F) msgsnd() requires a string at least as long as sizeof(long). =item Argument "%s" isn't numeric%s (W numeric) The indicated string was fed as an argument to an operator that expected a numeric value instead. If you're fortunate the message will identify which operator was so unfortunate. Note that for the C<Inf> and C<NaN> (infinity and not-a-number) the definition of "numeric" is somewhat unusual: the strings themselves (like "Inf") are considered numeric, and anything following them is considered non-numeric. =item Argument list not closed for PerlIO layer "%s" (W layer) When pushing a layer with arguments onto the Perl I/O system you forgot the ) that closes the argument list. (Layers take care of transforming data between external and internal representations.) Perl stopped parsing the layer list at this point and did not attempt to push this layer. If your program didn't explicitly request the failing operation, it may be the result of the value of the environment variable PERLIO. =item Argument "%s" treated as 0 in increment (++) (W numeric) The indicated string was fed as an argument to the C<++> operator which expects either a number or a string matching C</^[a-zA-Z]*[0-9]*\z/>. See L<perlop/Auto-increment and Auto-decrement> for details. =item Array passed to stat will be coerced to a scalar%s (W syntax) You called stat() on an array, but the array will be coerced to a scalar - the number of elements in the array. =item A signature parameter must start with '$', '@' or '%' (F) Each subroutine signature parameter declaration must start with a valid sigil; for example: sub foo ($a, $, $b = 1, @c) {} =item A slurpy parameter may not have a default value (F) Only scalar subroutine signature parameters may have a default value; for example: sub foo ($a = 1) {} # legal sub foo (@a = (1)) {} # invalid sub foo (%a = (a => b)) {} # invalid =item assertion botched: %s (X) The malloc package that comes with Perl had an internal failure. =item Assertion %s failed: file "%s", line %d (X) A general assertion failed. The file in question must be examined. =item Assigned value is not a reference (F) You tried to assign something that was not a reference to an lvalue reference (e.g., C<\$x = $y>). If you meant to make $x an alias to $y, use C<\$x = \$y>. =item Assigned value is not %s reference (F) You tried to assign a reference to a reference constructor, but the two references were not of the same type. You cannot alias a scalar to an array, or an array to a hash; the two types must match. \$x = \@y; # error \@x = \%y; # error $y = []; \$x = $y; # error; did you mean \$y? =item Assigning non-zero to $[ is no longer possible (F) When the "array_base" feature is disabled (e.g., and under C<use v5.16;>, and as of Perl 5.30) the special variable C<$[>, which is deprecated, is now a fixed zero value. =item Assignment to both a list and a scalar (F) If you assign to a conditional operator, the 2nd and 3rd arguments must either both be scalars or both be lists. Otherwise Perl won't know which context to supply to the right side. =item Assuming NOT a POSIX class since %s in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You had something like these: [[:alnum]] [[:digit:xyz] They look like they might have been meant to be the POSIX classes C<[:alnum:]> or C<[:digit:]>. If so, they should be written: [[:alnum:]] [[:digit:]xyz] Since these aren't legal POSIX class specifications, but are legal bracketed character classes, Perl treats them as the latter. In the first example, it matches the characters C<":">, C<"[">, C<"a">, C<"l">, C<"m">, C<"n">, and C<"u">. If these weren't meant to be POSIX classes, this warning message is spurious, and can be suppressed by reordering things, such as [[al:num]] or [[:munla]] =item <> at require-statement should be quotes (F) You wrote C<< require <file> >> when you should have written C<require 'file'>. =item Attempt to access disallowed key '%s' in a restricted hash (F) The failing code has attempted to get or set a key which is not in the current set of allowed keys of a restricted hash. =item Attempt to bless into a freed package (F) You wrote C<bless $foo> with one argument after somehow causing the current package to be freed. Perl cannot figure out what to do, so it throws up its hands in despair. =item Attempt to bless into a reference (F) The CLASSNAME argument to the bless() operator is expected to be the name of the package to bless the resulting object into. You've supplied instead a reference to something: perhaps you wrote bless $self, $proto; when you intended bless $self, ref($proto) || $proto; If you actually want to bless into the stringified version of the reference supplied, you need to stringify it yourself, for example by: bless $self, "$proto"; =item Attempt to clear deleted array (S debugging) An array was assigned to when it was being freed. Freed values are not supposed to be visible to Perl code. This can also happen if XS code calls C<av_clear> from a custom magic callback on the array. =item Attempt to delete disallowed key '%s' from a restricted hash (F) The failing code attempted to delete from a restricted hash a key which is not in its key set. =item Attempt to delete readonly key '%s' from a restricted hash (F) The failing code attempted to delete a key whose value has been declared readonly from a restricted hash. =item Attempt to free non-arena SV: 0x%x (S internal) All SV objects are supposed to be allocated from arenas that will be garbage collected on exit. An SV was discovered to be outside any of those arenas. =item Attempt to free nonexistent shared string '%s'%s (S internal) Perl maintains a reference-counted internal table of strings to optimize the storage and access of hash keys and other strings. This indicates someone tried to decrement the reference count of a string that can no longer be found in the table. =item Attempt to free temp prematurely: SV 0x%x (S debugging) Mortalized values are supposed to be freed by the free_tmps() routine. This indicates that something else is freeing the SV before the free_tmps() routine gets a chance, which means that the free_tmps() routine will be freeing an unreferenced scalar when it does try to free it. =item Attempt to free unreferenced glob pointers (S internal) The reference counts got screwed up on symbol aliases. =item Attempt to free unreferenced scalar: SV 0x%x (S internal) Perl went to decrement the reference count of a scalar to see if it would go to 0, and discovered that it had already gone to 0 earlier, and should have been freed, and in fact, probably was freed. This could indicate that SvREFCNT_dec() was called too many times, or that SvREFCNT_inc() was called too few times, or that the SV was mortalized when it shouldn't have been, or that memory has been corrupted. =item Attempt to pack pointer to temporary value (W pack) You tried to pass a temporary value (like the result of a function, or a computed expression) to the "p" pack() template. This means the result contains a pointer to a location that could become invalid anytime, even before the end of the current statement. Use literals or global values as arguments to the "p" pack() template to avoid this warning. =item Attempt to reload %s aborted. (F) You tried to load a file with C<use> or C<require> that failed to compile once already. Perl will not try to compile this file again unless you delete its entry from %INC. See L<perlfunc/require> and L<perlvar/%INC>. =item Attempt to set length of freed array (W misc) You tried to set the length of an array which has been freed. You can do this by storing a reference to the scalar representing the last index of an array and later assigning through that reference. For example $r = do {my @a; \$#a}; $$r = 503 =item Attempt to use reference as lvalue in substr (W substr) You supplied a reference as the first argument to substr() used as an lvalue, which is pretty strange. Perhaps you forgot to dereference it first. See L<perlfunc/substr>. =item Attribute prototype(%s) discards earlier prototype attribute in same sub (W misc) A sub was declared as sub foo : prototype(A) : prototype(B) {}, for example. Since each sub can only have one prototype, the earlier declaration(s) are discarded while the last one is applied. =item av_reify called on tied array (S debugging) This indicates that something went wrong and Perl got I<very> confused about C<@_> or C<@DB::args> being tied. =item Bad arg length for %s, is %u, should be %d (F) You passed a buffer of the wrong size to one of msgctl(), semctl() or shmctl(). In C parlance, the correct sizes are, respectively, S<sizeof(struct msqid_ds *)>, S<sizeof(struct semid_ds *)>, and S<sizeof(struct shmid_ds *)>. =item Bad evalled substitution pattern (F) You've used the C</e> switch to evaluate the replacement for a substitution, but perl found a syntax error in the code to evaluate, most likely an unexpected right brace '}'. =item Bad filehandle: %s (F) A symbol was passed to something wanting a filehandle, but the symbol has no filehandle associated with it. Perhaps you didn't do an open(), or did it in another package. =item Bad free() ignored (S malloc) An internal routine called free() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting environment variable C<PERL_BADFREE> to 0. This message can be seen quite often with DB_File on systems with "hard" dynamic linking, like C<AIX> and C<OS/2>. It is a bug of C<Berkeley DB> which is left unnoticed if C<DB> uses I<forgiving> system malloc(). =item Bad hash (P) One of the internal hash routines was passed a null HV pointer. =item Badly placed ()'s (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item Bad name after %s (F) You started to name a symbol by using a package prefix, and then didn't finish the symbol. In particular, you can't interpolate outside of quotes, so $var = 'myvar'; $sym = mypack::$var; is not the same as $var = 'myvar'; $sym = "mypack::$var"; =item Bad plugin affecting keyword '%s' (F) An extension using the keyword plugin mechanism violated the plugin API. =item Bad realloc() ignored (S malloc) An internal routine called realloc() on something that had never been malloc()ed in the first place. Mandatory, but can be disabled by setting the environment variable C<PERL_BADFREE> to 1. =item Bad symbol for array (P) An internal request asked to add an array entry to something that wasn't a symbol table entry. =item Bad symbol for dirhandle (P) An internal request asked to add a dirhandle entry to something that wasn't a symbol table entry. =item Bad symbol for filehandle (P) An internal request asked to add a filehandle entry to something that wasn't a symbol table entry. =item Bad symbol for hash (P) An internal request asked to add a hash entry to something that wasn't a symbol table entry. =item Bad symbol for scalar (P) An internal request asked to add a scalar entry to something that wasn't a symbol table entry. =item Bareword found in conditional (W bareword) The compiler found a bareword where it expected a conditional, which often indicates that an || or && was parsed as part of the last argument of the previous construct, for example: open FOO || die; It may also indicate a misspelled constant that has been interpreted as a bareword: use constant TYPO => 1; if (TYOP) { print "foo" } The C<strict> pragma is useful in avoiding such errors. =item Bareword in require contains "%s" =item Bareword in require maps to disallowed filename "%s" =item Bareword in require maps to empty filename (F) The bareword form of require has been invoked with a filename which could not have been generated by a valid bareword permitted by the parser. You shouldn't be able to get this error from Perl code, but XS code may throw it if it passes an invalid module name to C<Perl_load_module>. =item Bareword in require must not start with a double-colon: "%s" (F) In C<require Bare::Word>, the bareword is not allowed to start with a double-colon. Write C<require ::Foo::Bar> as C<require Foo::Bar> instead. =item Bareword "%s" not allowed while "strict subs" in use (F) With "strict subs" in use, a bareword is only allowed as a subroutine identifier, in curly brackets or to the left of the "=>" symbol. Perhaps you need to predeclare a subroutine? =item Bareword "%s" refers to nonexistent package (W bareword) You used a qualified bareword of the form C<Foo::>, but the compiler saw no other uses of that namespace before that point. Perhaps you need to predeclare a package? =item BEGIN failed--compilation aborted (F) An untrapped exception was raised while executing a BEGIN subroutine. Compilation stops immediately and the interpreter is exited. =item BEGIN not safe after errors--compilation aborted (F) Perl found a C<BEGIN {}> subroutine (or a C<use> directive, which implies a C<BEGIN {}>) after one or more compilation errors had already occurred. Since the intended environment for the C<BEGIN {}> could not be guaranteed (due to the errors), and since subsequent code likely depends on its correct operation, Perl just gave up. =item \%d better written as $%d (W syntax) Outside of patterns, backreferences live on as variables. The use of backslashes is grandfathered on the right-hand side of a substitution, but stylistically it's better to use the variable form because other Perl programmers will expect it, and it works better if there are more than 9 backreferences. =item Binary number > 0b11111111111111111111111111111111 non-portable (W portable) The binary number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. =item bind() on closed socket %s (W closed) You tried to do a bind on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/bind>. =item binmode() on closed filehandle %s (W unopened) You tried binmode() on a filehandle that was never opened. Check your control flow and number of arguments. =item Bit vector size > 32 non-portable (W portable) Using bit vector sizes larger than 32 is non-portable. =item Bizarre copy of %s (P) Perl detected an attempt to copy an internal value that is not copiable. =item Bizarre SvTYPE [%d] (P) When starting a new thread or returning values from a thread, Perl encountered an invalid data type. =item Both or neither range ends should be Unicode in regex; marked by S<<-- HERE> in m/%s/ (W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>) In a bracketed character class in a regular expression pattern, you had a range which has exactly one end of it specified using C<\N{}>, and the other end is specified using a non-portable mechanism. Perl treats the range as a Unicode range, that is, all the characters in it are considered to be the Unicode characters, and which may be different code points on some platforms Perl runs on. For example, C<[\N{U+06}-\x08]> is treated as if you had instead said C<[\N{U+06}-\N{U+08}]>, that is it matches the characters whose code points in Unicode are 6, 7, and 8. But that C<\x08> might indicate that you meant something different, so the warning gets raised. =item Buffer overflow in prime_env_iter: %s (W internal) A warning peculiar to VMS. While Perl was preparing to iterate over %ENV, it encountered a logical name or symbol definition which was too long, so it was truncated to the string shown. =item Callback called exit (F) A subroutine invoked from an external package via call_sv() exited by calling exit. =item %s() called too early to check prototype (W prototype) You've called a function that has a prototype before the parser saw a definition or declaration for it, and Perl could not check that the call conforms to the prototype. You need to either add an early prototype declaration for the subroutine in question, or move the subroutine definition ahead of the call to get proper prototype checking. Alternatively, if you are certain that you're calling the function correctly, you may put an ampersand before the name to avoid the warning. See L<perlsub>. =item Cannot chr %f (F) You passed an invalid number (like an infinity or not-a-number) to C<chr>. =item Cannot complete in-place edit of %s: %s (F) Your perl script appears to have changed directory while performing an in-place edit of a file specified by a relative path, and your system doesn't include the directory relative POSIX functions needed to handle that. =item Cannot compress %f in pack (F) You tried compressing an infinity or not-a-number as an unsigned integer with BER, which makes no sense. =item Cannot compress integer in pack (F) An argument to pack("w",...) was too large to compress. The BER compressed integer format can only be used with positive integers, and you attempted to compress a very large number (> 1e308). See L<perlfunc/pack>. =item Cannot compress negative numbers in pack (F) An argument to pack("w",...) was negative. The BER compressed integer format can only be used with positive integers. See L<perlfunc/pack>. =item Cannot convert a reference to %s to typeglob (F) You manipulated Perl's symbol table directly, stored a reference in it, then tried to access that symbol via conventional Perl syntax. The access triggers Perl to autovivify that typeglob, but it there is no legal conversion from that type of reference to a typeglob. =item Cannot copy to %s (P) Perl detected an attempt to copy a value to an internal type that cannot be directly assigned to. =item Cannot find encoding "%s" (S io) You tried to apply an encoding that did not exist to a filehandle, either with open() or binmode(). =item Cannot open %s as a dirhandle: it is already open as a filehandle (F) You tried to use opendir() to associate a dirhandle to a symbol (glob or scalar) that already holds a filehandle. Since this idiom might render your code confusing, it was deprecated in Perl 5.10. As of Perl 5.28, it is a fatal error. =item Cannot open %s as a filehandle: it is already open as a dirhandle (F) You tried to use open() to associate a filehandle to a symbol (glob or scalar) that already holds a dirhandle. Since this idiom might render your code confusing, it was deprecated in Perl 5.10. As of Perl 5.28, it is a fatal error. =item Cannot pack %f with '%c' (F) You tried converting an infinity or not-a-number to an integer, which makes no sense. =item Cannot printf %f with '%c' (F) You tried printing an infinity or not-a-number as a character (%c), which makes no sense. Maybe you meant '%s', or just stringifying it? =item Cannot set tied @DB::args (F) C<caller> tried to set C<@DB::args>, but found it tied. Tying C<@DB::args> is not supported. (Before this error was added, it used to crash.) =item Cannot tie unreifiable array (P) You somehow managed to call C<tie> on an array that does not keep a reference count on its arguments and cannot be made to do so. Such arrays are not even supposed to be accessible to Perl code, but are only used internally. =item Cannot yet reorder sv_vcatpvfn() arguments from va_list (F) Some XS code tried to use C<sv_vcatpvfn()> or a related function with a format string that specifies explicit indexes for some of the elements, and using a C-style variable-argument list (a C<va_list>). This is not currently supported. XS authors wanting to do this must instead construct a C array of C<SV*> scalars containing the arguments. =item Can only compress unsigned integers in pack (F) An argument to pack("w",...) was not an integer. The BER compressed integer format can only be used with positive integers, and you attempted to compress something else. See L<perlfunc/pack>. =item Can't bless non-reference value (F) Only hard references may be blessed. This is how Perl "enforces" encapsulation of objects. See L<perlobj>. =item Can't "break" in a loop topicalizer (F) You called C<break>, but you're in a C<foreach> block rather than a C<given> block. You probably meant to use C<next> or C<last>. =item Can't "break" outside a given block (F) You called C<break>, but you're not inside a C<given> block. =item Can't call method "%s" on an undefined value (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an undefined value. Something like this will reproduce the error: $BADREF = undef; process $BADREF 1,2,3; $BADREF->process(1,2,3); =item Can't call method "%s" on unblessed reference (F) A method call must know in what package it's supposed to run. It ordinarily finds this out from the object reference you supply, but you didn't supply an object reference in this case. A reference isn't an object reference until it has been blessed. See L<perlobj>. =item Can't call method "%s" without a package or object reference (F) You used the syntax of a method call, but the slot filled by the object reference or package name contains an expression that returns a defined value which is neither an object reference nor a package name. Something like this will reproduce the error: $BADREF = 42; process $BADREF 1,2,3; $BADREF->process(1,2,3); =item Can't call mro_isa_changed_in() on anonymous symbol table (P) Perl got confused as to whether a hash was a plain hash or a symbol table hash when trying to update @ISA caches. =item Can't call mro_method_changed_in() on anonymous symbol table (F) An XS module tried to call C<mro_method_changed_in> on a hash that was not attached to the symbol table. =item Can't chdir to %s (F) You called C<perl -x/foo/bar>, but F</foo/bar> is not a directory that you can chdir to, possibly because it doesn't exist. =item Can't check filesystem of script "%s" for nosuid (P) For some reason you can't check the filesystem of the script for nosuid. =item Can't coerce %s to %s in %s (F) Certain types of SVs, in particular real symbol table entries (typeglobs), can't be forced to stop being what they are. So you can't say things like: *foo += 1; You CAN say $foo = *foo; $foo += 1; but then $foo no longer contains a glob. =item Can't "continue" outside a when block (F) You called C<continue>, but you're not inside a C<when> or C<default> block. =item Can't create pipe mailbox (P) An error peculiar to VMS. The process is suffering from exhausted quotas or other plumbing problems. =item Can't declare %s in "%s" (F) Only scalar, array, and hash variables may be declared as "my", "our" or "state" variables. They must have ordinary identifiers as names. =item Can't "default" outside a topicalizer (F) You have used a C<default> block that is neither inside a C<foreach> loop nor a C<given> block. (Note that this error is issued on exit from the C<default> block, so you won't get the error if you use an explicit C<continue>.) =item Can't determine class of operator %s, assuming BASEOP (S) This warning indicates something wrong in the internals of perl. Perl was trying to find the class (e.g. LISTOP) of a particular OP, and was unable to do so. This is likely to be due to a bug in the perl internals, or due to a bug in XS code which manipulates perl optrees. =item Can't do inplace edit: %s is not a regular file (S inplace) You tried to use the B<-i> switch on a special file, such as a file in /dev, a FIFO or an uneditable directory. The file was ignored. =item Can't do inplace edit on %s: %s (S inplace) The creation of the new file failed for the indicated reason. =item Can't do inplace edit: %s would not be unique (S inplace) Your filesystem does not support filenames longer than 14 characters and Perl was unable to create a unique filename during inplace editing with the B<-i> switch. The file was ignored. =item Can't do %s("%s") on non-UTF-8 locale; resolved to "%s". (W locale) You are 1) running under "C<use locale>"; 2) the current locale is not a UTF-8 one; 3) you tried to do the designated case-change operation on the specified Unicode character; and 4) the result of this operation would mix Unicode and locale rules, which likely conflict. Mixing of different rule types is forbidden, so the operation was not done; instead the result is the indicated value, which is the best available that uses entirely Unicode rules. That turns out to almost always be the original character, unchanged. It is generally a bad idea to mix non-UTF-8 locales and Unicode, and this issue is one of the reasons why. This warning is raised when Unicode rules would normally cause the result of this operation to contain a character that is in the range specified by the locale, 0..255, and hence is subject to the locale's rules, not Unicode's. If you are using locale purely for its characteristics related to things like its numeric and time formatting (and not C<LC_CTYPE>), consider using a restricted form of the locale pragma (see L<perllocale/The "use locale" pragma>) like "S<C<use locale ':not_characters'>>". Note that failed case-changing operations done as a result of case-insensitive C</i> regular expression matching will show up in this warning as having the C<fc> operation (as that is what the regular expression engine calls behind the scenes.) =item Can't do waitpid with flags (F) This machine doesn't have either waitpid() or wait4(), so only waitpid() without flags is emulated. =item Can't emulate -%s on #! line (F) The #! line specifies a switch that doesn't make sense at this point. For example, it'd be kind of silly to put a B<-x> on the #! line. =item Can't %s %s-endian %ss on this platform (F) Your platform's byte-order is neither big-endian nor little-endian, or it has a very strange pointer size. Packing and unpacking big- or little-endian floating point values and pointers may not be possible. See L<perlfunc/pack>. =item Can't exec "%s": %s (W exec) A system(), exec(), or piped open call could not execute the named program for the indicated reason. Typical reasons include: the permissions were wrong on the file, the file wasn't found in C<$ENV{PATH}>, the executable in question was compiled for another architecture, or the #! line in a script points to an interpreter that can't be run for similar reasons. (Or maybe your system doesn't support #! at all.) =item Can't exec %s (F) Perl was trying to execute the indicated program for you because that's what the #! line said. If that's not what you wanted, you may need to mention "perl" on the #! line somewhere. =item Can't execute %s (F) You used the B<-S> switch, but the copies of the script to execute found in the PATH did not have correct permissions. =item Can't find an opnumber for "%s" (F) A string of a form C<CORE::word> was given to prototype(), but there is no builtin with the name C<word>. =item Can't find label %s (F) You said to goto a label that isn't mentioned anywhere that it's possible for us to go to. See L<perlfunc/goto>. =item Can't find %s on PATH (F) You used the B<-S> switch, but the script to execute could not be found in the PATH. =item Can't find %s on PATH, '.' not in PATH (F) You used the B<-S> switch, but the script to execute could not be found in the PATH, or at least not with the correct permissions. The script exists in the current directory, but PATH prohibits running it. =item Can't find string terminator %s anywhere before EOF (F) Perl strings can stretch over multiple lines. This message means that the closing delimiter was omitted. Because bracketed quotes count nesting levels, the following is missing its final parenthesis: print q(The character '(' starts a side comment.); If you're getting this error from a here-document, you may have included unseen whitespace before or after your closing tag or there may not be a linebreak after it. A good programmer's editor will have a way to help you find these characters (or lack of characters). See L<perlop> for the full details on here-documents. =item Can't find Unicode property definition "%s" =item Can't find Unicode property definition "%s" in regex; marked by <-- HERE in m/%s/ (F) The named property which you specified via C<\p> or C<\P> is not one known to Perl. Perhaps you misspelled the name? See L<perluniprops/Properties accessible through \p{} and \P{}> for a complete list of available official properties. If it is a L<user-defined property|perlunicode/User-Defined Character Properties> it must have been defined by the time the regular expression is matched. If you didn't mean to use a Unicode property, escape the C<\p>, either by C<\\p> (just the C<\p>) or by C<\Q\p> (the rest of the string, or until C<\E>). =item Can't fork: %s (F) A fatal error occurred while trying to fork while opening a pipeline. =item Can't fork, trying again in 5 seconds (W pipe) A fork in a piped open failed with EAGAIN and will be retried after five seconds. =item Can't get filespec - stale stat buffer? (S) A warning peculiar to VMS. This arises because of the difference between access checks under VMS and under the Unix model Perl assumes. Under VMS, access checks are done by filename, rather than by bits in the stat buffer, so that ACLs and other protections can be taken into account. Unfortunately, Perl assumes that the stat buffer contains all the necessary information, and passes it, instead of the filespec, to the access-checking routine. It will try to retrieve the filespec using the device name and FID present in the stat buffer, but this works only if you haven't made a subsequent call to the CRTL stat() routine, because the device name is overwritten with each call. If this warning appears, the name lookup failed, and the access-checking routine gave up and returned FALSE, just to be conservative. (Note: The access-checking routine knows about the Perl C<stat> operator and file tests, so you shouldn't ever see this warning in response to a Perl command; it arises only if some internal code takes stat buffers lightly.) =item Can't get pipe mailbox device name (P) An error peculiar to VMS. After creating a mailbox to act as a pipe, Perl can't retrieve its name for later use. =item Can't get SYSGEN parameter value for MAXBUF (P) An error peculiar to VMS. Perl asked $GETSYI how big you want your mailbox buffers to be, and didn't get an answer. =item Can't "goto" into a binary or list expression (F) A "goto" statement was executed to jump into the middle of a binary or list expression. You can't get there from here. The reason for this restriction is that the interpreter would get confused as to how many arguments there are, resulting in stack corruption or crashes. This error occurs in cases such as these: goto F; print do { F: }; # Can't jump into the arguments to print goto G; $x + do { G: $y }; # How is + supposed to get its first operand? =item Can't "goto" into a "given" block (F) A "goto" statement was executed to jump into the middle of a C<given> block. You can't get there from here. See L<perlfunc/goto>. =item Can't "goto" into the middle of a foreach loop (F) A "goto" statement was executed to jump into the middle of a foreach loop. You can't get there from here. See L<perlfunc/goto>. =item Can't "goto" out of a pseudo block (F) A "goto" statement was executed to jump out of what might look like a block, except that it isn't a proper block. This usually occurs if you tried to jump out of a sort() block or subroutine, which is a no-no. See L<perlfunc/goto>. =item Can't goto subroutine from an eval-%s (F) The "goto subroutine" call can't be used to jump out of an eval "string" or block. =item Can't goto subroutine from a sort sub (or similar callback) (F) The "goto subroutine" call can't be used to jump out of the comparison sub for a sort(), or from a similar callback (such as the reduce() function in List::Util). =item Can't goto subroutine outside a subroutine (F) The deeply magical "goto subroutine" call can only replace one subroutine call for another. It can't manufacture one out of whole cloth. In general you should be calling it out of only an AUTOLOAD routine anyway. See L<perlfunc/goto>. =item Can't ignore signal CHLD, forcing to default (W signal) Perl has detected that it is being run with the SIGCHLD signal (sometimes known as SIGCLD) disabled. Since disabling this signal will interfere with proper determination of exit status of child processes, Perl has reset the signal to its default value. This situation typically indicates that the parent program under which Perl may be running (e.g. cron) is being very careless. =item Can't kill a non-numeric process ID (F) Process identifiers must be (signed) integers. It is a fatal error to attempt to kill() an undefined, empty-string or otherwise non-numeric process identifier. =item Can't "last" outside a loop block (F) A "last" statement was executed to break out of the current block, except that there's this itty bitty problem called there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See L<perlfunc/last>. =item Can't linearize anonymous symbol table (F) Perl tried to calculate the method resolution order (MRO) of a package, but failed because the package stash has no name. =item Can't load '%s' for module %s (F) The module you tried to load failed to load a dynamic extension. This may either mean that you upgraded your version of perl to one that is incompatible with your old dynamic extensions (which is known to happen between major versions of perl), or (more likely) that your dynamic extension was built against an older version of the library that is installed on your system. You may need to rebuild your old dynamic extensions. =item Can't localize lexical variable %s (F) You used local on a variable name that was previously declared as a lexical variable using "my" or "state". This is not allowed. If you want to localize a package variable of the same name, qualify it with the package name. =item Can't localize through a reference (F) You said something like C<local $$ref>, which Perl can't currently handle, because when it goes to restore the old value of whatever $ref pointed to after the scope of the local() is finished, it can't be sure that $ref will still be a reference. =item Can't locate %s (F) You said to C<do> (or C<require>, or C<use>) a file that couldn't be found. Perl looks for the file in all the locations mentioned in @INC, unless the file name included the full path to the file. Perhaps you need to set the PERL5LIB or PERL5OPT environment variable to say where the extra library is, or maybe the script needs to add the library name to @INC. Or maybe you just misspelled the name of the file. See L<perlfunc/require> and L<lib>. =item Can't locate auto/%s.al in @INC (F) A function (or method) was called in a package which allows autoload, but there is no function to autoload. Most probable causes are a misprint in a function/method name or a failure to C<AutoSplit> the file, say, by doing C<make install>. =item Can't locate loadable object for module %s in @INC (F) The module you loaded is trying to load an external library, like for example, F<foo.so> or F<bar.dll>, but the L<DynaLoader> module was unable to locate this library. See L<DynaLoader>. =item Can't locate object method "%s" via package "%s" (F) You called a method correctly, and it correctly indicated a package functioning as a class, but that package doesn't define that particular method, nor does any of its base classes. See L<perlobj>. =item Can't locate object method "%s" via package "%s" (perhaps you forgot to load "%s"?) (F) You called a method on a class that did not exist, and the method could not be found in UNIVERSAL. This often means that a method requires a package that has not been loaded. =item Can't locate package %s for @%s::ISA (W syntax) The @ISA array contained the name of another package that doesn't seem to exist. =item Can't locate PerlIO%s (F) You tried to use in open() a PerlIO layer that does not exist, e.g. open(FH, ">:nosuchlayer", "somefile"). =item Can't make list assignment to %ENV on this system (F) List assignment to %ENV is not supported on some systems, notably VMS. =item Can't make loaded symbols global on this platform while loading %s (S) A module passed the flag 0x01 to DynaLoader::dl_load_file() to request that symbols from the stated file are made available globally within the process, but that functionality is not available on this platform. Whilst the module likely will still work, this may prevent the perl interpreter from loading other XS-based extensions which need to link directly to functions defined in the C or XS code in the stated file. =item Can't modify %s in %s (F) You aren't allowed to assign to the item indicated, or otherwise try to change it, such as with an auto-increment. =item Can't modify nonexistent substring (P) The internal routine that does assignment to a substr() was handed a NULL. =item Can't modify non-lvalue subroutine call of &%s =item Can't modify non-lvalue subroutine call of &%s in %s (F) Subroutines meant to be used in lvalue context should be declared as such. See L<perlsub/"Lvalue subroutines">. =item Can't modify reference to %s in %s assignment (F) Only a limited number of constructs can be used as the argument to a reference constructor on the left-hand side of an assignment, and what you used was not one of them. See L<perlref/Assigning to References>. =item Can't modify reference to localized parenthesized array in list assignment (F) Assigning to C<\local(@array)> or C<\(local @array)> is not supported, as it is not clear exactly what it should do. If you meant to make @array refer to some other array, use C<\@array = \@other_array>. If you want to make the elements of @array aliases of the scalars referenced on the right-hand side, use C<\(@array) = @scalar_refs>. =item Can't modify reference to parenthesized hash in list assignment (F) Assigning to C<\(%hash)> is not supported. If you meant to make %hash refer to some other hash, use C<\%hash = \%other_hash>. If you want to make the elements of %hash into aliases of the scalars referenced on the right-hand side, use a hash slice: C<\@hash{@keys} = @those_scalar_refs>. =item Can't msgrcv to read-only var (F) The target of a msgrcv must be modifiable to be used as a receive buffer. =item Can't "next" outside a loop block (F) A "next" statement was executed to reiterate the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See L<perlfunc/next>. =item Can't open %s: %s (S inplace) The implicit opening of a file through use of the C<< <> >> filehandle, either implicitly under the C<-n> or C<-p> command-line switches, or explicitly, failed for the indicated reason. Usually this is because you don't have read permission for a file which you named on the command line. (F) You tried to call perl with the B<-e> switch, but F</dev/null> (or your operating system's equivalent) could not be opened. =item Can't open a reference (W io) You tried to open a scalar reference for reading or writing, using the 3-arg open() syntax: open FH, '>', $ref; but your version of perl is compiled without perlio, and this form of open is not supported. =item Can't open bidirectional pipe (W pipe) You tried to say C<open(CMD, "|cmd|")>, which is not supported. You can try any of several modules in the Perl library to do this, such as IPC::Open2. Alternately, direct the pipe's output to a file using ">", and then read it in under a different file handle. =item Can't open error file %s as stderr (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '2>' or '2>>' on the command line for writing. =item Can't open input file %s as stdin (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '<' on the command line for reading. =item Can't open output file %s as stdout (F) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the file specified after '>' or '>>' on the command line for writing. =item Can't open output pipe (name: %s) (P) An error peculiar to VMS. Perl does its own command line redirection, and couldn't open the pipe into which to send data destined for stdout. =item Can't open perl script "%s": %s (F) The script you specified can't be opened for the indicated reason. If you're debugging a script that uses #!, and normally relies on the shell's $PATH search, the -S option causes perl to do that search, so you don't have to type the path or C<`which $scriptname`>. =item Can't read CRTL environ (S) A warning peculiar to VMS. Perl tried to read an element of %ENV from the CRTL's internal environment array and discovered the array was missing. You need to figure out where your CRTL misplaced its environ or define F<PERL_ENV_TABLES> (see L<perlvms>) so that environ is not searched. =item Can't redeclare "%s" in "%s" (F) A "my", "our" or "state" declaration was found within another declaration, such as C<my ($x, my($y), $z)> or C<our (my $x)>. =item Can't "redo" outside a loop block (F) A "redo" statement was executed to restart the current block, but there isn't a current block. Note that an "if" or "else" block doesn't count as a "loopish" block, as doesn't a block given to sort(), map() or grep(). You can usually double the curlies to get the same effect though, because the inner curlies will be considered a block that loops once. See L<perlfunc/redo>. =item Can't remove %s: %s, skipping file (S inplace) You requested an inplace edit without creating a backup file. Perl was unable to remove the original file to replace it with the modified file. The file was left unmodified. =item Can't rename in-place work file '%s' to '%s': %s (F) When closed implicitly, the temporary file for in-place editing couldn't be renamed to the original filename. =item Can't rename %s to %s: %s, skipping file (F) The rename done by the B<-i> switch failed for some reason, probably because you don't have write permission to the directory. =item Can't reopen input pipe (name: %s) in binary mode (P) An error peculiar to VMS. Perl thought stdin was a pipe, and tried to reopen it to accept binary data. Alas, it failed. =item Can't represent character for Ox%X on this platform (F) There is a hard limit to how big a character code point can be due to the fundamental properties of UTF-8, especially on EBCDIC platforms. The given code point exceeds that. The only work-around is to not use such a large code point. =item Can't reset %ENV on this system (F) You called C<reset('E')> or similar, which tried to reset all variables in the current package beginning with "E". In the main package, that includes %ENV. Resetting %ENV is not supported on some systems, notably VMS. =item Can't resolve method "%s" overloading "%s" in package "%s" (F)(P) Error resolving overloading specified by a method name (as opposed to a subroutine reference): no such method callable via the package. If the method name is C<???>, this is an internal error. =item Can't return %s from lvalue subroutine (F) Perl detected an attempt to return illegal lvalues (such as temporary or readonly values) from a subroutine used as an lvalue. This is not allowed. =item Can't return outside a subroutine (F) The return statement was executed in mainline code, that is, where there was no subroutine call to return out of. See L<perlsub>. =item Can't return %s to lvalue scalar context (F) You tried to return a complete array or hash from an lvalue subroutine, but you called the subroutine in a way that made Perl think you meant to return only one value. You probably meant to write parentheses around the call to the subroutine, which tell Perl that the call should be in list context. =item Can't stat script "%s" (P) For some reason you can't fstat() the script even though you have it open already. Bizarre. =item Can't take log of %g (F) For ordinary real numbers, you can't take the logarithm of a negative number or zero. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that for the negative numbers. =item Can't take sqrt of %g (F) For ordinary real numbers, you can't take the square root of a negative number. There's a Math::Complex package that comes standard with Perl, though, if you really want to do that. =item Can't undef active subroutine (F) You can't undefine a routine that's currently running. You can, however, redefine it while it's running, and you can even undef the redefined subroutine while the old routine is running. Go figure. =item Can't unweaken a nonreference (F) You attempted to unweaken something that was not a reference. Only references can be unweakened. =item Can't upgrade %s (%d) to %d (P) The internal sv_upgrade routine adds "members" to an SV, making it into a more specialized kind of SV. The top several SV types are so specialized, however, that they cannot be interconverted. This message indicates that such a conversion was attempted. =item Can't use '%c' after -mname (F) You tried to call perl with the B<-m> switch, but you put something other than "=" after the module name. =item Can't use a hash as a reference (F) You tried to use a hash as a reference, as in C<< %foo->{"bar"} >> or C<< %$ref->{"hello"} >>. Versions of perl <= 5.22.0 used to allow this syntax, but shouldn't have. This was deprecated in perl 5.6.1. =item Can't use an array as a reference (F) You tried to use an array as a reference, as in C<< @foo->[23] >> or C<< @$ref->[99] >>. Versions of perl <= 5.22.0 used to allow this syntax, but shouldn't have. This was deprecated in perl 5.6.1. =item Can't use anonymous symbol table for method lookup (F) The internal routine that does method lookup was handed a symbol table that doesn't have a name. Symbol tables can become anonymous for example by undefining stashes: C<undef %Some::Package::>. =item Can't use an undefined value as %s reference (F) A value used as either a hard reference or a symbolic reference must be a defined value. This helps to delurk some insidious errors. =item Can't use bareword ("%s") as %s ref while "strict refs" in use (F) Only hard references are allowed by "strict refs". Symbolic references are disallowed. See L<perlref>. =item Can't use %! because Errno.pm is not available (F) The first time the C<%!> hash is used, perl automatically loads the Errno.pm module. The Errno module is expected to tie the %! hash to provide symbolic names for C<$!> errno values. =item Can't use both '<' and '>' after type '%c' in %s (F) A type cannot be forced to have both big-endian and little-endian byte-order at the same time, so this combination of modifiers is not allowed. See L<perlfunc/pack>. =item Can't use 'defined(@array)' (Maybe you should just omit the defined()?) (F) defined() is not useful on arrays because it checks for an undefined I<scalar> value. If you want to see if the array is empty, just use C<if (@array) { # not empty }> for example. =item Can't use 'defined(%hash)' (Maybe you should just omit the defined()?) (F) C<defined()> is not usually right on hashes. Although C<defined %hash> is false on a plain not-yet-used hash, it becomes true in several non-obvious circumstances, including iterators, weak references, stash names, even remaining true after C<undef %hash>. These things make C<defined %hash> fairly useless in practice, so it now generates a fatal error. If a check for non-empty is what you wanted then just put it in boolean context (see L<perldata/Scalar values>): if (%hash) { # not empty } If you had C<defined %Foo::Bar::QUUX> to check whether such a package variable exists then that's never really been reliable, and isn't a good way to enquire about the features of a package, or whether it's loaded, etc. =item Can't use %s for loop variable (P) The parser got confused when trying to parse a C<foreach> loop. =item Can't use global %s in %s (F) You tried to declare a magical variable as a lexical variable. This is not allowed, because the magic can be tied to only one location (namely the global variable) and it would be incredibly confusing to have variables in your program that looked like magical variables but weren't. =item Can't use '%c' in a group with different byte-order in %s (F) You attempted to force a different byte-order on a type that is already inside a group with a byte-order modifier. For example you cannot force little-endianness on a type that is inside a big-endian group. =item Can't use "my %s" in sort comparison (F) The global variables $a and $b are reserved for sort comparisons. You mentioned $a or $b in the same line as the <=> or cmp operator, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable. =item Can't use %s ref as %s ref (F) You've mixed up your reference types. You have to dereference a reference of the type needed. You can use the ref() function to test the type of the reference, if need be. =item Can't use string ("%s") as %s ref while "strict refs" in use =item Can't use string ("%s"...) as %s ref while "strict refs" in use (F) You've told Perl to dereference a string, something which C<use strict> blocks to prevent it happening accidentally. See L<perlref/"Symbolic references">. This can be triggered by an C<@> or C<$> in a double-quoted string immediately before interpolating a variable, for example in C<"user @$twitter_id">, which says to treat the contents of C<$twitter_id> as an array reference; use a C<\> to have a literal C<@> symbol followed by the contents of C<$twitter_id>: C<"user \@$twitter_id">. =item Can't use subscript on %s (F) The compiler tried to interpret a bracketed expression as a subscript. But to the left of the brackets was an expression that didn't look like a hash or array reference, or anything else subscriptable. =item Can't use \%c to mean $%c in expression (W syntax) In an ordinary expression, backslash is a unary operator that creates a reference to its argument. The use of backslash to indicate a backreference to a matched substring is valid only as part of a regular expression pattern. Trying to do this in ordinary Perl code produces a value that prints out looking like SCALAR(0xdecaf). Use the $1 form instead. =item Can't weaken a nonreference (F) You attempted to weaken something that was not a reference. Only references can be weakened. =item Can't "when" outside a topicalizer (F) You have used a when() block that is neither inside a C<foreach> loop nor a C<given> block. (Note that this error is issued on exit from the C<when> block, so you won't get the error if the match fails, or if you use an explicit C<continue>.) =item Can't x= to read-only value (F) You tried to repeat a constant value (often the undefined value) with an assignment operator, which implies modifying the value itself. Perhaps you need to copy the value to a temporary, and repeat that. =item Character following "\c" must be printable ASCII (F) In C<\cI<X>>, I<X> must be a printable (non-control) ASCII character. Note that ASCII characters that don't map to control characters are discouraged, and will generate the warning (when enabled) L</""\c%c" is more clearly written simply as "%s"">. =item Character following \%c must be '{' or a single-character Unicode property name in regex; marked by <-- HERE in m/%s/ (F) (In the above the C<%c> is replaced by either C<p> or C<P>.) You specified something that isn't a legal Unicode property name. Most Unicode properties are specified by C<\p{...}>. But if the name is a single character one, the braces may be omitted. =item Character in 'C' format wrapped in pack (W pack) You said pack("C", $x) where $x is either less than 0 or more than 255; the C<"C"> format is only for encoding native operating system characters (ASCII, EBCDIC, and so on) and not for Unicode characters, so Perl behaved as if you meant pack("C", $x & 255) If you actually want to pack Unicode codepoints, use the C<"U"> format instead. =item Character in 'c' format wrapped in pack (W pack) You said pack("c", $x) where $x is either less than -128 or more than 127; the C<"c"> format is only for encoding native operating system characters (ASCII, EBCDIC, and so on) and not for Unicode characters, so Perl behaved as if you meant pack("c", $x & 255); If you actually want to pack Unicode codepoints, use the C<"U"> format instead. =item Character in '%c' format wrapped in unpack (W unpack) You tried something like unpack("H", "\x{2a1}") where the format expects to process a byte (a character with a value below 256), but a higher value was provided instead. Perl uses the value modulus 256 instead, as if you had provided: unpack("H", "\x{a1}") =item Character in 'W' format wrapped in pack (W pack) You said pack("U0W", $x) where $x is either less than 0 or more than 255. However, C<U0>-mode expects all values to fall in the interval [0, 255], so Perl behaved as if you meant: pack("U0W", $x & 255) =item Character(s) in '%c' format wrapped in pack (W pack) You tried something like pack("u", "\x{1f3}b") where the format expects to process a sequence of bytes (character with a value below 256), but some of the characters had a higher value. Perl uses the character values modulus 256 instead, as if you had provided: pack("u", "\x{f3}b") =item Character(s) in '%c' format wrapped in unpack (W unpack) You tried something like unpack("s", "\x{1f3}b") where the format expects to process a sequence of bytes (character with a value below 256), but some of the characters had a higher value. Perl uses the character values modulus 256 instead, as if you had provided: unpack("s", "\x{f3}b") =item charnames alias definitions may not contain a sequence of multiple spaces; marked by S<<-- HERE> in %s (F) You defined a character name which had multiple space characters in a row. Change them to single spaces. Usually these names are defined in the C<:alias> import argument to C<use charnames>, but they could be defined by a translator installed into C<$^H{charnames}>. See L<charnames/CUSTOM ALIASES>. =item charnames alias definitions may not contain trailing white-space; marked by S<<-- HERE> in %s (F) You defined a character name which ended in a space character. Remove the trailing space(s). Usually these names are defined in the C<:alias> import argument to C<use charnames>, but they could be defined by a translator installed into C<$^H{charnames}>. See L<charnames/CUSTOM ALIASES>. =item chdir() on unopened filehandle %s (W unopened) You tried chdir() on a filehandle that was never opened. =item "\c%c" is more clearly written simply as "%s" (W syntax) The C<\cI<X>> construct is intended to be a way to specify non-printable characters. You used it for a printable one, which is better written as simply itself, perhaps preceded by a backslash for non-word characters. Doing it the way you did is not portable between ASCII and EBCDIC platforms. =item Cloning substitution context is unimplemented (F) Creating a new thread inside the C<s///> operator is not supported. =item closedir() attempted on invalid dirhandle %s (W io) The dirhandle you tried to close is either closed or not really a dirhandle. Check your control flow. =item close() on unopened filehandle %s (W unopened) You tried to close a filehandle that was never opened. =item Closure prototype called (F) If a closure has attributes, the subroutine passed to an attribute handler is the prototype that is cloned when a new closure is created. This subroutine cannot be called. =item \C no longer supported in regex; marked by S<<-- HERE> in m/%s/ (F) The \C character class used to allow a match of single byte within a multi-byte utf-8 character, but was removed in v5.24 as it broke encapsulation and its implementation was extremely buggy. If you really need to process the individual bytes, you probably want to convert your string to one where each underlying byte is stored as a character, with utf8::encode(). =item Code missing after '/' (F) You had a (sub-)template that ends with a '/'. There must be another template code following the slash. See L<perlfunc/pack>. =item Code point 0x%X is not Unicode, and not portable (S non_unicode portable) You had a code point that has never been in any standard, so it is likely that languages other than Perl will NOT understand it. This code point also will not fit in a 32-bit word on ASCII platforms and therefore is non-portable between systems. At one time, it was legal in some standards to have code points up to 0x7FFF_FFFF, but not higher, and this code point is higher. Acceptance of these code points is a Perl extension, and you should expect that nothing other than Perl can handle them; Perl itself on EBCDIC platforms before v5.24 does not handle them. Perl also makes no guarantees that the representation of these code points won't change at some point in the future, say when machines become available that have larger than a 64-bit word. At that time, files containing any of these, written by an older Perl might require conversion before being readable by a newer Perl. =item Code point 0x%X is not Unicode, may not be portable (S non_unicode) You had a code point above the Unicode maximum of U+10FFFF. Perl allows strings to contain a superset of Unicode code points, but these may not be accepted by other languages/systems. Further, even if these languages/systems accept these large code points, they may have chosen a different representation for them than the UTF-8-like one that Perl has, which would mean files are not exchangeable between them and Perl. On EBCDIC platforms, code points above 0x3FFF_FFFF have a different representation in Perl v5.24 than before, so any file containing these that was written before that version will require conversion before being readable by a later Perl. =item %s: Command not found (A) You've accidentally run your script through B<csh> or another shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. The #! line at the top of your file could look like #!/usr/bin/perl =item %s: command not found (A) You've accidentally run your script through B<bash> or another shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. The #! line at the top of your file could look like #!/usr/bin/perl =item %s: command not found: %s (A) You've accidentally run your script through B<zsh> or another shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. The #! line at the top of your file could look like #!/usr/bin/perl =item Compilation failed in require (F) Perl could not compile a file specified in a C<require> statement. Perl uses this generic message when none of the errors that it encountered were severe enough to halt compilation immediately. =item Complex regular subexpression recursion limit (%d) exceeded (W regexp) The regular expression engine uses recursion in complex situations where back-tracking is required. Recursion depth is limited to 32766, or perhaps less in architectures where the stack cannot grow arbitrarily. ("Simple" and "medium" situations are handled without recursion and are not subject to a limit.) Try shortening the string under examination; looping in Perl code (e.g. with C<while>) rather than in the regular expression engine; or rewriting the regular expression so that it is simpler or backtracks less. (See L<perlfaq2> for information on I<Mastering Regular Expressions>.) =item connect() on closed socket %s (W closed) You tried to do a connect on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/connect>. =item Constant(%s): Call to &{$^H{%s}} did not return a defined value (F) The subroutine registered to handle constant overloading (see L<overload>) or a custom charnames handler (see L<charnames/CUSTOM TRANSLATORS>) returned an undefined value. =item Constant(%s): $^H{%s} is not defined (F) The parser found inconsistencies while attempting to define an overloaded constant. Perhaps you forgot to load the corresponding L<overload> pragma? =item Constant is not %s reference (F) A constant value (perhaps declared using the C<use constant> pragma) is being dereferenced, but it amounts to the wrong type of reference. The message indicates the type of reference that was expected. This usually indicates a syntax error in dereferencing the constant value. See L<perlsub/"Constant Functions"> and L<constant>. =item Constants from lexical variables potentially modified elsewhere are no longer permitted (F) You wrote something like my $var; $sub = sub () { $var }; but $var is referenced elsewhere and could be modified after the C<sub> expression is evaluated. Either it is explicitly modified elsewhere (C<$var = 3>) or it is passed to a subroutine or to an operator like C<printf> or C<map>, which may or may not modify the variable. Traditionally, Perl has captured the value of the variable at that point and turned the subroutine into a constant eligible for inlining. In those cases where the variable can be modified elsewhere, this breaks the behavior of closures, in which the subroutine captures the variable itself, rather than its value, so future changes to the variable are reflected in the subroutine's return value. This usage was deprecated, and as of Perl 5.32 is no longer allowed, making it possible to change the behavior in the future. If you intended for the subroutine to be eligible for inlining, then make sure the variable is not referenced elsewhere, possibly by copying it: my $var2 = $var; $sub = sub () { $var2 }; If you do want this subroutine to be a closure that reflects future changes to the variable that it closes over, add an explicit C<return>: my $var; $sub = sub () { return $var }; =item Constant subroutine %s redefined (W redefine)(S) You redefined a subroutine which had previously been eligible for inlining. See L<perlsub/"Constant Functions"> for commentary and workarounds. =item Constant subroutine %s undefined (W misc) You undefined a subroutine which had previously been eligible for inlining. See L<perlsub/"Constant Functions"> for commentary and workarounds. =item Constant(%s) unknown (F) The parser found inconsistencies either while attempting to define an overloaded constant, or when trying to find the character name specified in the C<\N{...}> escape. Perhaps you forgot to load the corresponding L<overload> pragma? =item :const is experimental (S experimental::const_attr) The "const" attribute is experimental. If you want to use the feature, disable the warning with C<no warnings 'experimental::const_attr'>, but know that in doing so you are taking the risk that your code may break in a future Perl version. =item :const is not permitted on named subroutines (F) The "const" attribute causes an anonymous subroutine to be run and its value captured at the time that it is cloned. Named subroutines are not cloned like this, so the attribute does not make sense on them. =item Copy method did not return a reference (F) The method which overloads "=" is buggy. See L<overload/Copy Constructor>. =item &CORE::%s cannot be called directly (F) You tried to call a subroutine in the C<CORE::> namespace with C<&foo> syntax or through a reference. Some subroutines in this package cannot yet be called that way, but must be called as barewords. Something like this will work: BEGIN { *shove = \&CORE::push; } shove @array, 1,2,3; # pushes on to @array =item CORE::%s is not a keyword (F) The CORE:: namespace is reserved for Perl keywords. =item Corrupted regexp opcode %d > %d (P) This is either an error in Perl, or, if you're using one, your L<custom regular expression engine|perlreapi>. If not the latter, report the problem to L<https://github.com/Perl/perl5/issues>. =item corrupted regexp pointers (P) The regular expression engine got confused by what the regular expression compiler gave it. =item corrupted regexp program (P) The regular expression engine got passed a regexp program without a valid magic number. =item Corrupt malloc ptr 0x%x at 0x%x (P) The malloc package that comes with Perl had an internal failure. =item Count after length/code in unpack (F) You had an unpack template indicating a counted-length string, but you have also specified an explicit size for the string. See L<perlfunc/pack>. =item Declaring references is experimental (S experimental::declared_refs) This warning is emitted if you use a reference constructor on the right-hand side of C<my>, C<state>, C<our>, or C<local>. Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version: no warnings "experimental::declared_refs"; use feature "declared_refs"; $fooref = my \$foo; =for comment The following are used in lib/diagnostics.t for testing two =items that share the same description. Changes here need to be propagated to there =item Deep recursion on anonymous subroutine =item Deep recursion on subroutine "%s" (W recursion) This subroutine has called itself (directly or indirectly) 100 times more than it has returned. This probably indicates an infinite recursion, unless you're writing strange benchmark programs, in which case it indicates something else. This threshold can be changed from 100, by recompiling the F<perl> binary, setting the C pre-processor macro C<PERL_SUB_DEPTH_WARN> to the desired value. =item (?(DEFINE)....) does not allow branches in regex; marked by S<<-- HERE> in m/%s/ (F) You used something like C<(?(DEFINE)...|..)> which is illegal. The most likely cause of this error is that you left out a parenthesis inside of the C<....> part. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item %s defines neither package nor VERSION--version check failed (F) You said something like "use Module 42" but in the Module file there are neither package declarations nor a C<$VERSION>. =item delete argument is not a HASH or ARRAY element or slice (F) The argument to C<delete> must be either a hash or array element, such as: $foo{$bar} $ref->{"susie"}[12] or a hash or array slice, such as: @foo[$bar, $baz, $xyzzy] @{$ref->[12]}{"susie", "queue"} or a hash key/value or array index/value slice, such as: %foo[$bar, $baz, $xyzzy] %{$ref->[12]}{"susie", "queue"} =item Delimiter for here document is too long (F) In a here document construct like C<<<FOO>, the label C<FOO> is too long for Perl to handle. You have to be seriously twisted to write code that triggers this error. =item Deprecated use of my() in false conditional. This will be a fatal error in Perl 5.30 (D deprecated) You used a declaration similar to C<my $x if 0>. There has been a long-standing bug in Perl that causes a lexical variable not to be cleared at scope exit when its declaration includes a false conditional. Some people have exploited this bug to achieve a kind of static variable. Since we intend to fix this bug, we don't want people relying on this behavior. You can achieve a similar static effect by declaring the variable in a separate block outside the function, eg sub f { my $x if 0; return $x++ } becomes { my $x; sub f { return $x++ } } Beginning with perl 5.10.0, you can also use C<state> variables to have lexicals that are initialized only once (see L<feature>): sub f { state $x; return $x++ } This use of C<my()> in a false conditional has been deprecated since Perl 5.10, and it will become a fatal error in Perl 5.30. =item DESTROY created new reference to dead object '%s' (F) A DESTROY() method created a new reference to the object which is just being DESTROYed. Perl is confused, and prefers to abort rather than to create a dangling reference. =item Did not produce a valid header See L</500 Server error>. =item %s did not return a true value (F) A required (or used) file must return a true value to indicate that it compiled correctly and ran its initialization code correctly. It's traditional to end such a file with a "1;", though any true value would do. See L<perlfunc/require>. =item (Did you mean &%s instead?) (W misc) You probably referred to an imported subroutine &FOO as $FOO or some such. =item (Did you mean "local" instead of "our"?) (W shadow) Remember that "our" does not localize the declared global variable. You have declared it again in the same lexical scope, which seems superfluous. =item (Did you mean $ or @ instead of %?) (W) You probably said %hash{$key} when you meant $hash{$key} or @hash{@keys}. On the other hand, maybe you just meant %hash and got carried away. =item Died (F) You passed die() an empty string (the equivalent of C<die "">) or you called it with no args and C<$@> was empty. =item Document contains no data See L</500 Server error>. =item %s does not define %s::VERSION--version check failed (F) You said something like "use Module 42" but the Module did not define a C<$VERSION>. =item '/' does not take a repeat count (F) You cannot put a repeat count of any kind right after the '/' code. See L<perlfunc/pack>. =item do "%s" failed, '.' is no longer in @INC; did you mean do "./%s"? (D deprecated) Previously C< do "somefile"; > would search the current directory for the specified file. Since perl v5.26.0, F<.> has been removed from C<@INC> by default, so this is no longer true. To search the current directory (and only the current directory) you can write C< do "./somefile"; >. =item Don't know how to get file name (P) C<PerlIO_getname>, a perl internal I/O function specific to VMS, was somehow called on another platform. This should not happen. =item Don't know how to handle magic of type \%o (P) The internal handling of magical variables has been cursed. =item do_study: out of memory (P) This should have been caught by safemalloc() instead. =item (Do you need to predeclare %s?) (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". It often means a subroutine or module name is being referenced that hasn't been declared yet. This may be because of ordering problems in your file, or because of a missing "sub", "package", "require", or "use" statement. If you're referencing something that isn't defined yet, you don't actually have to define the subroutine or package before the current location. You can use an empty "sub foo;" or "package FOO;" to enter a "forward" declaration. =item dump() must be written as CORE::dump() as of Perl 5.30 (F) You used the obsolete C<dump()> built-in function. That was deprecated in Perl 5.8.0. As of Perl 5.30 it must be written in fully qualified format: C<CORE::dump()>. See L<perlfunc/dump>. =item dump is not supported (F) Your machine doesn't support dump/undump. =item Duplicate free() ignored (S malloc) An internal routine called free() on something that had already been freed. =item Duplicate modifier '%c' after '%c' in %s (W unpack) You have applied the same modifier more than once after a type in a pack template. See L<perlfunc/pack>. =item elseif should be elsif (S syntax) There is no keyword "elseif" in Perl because Larry thinks it's ugly. Your code will be interpreted as an attempt to call a method named "elseif" for the class returned by the following block. This is unlikely to be what you want. =item Empty \%c in regex; marked by S<<-- HERE> in m/%s/ =item Empty \%c{} =item Empty \%c{} in regex; marked by S<<-- HERE> in m/%s/ (F) You used something like C<\b{}>, C<\B{}>, C<\o{}>, C<\p>, C<\P>, or C<\x> without specifying anything for it to operate on. Unfortunately, for backwards compatibility reasons, an empty C<\x> is legal outside S<C<use re 'strict'>> and expands to a NUL character. =item Empty (?) without any modifiers in regex; marked by <-- HERE in m/%s/ (W regexp) (only under C<S<use re 'strict'>>) C<(?)> does nothing, so perhaps this is a typo. =item ${^ENCODING} is no longer supported (F) The special variable C<${^ENCODING}>, formerly used to implement the C<encoding> pragma, is no longer supported as of Perl 5.26.0. Setting it to anything other than C<undef> is a fatal error as of Perl 5.28. =item entering effective %s failed (F) While under the C<use filetest> pragma, switching the real and effective uids or gids failed. =item %ENV is aliased to %s (F) You're running under taint mode, and the C<%ENV> variable has been aliased to another hash, so it doesn't reflect anymore the state of the program's environment. This is potentially insecure. =item Error converting file specification %s (F) An error peculiar to VMS. Because Perl may have to deal with file specifications in either VMS or Unix syntax, it converts them to a single form when it must operate on them directly. Either you've passed an invalid file specification to Perl, or you've found a case the conversion routines don't handle. Drat. =item Error %s in expansion of %s (F) An error was encountered in handling a user-defined property (L<perlunicode/User-Defined Character Properties>). These are programmer written subroutines, hence subject to errors that may prevent them from compiling or running. The calls to these subs are C<eval>'d, and if there is a failure, this message is raised, using the contents of C<$@> from the failed C<eval>. Another possibility is that tainted data was encountered somewhere in the chain of expanding the property. If so, the message wording will indicate that this is the problem. See L</Insecure user-defined property %s>. =item Eval-group in insecure regular expression (F) Perl detected tainted data when trying to compile a regular expression that contains the C<(?{ ... })> zero-width assertion, which is unsafe. See L<perlre/(?{ code })>, and L<perlsec>. =item Eval-group not allowed at runtime, use re 'eval' in regex m/%s/ (F) Perl tried to compile a regular expression containing the C<(?{ ... })> zero-width assertion at run time, as it would when the pattern contains interpolated values. Since that is a security risk, it is not allowed. If you insist, you may still do this by using the C<re 'eval'> pragma or by explicitly building the pattern from an interpolated string at run time and using that in an eval(). See L<perlre/(?{ code })>. =item Eval-group not allowed, use re 'eval' in regex m/%s/ (F) A regular expression contained the C<(?{ ... })> zero-width assertion, but that construct is only allowed when the C<use re 'eval'> pragma is in effect. See L<perlre/(?{ code })>. =item EVAL without pos change exceeded limit in regex; marked by S<<-- HERE> in m/%s/ (F) You used a pattern that nested too many EVAL calls without consuming any text. Restructure the pattern so that text is consumed. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Excessively long <> operator (F) The contents of a <> operator may not exceed the maximum size of a Perl identifier. If you're just trying to glob a long list of filenames, try using the glob() operator, or put the filenames into a variable and glob that. =item exec? I'm not *that* kind of operating system (F) The C<exec> function is not implemented on some systems, e.g., Symbian OS. See L<perlport>. =item %sExecution of %s aborted due to compilation errors. (F) The final summary message when a Perl compilation fails. =item exists argument is not a HASH or ARRAY element or a subroutine (F) The argument to C<exists> must be a hash or array element or a subroutine with an ampersand, such as: $foo{$bar} $ref->{"susie"}[12] &do_something =item exists argument is not a subroutine name (F) The argument to C<exists> for C<exists &sub> must be a subroutine name, and not a subroutine call. C<exists &sub()> will generate this error. =item Exiting eval via %s (W exiting) You are exiting an eval by unconventional means, such as a goto, or a loop control statement. =item Exiting format via %s (W exiting) You are exiting a format by unconventional means, such as a goto, or a loop control statement. =item Exiting pseudo-block via %s (W exiting) You are exiting a rather special block construct (like a sort block or subroutine) by unconventional means, such as a goto, or a loop control statement. See L<perlfunc/sort>. =item Exiting subroutine via %s (W exiting) You are exiting a subroutine by unconventional means, such as a goto, or a loop control statement. =item Exiting substitution via %s (W exiting) You are exiting a substitution by unconventional means, such as a return, a goto, or a loop control statement. =item Expecting close bracket in regex; marked by S<<-- HERE> in m/%s/ (F) You wrote something like (?13 to denote a capturing group of the form L<C<(?I<PARNO>)>|perlre/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)>, but omitted the C<")">. =item Expecting interpolated extended charclass in regex; marked by <-- HERE in m/%s/ (F) It looked like you were attempting to interpolate an already-compiled extended character class, like so: my $thai_or_lao = qr/(?[ \p{Thai} + \p{Lao} ])/; ... qr/(?[ \p{Digit} & $thai_or_lao ])/; But the marked code isn't syntactically correct to be such an interpolated class. =item Experimental aliasing via reference not enabled (F) To do aliasing via references, you must first enable the feature: no warnings "experimental::refaliasing"; use feature "refaliasing"; \$x = \$y; =item Experimental %s on scalar is now forbidden (F) An experimental feature added in Perl 5.14 allowed C<each>, C<keys>, C<push>, C<pop>, C<shift>, C<splice>, C<unshift>, and C<values> to be called with a scalar argument. This experiment is considered unsuccessful, and has been removed. The C<postderef> feature may meet your needs better. =item Experimental subroutine signatures not enabled (F) To use subroutine signatures, you must first enable them: no warnings "experimental::signatures"; use feature "signatures"; sub foo ($left, $right) { ... } =item Explicit blessing to '' (assuming package main) (W misc) You are blessing a reference to a zero length string. This has the effect of blessing the reference into the package main. This is usually not what you want. Consider providing a default target package, e.g. bless($ref, $p || 'MyPackage'); =item %s: Expression syntax (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item %s failed--call queue aborted (F) An untrapped exception was raised while executing a UNITCHECK, CHECK, INIT, or END subroutine. Processing of the remainder of the queue of such routines has been prematurely ended. =item Failed to close in-place work file %s: %s (F) Closing an output file from in-place editing, as with the C<-i> command-line switch, failed. =item False [] range "%s" in regex; marked by S<<-- HERE> in m/%s/ (W regexp)(F) A character class range must start and end at a literal character, not another character class like C<\d> or C<[:alpha:]>. The "-" in your false range is interpreted as a literal "-". In a C<(?[...])> construct, this is an error, rather than a warning. Consider quoting the "-", "\-". The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Fatal VMS error (status=%d) at %s, line %d (P) An error peculiar to VMS. Something untoward happened in a VMS system service or RTL routine; Perl's exit status should provide more details. The filename in "at %s" and the line number in "line %d" tell you which section of the Perl source code is distressed. =item fcntl is not implemented (F) Your machine apparently doesn't implement fcntl(). What is this, a PDP-11 or something? =item FETCHSIZE returned a negative value (F) A tied array claimed to have a negative number of elements, which is not possible. =item Field too wide in 'u' format in pack (W pack) Each line in an uuencoded string starts with a length indicator which can't encode values above 63. So there is no point in asking for a line length bigger than that. Perl will behave as if you specified C<u63> as the format. =item File::Glob::glob() will disappear in perl 5.30. Use File::Glob::bsd_glob() instead. (D deprecated) C<< File::Glob >> has a function called C<< glob >>, which just calls C<< bsd_glob >>. However, its prototype is different from the prototype of C<< CORE::glob >>, and hence, C<< File::Glob::glob >> should not be used. C<< File::Glob::glob() >> was deprecated in perl 5.8.0. A deprecation message was issued from perl 5.26.0 onwards, and the function will disappear in perl 5.30.0. Code using C<< File::Glob::glob() >> should call C<< File::Glob::bsd_glob() >> instead. =item Filehandle %s opened only for input (W io) You tried to write on a read-only filehandle. If you intended it to be a read-write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with "<" or nothing. If you intended only to write the file, use ">" or ">>". See L<perlfunc/open>. =item Filehandle %s opened only for output (W io) You tried to read from a filehandle opened only for writing, If you intended it to be a read/write filehandle, you needed to open it with "+<" or "+>" or "+>>" instead of with ">". If you intended only to read from the file, use "<". See L<perlfunc/open>. Another possibility is that you attempted to open filedescriptor 0 (also known as STDIN) for output (maybe you closed STDIN earlier?). =item Filehandle %s reopened as %s only for input (W io) You opened for reading a filehandle that got the same filehandle id as STDOUT or STDERR. This occurred because you closed STDOUT or STDERR previously. =item Filehandle STDIN reopened as %s only for output (W io) You opened for writing a filehandle that got the same filehandle id as STDIN. This occurred because you closed STDIN previously. =item Final $ should be \$ or $name (F) You must now decide whether the final $ in a string was meant to be a literal dollar sign, or was meant to introduce a variable name that happens to be missing. So you have to put either the backslash or the name. =item flock() on closed filehandle %s (W closed) The filehandle you're attempting to flock() got itself closed some time before now. Check your control flow. flock() operates on filehandles. Are you attempting to call flock() on a dirhandle by the same name? =item Format not terminated (F) A format must be terminated by a line with a solitary dot. Perl got to the end of your file without finding such a line. =item Format %s redefined (W redefine) You redefined a format. To suppress this warning, say { no warnings 'redefine'; eval "format NAME =..."; } =item Found = in conditional, should be == (W syntax) You said if ($foo = 123) when you meant if ($foo == 123) (or something like that). =item %s found where operator expected (S syntax) The Perl lexer knows whether to expect a term or an operator. If it sees what it knows to be a term when it was expecting to see an operator, it gives you this warning. Usually it indicates that an operator or delimiter was omitted, such as a semicolon. =item gdbm store returned %d, errno %d, key "%s" (S) A warning from the GDBM_File extension that a store failed. =item gethostent not implemented (F) Your C library apparently doesn't implement gethostent(), probably because if it did, it'd feel morally obligated to return every hostname on the Internet. =item get%sname() on closed socket %s (W closed) You tried to get a socket or peer socket name on a closed socket. Did you forget to check the return value of your socket() call? =item getpwnam returned invalid UIC %#o for user "%s" (S) A warning peculiar to VMS. The call to C<sys$getuai> underlying the C<getpwnam> operator returned an invalid UIC. =item getsockopt() on closed socket %s (W closed) You tried to get a socket option on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/getsockopt>. =item given is experimental (S experimental::smartmatch) C<given> depends on smartmatch, which is experimental, so its behavior may change or even be removed in any future release of perl. See the explanation under L<perlsyn/Experimental Details on given and when>. =item Global symbol "%s" requires explicit package name (did you forget to declare "my %s"?) (F) You've said "use strict" or "use strict vars", which indicates that all variables must either be lexically scoped (using "my" or "state"), declared beforehand using "our", or explicitly qualified to say which package the global variable is in (using "::"). =item glob failed (%s) (S glob) Something went wrong with the external program(s) used for C<glob> and C<< <*.c> >>. Usually, this means that you supplied a C<glob> pattern that caused the external program to fail and exit with a nonzero status. If the message indicates that the abnormal exit resulted in a coredump, this may also mean that your csh (C shell) is broken. If so, you should change all of the csh-related variables in config.sh: If you have tcsh, make the variables refer to it as if it were csh (e.g. C<full_csh='/usr/bin/tcsh'>); otherwise, make them all empty (except that C<d_csh> should be C<'undef'>) so that Perl will think csh is missing. In either case, after editing config.sh, run C<./Configure -S> and rebuild Perl. =item Glob not terminated (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than". =item gmtime(%f) failed (W overflow) You called C<gmtime> with a number that it could not handle: too large, too small, or NaN. The returned value is C<undef>. =item gmtime(%f) too large (W overflow) You called C<gmtime> with a number that was larger than it can reliably handle and C<gmtime> probably returned the wrong date. This warning is also triggered with NaN (the special not-a-number value). =item gmtime(%f) too small (W overflow) You called C<gmtime> with a number that was smaller than it can reliably handle and C<gmtime> probably returned the wrong date. =item Got an error from DosAllocMem (P) An error peculiar to OS/2. Most probably you're using an obsolete version of Perl, and this should not happen anyway. =item goto must have label (F) Unlike with "next" or "last", you're not allowed to goto an unspecified destination. See L<perlfunc/goto>. =item Goto undefined subroutine%s (F) You tried to call a subroutine with C<goto &sub> syntax, but the indicated subroutine hasn't been defined, or if it was, it has since been undefined. =item Group name must start with a non-digit word character in regex; marked by S<<-- HERE> in m/%s/ (F) Group names must follow the rules for perl identifiers, meaning they must start with a non-digit word character. A common cause of this error is using (?&0) instead of (?0). See L<perlre>. =item ()-group starts with a count (F) A ()-group started with a count. A count is supposed to follow something: a template character or a ()-group. See L<perlfunc/pack>. =item %s had compilation errors. (F) The final summary message when a C<perl -c> fails. =item Had to create %s unexpectedly (S internal) A routine asked for a symbol from a symbol table that ought to have existed already, but for some reason it didn't, and had to be created on an emergency basis to prevent a core dump. =item %s has too many errors (F) The parser has given up trying to parse the program after 10 errors. Further error messages would likely be uninformative. =item Hexadecimal float: exponent overflow (W overflow) The hexadecimal floating point has a larger exponent than the floating point supports. =item Hexadecimal float: exponent underflow (W overflow) The hexadecimal floating point has a smaller exponent than the floating point supports. With the IEEE 754 floating point, this may also mean that the subnormals (formerly known as denormals) are being used, which may or may not be an error. =item Hexadecimal float: internal error (%s) (F) Something went horribly bad in hexadecimal float handling. =item Hexadecimal float: mantissa overflow (W overflow) The hexadecimal floating point literal had more bits in the mantissa (the part between the 0x and the exponent, also known as the fraction or the significand) than the floating point supports. =item Hexadecimal float: precision loss (W overflow) The hexadecimal floating point had internally more digits than could be output. This can be caused by unsupported long double formats, or by 64-bit integers not being available (needed to retrieve the digits under some configurations). =item Hexadecimal float: unsupported long double format (F) You have configured Perl to use long doubles but the internals of the long double format are unknown; therefore the hexadecimal float output is impossible. =item Hexadecimal number > 0xffffffff non-portable (W portable) The hexadecimal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. =item Identifier too long (F) Perl limits identifiers (names for variables, functions, etc.) to about 250 characters for simple names, and somewhat more for compound names (like C<$A::B>). You've exceeded Perl's limits. Future versions of Perl are likely to eliminate these arbitrary limitations. =item Ignoring zero length \N{} in character class in regex; marked by S<<-- HERE> in m/%s/ (W regexp) Named Unicode character escapes (C<\N{...}>) may return a zero-length sequence. When such an escape is used in a character class its behavior is not well defined. Check that the correct escape has been used, and the correct charname handler is in scope. =item Illegal %s digit '%c' ignored (W digit) Here C<%s> is one of "binary", "octal", or "hex". You may have tried to use a digit other than one that is legal for the given type, such as only 0 and 1 for binary. For octals, this is raised only if the illegal character is an '8' or '9'. For hex, 'A' - 'F' and 'a' - 'f' are legal. Interpretation of the number stopped just before the offending digit or character. =item Illegal binary digit '%c' (F) You used a digit other than 0 or 1 in a binary number. =item Illegal character after '_' in prototype for %s : %s (W illegalproto) An illegal character was found in a prototype declaration. The '_' in a prototype must be followed by a ';', indicating the rest of the parameters are optional, or one of '@' or '%', since those two will accept 0 or more final parameters. =item Illegal character \%o (carriage return) (F) Perl normally treats carriage returns in the program text as it would any other whitespace, which means you should never see this error when Perl was built using standard options. For some reason, your version of Perl appears to have been built without this support. Talk to your Perl administrator. =item Illegal character following sigil in a subroutine signature (F) A parameter in a subroutine signature contained an unexpected character following the C<$>, C<@> or C<%> sigil character. Normally the sigil should be followed by the variable name or C<=> etc. Perhaps you are trying use a prototype while in the scope of C<use feature 'signatures'>? For example: sub foo ($$) {} # legal - a prototype use feature 'signatures; sub foo ($$) {} # illegal - was expecting a signature sub foo ($a, $b) :prototype($$) {} # legal =item Illegal character in prototype for %s : %s (W illegalproto) An illegal character was found in a prototype declaration. Legal characters in prototypes are $, @, %, *, ;, [, ], &, \, and +. Perhaps you were trying to write a subroutine signature but didn't enable that feature first (C<use feature 'signatures'>), so your signature was instead interpreted as a bad prototype. =item Illegal declaration of anonymous subroutine (F) When using the C<sub> keyword to construct an anonymous subroutine, you must always specify a block of code. See L<perlsub>. =item Illegal declaration of subroutine %s (F) A subroutine was not declared correctly. See L<perlsub>. =item Illegal division by zero (F) You tried to divide a number by 0. Either something was wrong in your logic, or you need to put a conditional in to guard against meaningless input. =item Illegal modulus zero (F) You tried to divide a number by 0 to get the remainder. Most numbers don't take to this kindly. =item Illegal number of bits in vec (F) The number of bits in vec() (the third argument) must be a power of two from 1 to 32 (or 64, if your platform supports that). =item Illegal octal digit '%c' (F) You used an 8 or 9 in an octal number. =item Illegal operator following parameter in a subroutine signature (F) A parameter in a subroutine signature, was followed by something other than C<=> introducing a default, C<,> or C<)>. use feature 'signatures'; sub foo ($=1) {} # legal sub foo ($a = 1) {} # legal sub foo ($a += 1) {} # illegal sub foo ($a == 1) {} # illegal =item Illegal pattern in regex; marked by S<<-- HERE> in m/%s/ (F) You wrote something like (?+foo) The C<"+"> is valid only when followed by digits, indicating a capturing group. See L<C<(?I<PARNO>)>|perlre/(?PARNO) (?-PARNO) (?+PARNO) (?R) (?0)>. =item Illegal suidscript (F) The script run under suidperl was somehow illegal. =item Illegal switch in PERL5OPT: -%c (X) The PERL5OPT environment variable may only be used to set the following switches: B<-[CDIMUdmtw]>. =item Illegal user-defined property name (F) You specified a Unicode-like property name in a regular expression pattern (using C<\p{}> or C<\P{}>) that Perl knows isn't an official Unicode property, and was likely meant to be a user-defined property name, but it can't be one of those, as they must begin with either C<In> or C<Is>. Check the spelling. See also L</Can't find Unicode property definition "%s">. =item Ill-formed CRTL environ value "%s" (W internal) A warning peculiar to VMS. Perl tried to read the CRTL's internal environ array, and encountered an element without the C<=> delimiter used to separate keys from values. The element is ignored. =item Ill-formed message in prime_env_iter: |%s| (W internal) A warning peculiar to VMS. Perl tried to read a logical name or CLI symbol definition when preparing to iterate over %ENV, and didn't see the expected delimiter between key and value, so the line was ignored. =item (in cleanup) %s (W misc) This prefix usually indicates that a DESTROY() method raised the indicated exception. Since destructors are usually called by the system at arbitrary points during execution, and often a vast number of times, the warning is issued only once for any number of failures that would otherwise result in the same message being repeated. Failure of user callbacks dispatched using the C<G_KEEPERR> flag could also result in this warning. See L<perlcall/G_KEEPERR>. =item Incomplete expression within '(?[ ])' in regex; marked by S<<-- HERE> in m/%s/ (F) There was a syntax error within the C<(?[ ])>. This can happen if the expression inside the construct was completely empty, or if there are too many or few operands for the number of operators. Perl is not smart enough to give you a more precise indication as to what is wrong. =item Inconsistent hierarchy during C3 merge of class '%s': merging failed on parent '%s' (F) The method resolution order (MRO) of the given class is not C3-consistent, and you have enabled the C3 MRO for this class. See the C3 documentation in L<mro> for more information. =item Indentation on line %d of here-doc doesn't match delimiter (F) You have an indented here-document where one or more of its lines have whitespace at the beginning that does not match the closing delimiter. For example, line 2 below is wrong because it does not have at least 2 spaces, but lines 1 and 3 are fine because they have at least 2: if ($something) { print <<~EOF; Line 1 Line 2 not Line 3 EOF } Note that tabs and spaces are compared strictly, meaning 1 tab will not match 8 spaces. =item Infinite recursion in regex (F) You used a pattern that references itself without consuming any input text. You should check the pattern to ensure that recursive patterns either consume text or fail. =item Infinite recursion in user-defined property (F) A user-defined property (L<perlunicode/User-Defined Character Properties>) can depend on the definitions of other user-defined properties. If the chain of dependencies leads back to this property, infinite recursion would occur, were it not for the check that raised this error. Restructure your property definitions to avoid this. =item Infinite recursion via empty pattern (F) You tried to use the empty pattern inside of a regex code block, for instance C</(?{ s!!! })/>, which resulted in re-executing the same pattern, which is an infinite loop which is broken by throwing an exception. =item Initialization of state variables in list currently forbidden (F) C<state> only permits initializing a single variable, specified without parentheses. So C<state $a = 42> and C<state @a = qw(a b c)> are allowed, but not C<state ($a) = 42> or C<(state $a) = 42>. To initialize more than one C<state> variable, initialize them one at a time. =item %%s[%s] in scalar context better written as $%s[%s] (W syntax) In scalar context, you've used an array index/value slice (indicated by %) to select a single element of an array. Generally it's better to ask for a scalar value (indicated by $). The difference is that C<$foo[&bar]> always behaves like a scalar, both in the value it returns and when evaluating its argument, while C<%foo[&bar]> provides a list context to its subscript, which can do weird things if you're expecting only one subscript. When called in list context, it also returns the index (what C<&bar> returns) in addition to the value. =item %%s{%s} in scalar context better written as $%s{%s} (W syntax) In scalar context, you've used a hash key/value slice (indicated by %) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that C<$foo{&bar}> always behaves like a scalar, both in the value it returns and when evaluating its argument, while C<@foo{&bar}> and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. When called in list context, it also returns the key in addition to the value. =item Insecure dependency in %s (F) You tried to do something that the tainting mechanism didn't like. The tainting mechanism is turned on when you're running setuid or setgid, or when you specify B<-T> to turn it on explicitly. The tainting mechanism labels all data that's derived directly or indirectly from the user, who is considered to be unworthy of your trust. If any such data is used in a "dangerous" operation, you get this error. See L<perlsec> for more information. =item Insecure directory in %s (F) You can't use system(), exec(), or a piped open in a setuid or setgid script if C<$ENV{PATH}> contains a directory that is writable by the world. Also, the PATH must not contain any relative directory. See L<perlsec>. =item Insecure $ENV{%s} while running %s (F) You can't use system(), exec(), or a piped open in a setuid or setgid script if any of C<$ENV{PATH}>, C<$ENV{IFS}>, C<$ENV{CDPATH}>, C<$ENV{ENV}>, C<$ENV{BASH_ENV}> or C<$ENV{TERM}> are derived from data supplied (or potentially supplied) by the user. The script must set the path to a known value, using trustworthy data. See L<perlsec>. =item Insecure user-defined property %s (F) Perl detected tainted data when trying to compile a regular expression that contains a call to a user-defined character property function, i.e. C<\p{IsFoo}> or C<\p{InFoo}>. See L<perlunicode/User-Defined Character Properties> and L<perlsec>. =item Integer overflow in format string for %s (F) The indexes and widths specified in the format string of C<printf()> or C<sprintf()> are too large. The numbers must not overflow the size of integers for your architecture. =item Integer overflow in %s number (S overflow) The hexadecimal, octal or binary number you have specified either as a literal or as an argument to hex() or oct() is too big for your architecture, and has been converted to a floating point number. On a 32-bit architecture the largest hexadecimal, octal or binary number representable without overflow is 0xFFFFFFFF, 037777777777, or 0b11111111111111111111111111111111 respectively. Note that Perl transparently promotes all numbers to a floating point representation internally--subject to loss of precision errors in subsequent operations. =item Integer overflow in srand (S overflow) The number you have passed to srand is too big to fit in your architecture's integer representation. The number has been replaced with the largest integer supported (0xFFFFFFFF on 32-bit architectures). This means you may be getting less randomness than you expect, because different random seeds above the maximum will return the same sequence of random numbers. =item Integer overflow in version =item Integer overflow in version %d (W overflow) Some portion of a version initialization is too large for the size of integers for your architecture. This is not a warning because there is no rational reason for a version to try and use an element larger than typically 2**32. This is usually caused by trying to use some odd mathematical operation as a version, like 100/9. =item Internal disaster in regex; marked by S<<-- HERE> in m/%s/ (P) Something went badly wrong in the regular expression parser. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Internal inconsistency in tracking vforks (S) A warning peculiar to VMS. Perl keeps track of the number of times you've called C<fork> and C<exec>, to determine whether the current call to C<exec> should affect the current script or a subprocess (see L<perlvms/"exec LIST">). Somehow, this count has become scrambled, so Perl is making a guess and treating this C<exec> as a request to terminate the Perl script and execute the specified command. =item internal %<num>p might conflict with future printf extensions (S internal) Perl's internal routine that handles C<printf> and C<sprintf> formatting follows a slightly different set of rules when called from C or XS code. Specifically, formats consisting of digits followed by "p" (e.g., "%7p") are reserved for future use. If you see this message, then an XS module tried to call that routine with one such reserved format. =item Internal urp in regex; marked by S<<-- HERE> in m/%s/ (P) Something went badly awry in the regular expression parser. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item %s (...) interpreted as function (W syntax) You've run afoul of the rule that says that any list operator followed by parentheses turns into a function, with all the list operators arguments found inside the parentheses. See L<perlop/Terms and List Operators (Leftward)>. =item In '(?...)', the '(' and '?' must be adjacent in regex; marked by S<<-- HERE> in m/%s/ (F) The two-character sequence C<"(?"> in this context in a regular expression pattern should be an indivisible token, with nothing intervening between the C<"("> and the C<"?">, but you separated them with whitespace. =item In '(*...)', the '(' and '*' must be adjacent in regex; marked by S<<-- HERE> in m/%s/ (F) The two-character sequence C<"(*"> in this context in a regular expression pattern should be an indivisible token, with nothing intervening between the C<"("> and the C<"*">, but you separated them. Fix the pattern and retry. =item Invalid %s attribute: %s (F) The indicated attribute for a subroutine or variable was not recognized by Perl or by a user-supplied handler. See L<attributes>. =item Invalid %s attributes: %s (F) The indicated attributes for a subroutine or variable were not recognized by Perl or by a user-supplied handler. See L<attributes>. =item Invalid character in charnames alias definition; marked by S<<-- HERE> in '%s (F) You tried to create a custom alias for a character name, with the C<:alias> option to C<use charnames> and the specified character in the indicated name isn't valid. See L<charnames/CUSTOM ALIASES>. =item Invalid \0 character in %s for %s: %s\0%s (W syscalls) Embedded \0 characters in pathnames or other system call arguments produce a warning as of 5.20. The parts after the \0 were formerly ignored by system calls. =item Invalid character in \N{...}; marked by S<<-- HERE> in \N{%s} (F) Only certain characters are valid for character names. The indicated one isn't. See L<charnames/CUSTOM ALIASES>. =item Invalid conversion in %s: "%s" (W printf) Perl does not understand the given format conversion. See L<perlfunc/sprintf>. =item Invalid escape in the specified encoding in regex; marked by S<<-- HERE> in m/%s/ (W regexp)(F) The numeric escape (for example C<\xHH>) of value < 256 didn't correspond to a single character through the conversion from the encoding specified by the encoding pragma. The escape was replaced with REPLACEMENT CHARACTER (U+FFFD) instead, except within S<C<(?[ ])>>, where it is a fatal error. The S<<-- HERE> shows whereabouts in the regular expression the escape was discovered. =item Invalid hexadecimal number in \N{U+...} =item Invalid hexadecimal number in \N{U+...} in regex; marked by S<<-- HERE> in m/%s/ (F) The character constant represented by C<...> is not a valid hexadecimal number. Either it is empty, or you tried to use a character other than 0 - 9 or A - F, a - f in a hexadecimal number. =item Invalid module name %s with -%c option: contains single ':' (F) The module argument to perl's B<-m> and B<-M> command-line options cannot contain single colons in the module name, but only in the arguments after "=". In other words, B<-MFoo::Bar=:baz> is ok, but B<-MFoo:Bar=baz> is not. =item Invalid mro name: '%s' (F) You tried to C<mro::set_mro("classname", "foo")> or C<use mro 'foo'>, where C<foo> is not a valid method resolution order (MRO). Currently, the only valid ones supported are C<dfs> and C<c3>, unless you have loaded a module that is a MRO plugin. See L<mro> and L<perlmroapi>. =item Invalid negative number (%s) in chr (W utf8) You passed a negative number to C<chr>. Negative numbers are not valid character numbers, so it returns the Unicode replacement character (U+FFFD). =item Invalid number '%s' for -C option. (F) You supplied a number to the -C option that either has extra leading zeroes or overflows perl's unsigned integer representation. =item invalid option -D%c, use -D'' to see choices (S debugging) Perl was called with invalid debugger flags. Call perl with the B<-D> option with no flags to see the list of acceptable values. See also L<perlrun/-Dletters>. =item Invalid quantifier in {,} in regex; marked by S<<-- HERE> in m/%s/ (F) The pattern looks like a {min,max} quantifier, but the min or max could not be parsed as a valid number - either it has leading zeroes, or it represents too big a number to cope with. The S<<-- HERE> shows where in the regular expression the problem was discovered. See L<perlre>. =item Invalid [] range "%s" in regex; marked by S<<-- HERE> in m/%s/ (F) The range specified in a character class had a minimum character greater than the maximum character. One possibility is that you forgot the C<{}> from your ending C<\x{}> - C<\x> without the curly braces can go only up to C<ff>. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Invalid range "%s" in transliteration operator (F) The range specified in the tr/// or y/// operator had a minimum character greater than the maximum character. See L<perlop>. =item Invalid reference to group in regex; marked by S<<-- HERE> in m/%s/ (F) The capture group you specified can't possibly exist because the number you used is not within the legal range of possible values for this machine. =item Invalid separator character %s in attribute list (F) Something other than a colon or whitespace was seen between the elements of an attribute list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. See L<attributes>. =item Invalid separator character %s in PerlIO layer specification %s (W layer) When pushing layers onto the Perl I/O system, something other than a colon or whitespace was seen between the elements of a layer list. If the previous attribute had a parenthesised parameter list, perhaps that list was terminated too soon. =item Invalid strict version format (%s) (F) A version number did not meet the "strict" criteria for versions. A "strict" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string with a leading 'v' character and at least three components. The parenthesized text indicates which criteria were not met. See the L<version> module for more details on allowed version formats. =item Invalid type '%s' in %s (F) The given character is not a valid pack or unpack type. See L<perlfunc/pack>. (W) The given character is not a valid pack or unpack type but used to be silently ignored. =item Invalid version format (%s) (F) A version number did not meet the "lax" criteria for versions. A "lax" version number is a positive decimal number (integer or decimal-fraction) without exponentiation or else a dotted-decimal v-string. If the v-string has fewer than three components, it must have a leading 'v' character. Otherwise, the leading 'v' is optional. Both decimal and dotted-decimal versions may have a trailing "alpha" component separated by an underscore character after a fractional or dotted-decimal component. The parenthesized text indicates which criteria were not met. See the L<version> module for more details on allowed version formats. =item Invalid version object (F) The internal structure of the version object was invalid. Perhaps the internals were modified directly in some way or an arbitrary reference was blessed into the "version" class. =item In '(*VERB...)', the '(' and '*' must be adjacent in regex; marked by S<<-- HERE> in m/%s/ =item Inverting a character class which contains a multi-character sequence is illegal in regex; marked by <-- HERE in m/%s/ (F) You wrote something like qr/\P{name=KATAKANA LETTER AINU P}/ qr/[^\p{name=KATAKANA LETTER AINU P}]/ This name actually evaluates to a sequence of two Katakana characters, not just a single one, and it is illegal to try to take the complement of a sequence. (Mathematically it would mean any sequence of characters from 0 to infinity in length that weren't these two in a row, and that is likely not of any real use.) (F) The two-character sequence C<"(*"> in this context in a regular expression pattern should be an indivisible token, with nothing intervening between the C<"("> and the C<"*">, but you separated them. =item ioctl is not implemented (F) Your machine apparently doesn't implement ioctl(), which is pretty strange for a machine that supports C. =item ioctl() on unopened %s (W unopened) You tried ioctl() on a filehandle that was never opened. Check your control flow and number of arguments. =item IO layers (like '%s') unavailable (F) Your Perl has not been configured to have PerlIO, and therefore you cannot use IO layers. To have PerlIO, Perl must be configured with 'useperlio'. =item IO::Socket::atmark not implemented on this architecture (F) Your machine doesn't implement the sockatmark() functionality, neither as a system call nor an ioctl call (SIOCATMARK). =item '%s' is an unknown bound type in regex; marked by S<<-- HERE> in m/%s/ (F) You used C<\b{...}> or C<\B{...}> and the C<...> is not known to Perl. The current valid ones are given in L<perlrebackslash/\b{}, \b, \B{}, \B>. =item %s is forbidden - matches null string many times in regex; marked by S<<-- HERE> in m/%s/ (F) The pattern you've specified might cause the regular expression to infinite loop so it is forbidden. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item %s() isn't allowed on :utf8 handles (F) The sysread(), recv(), syswrite() and send() operators are not allowed on handles that have the C<:utf8> layer, either explicitly, or implicitly, eg., with the C<:encoding(UTF-16LE)> layer. Previously sysread() and recv() currently use only the C<:utf8> flag for the stream, ignoring the actual layers. Since sysread() and recv() did no UTF-8 validation they can end up creating invalidly encoded scalars. Similarly, syswrite() and send() used only the C<:utf8> flag, otherwise ignoring any layers. If the flag is set, both wrote the value UTF-8 encoded, even if the layer is some different encoding, such as the example above. Ideally, all of these operators would completely ignore the C<:utf8> state, working only with bytes, but this would result in silently breaking existing code. =item "%s" is more clearly written simply as "%s" in regex; marked by S<<-- HERE> in m/%s/ (W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>) You specified a character that has the given plainer way of writing it, and which is also portable to platforms running with different character sets. =item $* is no longer supported as of Perl 5.30 (F) The special variable C<$*>, deprecated in older perls, was removed in 5.10.0, is no longer supported and is a fatal error as of Perl 5.30. In previous versions of perl the use of C<$*> enabled or disabled multi-line matching within a string. Instead of using C<$*> you should use the C</m> (and maybe C</s>) regexp modifiers. You can enable C</m> for a lexical scope (even a whole file) with C<use re '/m'>. (In older versions: when C<$*> was set to a true value then all regular expressions behaved as if they were written using C</m>.) Use of this variable will be a fatal error in Perl 5.30. =item $# is no longer supported as of Perl 5.30 (F) The special variable C<$#>, deprecated in older perls, was removed as of 5.10.0, is no longer supported and is a fatal error as of Perl 5.30. You should use the printf/sprintf functions instead. =item '%s' is not a code reference (W overload) The second (fourth, sixth, ...) argument of overload::constant needs to be a code reference. Either an anonymous subroutine, or a reference to a subroutine. =item '%s' is not an overloadable type (W overload) You tried to overload a constant type the overload package is unaware of. =item isa is experimental (S experimental::isa) This warning is emitted if you use the (C<isa>) operator. This operator is currently experimental and its behaviour may change in future releases of Perl. =item -i used with no filenames on the command line, reading from STDIN (S inplace) The C<-i> option was passed on the command line, indicating that the script is intended to edit files in place, but no files were given. This is usually a mistake, since editing STDIN in place doesn't make sense, and can be confusing because it can make perl look like it is hanging when it is really just trying to read from STDIN. You should either pass a filename to edit, or remove C<-i> from the command line. See L<perlrun|perlrun/-i[extension]> for more details. =item Junk on end of regexp in regex m/%s/ (P) The regular expression parser is confused. =item \K not permitted in lookahead/lookbehind in regex; marked by <-- HERE in m/%s/ (F) Your regular expression used C<\K> in a lookahead or lookbehind assertion, which currently isn't permitted. This may change in the future, see L<Support \K in lookarounds|https://github.com/Perl/perl5/issues/18134>. =item Label not found for "last %s" (F) You named a loop to break out of, but you're not currently in a loop of that name, not even if you count where you were called from. See L<perlfunc/last>. =item Label not found for "next %s" (F) You named a loop to continue, but you're not currently in a loop of that name, not even if you count where you were called from. See L<perlfunc/last>. =item Label not found for "redo %s" (F) You named a loop to restart, but you're not currently in a loop of that name, not even if you count where you were called from. See L<perlfunc/last>. =item leaving effective %s failed (F) While under the C<use filetest> pragma, switching the real and effective uids or gids failed. =item length/code after end of string in unpack (F) While unpacking, the string buffer was already used up when an unpack length/code combination tried to obtain more data. This results in an undefined value for the length. See L<perlfunc/pack>. =item length() used on %s (did you mean "scalar(%s)"?) (W syntax) You used length() on either an array or a hash when you probably wanted a count of the items. Array size can be obtained by doing: scalar(@array); The number of items in a hash can be obtained by doing: scalar(keys %hash); =item Lexing code attempted to stuff non-Latin-1 character into Latin-1 input (F) An extension is attempting to insert text into the current parse (using L<lex_stuff_pvn|perlapi/lex_stuff_pvn> or similar), but tried to insert a character that couldn't be part of the current input. This is an inherent pitfall of the stuffing mechanism, and one of the reasons to avoid it. Where it is necessary to stuff, stuffing only plain ASCII is recommended. =item Lexing code internal error (%s) (F) Lexing code supplied by an extension violated the lexer's API in a detectable way. =item listen() on closed socket %s (W closed) You tried to do a listen on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/listen>. =item List form of piped open not implemented (F) On some platforms, notably Windows, the three-or-more-arguments form of C<open> does not support pipes, such as C<open($pipe, '|-', @args)>. Use the two-argument C<open($pipe, '|prog arg1 arg2...')> form instead. =item Literal vertical space in [] is illegal except under /x in regex; marked by S<<-- HERE> in m/%s/ (F) (only under C<S<use re 'strict'>> or within C<(?[...])>) Likely you forgot the C</x> modifier or there was a typo in the pattern. For example, did you really mean to match a form-feed? If so, all the ASCII vertical space control characters are representable by escape sequences which won't present such a jarring appearance as your pattern does when displayed. \r carriage return \f form feed \n line feed \cK vertical tab =item %s: loadable library and perl binaries are mismatched (got handshake key %p, needed %p) (P) A dynamic loading library C<.so> or C<.dll> was being loaded into the process that was built against a different build of perl than the said library was compiled against. Reinstalling the XS module will likely fix this error. =item Locale '%s' contains (at least) the following characters which have unexpected meanings: %s The Perl program will use the expected meanings (W locale) You are using the named UTF-8 locale. UTF-8 locales are expected to have very particular behavior, which most do. This message arises when perl found some departures from the expectations, and is notifying you that the expected behavior overrides these differences. In some cases the differences are caused by the locale definition being defective, but the most common causes of this warning are when there are ambiguities and conflicts in following the Standard, and the locale has chosen an approach that differs from Perl's. One of these is because that, contrary to the claims, Unicode is not completely locale insensitive. Turkish and some related languages have two types of C<"I"> characters. One is dotted in both upper- and lowercase, and the other is dotless in both cases. Unicode allows a locale to use either the Turkish rules, or the rules used in all other instances, where there is only one type of C<"I">, which is dotless in the uppercase, and dotted in the lower. The perl core does not (yet) handle the Turkish case, and this message warns you of that. Instead, the L<Unicode::Casing> module allows you to mostly implement the Turkish casing rules. The other common cause is for the characters $ + < = > ^ ` | ~ These are probematic. The C standard says that these should be considered punctuation in the C locale (and the POSIX standard defers to the C standard), and Unicode is generally considered a superset of the C locale. But Unicode has added an extra category, "Symbol", and classifies these particular characters as being symbols. Most UTF-8 locales have them treated as punctuation, so that L<ispunct(2)> returns non-zero for them. But a few locales have it return 0. Perl takes the first approach, not using C<ispunct()> at all (see L<Note [5] in perlrecharclass|perlrecharclass/[5]>), and this message is raised to notify you that you are getting Perl's approach, not the locale's. =item Locale '%s' may not work well.%s (W locale) You are using the named locale, which is a non-UTF-8 one, and which perl has determined is not fully compatible with what it can handle. The second C<%s> gives a reason. By far the most common reason is that the locale has characters in it that are represented by more than one byte. The only such locales that Perl can handle are the UTF-8 locales. Most likely the specified locale is a non-UTF-8 one for an East Asian language such as Chinese or Japanese. If the locale is a superset of ASCII, the ASCII portion of it may work in Perl. Some essentially obsolete locales that aren't supersets of ASCII, mainly those in ISO 646 or other 7-bit locales, such as ASMO 449, can also have problems, depending on what portions of the ASCII character set get changed by the locale and are also used by the program. The warning message lists the determinable conflicting characters. Note that not all incompatibilities are found. If this happens to you, there's not much you can do except switch to use a different locale or use L<Encode> to translate from the locale into UTF-8; if that's impracticable, you have been warned that some things may break. This message is output once each time a bad locale is switched into within the scope of C<S<use locale>>, or on the first possibly-affected operation if the C<S<use locale>> inherits a bad one. It is not raised for any operations from the L<POSIX> module. =item localtime(%f) failed (W overflow) You called C<localtime> with a number that it could not handle: too large, too small, or NaN. The returned value is C<undef>. =item localtime(%f) too large (W overflow) You called C<localtime> with a number that was larger than it can reliably handle and C<localtime> probably returned the wrong date. This warning is also triggered with NaN (the special not-a-number value). =item localtime(%f) too small (W overflow) You called C<localtime> with a number that was smaller than it can reliably handle and C<localtime> probably returned the wrong date. =item Lookbehind longer than %d not implemented in regex m/%s/ (F) There is currently a limit on the length of string which lookbehind can handle. This restriction may be eased in a future release. =item Lost precision when %s %f by 1 (W imprecision) The value you attempted to increment or decrement by one is too large for the underlying floating point representation to store accurately, hence the target of C<++> or C<--> is unchanged. Perl issues this warning because it has already switched from integers to floating point when values are too large for integers, and now even floating point is insufficient. You may wish to switch to using L<Math::BigInt> explicitly. =item lstat() on filehandle%s (W io) You tried to do an lstat on a filehandle. What did you mean by that? lstat() makes sense only on filenames. (Perl did a fstat() instead on the filehandle.) =item lvalue attribute %s already-defined subroutine (W misc) Although L<attributes.pm|attributes> allows this, turning the lvalue attribute on or off on a Perl subroutine that is already defined does not always work properly. It may or may not do what you want, depending on what code is inside the subroutine, with exact details subject to change between Perl versions. Only do this if you really know what you are doing. =item lvalue attribute ignored after the subroutine has been defined (W misc) Using the C<:lvalue> declarative syntax to make a Perl subroutine an lvalue subroutine after it has been defined is not permitted. To make the subroutine an lvalue subroutine, add the lvalue attribute to the definition, or put the C<sub foo :lvalue;> declaration before the definition. See also L<attributes.pm|attributes>. =item Magical list constants are not supported (F) You assigned a magical array to a stash element, and then tried to use the subroutine from the same slot. You are asking Perl to do something it cannot do, details subject to change between Perl versions. =item Malformed integer in [] in pack (F) Between the brackets enclosing a numeric repeat count only digits are permitted. See L<perlfunc/pack>. =item Malformed integer in [] in unpack (F) Between the brackets enclosing a numeric repeat count only digits are permitted. See L<perlfunc/pack>. =item Malformed PERLLIB_PREFIX (F) An error peculiar to OS/2. PERLLIB_PREFIX should be of the form prefix1;prefix2 or prefix1 prefix2 with nonempty prefix1 and prefix2. If C<prefix1> is indeed a prefix of a builtin library search path, prefix2 is substituted. The error may appear if components are not found, or are too long. See "PERLLIB_PREFIX" in L<perlos2>. =item Malformed prototype for %s: %s (F) You tried to use a function with a malformed prototype. The syntax of function prototypes is given a brief compile-time check for obvious errors like invalid characters. A more rigorous check is run when the function is called. Perhaps the function's author was trying to write a subroutine signature but didn't enable that feature first (C<use feature 'signatures'>), so the signature was instead interpreted as a bad prototype. =item Malformed UTF-8 character%s (S utf8)(F) Perl detected a string that should be UTF-8, but didn't comply with UTF-8 encoding rules, or represents a code point whose ordinal integer value doesn't fit into the word size of the current platform (overflows). Details as to the exact malformation are given in the variable, C<%s>, part of the message. One possible cause is that you set the UTF8 flag yourself for data that you thought to be in UTF-8 but it wasn't (it was for example legacy 8-bit data). To guard against this, you can use C<Encode::decode('UTF-8', ...)>. If you use the C<:encoding(UTF-8)> PerlIO layer for input, invalid byte sequences are handled gracefully, but if you use C<:utf8>, the flag is set without validating the data, possibly resulting in this error message. See also L<Encode/"Handling Malformed Data">. =item Malformed UTF-8 returned by \N{%s} immediately after '%s' (F) The charnames handler returned malformed UTF-8. =item Malformed UTF-8 string in "%s" (F) This message indicates a bug either in the Perl core or in XS code. Such code was trying to find out if a character, allegedly stored internally encoded as UTF-8, was of a given type, such as being punctuation or a digit. But the character was not encoded in legal UTF-8. The C<%s> is replaced by a string that can be used by knowledgeable people to determine what the type being checked against was. Passing malformed strings was deprecated in Perl 5.18, and became fatal in Perl 5.26. =item Malformed UTF-8 string in '%c' format in unpack (F) You tried to unpack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress. =item Malformed UTF-8 string in pack (F) You tried to pack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress. =item Malformed UTF-8 string in unpack (F) You tried to unpack something that didn't comply with UTF-8 encoding rules and perl was unable to guess how to make more progress. =item Malformed UTF-16 surrogate (F) Perl thought it was reading UTF-16 encoded character data but while doing it Perl met a malformed Unicode surrogate. =item Mandatory parameter follows optional parameter (F) In a subroutine signature, you wrote something like "$a = undef, $b", making an earlier parameter optional and a later one mandatory. Parameters are filled from left to right, so it's impossible for the caller to omit an earlier one and pass a later one. If you want to act as if the parameters are filled from right to left, declare the rightmost optional and then shuffle the parameters around in the subroutine's body. =item Matched non-Unicode code point 0x%X against Unicode property; may not be portable (S non_unicode) Perl allows strings to contain a superset of Unicode code points; each code point may be as large as what is storable in a signed integer on your system, but these may not be accepted by other languages/systems. This message occurs when you matched a string containing such a code point against a regular expression pattern, and the code point was matched against a Unicode property, C<\p{...}> or C<\P{...}>. Unicode properties are only defined on Unicode code points, so the result of this match is undefined by Unicode, but Perl (starting in v5.20) treats non-Unicode code points as if they were typical unassigned Unicode ones, and matched this one accordingly. Whether a given property matches these code points or not is specified in L<perluniprops/Properties accessible through \p{} and \P{}>. This message is suppressed (unless it has been made fatal) if it is immaterial to the results of the match if the code point is Unicode or not. For example, the property C<\p{ASCII_Hex_Digit}> only can match the 22 characters C<[0-9A-Fa-f]>, so obviously all other code points, Unicode or not, won't match it. (And C<\P{ASCII_Hex_Digit}> will match every code point except these 22.) Getting this message indicates that the outcome of the match arguably should have been the opposite of what actually happened. If you think that is the case, you may wish to make the C<non_unicode> warnings category fatal; if you agree with Perl's decision, you may wish to turn off this category. See L<perlunicode/Beyond Unicode code points> for more information. =item %s matches null string many times in regex; marked by S<<-- HERE> in m/%s/ (W regexp) The pattern you've specified would be an infinite loop if the regular expression engine didn't specifically check for that. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Maximal count of pending signals (%u) exceeded (F) Perl aborted due to too high a number of signals pending. This usually indicates that your operating system tried to deliver signals too fast (with a very high priority), starving the perl process from resources it would need to reach a point where it can process signals safely. (See L<perlipc/"Deferred Signals (Safe Signals)">.) =item "%s" may clash with future reserved word (W) This warning may be due to running a perl5 script through a perl4 interpreter, especially if the word that is being warned about is "use" or "my". =item '%' may not be used in pack (F) You can't pack a string by supplying a checksum, because the checksumming process loses information, and you can't go the other way. See L<perlfunc/unpack>. =item Method for operation %s not found in package %s during blessing (F) An attempt was made to specify an entry in an overloading table that doesn't resolve to a valid subroutine. See L<overload>. =item Method %s not permitted See L</500 Server error>. =item Might be a runaway multi-line %s string starting on line %d (S) An advisory indicating that the previous error may have been caused by a missing delimiter on a string or pattern, because it eventually ended earlier on the current line. =item Misplaced _ in number (W syntax) An underscore (underbar) in a numeric constant did not separate two digits. =item Missing argument for %n in %s (F) A C<%n> was used in a format string with no corresponding argument for perl to write the current string length to. =item Missing argument in %s (W missing) You called a function with fewer arguments than other arguments you supplied indicated would be needed. Currently only emitted when a printf-type format required more arguments than were supplied, but might be used in the future for other cases where we can statically determine that arguments to functions are missing, e.g. for the L<perlfunc/pack> function. =item Missing argument to -%c (F) The argument to the indicated command line switch must follow immediately after the switch, without intervening spaces. =item Missing braces on \N{} =item Missing braces on \N{} in regex; marked by S<<-- HERE> in m/%s/ (F) Wrong syntax of character name literal C<\N{charname}> within double-quotish context. This can also happen when there is a space (or comment) between the C<\N> and the C<{> in a regex with the C</x> modifier. This modifier does not change the requirement that the brace immediately follow the C<\N>. =item Missing braces on \o{} (F) A C<\o> must be followed immediately by a C<{> in double-quotish context. =item Missing comma after first argument to %s function (F) While certain functions allow you to specify a filehandle or an "indirect object" before the argument list, this ain't one of them. =item Missing command in piped open (W pipe) You used the C<open(FH, "| command")> or C<open(FH, "command |")> construction, but the command was missing or blank. =item Missing control char name in \c (F) A double-quoted string ended with "\c", without the required control character name. =item Missing ']' in prototype for %s : %s (W illegalproto) A grouping was started with C<[> but never closed with C<]>. =item Missing name in "%s sub" (F) The syntax for lexically scoped subroutines requires that they have a name with which they can be found. =item Missing $ on loop variable (F) Apparently you've been programming in B<csh> too much. Variables are always mentioned with the $ in Perl, unlike in the shells, where it can vary from one line to the next. =item (Missing operator before %s?) (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". Often the missing operator is a comma. =item Missing or undefined argument to %s (F) You tried to call require or do with no argument or with an undefined value as an argument. Require expects either a package name or a file-specification as an argument; do expects a filename. See L<perlfunc/require EXPR> and L<perlfunc/do EXPR>. =item Missing right brace on \%c{} in regex; marked by S<<-- HERE> in m/%s/ (F) Missing right brace in C<\x{...}>, C<\p{...}>, C<\P{...}>, or C<\N{...}>. =item Missing right brace on \N{} =item Missing right brace on \N{} or unescaped left brace after \N (F) C<\N> has two meanings. The traditional one has it followed by a name enclosed in braces, meaning the character (or sequence of characters) given by that name. Thus C<\N{ASTERISK}> is another way of writing C<*>, valid in both double-quoted strings and regular expression patterns. In patterns, it doesn't have the meaning an unescaped C<*> does. Starting in Perl 5.12.0, C<\N> also can have an additional meaning (only) in patterns, namely to match a non-newline character. (This is short for C<[^\n]>, and like C<.> but is not affected by the C</s> regex modifier.) This can lead to some ambiguities. When C<\N> is not followed immediately by a left brace, Perl assumes the C<[^\n]> meaning. Also, if the braces form a valid quantifier such as C<\N{3}> or C<\N{5,}>, Perl assumes that this means to match the given quantity of non-newlines (in these examples, 3; and 5 or more, respectively). In all other case, where there is a C<\N{> and a matching C<}>, Perl assumes that a character name is desired. However, if there is no matching C<}>, Perl doesn't know if it was mistakenly omitted, or if C<[^\n]{> was desired, and raises this error. If you meant the former, add the right brace; if you meant the latter, escape the brace with a backslash, like so: C<\N\{> =item Missing right curly or square bracket (F) The lexer counted more opening curly or square brackets than closing ones. As a general rule, you'll find it's missing near the place you were last editing. =item (Missing semicolon on previous line?) (S syntax) This is an educated guess made in conjunction with the message "%s found where operator expected". Don't automatically put a semicolon on the previous line just because you saw this message. =item Modification of a read-only value attempted (F) You tried, directly or indirectly, to change the value of a constant. You didn't, of course, try "2 = 1", because the compiler catches that. But an easy way to do the same thing is: sub mod { $_[0] = 1 } mod(2); Another way is to assign to a substr() that's off the end of the string. Yet another way is to assign to a C<foreach> loop I<VAR> when I<VAR> is aliased to a constant in the look I<LIST>: $x = 1; foreach my $n ($x, 2) { $n *= 2; # modifies the $x, but fails on attempt to } # modify the 2 =item Modification of non-creatable array value attempted, %s (F) You tried to make an array value spring into existence, and the subscript was probably negative, even counting from end of the array backwards. =item Modification of non-creatable hash value attempted, %s (P) You tried to make a hash value spring into existence, and it couldn't be created for some peculiar reason. =item Module name must be constant (F) Only a bare module name is allowed as the first argument to a "use". =item Module name required with -%c option (F) The C<-M> or C<-m> options say that Perl should load some module, but you omitted the name of the module. Consult L<perlrun|perlrun/-m[-]module> for full details about C<-M> and C<-m>. =item More than one argument to '%s' open (F) The C<open> function has been asked to open multiple files. This can happen if you are trying to open a pipe to a command that takes a list of arguments, but have forgotten to specify a piped open mode. See L<perlfunc/open> for details. =item mprotect for COW string %p %u failed with %d (S) You compiled perl with B<-D>PERL_DEBUG_READONLY_COW (see L<perlguts/"Copy on Write">), but a shared string buffer could not be made read-only. =item mprotect for %p %u failed with %d (S) You compiled perl with B<-D>PERL_DEBUG_READONLY_OPS (see L<perlhacktips>), but an op tree could not be made read-only. =item mprotect RW for COW string %p %u failed with %d (S) You compiled perl with B<-D>PERL_DEBUG_READONLY_COW (see L<perlguts/"Copy on Write">), but a read-only shared string buffer could not be made mutable. =item mprotect RW for %p %u failed with %d (S) You compiled perl with B<-D>PERL_DEBUG_READONLY_OPS (see L<perlhacktips>), but a read-only op tree could not be made mutable before freeing the ops. =item msg%s not implemented (F) You don't have System V message IPC on your system. =item Multidimensional syntax %s not supported (W syntax) Multidimensional arrays aren't written like C<$foo[1,2,3]>. They're written like C<$foo[1][2][3]>, as in C. =item Multiple slurpy parameters not allowed (F) In subroutine signatures, a slurpy parameter (C<@> or C<%>) must be the last parameter, and there must not be more than one of them; for example: sub foo ($a, @b) {} # legal sub foo ($a, @b, %) {} # invalid =item '/' must follow a numeric type in unpack (F) You had an unpack template that contained a '/', but this did not follow some unpack specification producing a numeric value. See L<perlfunc/pack>. =item %s must not be a named sequence in transliteration operator (F) Transliteration (C<tr///> and C<y///>) transliterates individual characters. But a named sequence by definition is more than an individual character, and hence doing this operation on it doesn't make sense. =item "my sub" not yet implemented (F) Lexically scoped subroutines are not yet implemented. Don't try that yet. =item "my" subroutine %s can't be in a package (F) Lexically scoped subroutines aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. =item "my %s" used in sort comparison (W syntax) The package variables $a and $b are used for sort comparisons. You used $a or $b in as an operand to the C<< <=> >> or C<cmp> operator inside a sort comparison block, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable. =item "my" variable %s can't be in a package (F) Lexically scoped variables aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. Use local() if you want to localize a package variable. =item Name "%s::%s" used only once: possible typo (W once) Typographical errors often show up as unique variable names. If you had a good reason for having a unique name, then just mention it again somehow to suppress the message. The C<our> declaration is also provided for this purpose. NOTE: This warning detects package symbols that have been used only once. This means lexical variables will never trigger this warning. It also means that all of the package variables $c, @c, %c, as well as *c, &c, sub c{}, c(), and c (the filehandle or format) are considered the same; if a program uses $c only once but also uses any of the others it will not trigger this warning. Symbols beginning with an underscore and symbols using special identifiers (q.v. L<perldata>) are exempt from this warning. =item Need exactly 3 octal digits in regex; marked by S<<-- HERE> in m/%s/ (F) Within S<C<(?[ ])>>, all constants interpreted as octal need to be exactly 3 digits long. This helps catch some ambiguities. If your constant is too short, add leading zeros, like (?[ [ \078 ] ]) # Syntax error! (?[ [ \0078 ] ]) # Works (?[ [ \007 8 ] ]) # Clearer The maximum number this construct can express is C<\777>. If you need a larger one, you need to use L<\o{}|perlrebackslash/Octal escapes> instead. If you meant two separate things, you need to separate them: (?[ [ \7776 ] ]) # Syntax error! (?[ [ \o{7776} ] ]) # One meaning (?[ [ \777 6 ] ]) # Another meaning (?[ [ \777 \006 ] ]) # Still another =item Negative '/' count in unpack (F) The length count obtained from a length/code unpack operation was negative. See L<perlfunc/pack>. =item Negative length (F) You tried to do a read/write/send/recv operation with a buffer length that is less than 0. This is difficult to imagine. =item Negative offset to vec in lvalue context (F) When C<vec> is called in an lvalue context, the second argument must be greater than or equal to zero. =item Negative repeat count does nothing (W numeric) You tried to execute the L<C<x>|perlop/Multiplicative Operators> repetition operator fewer than 0 times, which doesn't make sense. =item Nested quantifiers in regex; marked by S<<-- HERE> in m/%s/ (F) You can't quantify a quantifier without intervening parentheses. So things like ** or +* or ?* are illegal. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. Note that the minimal matching quantifiers, C<*?>, C<+?>, and C<??> appear to be nested quantifiers, but aren't. See L<perlre>. =item %s never introduced (S internal) The symbol in question was declared but somehow went out of scope before it could possibly have been used. =item next::method/next::can/maybe::next::method cannot find enclosing method (F) C<next::method> needs to be called within the context of a real method in a real package, and it could not find such a context. See L<mro>. =item \N in a character class must be a named character: \N{...} in regex; marked by S<<-- HERE> in m/%s/ (F) The new (as of Perl 5.12) meaning of C<\N> as C<[^\n]> is not valid in a bracketed character class, for the same reason that C<.> in a character class loses its specialness: it matches almost everything, which is probably not what you want. =item \N{} here is restricted to one character in regex; marked by <-- HERE in m/%s/ (F) Named Unicode character escapes (C<\N{...}>) may return a multi-character sequence. Even though a character class is supposed to match just one character of input, perl will match the whole thing correctly, except under certain conditions. These currently are =over 4 =item When the class is inverted (C<[^...]>) The mathematically logical behavior for what matches when inverting is very different from what people expect, so we have decided to forbid it. =item The escape is the beginning or final end point of a range Similarly unclear is what should be generated when the C<\N{...}> is used as one of the end points of the range, such as in [\x{41}-\N{ARABIC SEQUENCE YEH WITH HAMZA ABOVE WITH AE}] What is meant here is unclear, as the C<\N{...}> escape is a sequence of code points, so this is made an error. =item In a regex set The syntax S<C<(?[ ])>> in a regular expression yields a list of single code points, none can be a sequence. =back =item No %s allowed while running setuid (F) Certain operations are deemed to be too insecure for a setuid or setgid script to even be allowed to attempt. Generally speaking there will be another way to do what you want that is, if not secure, at least securable. See L<perlsec>. =item No code specified for -%c (F) Perl's B<-e> and B<-E> command-line options require an argument. If you want to run an empty program, pass the empty string as a separate argument or run a program consisting of a single 0 or 1: perl -e "" perl -e0 perl -e1 =item No comma allowed after %s (F) A list operator that has a filehandle or "indirect object" is not allowed to have a comma between that and the following arguments. Otherwise it'd be just another one of the arguments. One possible cause for this is that you expected to have imported a constant to your name space with B<use> or B<import> while no such importing took place, it may for example be that your operating system does not support that particular constant. Hopefully you did use an explicit import list for the constants you expect to see; please see L<perlfunc/use> and L<perlfunc/import>. While an explicit import list would probably have caught this error earlier it naturally does not remedy the fact that your operating system still does not support that constant. Maybe you have a typo in the constants of the symbol import list of B<use> or B<import> or in the constant name at the line where this error was triggered? =item No command into which to pipe on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '|' at the end of the command line, so it doesn't know where you want to pipe the output from this command. =item No DB::DB routine defined (F) The currently executing code was compiled with the B<-d> switch, but for some reason the current debugger (e.g. F<perl5db.pl> or a C<Devel::> module) didn't define a routine to be called at the beginning of each statement. =item No dbm on this machine (P) This is counted as an internal error, because every machine should supply dbm nowadays, because Perl comes with SDBM. See L<SDBM_File>. =item No DB::sub routine defined (F) The currently executing code was compiled with the B<-d> switch, but for some reason the current debugger (e.g. F<perl5db.pl> or a C<Devel::> module) didn't define a C<DB::sub> routine to be called at the beginning of each ordinary subroutine call. =item No digits found for %s literal (F) No hexadecimal digits were found following C<0x> or no binary digits were found following C<0b>. =item No directory specified for -I (F) The B<-I> command-line switch requires a directory name as part of the I<same> argument. Use B<-Ilib>, for instance. B<-I lib> won't work. =item No error file after 2> or 2>> on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '2>' or a '2>>' on the command line, but can't find the name of the file to which to write data destined for stderr. =item No group ending character '%c' found in template (F) A pack or unpack template has an opening '(' or '[' without its matching counterpart. See L<perlfunc/pack>. =item No input file after < on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '<' on the command line, but can't find the name of the file from which to read data for stdin. =item No next::method '%s' found for %s (F) C<next::method> found no further instances of this method name in the remaining packages of the MRO of this class. If you don't want it throwing an exception, use C<maybe::next::method> or C<next::can>. See L<mro>. =item Non-finite repeat count does nothing (W numeric) You tried to execute the L<C<x>|perlop/Multiplicative Operators> repetition operator C<Inf> (or C<-Inf>) or C<NaN> times, which doesn't make sense. =item Non-hex character in regex; marked by S<<-- HERE> in m/%s/ (F) In a regular expression, there was a non-hexadecimal character where a hex one was expected, like (?[ [ \xDG ] ]) (?[ [ \x{DEKA} ] ]) =item Non-hex character '%c' terminates \x early. Resolved as "%s" (W digit) In parsing a hexadecimal numeric constant, a character was unexpectedly encountered that isn't hexadecimal. The resulting value is as indicated. Note that, within braces, every character starting with the first non-hexadecimal up to the ending brace is ignored. =item Non-octal character in regex; marked by S<<-- HERE> in m/%s/ (F) In a regular expression, there was a non-octal character where an octal one was expected, like (?[ [ \o{1278} ] ]) =item Non-octal character '%c' terminates \o early. Resolved as "%s" (W digit) In parsing an octal numeric constant, a character was unexpectedly encountered that isn't octal. The resulting value is as indicated. When not using C<\o{...}>, you wrote something like C<\08>, or C<\179> in a double-quotish string. The resolution is as indicated, with all but the last digit treated as a single character, specified in octal. The last digit is the next character in the string. To tell Perl that this is indeed what you want, you can use the C<\o{ }> syntax, or use exactly three digits to specify the octal for the character. Note that, within braces, every character starting with the first non-octal up to the ending brace is ignored. =item "no" not allowed in expression (F) The "no" keyword is recognized and executed at compile time, and returns no useful value. See L<perlmod>. =item Non-string passed as bitmask (W misc) A number has been passed as a bitmask argument to select(). Use the vec() function to construct the file descriptor bitmasks for select. See L<perlfunc/select>. =item No output file after > on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a lone '>' at the end of the command line, so it doesn't know where you wanted to redirect stdout. =item No output file after > or >> on command line (F) An error peculiar to VMS. Perl handles its own command line redirection, and found a '>' or a '>>' on the command line, but can't find the name of the file to which to write data destined for stdout. =item No package name allowed for subroutine %s in "our" =item No package name allowed for variable %s in "our" (F) Fully qualified subroutine and variable names are not allowed in "our" declarations, because that doesn't make much sense under existing rules. Such syntax is reserved for future extensions. =item No Perl script found in input (F) You called C<perl -x>, but no line was found in the file beginning with #! and containing the word "perl". =item No setregid available (F) Configure didn't find anything resembling the setregid() call for your system. =item No setreuid available (F) Configure didn't find anything resembling the setreuid() call for your system. =item No such class %s (F) You provided a class qualifier in a "my", "our" or "state" declaration, but this class doesn't exist at this point in your program. =item No such class field "%s" in variable %s of type %s (F) You tried to access a key from a hash through the indicated typed variable but that key is not allowed by the package of the same type. The indicated package has restricted the set of allowed keys using the L<fields> pragma. =item No such hook: %s (F) You specified a signal hook that was not recognized by Perl. Currently, Perl accepts C<__DIE__> and C<__WARN__> as valid signal hooks. =item No such pipe open (P) An error peculiar to VMS. The internal routine my_pclose() tried to close a pipe which hadn't been opened. This should have been caught earlier as an attempt to close an unopened filehandle. =item No such signal: SIG%s (W signal) You specified a signal name as a subscript to %SIG that was not recognized. Say C<kill -l> in your shell to see the valid signal names on your system. =item No Unicode property value wildcard matches: (W regexp) You specified a wildcard for a Unicode property value, but there is no property value in the current Unicode release that matches it. Check your spelling. =item Not a CODE reference (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also L<perlref>. =item Not a GLOB reference (F) Perl was trying to evaluate a reference to a "typeglob" (that is, a symbol table entry that looks like C<*foo>), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See L<perlref>. =item Not a HASH reference (F) Perl was trying to evaluate a reference to a hash value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See L<perlref>. =item '#' not allowed immediately following a sigil in a subroutine signature (F) In a subroutine signature definition, a comment following a sigil (C<$>, C<@> or C<%>), needs to be separated by whitespace or a comma etc., in particular to avoid confusion with the C<$#> variable. For example: # bad sub f ($# ignore first arg , $b) {} # good sub f ($, # ignore first arg $b) {} =item Not an ARRAY reference (F) Perl was trying to evaluate a reference to an array value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See L<perlref>. =item Not a SCALAR reference (F) Perl was trying to evaluate a reference to a scalar value, but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See L<perlref>. =item Not a subroutine reference (F) Perl was trying to evaluate a reference to a code value (that is, a subroutine), but found a reference to something else instead. You can use the ref() function to find out what kind of ref it really was. See also L<perlref>. =item Not a subroutine reference in overload table (F) An attempt was made to specify an entry in an overloading table that doesn't somehow point to a valid subroutine. See L<overload>. =item Not enough arguments for %s (F) The function requires more arguments than you specified. =item Not enough format arguments (W syntax) A format specified more picture fields than the next line supplied. See L<perlform>. =item %s: not found (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item no UTC offset information; assuming local time is UTC (S) A warning peculiar to VMS. Perl was unable to find the local timezone offset, so it's assuming that local system time is equivalent to UTC. If it's not, define the logical name F<SYS$TIMEZONE_DIFFERENTIAL> to translate to the number of seconds which need to be added to UTC to get local time. =item NULL OP IN RUN (S debugging) Some internal routine called run() with a null opcode pointer. =item Null picture in formline (F) The first argument to formline must be a valid format picture specification. It was found to be empty, which probably means you supplied it an uninitialized value. See L<perlform>. =item Null realloc (P) An attempt was made to realloc NULL. =item NULL regexp argument (P) The internal pattern matching routines blew it big time. =item NULL regexp parameter (P) The internal pattern matching routines are out of their gourd. =item Number too long (F) Perl limits the representation of decimal numbers in programs to about 250 characters. You've exceeded that length. Future versions of Perl are likely to eliminate this arbitrary limitation. In the meantime, try using scientific notation (e.g. "1e6" instead of "1_000_000"). =item Number with no digits (F) Perl was looking for a number but found nothing that looked like a number. This happens, for example with C<\o{}>, with no number between the braces. =item Numeric format result too large (F) The length of the result of a numeric format supplied to sprintf() or printf() would have been too large for the underlying C function to report. This limit is typically 2GB. =item Numeric variables with more than one digit may not start with '0' (F) The only numeric variable which is allowed to start with a 0 is C<$0>, and you mentioned a variable that starts with 0 that has more than one digit. You probably want to remove the leading 0, or if the intent was to express a variable name in octal you should convert to decimal. =item Octal number > 037777777777 non-portable (W portable) The octal number you specified is larger than 2**32-1 (4294967295) and therefore non-portable between systems. See L<perlport> for more on portability concerns. =item Odd name/value argument for subroutine '%s' (F) A subroutine using a slurpy hash parameter in its signature received an odd number of arguments to populate the hash. It requires the arguments to be paired, with the same number of keys as values. The caller of the subroutine is presumably at fault. The message attempts to include the name of the called subroutine. If the subroutine has been aliased, the subroutine's original name will be shown, regardless of what name the caller used. =item Odd number of arguments for overload::constant (W overload) The call to overload::constant contained an odd number of arguments. The arguments should come in pairs. =item Odd number of elements in anonymous hash (W misc) You specified an odd number of elements to initialize a hash, which is odd, because hashes come in key/value pairs. =item Odd number of elements in hash assignment (W misc) You specified an odd number of elements to initialize a hash, which is odd, because hashes come in key/value pairs. =item Offset outside string (F)(W layer) You tried to do a read/write/send/recv/seek operation with an offset pointing outside the buffer. This is difficult to imagine. The sole exceptions to this are that zero padding will take place when going past the end of the string when either C<sysread()>ing a file, or when seeking past the end of a scalar opened for I/O (in anticipation of future reads and to imitate the behavior with real files). =item Old package separator used in string (W syntax) You used the old package separator, "'", in a variable named inside a double-quoted string; e.g., C<"In $name's house">. This is equivalent to C<"In $name::s house">. If you meant the former, put a backslash before the apostrophe (C<"In $name\'s house">). =item %s() on unopened %s (W unopened) An I/O operation was attempted on a filehandle that was never initialized. You need to do an open(), a sysopen(), or a socket() call, or call a constructor from the FileHandle package. =item -%s on unopened filehandle %s (W unopened) You tried to invoke a file test operator on a filehandle that isn't open. Check your control flow. See also L<perlfunc/-X>. =item oops: oopsAV (S internal) An internal warning that the grammar is screwed up. =item oops: oopsHV (S internal) An internal warning that the grammar is screwed up. =item Operand with no preceding operator in regex; marked by S<<-- HERE> in m/%s/ (F) You wrote something like (?[ \p{Digit} \p{Thai} ]) There are two operands, but no operator giving how you want to combine them. =item Operation "%s": no method found, %s (F) An attempt was made to perform an overloaded operation for which no handler was defined. While some handlers can be autogenerated in terms of other handlers, there is no default handler for any operation, unless the C<fallback> overloading key is specified to be true. See L<overload>. =item Operation "%s" returns its argument for non-Unicode code point 0x%X (S non_unicode) You performed an operation requiring Unicode rules on a code point that is not in Unicode, so what it should do is not defined. Perl has chosen to have it do nothing, and warn you. If the operation shown is "ToFold", it means that case-insensitive matching in a regular expression was done on the code point. If you know what you are doing you can turn off this warning by C<no warnings 'non_unicode';>. =item Operation "%s" returns its argument for UTF-16 surrogate U+%X (S surrogate) You performed an operation requiring Unicode rules on a Unicode surrogate. Unicode frowns upon the use of surrogates for anything but storing strings in UTF-16, but rules are (reluctantly) defined for the surrogates, and they are to do nothing for this operation. Because the use of surrogates can be dangerous, Perl warns. If the operation shown is "ToFold", it means that case-insensitive matching in a regular expression was done on the code point. If you know what you are doing you can turn off this warning by C<no warnings 'surrogate';>. =item Operator or semicolon missing before %s (S ambiguous) You used a variable or subroutine call where the parser was expecting an operator. The parser has assumed you really meant to use an operator, but this is highly likely to be incorrect. For example, if you say "*foo *foo" it will be interpreted as if you said "*foo * 'foo'". =item Optional parameter lacks default expression (F) In a subroutine signature, you wrote something like "$a =", making a named optional parameter without a default value. A nameless optional parameter is permitted to have no default value, but a named one must have a specific default. You probably want "$a = undef". =item "our" variable %s redeclared (W shadow) You seem to have already declared the same global once before in the current lexical scope. =item Out of memory! (X) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. Perl has no option but to exit immediately. At least in Unix you may be able to get past this by increasing your process datasize limits: in csh/tcsh use C<limit> and C<limit datasize n> (where C<n> is the number of kilobytes) to check the current limits and change them, and in ksh/bash/zsh use C<ulimit -a> and C<ulimit -d n>, respectively. =item Out of memory during %s extend (X) An attempt was made to extend an array, a list, or a string beyond the largest possible memory allocation. =item Out of memory during "large" request for %s (F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. However, the request was judged large enough (compile-time default is 64K), so a possibility to shut down by trapping this error is granted. =item Out of memory during request for %s (X)(F) The malloc() function returned 0, indicating there was insufficient remaining memory (or virtual memory) to satisfy the request. The request was judged to be small, so the possibility to trap it depends on the way perl was compiled. By default it is not trappable. However, if compiled for this, Perl may use the contents of C<$^M> as an emergency pool after die()ing with this message. In this case the error is trappable I<once>, and the error message will include the line and file where the failed request happened. =item Out of memory during ridiculously large request (F) You can't allocate more than 2^31+"small amount" bytes. This error is most likely to be caused by a typo in the Perl program. e.g., C<$arr[time]> instead of C<$arr[$time]>. =item Out of memory for yacc stack (F) The yacc parser wanted to grow its stack so it could continue parsing, but realloc() wouldn't give it more memory, virtual or otherwise. =item '.' outside of string in pack (F) The argument to a '.' in your template tried to move the working position to before the start of the packed string being built. =item '@' outside of string in unpack (F) You had a template that specified an absolute position outside the string being unpacked. See L<perlfunc/pack>. =item '@' outside of string with malformed UTF-8 in unpack (F) You had a template that specified an absolute position outside the string being unpacked. The string being unpacked was also invalid UTF-8. See L<perlfunc/pack>. =item overload arg '%s' is invalid (W overload) The L<overload> pragma was passed an argument it did not recognize. Did you mistype an operator? =item Overloaded dereference did not return a reference (F) An object with an overloaded dereference operator was dereferenced, but the overloaded operation did not return a reference. See L<overload>. =item Overloaded qr did not return a REGEXP (F) An object with a C<qr> overload was used as part of a match, but the overloaded operation didn't return a compiled regexp. See L<overload>. =item %s package attribute may clash with future reserved word: %s (W reserved) A lowercase attribute name was used that had a package-specific handler. That name might have a meaning to Perl itself some day, even though it doesn't yet. Perhaps you should use a mixed-case attribute name, instead. See L<attributes>. =item pack/unpack repeat count overflow (F) You can't specify a repeat count so large that it overflows your signed integers. See L<perlfunc/pack>. =item page overflow (W io) A single call to write() produced more lines than can fit on a page. See L<perlform>. =item panic: %s (P) An internal error. =item panic: attempt to call %s in %s (P) One of the file test operators entered a code branch that calls an ACL related-function, but that function is not available on this platform. Earlier checks mean that it should not be possible to enter this branch on this platform. =item panic: child pseudo-process was never scheduled (P) A child pseudo-process in the ithreads implementation on Windows was not scheduled within the time period allowed and therefore was not able to initialize properly. =item panic: ck_grep, type=%u (P) Failed an internal consistency check trying to compile a grep. =item panic: corrupt saved stack index %ld (P) The savestack was requested to restore more localized values than there are in the savestack. =item panic: del_backref (P) Failed an internal consistency check while trying to reset a weak reference. =item panic: do_subst (P) The internal pp_subst() routine was called with invalid operational data. =item panic: do_trans_%s (P) The internal do_trans routines were called with invalid operational data. =item panic: fold_constants JMPENV_PUSH returned %d (P) While attempting folding constants an exception other than an C<eval> failure was caught. =item panic: frexp: %f (P) The library function frexp() failed, making printf("%f") impossible. =item panic: goto, type=%u, ix=%ld (P) We popped the context stack to a context with the specified label, and then discovered it wasn't a context we know how to do a goto in. =item panic: gp_free failed to free glob pointer (P) The internal routine used to clear a typeglob's entries tried repeatedly, but each time something re-created entries in the glob. Most likely the glob contains an object with a reference back to the glob and a destructor that adds a new object to the glob. =item panic: INTERPCASEMOD, %s (P) The lexer got into a bad state at a case modifier. =item panic: INTERPCONCAT, %s (P) The lexer got into a bad state parsing a string with brackets. =item panic: kid popen errno read (F) A forked child returned an incomprehensible message about its errno. =item panic: last, type=%u (P) We popped the context stack to a block context, and then discovered it wasn't a block context. =item panic: leave_scope clearsv (P) A writable lexical variable became read-only somehow within the scope. =item panic: leave_scope inconsistency %u (P) The savestack probably got out of sync. At least, there was an invalid enum on the top of it. =item panic: magic_killbackrefs (P) Failed an internal consistency check while trying to reset all weak references to an object. =item panic: malloc, %s (P) Something requested a negative number of bytes of malloc. =item panic: memory wrap (P) Something tried to allocate either more memory than possible or a negative amount. =item panic: pad_alloc, %p!=%p (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. =item panic: pad_free curpad, %p!=%p (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. =item panic: pad_free po (P) A zero scratch pad offset was detected internally. An attempt was made to free a target that had not been allocated to begin with. =item panic: pad_reset curpad, %p!=%p (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. =item panic: pad_sv po (P) A zero scratch pad offset was detected internally. Most likely an operator needed a target but that target had not been allocated for whatever reason. =item panic: pad_swipe curpad, %p!=%p (P) The compiler got confused about which scratch pad it was allocating and freeing temporaries and lexicals from. =item panic: pad_swipe po (P) An invalid scratch pad offset was detected internally. =item panic: pp_iter, type=%u (P) The foreach iterator got called in a non-loop context frame. =item panic: pp_match%s (P) The internal pp_match() routine was called with invalid operational data. =item panic: realloc, %s (P) Something requested a negative number of bytes of realloc. =item panic: reference miscount on nsv in sv_replace() (%d != 1) (P) The internal sv_replace() function was handed a new SV with a reference count other than 1. =item panic: restartop in %s (P) Some internal routine requested a goto (or something like it), and didn't supply the destination. =item panic: return, type=%u (P) We popped the context stack to a subroutine or eval context, and then discovered it wasn't a subroutine or eval context. =item panic: scan_num, %s (P) scan_num() got called on something that wasn't a number. =item panic: Sequence (?{...}): no code block found in regex m/%s/ (P) While compiling a pattern that has embedded (?{}) or (??{}) code blocks, perl couldn't locate the code block that should have already been seen and compiled by perl before control passed to the regex compiler. =item panic: strxfrm() gets absurd - a => %u, ab => %u (P) The interpreter's sanity check of the C function strxfrm() failed. In your current locale the returned transformation of the string "ab" is shorter than that of the string "a", which makes no sense. =item panic: sv_chop %s (P) The sv_chop() routine was passed a position that is not within the scalar's string buffer. =item panic: sv_insert, midend=%p, bigend=%p (P) The sv_insert() routine was told to remove more string than there was string. =item panic: top_env (P) The compiler attempted to do a goto, or something weird like that. =item panic: unimplemented op %s (#%d) called (P) The compiler is screwed up and attempted to use an op that isn't permitted at run time. =item panic: unknown OA_*: %x (P) The internal routine that handles arguments to C<&CORE::foo()> subroutine calls was unable to determine what type of arguments were expected. =item panic: utf16_to_utf8: odd bytelen (P) Something tried to call utf16_to_utf8 with an odd (as opposed to even) byte length. =item panic: utf16_to_utf8_reversed: odd bytelen (P) Something tried to call utf16_to_utf8_reversed with an odd (as opposed to even) byte length. =item panic: yylex, %s (P) The lexer got into a bad state while processing a case modifier. =item Parentheses missing around "%s" list (W parenthesis) You said something like my $foo, $bar = @_; when you meant my ($foo, $bar) = @_; Remember that "my", "our", "local" and "state" bind tighter than comma. =item Parsing code internal error (%s) (F) Parsing code supplied by an extension violated the parser's API in a detectable way. =item Pattern subroutine nesting without pos change exceeded limit in regex (F) You used a pattern that uses too many nested subpattern calls without consuming any text. Restructure the pattern so text is consumed before the nesting limit is exceeded. =item C<-p> destination: %s (F) An error occurred during the implicit output invoked by the C<-p> command-line switch. (This output goes to STDOUT unless you've redirected it with select().) =item Perl API version %s of %s does not match %s (F) The XS module in question was compiled against a different incompatible version of Perl than the one that has loaded the XS module. =item Perl folding rules are not up-to-date for 0x%X; please use the perlbug utility to report; in regex; marked by S<<-- HERE> in m/%s/ (S regexp) You used a regular expression with case-insensitive matching, and there is a bug in Perl in which the built-in regular expression folding rules are not accurate. This may lead to incorrect results. Please report this as a bug to L<https://github.com/Perl/perl5/issues>. =item PerlIO layer ':win32' is experimental (S experimental::win32_perlio) The C<:win32> PerlIO layer is experimental. If you want to take the risk of using this layer, simply disable this warning: no warnings "experimental::win32_perlio"; =item Perl_my_%s() not available (F) Your platform has very uncommon byte-order and integer size, so it was not possible to set up some or all fixed-width byte-order conversion functions. This is only a problem when you're using the '<' or '>' modifiers in (un)pack templates. See L<perlfunc/pack>. =item Perl %s required (did you mean %s?)--this is only %s, stopped (F) The code you are trying to run has asked for a newer version of Perl than you are running. Perhaps C<use 5.10> was written instead of C<use 5.010> or C<use v5.10>. Without the leading C<v>, the number is interpreted as a decimal, with every three digits after the decimal point representing a part of the version number. So 5.10 is equivalent to v5.100. =item Perl %s required--this is only %s, stopped (F) The module in question uses features of a version of Perl more recent than the currently running version. How long has it been since you upgraded, anyway? See L<perlfunc/require>. =item PERL_SH_DIR too long (F) An error peculiar to OS/2. PERL_SH_DIR is the directory to find the C<sh>-shell in. See "PERL_SH_DIR" in L<perlos2>. =item PERL_SIGNALS illegal: "%s" (X) See L<perlrun/PERL_SIGNALS> for legal values. =item Perls since %s too modern--this is %s, stopped (F) The code you are trying to run claims it will not run on the version of Perl you are using because it is too new. Maybe the code needs to be updated, or maybe it is simply wrong and the version check should just be removed. =item perl: warning: Non hex character in '$ENV{PERL_HASH_SEED}', seed only partially set (S) PERL_HASH_SEED should match /^\s*(?:0x)?[0-9a-fA-F]+\s*\z/ but it contained a non hex character. This could mean you are not using the hash seed you think you are. =item perl: warning: Setting locale failed. (S) The whole warning message will look something like: perl: warning: Setting locale failed. perl: warning: Please check that your locale settings: LC_ALL = "En_US", LANG = (unset) are supported and installed on your system. perl: warning: Falling back to the standard locale ("C"). Exactly what were the failed locale settings varies. In the above the settings were that the LC_ALL was "En_US" and the LANG had no value. This error means that Perl detected that you and/or your operating system supplier and/or system administrator have set up the so-called locale system but Perl could not use those settings. This was not dead serious, fortunately: there is a "default locale" called "C" that Perl can and will use, and the script will be run. Before you really fix the problem, however, you will get the same error message each time you run Perl. How to really fix the problem can be found in L<perllocale> section B<LOCALE PROBLEMS>. =item perl: warning: strange setting in '$ENV{PERL_PERTURB_KEYS}': '%s' (S) Perl was run with the environment variable PERL_PERTURB_KEYS defined but containing an unexpected value. The legal values of this setting are as follows. Numeric | String | Result --------+---------------+----------------------------------------- 0 | NO | Disables key traversal randomization 1 | RANDOM | Enables full key traversal randomization 2 | DETERMINISTIC | Enables repeatable key traversal | | randomization Both numeric and string values are accepted, but note that string values are case sensitive. The default for this setting is "RANDOM" or 1. =item pid %x not a child (W exec) A warning peculiar to VMS. Waitpid() was asked to wait for a process which isn't a subprocess of the current process. While this is fine from VMS' perspective, it's probably not what you intended. =item 'P' must have an explicit size in unpack (F) The unpack format P must have an explicit size, not "*". =item POSIX class [:%s:] unknown in regex; marked by S<<-- HERE> in m/%s/ (F) The class in the character class [: :] syntax is unknown. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. Note that the POSIX character classes do B<not> have the C<is> prefix the corresponding C interfaces have: in other words, it's C<[[:print:]]>, not C<isprint>. See L<perlre>. =item POSIX getpgrp can't take an argument (F) Your system has POSIX getpgrp(), which takes no argument, unlike the BSD version, which takes a pid. =item POSIX syntax [%c %c] belongs inside character classes%s in regex; marked by S<<-- HERE> in m/%s/ (W regexp) Perl thinks that you intended to write a POSIX character class, but didn't use enough brackets. These POSIX class constructs [: :], [= =], and [. .] go I<inside> character classes, the [] are part of the construct, for example: C<qr/[012[:alpha:]345]/>. What the regular expression pattern compiled to is probably not what you were intending. For example, C<qr/[:alpha:]/> compiles to a regular bracketed character class consisting of the four characters C<":">, C<"a">, C<"l">, C<"h">, and C<"p">. To specify the POSIX class, it should have been written C<qr/[[:alpha:]]/>. Note that [= =] and [. .] are not currently implemented; they are simply placeholders for future extensions and will cause fatal errors. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. If the specification of the class was not completely valid, the message indicates that. =item POSIX syntax [. .] is reserved for future extensions in regex; marked by S<<-- HERE> in m/%s/ (F) Within regular expression character classes ([]) the syntax beginning with "[." and ending with ".]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[." and ".\]". The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item POSIX syntax [= =] is reserved for future extensions in regex; marked by S<<-- HERE> in m/%s/ (F) Within regular expression character classes ([]) the syntax beginning with "[=" and ending with "=]" is reserved for future extensions. If you need to represent those character sequences inside a regular expression character class, just quote the square brackets with the backslash: "\[=" and "=\]". The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Possible attempt to put comments in qw() list (W qw) qw() lists contain items separated by whitespace; as with literal strings, comment characters are not ignored, but are instead treated as literal data. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: @list = qw( a # a comment b # another comment ); when you should have written this: @list = qw( a b ); If you really want comments, build your list the old-fashioned way, with quotes and commas: @list = ( 'a', # a comment 'b', # another comment ); =item Possible attempt to separate words with commas (W qw) qw() lists contain items separated by whitespace; therefore commas aren't needed to separate the items. (You may have used different delimiters than the parentheses shown here; braces are also frequently used.) You probably wrote something like this: qw! a, b, c !; which puts literal commas into some of the list items. Write it without commas if you don't want them to appear in your data: qw! a b c !; =item Possible memory corruption: %s overflowed 3rd argument (F) An ioctl() or fcntl() returned more than Perl was bargaining for. Perl guesses a reasonable buffer size, but puts a sentinel byte at the end of the buffer just in case. This sentinel byte got clobbered, and Perl assumes that memory is now corrupted. See L<perlfunc/ioctl>. =item Possible precedence issue with control flow operator (W syntax) There is a possible problem with the mixing of a control flow operator (e.g. C<return>) and a low-precedence operator like C<or>. Consider: sub { return $a or $b; } This is parsed as: sub { (return $a) or $b; } Which is effectively just: sub { return $a; } Either use parentheses or the high-precedence variant of the operator. Note this may be also triggered for constructs like: sub { 1 if die; } =item Possible precedence problem on bitwise %s operator (W precedence) Your program uses a bitwise logical operator in conjunction with a numeric comparison operator, like this : if ($x & $y == 0) { ... } This expression is actually equivalent to C<$x & ($y == 0)>, due to the higher precedence of C<==>. This is probably not what you want. (If you really meant to write this, disable the warning, or, better, put the parentheses explicitly and write C<$x & ($y == 0)>). =item Possible unintended interpolation of $\ in regex (W ambiguous) You said something like C<m/$\/> in a regex. The regex C<m/foo$\s+bar/m> translates to: match the word 'foo', the output record separator (see L<perlvar/$\>) and the letter 's' (one time or more) followed by the word 'bar'. If this is what you intended then you can silence the warning by using C<m/${\}/> (for example: C<m/foo${\}s+bar/>). If instead you intended to match the word 'foo' at the end of the line followed by whitespace and the word 'bar' on the next line then you can use C<m/$(?)\/> (for example: C<m/foo$(?)\s+bar/>). =item Possible unintended interpolation of %s in string (W ambiguous) You said something like '@foo' in a double-quoted string but there was no array C<@foo> in scope at the time. If you wanted a literal @foo, then write it as \@foo; otherwise find out what happened to the array you apparently lost track of. =item Precedence problem: open %s should be open(%s) (S precedence) The old irregular construct open FOO || die; is now misinterpreted as open(FOO || die); because of the strict regularization of Perl 5's grammar into unary and list operators. (The old open was a little of both.) You must put parentheses around the filehandle, or use the new "or" operator instead of "||". =item Premature end of script headers See L</500 Server error>. =item printf() on closed filehandle %s (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow. =item print() on closed filehandle %s (W closed) The filehandle you're printing on got itself closed sometime before now. Check your control flow. =item Process terminated by SIG%s (W) This is a standard message issued by OS/2 applications, while *nix applications die in silence. It is considered a feature of the OS/2 port. One can easily disable this by appropriate sighandlers, see L<perlipc/"Signals">. See also "Process terminated by SIGTERM/SIGINT" in L<perlos2>. =item Prototype after '%c' for %s : %s (W illegalproto) A character follows % or @ in a prototype. This is useless, since % and @ gobble the rest of the subroutine arguments. =item Prototype mismatch: %s vs %s (S prototype) The subroutine being declared or defined had previously been declared or defined with a different function prototype. =item Prototype not terminated (F) You've omitted the closing parenthesis in a function prototype definition. =item Prototype '%s' overridden by attribute 'prototype(%s)' in %s (W prototype) A prototype was declared in both the parentheses after the sub name and via the prototype attribute. The prototype in parentheses is useless, since it will be replaced by the prototype from the attribute before it's ever used. =item Quantifier follows nothing in regex; marked by S<<-- HERE> in m/%s/ (F) You started a regular expression with a quantifier. Backslash it if you meant it literally. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Quantifier in {,} bigger than %d in regex; marked by S<<-- HERE> in m/%s/ (F) There is currently a limit to the size of the min and max values of the {min,max} construct. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Quantifier {n,m} with n > m can't match in regex =item Quantifier {n,m} with n > m can't match in regex; marked by S<<-- HERE> in m/%s/ (W regexp) Minima should be less than or equal to maxima. If you really want your regexp to match something 0 times, just put {0}. =item Quantifier unexpected on zero-length expression in regex m/%s/ (W regexp) You applied a regular expression quantifier in a place where it makes no sense, such as on a zero-width assertion. Try putting the quantifier inside the assertion instead. For example, the way to match "abc" provided that it is followed by three repetitions of "xyz" is C</abc(?=(?:xyz){3})/>, not C</abc(?=xyz){3}/>. =item Range iterator outside integer range (F) One (or both) of the numeric arguments to the range operator ".." are outside the range which can be represented by integers internally. One possible workaround is to force Perl to use magical string increment by prepending "0" to your numbers. =item Ranges of ASCII printables should be some subset of "0-9", "A-Z", or "a-z" in regex; marked by S<<-- HERE> in m/%s/ (W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>) Stricter rules help to find typos and other errors. Perhaps you didn't even intend a range here, if the C<"-"> was meant to be some other character, or should have been escaped (like C<"\-">). If you did intend a range, the one that was used is not portable between ASCII and EBCDIC platforms, and doesn't have an obvious meaning to a casual reader. [3-7] # OK; Obvious and portable [d-g] # OK; Obvious and portable [A-Y] # OK; Obvious and portable [A-z] # WRONG; Not portable; not clear what is meant [a-Z] # WRONG; Not portable; not clear what is meant [%-.] # WRONG; Not portable; not clear what is meant [\x41-Z] # WRONG; Not portable; not obvious to non-geek (You can force portability by specifying a Unicode range, which means that the endpoints are specified by L<C<\N{...}>|perlrecharclass/Character Ranges>, but the meaning may still not be obvious.) The stricter rules require that ranges that start or stop with an ASCII character that is not a control have all their endpoints be the literal character, and not some escape sequence (like C<"\x41">), and the ranges must be all digits, or all uppercase letters, or all lowercase letters. =item Ranges of digits should be from the same group in regex; marked by S<<-- HERE> in m/%s/ (W regexp) (only under C<S<use re 'strict'>> or within C<(?[...])>) Stricter rules help to find typos and other errors. You included a range, and at least one of the end points is a decimal digit. Under the stricter rules, when this happens, both end points should be digits in the same group of 10 consecutive digits. =item readdir() attempted on invalid dirhandle %s (W io) The dirhandle you're reading from is either closed or not really a dirhandle. Check your control flow. =item readline() on closed filehandle %s (W closed) The filehandle you're reading from got itself closed sometime before now. Check your control flow. =item read() on closed filehandle %s (W closed) You tried to read from a closed filehandle. =item read() on unopened filehandle %s (W unopened) You tried to read from a filehandle that was never opened. =item Reallocation too large: %x (F) You can't allocate more than 64K on an MS-DOS machine. =item realloc() of freed memory ignored (S malloc) An internal routine called realloc() on something that had already been freed. =item Recompile perl with B<-D>DEBUGGING to use B<-D> switch (S debugging) You can't use the B<-D> option unless the code to produce the desired output is compiled into Perl, which entails some overhead, which is why it's currently left out of your copy. =item Recursive call to Perl_load_module in PerlIO_find_layer (P) It is currently not permitted to load modules when creating a filehandle inside an %INC hook. This can happen with C<open my $fh, '<', \$scalar>, which implicitly loads PerlIO::scalar. Try loading PerlIO::scalar explicitly first. =item Recursive inheritance detected in package '%s' (F) While calculating the method resolution order (MRO) of a package, Perl believes it found an infinite loop in the C<@ISA> hierarchy. This is a crude check that bails out after 100 levels of C<@ISA> depth. =item Redundant argument in %s (W redundant) You called a function with more arguments than other arguments you supplied indicated would be needed. Currently only emitted when a printf-type format required fewer arguments than were supplied, but might be used in the future for e.g. L<perlfunc/pack>. =item refcnt_dec: fd %d%s =item refcnt: fd %d%s =item refcnt_inc: fd %d%s (P) Perl's I/O implementation failed an internal consistency check. If you see this message, something is very wrong. =item Reference found where even-sized list expected (W misc) You gave a single reference where Perl was expecting a list with an even number of elements (for assignment to a hash). This usually means that you used the anon hash constructor when you meant to use parens. In any case, a hash requires key/value B<pairs>. %hash = { one => 1, two => 2, }; # WRONG %hash = [ qw/ an anon array / ]; # WRONG %hash = ( one => 1, two => 2, ); # right %hash = qw( one 1 two 2 ); # also fine =item Reference is already weak (W misc) You have attempted to weaken a reference that is already weak. Doing so has no effect. =item Reference is not weak (W misc) You have attempted to unweaken a reference that is not weak. Doing so has no effect. =item Reference to invalid group 0 in regex; marked by S<<-- HERE> in m/%s/ (F) You used C<\g0> or similar in a regular expression. You may refer to capturing parentheses only with strictly positive integers (normal backreferences) or with strictly negative integers (relative backreferences). Using 0 does not make sense. =item Reference to nonexistent group in regex; marked by S<<-- HERE> in m/%s/ (F) You used something like C<\7> in your regular expression, but there are not at least seven sets of capturing parentheses in the expression. If you wanted to have the character with ordinal 7 inserted into the regular expression, prepend zeroes to make it three digits long: C<\007> The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Reference to nonexistent named group in regex; marked by S<<-- HERE> in m/%s/ (F) You used something like C<\k'NAME'> or C<< \k<NAME> >> in your regular expression, but there is no corresponding named capturing parentheses such as C<(?'NAME'...)> or C<< (?<NAME>...) >>. Check if the name has been spelled correctly both in the backreference and the declaration. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Reference to nonexistent or unclosed group in regex; marked by S<<-- HERE> in m/%s/ (F) You used something like C<\g{-7}> in your regular expression, but there are not at least seven sets of closed capturing parentheses in the expression before where the C<\g{-7}> was located. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item regexp memory corruption (P) The regular expression engine got confused by what the regular expression compiler gave it. =item Regexp modifier "/%c" may appear a maximum of twice =item Regexp modifier "%c" may appear a maximum of twice in regex; marked by S<<-- HERE> in m/%s/ (F) The regular expression pattern had too many occurrences of the specified modifier. Remove the extraneous ones. =item Regexp modifier "%c" may not appear after the "-" in regex; marked by <-- HERE in m/%s/ (F) Turning off the given modifier has the side effect of turning on another one. Perl currently doesn't allow this. Reword the regular expression to use the modifier you want to turn on (and place it before the minus), instead of the one you want to turn off. =item Regexp modifier "/%c" may not appear twice =item Regexp modifier "%c" may not appear twice in regex; marked by <-- HERE in m/%s/ (F) The regular expression pattern had too many occurrences of the specified modifier. Remove the extraneous ones. =item Regexp modifiers "/%c" and "/%c" are mutually exclusive =item Regexp modifiers "%c" and "%c" are mutually exclusive in regex; marked by S<<-- HERE> in m/%s/ (F) The regular expression pattern had more than one of these mutually exclusive modifiers. Retain only the modifier that is supposed to be there. =item Regexp out of space in regex m/%s/ (P) A "can't happen" error, because safemalloc() should have caught it earlier. =item Repeated format line will never terminate (~~ and @#) (F) Your format contains the ~~ repeat-until-blank sequence and a numeric field that will never go blank so that the repetition never terminates. You might use ^# instead. See L<perlform>. =item Replacement list is longer than search list (W misc) You have used a replacement list that is longer than the search list. So the additional elements in the replacement list are meaningless. =item '(*%s' requires a terminating ':' in regex; marked by <-- HERE in m/%s/ (F) You used a construct that needs a colon and pattern argument. Supply these or check that you are using the right construct. =item '%s' resolved to '\o{%s}%d' As of Perl 5.32, this message is no longer generated. Instead, see L</Non-octal character '%c' terminates \o early. Resolved as "%s">. (W misc, regexp) You wrote something like C<\08>, or C<\179> in a double-quotish string. All but the last digit is treated as a single character, specified in octal. The last digit is the next character in the string. To tell Perl that this is indeed what you want, you can use the C<\o{ }> syntax, or use exactly three digits to specify the octal for the character. =item Reversed %s= operator (W syntax) You wrote your assignment operator backwards. The = must always come last, to avoid ambiguity with subsequent unary operators. =item rewinddir() attempted on invalid dirhandle %s (W io) The dirhandle you tried to do a rewinddir() on is either closed or not really a dirhandle. Check your control flow. =item Scalars leaked: %d (S internal) Something went wrong in Perl's internal bookkeeping of scalars: not all scalar variables were deallocated by the time Perl exited. What this usually indicates is a memory leak, which is of course bad, especially if the Perl program is intended to be long-running. =item Scalar value @%s[%s] better written as $%s[%s] (W syntax) You've used an array slice (indicated by @) to select a single element of an array. Generally it's better to ask for a scalar value (indicated by $). The difference is that C<$foo[&bar]> always behaves like a scalar, both when assigning to it and when evaluating its argument, while C<@foo[&bar]> behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. On the other hand, if you were actually hoping to treat the array element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See L<perlref>. =item Scalar value @%s{%s} better written as $%s{%s} (W syntax) You've used a hash slice (indicated by @) to select a single element of a hash. Generally it's better to ask for a scalar value (indicated by $). The difference is that C<$foo{&bar}> always behaves like a scalar, both when assigning to it and when evaluating its argument, while C<@foo{&bar}> behaves like a list when you assign to it, and provides a list context to its subscript, which can do weird things if you're expecting only one subscript. On the other hand, if you were actually hoping to treat the hash element as a list, you need to look into how references work, because Perl will not magically convert between scalars and lists for you. See L<perlref>. =item Search pattern not terminated (F) The lexer couldn't find the final delimiter of a // or m{} construct. Remember that bracketing delimiters count nesting level. Missing the leading C<$> from a variable C<$m> may cause this error. Note that since Perl 5.10.0 a // can also be the I<defined-or> construct, not just the empty search pattern. Therefore code written in Perl 5.10.0 or later that uses the // as the I<defined-or> can be misparsed by pre-5.10.0 Perls as a non-terminated search pattern. =item seekdir() attempted on invalid dirhandle %s (W io) The dirhandle you are doing a seekdir() on is either closed or not really a dirhandle. Check your control flow. =item %sseek() on unopened filehandle (W unopened) You tried to use the seek() or sysseek() function on a filehandle that was either never opened or has since been closed. =item select not implemented (F) This machine doesn't implement the select() system call. =item Self-ties of arrays and hashes are not supported (F) Self-ties are of arrays and hashes are not supported in the current implementation. =item Semicolon seems to be missing (W semicolon) A nearby syntax error was probably caused by a missing semicolon, or possibly some other missing operator, such as a comma. =item semi-panic: attempt to dup freed string (S internal) The internal newSVsv() routine was called to duplicate a scalar that had previously been marked as free. =item sem%s not implemented (F) You don't have System V semaphore IPC on your system. =item send() on closed socket %s (W closed) The socket you're sending to got itself closed sometime before now. Check your control flow. =item Sequence "\c{" invalid (F) These three characters may not appear in sequence in a double-quotish context. This message is raised only on non-ASCII platforms (a different error message is output on ASCII ones). If you were intending to specify a control character with this sequence, you'll have to use a different way to specify it. =item Sequence (? incomplete in regex; marked by S<<-- HERE> in m/%s/ (F) A regular expression ended with an incomplete extension (?. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Sequence (?%c...) not implemented in regex; marked by S<<-- HERE> in m/%s/ (F) A proposed regular expression extension has the character reserved but has not yet been written. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Sequence (?%s...) not recognized in regex; marked by S<<-- HERE> in m/%s/ (F) You used a regular expression extension that doesn't make sense. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. This may happen when using the C<(?^...)> construct to tell Perl to use the default regular expression modifiers, and you redundantly specify a default modifier. For other causes, see L<perlre>. =item Sequence (?#... not terminated in regex m/%s/ (F) A regular expression comment must be terminated by a closing parenthesis. Embedded parentheses aren't allowed. See L<perlre>. =item Sequence (?&... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named reference of the form C<(?&...)> was missing the final closing parenthesis after the name. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence (?%c... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named group of the form C<(?'...')> or C<< (?<...>) >> was missing the final closing quote or angle bracket. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence (?(%c... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named reference of the form C<(?('...')...)> or C<< (?(<...>)...) >> was missing the final closing quote or angle bracket after the name. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence (?... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) There was no matching closing parenthesis for the '('. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence \%s... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) The regular expression expects a mandatory argument following the escape sequence and this has been omitted or incorrectly written. =item Sequence (?{...}) not terminated with ')' (F) The end of the perl code contained within the {...} must be followed immediately by a ')'. =item Sequence (?PE<gt>... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named reference of the form C<(?PE<gt>...)> was missing the final closing parenthesis after the name. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence (?PE<lt>... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named group of the form C<(?PE<lt>...E<gt>')> was missing the final closing angle bracket. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence ?P=... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) A named reference of the form C<(?P=...)> was missing the final closing parenthesis after the name. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =item Sequence (?R) not terminated in regex m/%s/ (F) An C<(?R)> or C<(?0)> sequence in a regular expression was missing the final parenthesis. =item Z<>500 Server error (A) This is the error message generally seen in a browser window when trying to run a CGI program (including SSI) over the web. The actual error text varies widely from server to server. The most frequently-seen variants are "500 Server error", "Method (something) not permitted", "Document contains no data", "Premature end of script headers", and "Did not produce a valid header". B<This is a CGI error, not a Perl error>. You need to make sure your script is executable, is accessible by the user CGI is running the script under (which is probably not the user account you tested it under), does not rely on any environment variables (like PATH) from the user it isn't running under, and isn't in a location where the CGI server can't find it, basically, more or less. Please see the following for more information: https://www.perl.org/CGI_MetaFAQ.html http://www.htmlhelp.org/faq/cgifaq.html http://www.w3.org/Security/Faq/ You should also look at L<perlfaq9>. =item setegid() not implemented (F) You tried to assign to C<$)>, and your operating system doesn't support the setegid() system call (or equivalent), or at least Configure didn't think so. =item seteuid() not implemented (F) You tried to assign to C<< $> >>, and your operating system doesn't support the seteuid() system call (or equivalent), or at least Configure didn't think so. =item setpgrp can't take arguments (F) Your system has the setpgrp() from BSD 4.2, which takes no arguments, unlike POSIX setpgid(), which takes a process ID and process group ID. =item setrgid() not implemented (F) You tried to assign to C<$(>, and your operating system doesn't support the setrgid() system call (or equivalent), or at least Configure didn't think so. =item setruid() not implemented (F) You tried to assign to C<$<>, and your operating system doesn't support the setruid() system call (or equivalent), or at least Configure didn't think so. =item setsockopt() on closed socket %s (W closed) You tried to set a socket option on a closed socket. Did you forget to check the return value of your socket() call? See L<perlfunc/setsockopt>. =item Setting $/ to a reference to %s is forbidden (F) You assigned a reference to a scalar to C<$/> where the referenced item is not a positive integer. In older perls this B<appeared> to work the same as setting it to C<undef> but was in fact internally different, less efficient and with very bad luck could have resulted in your file being split by a stringified form of the reference. In Perl 5.20.0 this was changed so that it would be B<exactly> the same as setting C<$/> to undef, with the exception that this warning would be thrown. You are recommended to change your code to set C<$/> to C<undef> explicitly if you wish to slurp the file. As of Perl 5.28 assigning C<$/> to a reference to an integer which isn't positive is a fatal error. =item Setting $/ to %s reference is forbidden (F) You tried to assign a reference to a non integer to C<$/>. In older Perls this would have behaved similarly to setting it to a reference to a positive integer, where the integer was the address of the reference. As of Perl 5.20.0 this is a fatal error, to allow future versions of Perl to use non-integer refs for more interesting purposes. =item shm%s not implemented (F) You don't have System V shared memory IPC on your system. =item !=~ should be !~ (W syntax) The non-matching operator is !~, not !=~. !=~ will be interpreted as the != (numeric not equal) and ~ (1's complement) operators: probably not what you intended. =item /%s/ should probably be written as "%s" (W syntax) You have used a pattern where Perl expected to find a string, as in the first argument to C<join>. Perl will treat the true or false result of matching the pattern against $_ as the string, which is probably not what you had in mind. =item shutdown() on closed socket %s (W closed) You tried to do a shutdown on a closed socket. Seems a bit superfluous. =item SIG%s handler "%s" not defined (W signal) The signal handler named in %SIG doesn't, in fact, exist. Perhaps you put it into the wrong package? =item Slab leaked from cv %p (S) If you see this message, then something is seriously wrong with the internal bookkeeping of op trees. An op tree needed to be freed after a compilation error, but could not be found, so it was leaked instead. =item sleep(%u) too large (W overflow) You called C<sleep> with a number that was larger than it can reliably handle and C<sleep> probably slept for less time than requested. =item Slurpy parameter not last (F) In a subroutine signature, you put something after a slurpy (array or hash) parameter. The slurpy parameter takes all the available arguments, so there can't be any left to fill later parameters. =item Smart matching a non-overloaded object breaks encapsulation (F) You should not use the C<~~> operator on an object that does not overload it: Perl refuses to use the object's underlying structure for the smart match. =item Smartmatch is experimental (S experimental::smartmatch) This warning is emitted if you use the smartmatch (C<~~>) operator. This is currently an experimental feature, and its details are subject to change in future releases of Perl. Particularly, its current behavior is noticed for being unnecessarily complex and unintuitive, and is very likely to be overhauled. =item Sorry, hash keys must be smaller than 2**31 bytes (F) You tried to create a hash containing a very large key, where "very large" means that it needs at least 2 gigabytes to store. Unfortunately, Perl doesn't yet handle such large hash keys. You should reconsider your design to avoid hashing such a long string directly. =item sort is now a reserved word (F) An ancient error message that almost nobody ever runs into anymore. But before sort was a keyword, people sometimes used it as a filehandle. =item Source filters apply only to byte streams (F) You tried to activate a source filter (usually by loading a source filter module) within a string passed to C<eval>. This is not permitted under the C<unicode_eval> feature. Consider using C<evalbytes> instead. See L<feature>. =item splice() offset past end of array (W misc) You attempted to specify an offset that was past the end of the array passed to splice(). Splicing will instead commence at the end of the array, rather than past it. If this isn't what you want, try explicitly pre-extending the array by assigning $#array = $offset. See L<perlfunc/splice>. =item Split loop (P) The split was looping infinitely. (Obviously, a split shouldn't iterate more times than there are characters of input, which is what happened.) See L<perlfunc/split>. =item Statement unlikely to be reached (W exec) You did an exec() with some statement after it other than a die(). This is almost always an error, because exec() never returns unless there was a failure. You probably wanted to use system() instead, which does return. To suppress this warning, put the exec() in a block by itself. =item "state" subroutine %s can't be in a package (F) Lexically scoped subroutines aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. =item "state %s" used in sort comparison (W syntax) The package variables $a and $b are used for sort comparisons. You used $a or $b in as an operand to the C<< <=> >> or C<cmp> operator inside a sort comparison block, and the variable had earlier been declared as a lexical variable. Either qualify the sort variable with the package name, or rename the lexical variable. =item "state" variable %s can't be in a package (F) Lexically scoped variables aren't in a package, so it doesn't make sense to try to declare one with a package qualifier on the front. Use local() if you want to localize a package variable. =item stat() on unopened filehandle %s (W unopened) You tried to use the stat() function on a filehandle that was either never opened or has since been closed. =item Strings with code points over 0xFF may not be mapped into in-memory file handles (W utf8) You tried to open a reference to a scalar for read or append where the scalar contained code points over 0xFF. In-memory files model on-disk files and can only contain bytes. =item Stub found while resolving method "%s" overloading "%s" in package "%s" (P) Overloading resolution over @ISA tree may be broken by importation stubs. Stubs should never be implicitly created, but explicit calls to C<can> may break this. =item Subroutine attributes must come before the signature (F) When subroutine signatures are enabled, any subroutine attributes must come before the signature. Note that this order was the opposite in versions 5.22..5.26. So: sub foo :lvalue ($a, $b) { ... } # 5.20 and 5.28 + sub foo ($a, $b) :lvalue { ... } # 5.22 .. 5.26 =item Subroutine "&%s" is not available (W closure) During compilation, an inner named subroutine or eval is attempting to capture an outer lexical subroutine that is not currently available. This can happen for one of two reasons. First, the lexical subroutine may be declared in an outer anonymous subroutine that has not yet been created. (Remember that named subs are created at compile time, while anonymous subs are created at run-time.) For example, sub { my sub a {...} sub f { \&a } } At the time that f is created, it can't capture the current "a" sub, since the anonymous subroutine hasn't been created yet. Conversely, the following won't give a warning since the anonymous subroutine has by now been created and is live: sub { my sub a {...} eval 'sub f { \&a }' }->(); The second situation is caused by an eval accessing a lexical subroutine that has gone out of scope, for example, sub f { my sub a {...} sub { eval '\&a' } } f()->(); Here, when the '\&a' in the eval is being compiled, f() is not currently being executed, so its &a is not available for capture. =item "%s" subroutine &%s masks earlier declaration in same %s (W shadow) A "my" or "state" subroutine has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier subroutine will still exist until the end of the scope or until all closure references to it are destroyed. =item Subroutine %s redefined (W redefine) You redefined a subroutine. To suppress this warning, say { no warnings 'redefine'; eval "sub name { ... }"; } =item Subroutine "%s" will not stay shared (W closure) An inner (nested) I<named> subroutine is referencing a "my" subroutine defined in an outer named subroutine. When the inner subroutine is called, it will see the value of the outer subroutine's lexical subroutine as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the lexical subroutine. In other words, it will no longer be shared. This will especially make a difference if the lexical subroutines accesses lexical variables declared in its surrounding scope. This problem can usually be solved by making the inner subroutine anonymous, using the C<sub {}> syntax. When inner anonymous subs that reference lexical subroutines in outer subroutines are created, they are automatically rebound to the current values of such lexical subs. =item Substitution loop (P) The substitution was looping infinitely. (Obviously, a substitution shouldn't iterate more times than there are characters of input, which is what happened.) See the discussion of substitution in L<perlop/"Regexp Quote-Like Operators">. =item Substitution pattern not terminated (F) The lexer couldn't find the interior delimiter of an s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading C<$> from variable C<$s> may cause this error. =item Substitution replacement not terminated (F) The lexer couldn't find the final delimiter of an s/// or s{}{} construct. Remember that bracketing delimiters count nesting level. Missing the leading C<$> from variable C<$s> may cause this error. =item substr outside of string (W substr)(F) You tried to reference a substr() that pointed outside of a string. That is, the absolute value of the offset was larger than the length of the string. See L<perlfunc/substr>. This warning is fatal if substr is used in an lvalue context (as the left hand side of an assignment or as a subroutine argument for example). =item sv_upgrade from type %d down to type %d (P) Perl tried to force the upgrade of an SV to a type which was actually inferior to its current type. =item Switch (?(condition)... contains too many branches in regex; marked by S<<-- HERE> in m/%s/ (F) A (?(condition)if-clause|else-clause) construct can have at most two branches (the if-clause and the else-clause). If you want one or both to contain alternation, such as using C<this|that|other>, enclose it in clustering parentheses: (?(condition)(?:this|that|other)|else-clause) The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Switch condition not recognized in regex; marked by S<<-- HERE> in m/%s/ (F) The condition part of a (?(condition)if-clause|else-clause) construct is not known. The condition must be one of the following: (1) (2) ... true if 1st, 2nd, etc., capture matched (<NAME>) ('NAME') true if named capture matched (?=...) (?<=...) true if subpattern matches (?!...) (?<!...) true if subpattern fails to match (?{ CODE }) true if code returns a true value (R) true if evaluating inside recursion (R1) (R2) ... true if directly inside capture group 1, 2, etc. (R&NAME) true if directly inside named capture (DEFINE) always false; for defining named subpatterns The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Switch (?(condition)... not terminated in regex; marked by S<<-- HERE> in m/%s/ (F) You omitted to close a (?(condition)...) block somewhere in the pattern. Add a closing parenthesis in the appropriate position. See L<perlre>. =item switching effective %s is not implemented (F) While under the C<use filetest> pragma, we cannot switch the real and effective uids or gids. =item syntax error (F) Probably means you had a syntax error. Common reasons include: A keyword is misspelled. A semicolon is missing. A comma is missing. An opening or closing parenthesis is missing. An opening or closing brace is missing. A closing quote is missing. Often there will be another error message associated with the syntax error giving more information. (Sometimes it helps to turn on B<-w>.) The error message itself often tells you where it was in the line when it decided to give up. Sometimes the actual error is several tokens before this, because Perl is good at understanding random input. Occasionally the line number may be misleading, and once in a blue moon the only way to figure out what's triggering the error is to call C<perl -c> repeatedly, chopping away half the program each time to see if the error went away. Sort of the cybernetic version of S<20 questions>. =item syntax error at line %d: '%s' unexpected (A) You've accidentally run your script through the Bourne shell instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item syntax error in file %s at line %d, next 2 tokens "%s" (F) This error is likely to occur if you run a perl5 script through a perl4 interpreter, especially if the next 2 tokens are "use strict" or "my $var" or "our $var". =item Syntax error in (?[...]) in regex; marked by <-- HERE in m/%s/ (F) Perl could not figure out what you meant inside this construct; this notifies you that it is giving up trying. =item %s syntax OK (F) The final summary message when a C<perl -c> succeeds. =item sysread() on closed filehandle %s (W closed) You tried to read from a closed filehandle. =item sysread() on unopened filehandle %s (W unopened) You tried to read from a filehandle that was never opened. =item System V %s is not implemented on this machine (F) You tried to do something with a function beginning with "sem", "shm", or "msg" but that System V IPC is not implemented in your machine. In some machines the functionality can exist but be unconfigured. Consult your system support. =item syswrite() on closed filehandle %s (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow. =item C<-T> and C<-B> not implemented on filehandles (F) Perl can't peek at the stdio buffer of filehandles when it doesn't know about your kind of stdio. You'll have to use a filename instead. =item Target of goto is too deeply nested (F) You tried to use C<goto> to reach a label that was too deeply nested for Perl to reach. Perl is doing you a favor by refusing. =item telldir() attempted on invalid dirhandle %s (W io) The dirhandle you tried to telldir() is either closed or not really a dirhandle. Check your control flow. =item tell() on unopened filehandle (W unopened) You tried to use the tell() function on a filehandle that was either never opened or has since been closed. =item The crypt() function is unimplemented due to excessive paranoia. (F) Configure couldn't find the crypt() function on your machine, probably because your vendor didn't supply it, probably because they think the U.S. Government thinks it's a secret, or at least that they will continue to pretend that it is. And if you quote me on that, I will deny it. =item The experimental declared_refs feature is not enabled (F) To declare references to variables, as in C<my \%x>, you must first enable the feature: no warnings "experimental::declared_refs"; use feature "declared_refs"; =item The %s function is unimplemented (F) The function indicated isn't implemented on this architecture, according to the probings of Configure. =item The private_use feature is experimental (S experimental::private_use) This feature is actually a hook for future use. =item The regex_sets feature is experimental (S experimental::regex_sets) This warning is emitted if you use the syntax S<C<(?[ ])>> in a regular expression. The details of this feature are subject to change. If you want to use it, but know that in doing so you are taking the risk of using an experimental feature which may change in a future Perl version, you can do this to silence the warning: no warnings "experimental::regex_sets"; =item The signatures feature is experimental (S experimental::signatures) This warning is emitted if you unwrap a subroutine's arguments using a signature. Simply suppress the warning if you want to use the feature, but know that in doing so you are taking the risk of using an experimental feature which may change or be removed in a future Perl version: no warnings "experimental::signatures"; use feature "signatures"; sub foo ($left, $right) { ... } =item The stat preceding %s wasn't an lstat (F) It makes no sense to test the current stat buffer for symbolic linkhood if the last stat that wrote to the stat buffer already went past the symlink to get to the real file. Use an actual filename instead. =item The Unicode property wildcards feature is experimental (S experimental::uniprop_wildcards) This feature is experimental and its behavior may in any future release of perl. See L<perlunicode/Wildcards in Property Values>. =item The 'unique' attribute may only be applied to 'our' variables (F) This attribute was never supported on C<my> or C<sub> declarations. =item This Perl can't reset CRTL environ elements (%s) =item This Perl can't set CRTL environ elements (%s=%s) (W internal) Warnings peculiar to VMS. You tried to change or delete an element of the CRTL's internal environ array, but your copy of Perl wasn't built with a CRTL that contained the setenv() function. You'll need to rebuild Perl with a CRTL that does, or redefine F<PERL_ENV_TABLES> (see L<perlvms>) so that the environ array isn't the target of the change to %ENV which produced the warning. =item This Perl has not been built with support for randomized hash key traversal but something called Perl_hv_rand_set(). (F) Something has attempted to use an internal API call which depends on Perl being compiled with the default support for randomized hash key traversal, but this Perl has been compiled without it. You should report this warning to the relevant upstream party, or recompile perl with default options. =item This use of my() in false conditional is no longer allowed (F) You used a declaration similar to C<my $x if 0>. There has been a long-standing bug in Perl that causes a lexical variable not to be cleared at scope exit when its declaration includes a false conditional. Some people have exploited this bug to achieve a kind of static variable. Since we intend to fix this bug, we don't want people relying on this behavior. You can achieve a similar static effect by declaring the variable in a separate block outside the function, eg sub f { my $x if 0; return $x++ } becomes { my $x; sub f { return $x++ } } Beginning with perl 5.10.0, you can also use C<state> variables to have lexicals that are initialized only once (see L<feature>): sub f { state $x; return $x++ } This use of C<my()> in a false conditional was deprecated beginning in Perl 5.10 and became a fatal error in Perl 5.30. =item Timeout waiting for another thread to define \p{%s} (F) The first time a user-defined property (L<perlunicode/User-Defined Character Properties>) is used, its definition is looked up and converted into an internal form for more efficient handling in subsequent uses. There could be a race if two or more threads tried to do this processing nearly simultaneously. Instead, a critical section is created around this task, locking out all but one thread from doing it. This message indicates that the thread that is doing the conversion is taking an unexpectedly long time. The timeout exists solely to prevent deadlock; it's long enough that the system was likely thrashing and about to crash. There is no real remedy but rebooting. =item times not implemented (F) Your version of the C library apparently doesn't do times(). I suspect you're not running on Unix. =item "-T" is on the #! line, it must also be used on the command line (X) The #! line (or local equivalent) in a Perl script contains the B<-T> option (or the B<-t> option), but Perl was not invoked with B<-T> in its command line. This is an error because, by the time Perl discovers a B<-T> in a script, it's too late to properly taint everything from the environment. So Perl gives up. If the Perl script is being executed as a command using the #! mechanism (or its local equivalent), this error can usually be fixed by editing the #! line so that the B<-%c> option is a part of Perl's first argument: e.g. change C<perl -n -%c> to C<perl -%c -n>. If the Perl script is being executed as C<perl scriptname>, then the B<-%c> option must appear on the command line: C<perl -%c scriptname>. =item To%s: illegal mapping '%s' (F) You tried to define a customized To-mapping for lc(), lcfirst, uc(), or ucfirst() (or their string-inlined versions), but you specified an illegal mapping. See L<perlunicode/"User-Defined Character Properties">. =item Too deeply nested ()-groups (F) Your template contains ()-groups with a ridiculously deep nesting level. =item Too few args to syscall (F) There has to be at least one argument to syscall() to specify the system call to call, silly dilly. =item Too few arguments for subroutine '%s' (F) A subroutine using a signature fewer arguments than required by the signature. The caller of the subroutine is presumably at fault. The message attempts to include the name of the called subroutine. If the subroutine has been aliased, the subroutine's original name will be shown, regardless of what name the caller used. =item Too late for "-%s" option (X) The #! line (or local equivalent) in a Perl script contains the B<-M>, B<-m> or B<-C> option. In the case of B<-M> and B<-m>, this is an error because those options are not intended for use inside scripts. Use the C<use> pragma instead. The B<-C> option only works if it is specified on the command line as well (with the same sequence of letters or numbers following). Either specify this option on the command line, or, if your system supports it, make your script executable and run it directly instead of passing it to perl. =item Too late to run %s block (W void) A CHECK or INIT block is being defined during run time proper, when the opportunity to run them has already passed. Perhaps you are loading a file with C<require> or C<do> when you should be using C<use> instead. Or perhaps you should put the C<require> or C<do> inside a BEGIN block. =item Too many args to syscall (F) Perl supports a maximum of only 14 args to syscall(). =item Too many arguments for %s (F) The function requires fewer arguments than you specified. =item Too many arguments for subroutine '%s' (F) A subroutine using a signature received more arguments than permitted by the signature. The caller of the subroutine is presumably at fault. The message attempts to include the name of the called subroutine. If the subroutine has been aliased, the subroutine's original name will be shown, regardless of what name the caller used. =item Too many nested open parens in regex; marked by <-- HERE in m/%s/ (F) You have exceeded the number of open C<"("> parentheses that haven't been matched by corresponding closing ones. This limit prevents eating up too much memory. It is initially set to 1000, but may be changed by setting C<${^RE_COMPILE_RECURSION_LIMIT}> to some other value. This may need to be done in a BEGIN block before the regular expression pattern is compiled. =item Too many )'s (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item Too many ('s (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item Trailing \ in regex m/%s/ (F) The regular expression ends with an unbackslashed backslash. Backslash it. See L<perlre>. =item Transliteration pattern not terminated (F) The lexer couldn't find the interior delimiter of a tr/// or tr[][] or y/// or y[][] construct. Missing the leading C<$> from variables C<$tr> or C<$y> may cause this error. =item Transliteration replacement not terminated (F) The lexer couldn't find the final delimiter of a tr///, tr[][], y/// or y[][] construct. =item '%s' trapped by operation mask (F) You tried to use an operator from a Safe compartment in which it's disallowed. See L<Safe>. =item truncate not implemented (F) Your machine doesn't implement a file truncation mechanism that Configure knows about. =item Type of arg %d to &CORE::%s must be %s (F) The subroutine in question in the CORE package requires its argument to be a hard reference to data of the specified type. Overloading is ignored, so a reference to an object that is not the specified type, but nonetheless has overloading to handle it, will still not be accepted. =item Type of arg %d to %s must be %s (not %s) (F) This function requires the argument in that position to be of a certain type. Arrays must be @NAME or C<@{EXPR}>. Hashes must be %NAME or C<%{EXPR}>. No implicit dereferencing is allowed--use the {EXPR} forms as an explicit dereference. See L<perlref>. =item umask not implemented (F) Your machine doesn't implement the umask function and you tried to use it to restrict permissions for yourself (EXPR & 0700). =item Unbalanced context: %d more PUSHes than POPs (S internal) The exit code detected an internal inconsistency in how many execution contexts were entered and left. =item Unbalanced saves: %d more saves than restores (S internal) The exit code detected an internal inconsistency in how many values were temporarily localized. =item Unbalanced scopes: %d more ENTERs than LEAVEs (S internal) The exit code detected an internal inconsistency in how many blocks were entered and left. =item Unbalanced string table refcount: (%d) for "%s" (S internal) On exit, Perl found some strings remaining in the shared string table used for copy on write and for hash keys. The entries should have been freed, so this indicates a bug somewhere. =item Unbalanced tmps: %d more allocs than frees (S internal) The exit code detected an internal inconsistency in how many mortal scalars were allocated and freed. =item Undefined format "%s" called (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See L<perlform>. =item Undefined sort subroutine "%s" called (F) The sort comparison routine specified doesn't seem to exist. Perhaps it's in a different package? See L<perlfunc/sort>. =item Undefined subroutine &%s called (F) The subroutine indicated hasn't been defined, or if it was, it has since been undefined. =item Undefined subroutine called (F) The anonymous subroutine you're trying to call hasn't been defined, or if it was, it has since been undefined. =item Undefined subroutine in sort (F) The sort comparison routine specified is declared but doesn't seem to have been defined yet. See L<perlfunc/sort>. =item Undefined top format "%s" called (F) The format indicated doesn't seem to exist. Perhaps it's really in another package? See L<perlform>. =item Undefined value assigned to typeglob (W misc) An undefined value was assigned to a typeglob, a la C<*foo = undef>. This does nothing. It's possible that you really mean C<undef *foo>. =item %s: Undefined variable (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item Unescaped left brace in regex is illegal here in regex; marked by S<<-- HERE> in m/%s/ (F) The simple rule to remember, if you want to match a literal C<"{"> character (U+007B C<LEFT CURLY BRACKET>) in a regular expression pattern, is to escape each literal instance of it in some way. Generally easiest is to precede it with a backslash, like C<"\{"> or enclose it in square brackets (C<"[{]">). If the pattern delimiters are also braces, any matching right brace (C<"}">) should also be escaped to avoid confusing the parser, for example, qr{abc\{def\}ghi} Forcing literal C<"{"> characters to be escaped enables the Perl language to be extended in various ways in future releases. To avoid needlessly breaking existing code, the restriction is not enforced in contexts where there are unlikely to ever be extensions that could conflict with the use there of C<"{"> as a literal. Those that are not potentially ambiguous do not warn; those that are do raise a non-deprecation warning. The contexts where no warnings or errors are raised are: =over 4 =item * as the first character in a pattern, or following C<"^"> indicating to anchor the match to the beginning of a line. =item * as the first character following a C<"|"> indicating alternation. =item * as the first character in a parenthesized grouping like /foo({bar)/ /foo(?:{bar)/ =item * as the first character following a quantifier /\s*{/ =back =for comment The text of the message above is mostly duplicated below (with changes) to allow splain (and 'use diagnostics') to work. Since one is fatal, and one not, they can't be combined as one message. Perhaps perldiag could be enhanced to handle this case. =item Unescaped left brace in regex is passed through in regex; marked by S<<-- HERE> in m/%s/ (W regexp) The simple rule to remember, if you want to match a literal C<"{"> character (U+007B C<LEFT CURLY BRACKET>) in a regular expression pattern, is to escape each literal instance of it in some way. Generally easiest is to precede it with a backslash, like C<"\{"> or enclose it in square brackets (C<"[{]">). If the pattern delimiters are also braces, any matching right brace (C<"}">) should also be escaped to avoid confusing the parser, for example, qr{abc\{def\}ghi} Forcing literal C<"{"> characters to be escaped enables the Perl language to be extended in various ways in future releases. To avoid needlessly breaking existing code, the restriction is not enforced in contexts where there are unlikely to ever be extensions that could conflict with the use there of C<"{"> as a literal. Those that are not potentially ambiguous do not warn; those that are raise this warning. This makes sure that an inadvertent typo doesn't silently cause the pattern to compile to something unintended. The contexts where no warnings or errors are raised are: =over 4 =item * as the first character in a pattern, or following C<"^"> indicating to anchor the match to the beginning of a line. =item * as the first character following a C<"|"> indicating alternation. =item * as the first character in a parenthesized grouping like /foo({bar)/ /foo(?:{bar)/ =item * as the first character following a quantifier /\s*{/ =back =item Unescaped literal '%c' in regex; marked by <-- HERE in m/%s/ (W regexp) (only under C<S<use re 'strict'>>) Within the scope of C<S<use re 'strict'>> in a regular expression pattern, you included an unescaped C<}> or C<]> which was interpreted literally. These two characters are sometimes metacharacters, and sometimes literals, depending on what precedes them in the pattern. This is unlike the similar C<)> which is always a metacharacter unless escaped. This action at a distance, perhaps a large distance, can lead to Perl silently misinterpreting what you meant, so when you specify that you want extra checking by C<S<use re 'strict'>>, this warning is generated. If you meant the character as a literal, simply confirm that to Perl by preceding the character with a backslash, or make it into a bracketed character class (like C<[}]>). If you meant it as closing a corresponding C<[> or C<{>, you'll need to look back through the pattern to find out why that isn't happening. =item unexec of %s into %s failed! (F) The unexec() routine failed for some reason. See your local FSF representative, who probably put it there in the first place. =item Unexpected binary operator '%c' with no preceding operand in regex; marked by S<<-- HERE> in m/%s/ (F) You had something like this: (?[ | \p{Digit} ]) where the C<"|"> is a binary operator with an operand on the right, but no operand on the left. =item Unexpected character in regex; marked by S<<-- HERE> in m/%s/ (F) You had something like this: (?[ z ]) Within C<(?[ ])>, no literal characters are allowed unless they are within an inner pair of square brackets, like (?[ [ z ] ]) Another possibility is that you forgot a backslash. Perl isn't smart enough to figure out what you really meant. =item Unexpected constant lvalue entersub entry via type/targ %d:%d (P) When compiling a subroutine call in lvalue context, Perl failed an internal consistency check. It encountered a malformed op tree. =item Unexpected exit %u (S) exit() was called or the script otherwise finished gracefully when C<PERL_EXIT_WARN> was set in C<PL_exit_flags>. =item Unexpected exit failure %d (S) An uncaught die() was called when C<PERL_EXIT_WARN> was set in C<PL_exit_flags>. =item Unexpected ')' in regex; marked by S<<-- HERE> in m/%s/ (F) You had something like this: (?[ ( \p{Digit} + ) ]) The C<")"> is out-of-place. Something apparently was supposed to be combined with the digits, or the C<"+"> shouldn't be there, or something like that. Perl can't figure out what was intended. =item Unexpected ']' with no following ')' in (?[... in regex; marked by <-- HERE in m/%s/ (F) While parsing an extended character class a ']' character was encountered at a point in the definition where the only legal use of ']' is to close the character class definition as part of a '])', you may have forgotten the close paren, or otherwise confused the parser. =item Unexpected '(' with no preceding operator in regex; marked by S<<-- HERE> in m/%s/ (F) You had something like this: (?[ \p{Digit} ( \p{Lao} + \p{Thai} ) ]) There should be an operator before the C<"(">, as there's no indication as to how the digits are to be combined with the characters in the Lao and Thai scripts. =item Unicode non-character U+%X is not recommended for open interchange (S nonchar) Certain codepoints, such as U+FFFE and U+FFFF, are defined by the Unicode standard to be non-characters. Those are legal codepoints, but are reserved for internal use; so, applications shouldn't attempt to exchange them. An application may not be expecting any of these characters at all, and receiving them may lead to bugs. If you know what you are doing you can turn off this warning by C<no warnings 'nonchar';>. This is not really a "severe" error, but it is supposed to be raised by default even if warnings are not enabled, and currently the only way to do that in Perl is to mark it as serious. =item Unicode property wildcard not terminated (F) A Unicode property wildcard looks like a delimited regular expression pattern (all within the braces of the enclosing C<\p{...}>. The closing delimtter to match the opening one was not found. If the opening one is escaped by preceding it with a backslash, the closing one must also be so escaped. =item Unicode string properties are not implemented in (?[...]) in regex; marked by <-- HERE in m/%s/ (F) A Unicode string property is one which expands to a sequence of multiple characters. An example is C<\p{name=KATAKANA LETTER AINU P}>, which is comprised of the sequence C<\N{KATAKANA LETTER SMALL H}> followed by C<\N{COMBINING KATAKANA-HIRAGANA SEMI-VOICED SOUND MARK}>. Extended character classes, C<(?[...])> currently cannot handle these. =item Unicode surrogate U+%X is illegal in UTF-8 (S surrogate) You had a UTF-16 surrogate in a context where they are not considered acceptable. These code points, between U+D800 and U+DFFF (inclusive), are used by Unicode only for UTF-16. However, Perl internally allows all unsigned integer code points (up to the size limit available on your platform), including surrogates. But these can cause problems when being input or output, which is likely where this message came from. If you really really know what you are doing you can turn off this warning by C<no warnings 'surrogate';>. =item Unknown charname '%s' (F) The name you used inside C<\N{}> is unknown to Perl. Check the spelling. You can say C<use charnames ":loose"> to not have to be so precise about spaces, hyphens, and capitalization on standard Unicode names. (Any custom aliases that have been created must be specified exactly, regardless of whether C<:loose> is used or not.) This error may also happen if the C<\N{}> is not in the scope of the corresponding C<S<use charnames>>. =item Unknown '(*...)' construct '%s' in regex; marked by <-- HERE in m/%s/ (F) The C<(*> was followed by something that the regular expression compiler does not recognize. Check your spelling. =item Unknown error (P) Perl was about to print an error message in C<$@>, but the C<$@> variable did not exist, even after an attempt to create it. =item Unknown locale category %d; can't set it to %s (W locale) You used a locale category that perl doesn't recognize, so it cannot carry out your request. Check that you are using a valid category. If so, see L<perllocale/Multi-threaded> for advice on reporting this as a bug, and for modifying perl locally to accommodate your needs. =item Unknown open() mode '%s' (F) The second argument of 3-argument open() is not among the list of valid modes: C<< < >>, C<< > >>, C<<< >> >>>, C<< +< >>, C<< +> >>, C<<< +>> >>>, C<-|>, C<|->, C<< <& >>, C<< >& >>. =item Unknown PerlIO layer "%s" (W layer) An attempt was made to push an unknown layer onto the Perl I/O system. (Layers take care of transforming data between external and internal representations.) Note that some layers, such as C<mmap>, are not supported in all environments. If your program didn't explicitly request the failing operation, it may be the result of the value of the environment variable PERLIO. =item Unknown process %x sent message to prime_env_iter: %s (P) An error peculiar to VMS. Perl was reading values for %ENV before iterating over it, and someone else stuck a message in the stream of data Perl expected. Someone's very confused, or perhaps trying to subvert Perl's population of %ENV for nefarious purposes. =item Unknown regexp modifier "/%s" (F) Alphanumerics immediately following the closing delimiter of a regular expression pattern are interpreted by Perl as modifier flags for the regex. One of the ones you specified is invalid. One way this can happen is if you didn't put in white space between the end of the regex and a following alphanumeric operator: if ($a =~ /foo/and $bar == 3) { ... } The C<"a"> is a valid modifier flag, but the C<"n"> is not, and raises this error. Likely what was meant instead was: if ($a =~ /foo/ and $bar == 3) { ... } =item Unknown "re" subpragma '%s' (known ones are: %s) (W) You tried to use an unknown subpragma of the "re" pragma. =item Unknown switch condition (?(...)) in regex; marked by S<<-- HERE> in m/%s/ (F) The condition part of a (?(condition)if-clause|else-clause) construct is not known. The condition must be one of the following: (1) (2) ... true if 1st, 2nd, etc., capture matched (<NAME>) ('NAME') true if named capture matched (?=...) (?<=...) true if subpattern matches (*pla:...) (*plb:...) true if subpattern matches; also (*positive_lookahead:...) (*positive_lookbehind:...) (*nla:...) (*nlb:...) true if subpattern fails to match; also (*negative_lookahead:...) (*negative_lookbehind:...) (?{ CODE }) true if code returns a true value (R) true if evaluating inside recursion (R1) (R2) ... true if directly inside capture group 1, 2, etc. (R&NAME) true if directly inside named capture (DEFINE) always false; for defining named subpatterns The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Unknown Unicode option letter '%c' (F) You specified an unknown Unicode option. See L<perlrun|perlrun/-C [numberE<sol>list]> documentation of the C<-C> switch for the list of known options. =item Unknown Unicode option value %d (F) You specified an unknown Unicode option. See L<perlrun|perlrun/-C [numberE<sol>list]> documentation of the C<-C> switch for the list of known options. =item Unknown user-defined property name \p{%s} (F) You specified to use a property within the C<\p{...}> which was a syntactically valid user-defined property, but no definition was found for it by the time one was required to proceed. Check your spelling. See L<perlunicode/User-Defined Character Properties>. =item Unknown verb pattern '%s' in regex; marked by S<<-- HERE> in m/%s/ (F) You either made a typo or have incorrectly put a C<*> quantifier after an open brace in your pattern. Check the pattern and review L<perlre> for details on legal verb patterns. =item Unknown warnings category '%s' (F) An error issued by the C<warnings> pragma. You specified a warnings category that is unknown to perl at this point. Note that if you want to enable a warnings category registered by a module (e.g. C<use warnings 'File::Find'>), you must have loaded this module first. =item Unmatched [ in regex; marked by S<<-- HERE> in m/%s/ (F) The brackets around a character class must match. If you wish to include a closing bracket in a character class, backslash it or put it first. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Unmatched ( in regex; marked by S<<-- HERE> in m/%s/ =item Unmatched ) in regex; marked by S<<-- HERE> in m/%s/ (F) Unbackslashed parentheses must always be balanced in regular expressions. If you're a vi user, the % key is valuable for finding the matching parenthesis. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Unmatched right %s bracket (F) The lexer counted more closing curly or square brackets than opening ones, so you're probably missing a matching opening bracket. As a general rule, you'll find the missing one (so to speak) near the place you were last editing. =item Unquoted string "%s" may clash with future reserved word (W reserved) You used a bareword that might someday be claimed as a reserved word. It's best to put such a word in quotes, or capitalize it somehow, or insert an underbar into it. You might also declare it as a subroutine. =item Unrecognized character %s; marked by S<<-- HERE> after %s near column %d (F) The Perl parser has no idea what to do with the specified character in your Perl script (or eval) near the specified column. Perhaps you tried to run a compressed script, a binary program, or a directory as a Perl program. =item Unrecognized escape \%c in character class in regex; marked by S<<-- HERE> in m/%s/ (F) You used a backslash-character combination which is not recognized by Perl inside character classes. This is a fatal error when the character class is used within C<(?[ ])>. =item Unrecognized escape \%c in character class passed through in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You used a backslash-character combination which is not recognized by Perl inside character classes. The character was understood literally, but this may change in a future version of Perl. The S<<-- HERE> shows whereabouts in the regular expression the escape was discovered. =item Unrecognized escape \%c passed through (W misc) You used a backslash-character combination which is not recognized by Perl. The character was understood literally, but this may change in a future version of Perl. =item Unrecognized escape \%s passed through in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You used a backslash-character combination which is not recognized by Perl. The character(s) were understood literally, but this may change in a future version of Perl. The S<<-- HERE> shows whereabouts in the regular expression the escape was discovered. =item Unrecognized signal name "%s" (F) You specified a signal name to the kill() function that was not recognized. Say C<kill -l> in your shell to see the valid signal names on your system. =item Unrecognized switch: -%s (-h will show valid options) (F) You specified an illegal option to Perl. Don't do that. (If you think you didn't do that, check the #! line to see if it's supplying the bad switch on your behalf.) =item Unsuccessful %s on filename containing newline (W newline) A file operation was attempted on a filename, and that operation failed, PROBABLY because the filename contained a newline, PROBABLY because you forgot to chomp() it off. See L<perlfunc/chomp>. =item Unsupported directory function "%s" called (F) Your machine doesn't support opendir() and readdir(). =item Unsupported function %s (F) This machine doesn't implement the indicated function, apparently. At least, Configure doesn't think so. =item Unsupported function fork (F) Your version of executable does not support forking. Note that under some systems, like OS/2, there may be different flavors of Perl executables, some of which may support fork, some not. Try changing the name you call Perl by to C<perl_>, C<perl__>, and so on. =item Unsupported script encoding %s (F) Your program file begins with a Unicode Byte Order Mark (BOM) which declares it to be in a Unicode encoding that Perl cannot read. =item Unsupported socket function "%s" called (F) Your machine doesn't support the Berkeley socket mechanism, or at least that's what Configure thought. =item Unterminated '(*...' argument in regex; marked by <-- HERE in m/%s/ (F) You used a pattern of the form C<(*...:...)> but did not terminate the pattern with a C<)>. Fix the pattern and retry. =item Unterminated attribute list (F) The lexer found something other than a simple identifier at the start of an attribute, and it wasn't a semicolon or the start of a block. Perhaps you terminated the parameter list of the previous attribute too soon. See L<attributes>. =item Unterminated attribute parameter in attribute list (F) The lexer saw an opening (left) parenthesis character while parsing an attribute list, but the matching closing (right) parenthesis character was not found. You may need to add (or remove) a backslash character to get your parentheses to balance. See L<attributes>. =item Unterminated compressed integer (F) An argument to unpack("w",...) was incompatible with the BER compressed integer format and could not be converted to an integer. See L<perlfunc/pack>. =item Unterminated '(*...' construct in regex; marked by <-- HERE in m/%s/ (F) You used a pattern of the form C<(*...)> but did not terminate the pattern with a C<)>. Fix the pattern and retry. =item Unterminated delimiter for here document (F) This message occurs when a here document label has an initial quotation mark but the final quotation mark is missing. Perhaps you wrote: <<"foo instead of: <<"foo" =item Unterminated \g... pattern in regex; marked by S<<-- HERE> in m/%s/ =item Unterminated \g{...} pattern in regex; marked by S<<-- HERE> in m/%s/ (F) In a regular expression, you had a C<\g> that wasn't followed by a proper group reference. In the case of C<\g{>, the closing brace is missing; otherwise the C<\g> must be followed by an integer. Fix the pattern and retry. =item Unterminated <> operator (F) The lexer saw a left angle bracket in a place where it was expecting a term, so it's looking for the corresponding right angle bracket, and not finding it. Chances are you left some needed parentheses out earlier in the line, and you really meant a "less than". =item Unterminated verb pattern argument in regex; marked by S<<-- HERE> in m/%s/ (F) You used a pattern of the form C<(*VERB:ARG)> but did not terminate the pattern with a C<)>. Fix the pattern and retry. =item Unterminated verb pattern in regex; marked by S<<-- HERE> in m/%s/ (F) You used a pattern of the form C<(*VERB)> but did not terminate the pattern with a C<)>. Fix the pattern and retry. =item untie attempted while %d inner references still exist (W untie) A copy of the object returned from C<tie> (or C<tied>) was still valid when C<untie> was called. =item Usage: POSIX::%s(%s) (F) You called a POSIX function with incorrect arguments. See L<POSIX/FUNCTIONS> for more information. =item Usage: Win32::%s(%s) (F) You called a Win32 function with incorrect arguments. See L<Win32> for more information. =item $[ used in %s (did you mean $] ?) (W syntax) You used C<$[> in a comparison, such as: if ($[ > 5.006) { ... } You probably meant to use C<$]> instead. C<$[> is the base for indexing arrays. C<$]> is the Perl version number in decimal. =item Use "%s" instead of "%s" (F) The second listed construct is no longer legal. Use the first one instead. =item Useless assignment to a temporary (W misc) You assigned to an lvalue subroutine, but what the subroutine returned was a temporary scalar about to be discarded, so the assignment had no effect. =item Useless (?-%s) - don't use /%s modifier in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You have used an internal modifier such as (?-o) that has no meaning unless removed from the entire regexp: if ($string =~ /(?-o)$pattern/o) { ... } must be written as if ($string =~ /$pattern/) { ... } The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Useless localization of %s (W syntax) The localization of lvalues such as C<local($x=10)> is legal, but in fact the local() currently has no effect. This may change at some point in the future, but in the meantime such code is discouraged. =item Useless (?%s) - use /%s modifier in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You have used an internal modifier such as (?o) that has no meaning unless applied to the entire regexp: if ($string =~ /(?o)$pattern/) { ... } must be written as if ($string =~ /$pattern/o) { ... } The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. See L<perlre>. =item Useless use of attribute "const" (W misc) The C<const> attribute has no effect except on anonymous closure prototypes. You applied it to a subroutine via L<attributes.pm|attributes>. This is only useful inside an attribute handler for an anonymous subroutine. =item Useless use of /d modifier in transliteration operator (W misc) You have used the /d modifier where the searchlist has the same length as the replacelist. See L<perlop> for more information about the /d modifier. =item Useless use of \E (W misc) You have a \E in a double-quotish string without a C<\U>, C<\L> or C<\Q> preceding it. =item Useless use of greediness modifier '%c' in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You specified something like these: qr/a{3}?/ qr/b{1,1}+/ The C<"?"> and C<"+"> don't have any effect, as they modify whether to match more or fewer when there is a choice, and by specifying to match exactly a given numer, there is no room left for a choice. =item Useless use of %s in void context (W void) You did something without a side effect in a context that does nothing with the return value, such as a statement that doesn't return a value from a block, or the left side of a scalar comma operator. Very often this points not to stupidity on your part, but a failure of Perl to parse your program the way you thought it would. For example, you'd get this if you mixed up your C precedence with Python precedence and said $one, $two = 1, 2; when you meant to say ($one, $two) = (1, 2); Another common error is to use ordinary parentheses to construct a list reference when you should be using square or curly brackets, for example, if you say $array = (1,2); when you should have said $array = [1,2]; The square brackets explicitly turn a list value into a scalar value, while parentheses do not. So when a parenthesized list is evaluated in a scalar context, the comma is treated like C's comma operator, which throws away the left argument, which is not what you want. See L<perlref> for more on this. This warning will not be issued for numerical constants equal to 0 or 1 since they are often used in statements like 1 while sub_with_side_effects(); String constants that would normally evaluate to 0 or 1 are warned about. =item Useless use of (?-p) in regex; marked by S<<-- HERE> in m/%s/ (W regexp) The C<p> modifier cannot be turned off once set. Trying to do so is futile. =item Useless use of "re" pragma (W) You did C<use re;> without any arguments. That isn't very useful. =item Useless use of sort in scalar context (W void) You used sort in scalar context, as in : my $x = sort @y; This is not very useful, and perl currently optimizes this away. =item Useless use of %s with no values (W syntax) You used the push() or unshift() function with no arguments apart from the array, like C<push(@x)> or C<unshift(@foo)>. That won't usually have any effect on the array, so is completely useless. It's possible in principle that push(@tied_array) could have some effect if the array is tied to a class which implements a PUSH method. If so, you can write it as C<push(@tied_array,())> to avoid this warning. =item "use" not allowed in expression (F) The "use" keyword is recognized and executed at compile time, and returns no useful value. See L<perlmod>. =item Use of bare << to mean <<"" is forbidden (F) You are now required to use the explicitly quoted form if you wish to use an empty line as the terminator of the here-document. Use of a bare terminator was deprecated in Perl 5.000, and is a fatal error as of Perl 5.28. =item Use of /c modifier is meaningless in s/// (W regexp) You used the /c modifier in a substitution. The /c modifier is not presently meaningful in substitutions. =item Use of /c modifier is meaningless without /g (W regexp) You used the /c modifier with a regex operand, but didn't use the /g modifier. Currently, /c is meaningful only when /g is used. (This may change in the future.) =item Use of code point 0x%s is not allowed; the permissible max is 0x%X =item Use of code point 0x%s is not allowed; the permissible max is 0x%X in regex; marked by <-- HERE in m/%s/ (F) You used a code point that is not allowed, because it is too large. Unicode only allows code points up to 0x10FFFF, but Perl allows much larger ones. Earlier versions of Perl allowed code points above IV_MAX (0x7FFFFFF on 32-bit platforms, 0x7FFFFFFFFFFFFFFF on 64-bit platforms), however, this could possibly break the perl interpreter in some constructs, including causing it to hang in a few cases. If your code is to run on various platforms, keep in mind that the upper limit depends on the platform. It is much larger on 64-bit word sizes than 32-bit ones. The use of out of range code points was deprecated in Perl 5.24, and became a fatal error in Perl 5.28. =item Use of each() on hash after insertion without resetting hash iterator results in undefined behavior (S internal) The behavior of C<each()> after insertion is undefined; it may skip items, or visit items more than once. Consider using C<keys()> instead of C<each()>. =item Use of := for an empty attribute list is not allowed (F) The construction C<my $x := 42> used to parse as equivalent to C<my $x : = 42> (applying an empty attribute list to C<$x>). This construct was deprecated in 5.12.0, and has now been made a syntax error, so C<:=> can be reclaimed as a new operator in the future. If you need an empty attribute list, for example in a code generator, add a space before the C<=>. =item Use of %s for non-UTF-8 locale is wrong. Assuming a UTF-8 locale (W locale) You are matching a regular expression using locale rules, and the specified construct was encountered. This construct is only valid for UTF-8 locales, which the current locale isn't. This doesn't make sense. Perl will continue, assuming a Unicode (UTF-8) locale, but the results are likely to be wrong. =item Use of freed value in iteration (F) Perhaps you modified the iterated array within the loop? This error is typically caused by code like the following: @a = (3,4); @a = () for (1,2,@a); You are not supposed to modify arrays while they are being iterated over. For speed and efficiency reasons, Perl internally does not do full reference-counting of iterated items, hence deleting such an item in the middle of an iteration causes Perl to see a freed value. =item Use of /g modifier is meaningless in split (W regexp) You used the /g modifier on the pattern for a C<split> operator. Since C<split> always tries to match the pattern repeatedly, the C</g> has no effect. =item Use of "goto" to jump into a construct is deprecated (D deprecated) Using C<goto> to jump from an outer scope into an inner scope is deprecated and should be avoided. This was deprecated in Perl 5.12. =item Use of '%s' in \p{} or \P{} is deprecated because: %s (D deprecated) Certain properties are deprecated by Unicode, and may eventually be removed from the Standard, at which time Perl will follow along. In the meantime, this message is raised to notify you. =item Use of inherited AUTOLOAD for non-method %s::%s() is no longer allowed (F) As an accidental feature, C<AUTOLOAD> subroutines were looked up as methods (using the C<@ISA> hierarchy), even when the subroutines to be autoloaded were called as plain functions (e.g. C<Foo::bar()>), not as methods (e.g. C<< Foo->bar() >> or C<< $obj->bar() >>). This was deprecated in Perl 5.004, and was made fatal in Perl 5.28. =item Use of %s in printf format not supported (F) You attempted to use a feature of printf that is accessible from only C. This usually means there's a better way to do it in Perl. =item Use of %s is not allowed in Unicode property wildcard subpatterns in regex; marked by S<<-- HERE> in m/%s/ (F) You were using a wildcard subpattern a Unicode property value, and the subpattern contained something that is illegal. Not all regular expression capabilities are legal in such subpatterns, and this is one. Rewrite your subppattern to not use the offending construct. See L<perlunicode/Wildcards in Property Values>. =item Use of -l on filehandle%s (W io) A filehandle represents an opened file, and when you opened the file it already went past any symlink you are presumably trying to look for. The operation returned C<undef>. Use a filename instead. =item Use of reference "%s" as array index (W misc) You tried to use a reference as an array index; this probably isn't what you mean, because references in numerical context tend to be huge numbers, and so usually indicates programmer error. If you really do mean it, explicitly numify your reference, like so: C<$array[0+$ref]>. This warning is not given for overloaded objects, however, because you can overload the numification and stringification operators and then you presumably know what you are doing. =item Use of strings with code points over 0xFF as arguments to %s operator is not allowed (F) You tried to use one of the string bitwise operators (C<&> or C<|> or C<^> or C<~>) on a string containing a code point over 0xFF. The string bitwise operators treat their operands as strings of bytes, and values beyond 0xFF are nonsensical in this context. Certain instances became fatal in Perl 5.28; others in perl 5.32. =item Use of strings with code points over 0xFF as arguments to vec is forbidden (F) You tried to use L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on a string containing a code point over 0xFF, which is nonsensical here. This became fatal in Perl 5.32. =item Use of tainted arguments in %s is deprecated (W taint, deprecated) You have supplied C<system()> or C<exec()> with multiple arguments and at least one of them is tainted. This used to be allowed but will become a fatal error in a future version of perl. Untaint your arguments. See L<perlsec>. =item Use of unassigned code point or non-standalone grapheme for a delimiter is not allowed (F) A grapheme is what appears to a native-speaker of a language to be a character. In Unicode (and hence Perl) a grapheme may actually be several adjacent characters that together form a complete grapheme. For example, there can be a base character, like "R" and an accent, like a circumflex "^", that appear when displayed to be a single character with the circumflex hovering over the "R". Perl currently allows things like that circumflex to be delimiters of strings, patterns, I<etc>. When displayed, the circumflex would look like it belongs to the character just to the left of it. In order to move the language to be able to accept graphemes as delimiters, we cannot allow the use of delimiters which aren't graphemes by themselves. Also, a delimiter must already be assigned (or known to be never going to be assigned) to try to future-proof code, for otherwise code that works today would fail to compile if the currently unassigned delimiter ends up being something that isn't a stand-alone grapheme. Because Unicode is never going to assign L<non-character code points|perlunicode/Noncharacter code points>, nor L<code points that are above the legal Unicode maximum| perlunicode/Beyond Unicode code points>, those can be delimiters, and their use is legal. =item Use of uninitialized value%s (W uninitialized) An undefined value was used as if it were already defined. It was interpreted as a "" or a 0, but maybe it was a mistake. To suppress this warning assign a defined value to your variables. To help you figure out what was undefined, perl will try to tell you the name of the variable (if any) that was undefined. In some cases it cannot do this, so it also tells you what operation you used the undefined value in. Note, however, that perl optimizes your program and the operation displayed in the warning may not necessarily appear literally in your program. For example, C<"that $foo"> is usually optimized into C<"that " . $foo>, and the warning will refer to the C<concatenation (.)> operator, even though there is no C<.> in your program. =item "use re 'strict'" is experimental (S experimental::re_strict) The things that are different when a regular expression pattern is compiled under C<'strict'> are subject to change in future Perl releases in incompatible ways. This means that a pattern that compiles today may not in a future Perl release. This warning is to alert you to that risk. =item Use \x{...} for more than two hex characters in regex; marked by S<<-- HERE> in m/%s/ (F) In a regular expression, you said something like (?[ [ \xBEEF ] ]) Perl isn't sure if you meant this (?[ [ \x{BEEF} ] ]) or if you meant this (?[ [ \x{BE} E F ] ]) You need to add either braces or blanks to disambiguate. =item Using just the first character returned by \N{} in character class in regex; marked by S<<-- HERE> in m/%s/ (W regexp) Named Unicode character escapes C<(\N{...})> may return a multi-character sequence. Even though a character class is supposed to match just one character of input, perl will match the whole thing correctly, except when the class is inverted (C<[^...]>), or the escape is the beginning or final end point of a range. For these, what should happen isn't clear at all. In these circumstances, Perl discards all but the first character of the returned sequence, which is not likely what you want. =item Using just the single character results returned by \p{} in (?[...]) in regex; marked by S<<-- HERE> in m/%s/ (W regexp) Extended character classes currently cannot handle operands that evaluate to more than one character. These are removed from the results of the expansion of the C<\p{}>. This situation can happen, for example, in (?[ \p{name=/KATAKANA/} ]) "KATAKANA LETTER AINU P" is a legal Unicode name (technically a "named sequence"), but it is actually two characters. The above expression with match only the Unicode names containing KATAKANA that represent single characters. =item Using /u for '%s' instead of /%s in regex; marked by S<<-- HERE> in m/%s/ (W regexp) You used a Unicode boundary (C<\b{...}> or C<\B{...}>) in a portion of a regular expression where the character set modifiers C</a> or C</aa> are in effect. These two modifiers indicate an ASCII interpretation, and this doesn't make sense for a Unicode definition. The generated regular expression will compile so that the boundary uses all of Unicode. No other portion of the regular expression is affected. =item Using !~ with %s doesn't make sense (F) Using the C<!~> operator with C<s///r>, C<tr///r> or C<y///r> is currently reserved for future use, as the exact behavior has not been decided. (Simply returning the boolean opposite of the modified string is usually not particularly useful.) =item UTF-16 surrogate U+%X (S surrogate) You had a UTF-16 surrogate in a context where they are not considered acceptable. These code points, between U+D800 and U+DFFF (inclusive), are used by Unicode only for UTF-16. However, Perl internally allows all unsigned integer code points (up to the size limit available on your platform), including surrogates. But these can cause problems when being input or output, which is likely where this message came from. If you really really know what you are doing you can turn off this warning by C<no warnings 'surrogate';>. =item Value of %s can be "0"; test with defined() (W misc) In a conditional expression, you used <HANDLE>, <*> (glob), C<each()>, or C<readdir()> as a boolean value. Each of these constructs can return a value of "0"; that would make the conditional expression false, which is probably not what you intended. When using these constructs in conditional expressions, test their values with the C<defined> operator. =item Value of CLI symbol "%s" too long (W misc) A warning peculiar to VMS. Perl tried to read the value of an %ENV element from a CLI symbol table, and found a resultant string longer than 1024 characters. The return value has been truncated to 1024 characters. =item Variable "%s" is not available (W closure) During compilation, an inner named subroutine or eval is attempting to capture an outer lexical that is not currently available. This can happen for one of two reasons. First, the outer lexical may be declared in an outer anonymous subroutine that has not yet been created. (Remember that named subs are created at compile time, while anonymous subs are created at run-time.) For example, sub { my $a; sub f { $a } } At the time that f is created, it can't capture the current value of $a, since the anonymous subroutine hasn't been created yet. Conversely, the following won't give a warning since the anonymous subroutine has by now been created and is live: sub { my $a; eval 'sub f { $a }' }->(); The second situation is caused by an eval accessing a variable that has gone out of scope, for example, sub f { my $a; sub { eval '$a' } } f()->(); Here, when the '$a' in the eval is being compiled, f() is not currently being executed, so its $a is not available for capture. =item Variable "%s" is not imported%s (S misc) With "use strict" in effect, you referred to a global variable that you apparently thought was imported from another module, because something else of the same name (usually a subroutine) is exported by that module. It usually means you put the wrong funny character on the front of your variable. =item Variable length lookbehind not implemented in regex m/%s/ (F) B<This message no longer should be raised as of Perl 5.30.> It is retained in this document as a convenience for people using an earlier Perl version. In Perl 5.30 and earlier, lookbehind is allowed only for subexpressions whose length is fixed and known at compile time. For positive lookbehind, you can use the C<\K> regex construct as a way to get the equivalent functionality. See L<(?<=pattern) and \K in perlre|perlre/\K>. Starting in Perl 5.18, there are non-obvious Unicode rules under C</i> that can match variably, but which you might not think could. For example, the substring C<"ss"> can match the single character LATIN SMALL LETTER SHARP S. Here's a complete list of the current ones affecting ASCII characters: ASCII sequence Matches single letter under /i FF U+FB00 LATIN SMALL LIGATURE FF FFI U+FB03 LATIN SMALL LIGATURE FFI FFL U+FB04 LATIN SMALL LIGATURE FFL FI U+FB01 LATIN SMALL LIGATURE FI FL U+FB02 LATIN SMALL LIGATURE FL SS U+00DF LATIN SMALL LETTER SHARP S U+1E9E LATIN CAPITAL LETTER SHARP S ST U+FB06 LATIN SMALL LIGATURE ST U+FB05 LATIN SMALL LIGATURE LONG S T This list is subject to change, but is quite unlikely to. Each ASCII sequence can be any combination of upper- and lowercase. You can avoid this by using a bracketed character class in the lookbehind assertion, like (?<![sS]t) (?<![fF]f[iI]) This fools Perl into not matching the ligatures. Another option for Perls starting with 5.16, if you only care about ASCII matches, is to add the C</aa> modifier to the regex. This will exclude all these non-obvious matches, thus getting rid of this message. You can also say use if $] ge 5.016, re => '/aa'; to apply C</aa> to all regular expressions compiled within its scope. See L<re>. =item "%s" variable %s masks earlier declaration in same %s (W shadow) A "my", "our" or "state" variable has been redeclared in the current scope or statement, effectively eliminating all access to the previous instance. This is almost always a typographical error. Note that the earlier variable will still exist until the end of the scope or until all closure references to it are destroyed. =item Variable syntax (A) You've accidentally run your script through B<csh> instead of Perl. Check the #! line, or manually feed your script into Perl yourself. =item Variable "%s" will not stay shared (W closure) An inner (nested) I<named> subroutine is referencing a lexical variable defined in an outer named subroutine. When the inner subroutine is called, it will see the value of the outer subroutine's variable as it was before and during the *first* call to the outer subroutine; in this case, after the first call to the outer subroutine is complete, the inner and outer subroutines will no longer share a common value for the variable. In other words, the variable will no longer be shared. This problem can usually be solved by making the inner subroutine anonymous, using the C<sub {}> syntax. When inner anonymous subs that reference variables in outer subroutines are created, they are automatically rebound to the current values of such variables. =item vector argument not supported with alpha versions (S printf) The %vd (s)printf format does not support version objects with alpha parts. =item Verb pattern '%s' has a mandatory argument in regex; marked by S<<-- HERE> in m/%s/ (F) You used a verb pattern that requires an argument. Supply an argument or check that you are using the right verb. =item Verb pattern '%s' may not have an argument in regex; marked by S<<-- HERE> in m/%s/ (F) You used a verb pattern that is not allowed an argument. Remove the argument or check that you are using the right verb. =item Version control conflict marker (F) The parser found a line starting with C<E<lt><<<<<<>, C<E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>E<gt>>, or C<=======>. These may be left by a version control system to mark conflicts after a failed merge operation. =item Version number must be a constant number (P) The attempt to translate a C<use Module n.n LIST> statement into its equivalent C<BEGIN> block found an internal inconsistency with the version number. =item Version string '%s' contains invalid data; ignoring: '%s' (W misc) The version string contains invalid characters at the end, which are being ignored. =item Warning: something's wrong (W) You passed warn() an empty string (the equivalent of C<warn "">) or you called it with no args and C<$@> was empty. =item Warning: unable to close filehandle %s properly (S) The implicit close() done by an open() got an error indication on the close(). This usually indicates your file system ran out of disk space. =item Warning: unable to close filehandle properly: %s =item Warning: unable to close filehandle %s properly: %s (S io) There were errors during the implicit close() done on a filehandle when its reference count reached zero while it was still open, e.g.: { open my $fh, '>', $file or die "open: '$file': $!\n"; print $fh $data or die "print: $!"; } # implicit close here Because various errors may only be detected by close() (e.g. buffering could allow the C<print> in this example to return true even when the disk is full), it is dangerous to ignore its result. So when it happens implicitly, perl will signal errors by warning. B<Prior to version 5.22.0, perl ignored such errors>, so the common idiom shown above was liable to cause B<silent data loss>. =item Warning: Use of "%s" without parentheses is ambiguous (S ambiguous) You wrote a unary operator followed by something that looks like a binary operator that could also have been interpreted as a term or unary operator. For instance, if you know that the rand function has a default argument of 1.0, and you write rand + 5; you may THINK you wrote the same thing as rand() + 5; but in actual fact, you got rand(+5); So put in parentheses to say what you really mean. =item when is experimental (S experimental::smartmatch) C<when> depends on smartmatch, which is experimental. Additionally, it has several special cases that may not be immediately obvious, and their behavior may change or even be removed in any future release of perl. See the explanation under L<perlsyn/Experimental Details on given and when>. =item Wide character in %s (S utf8) Perl met a wide character (ordinal >255) when it wasn't expecting one. This warning is by default on for I/O (like print). If this warning does come from I/O, the easiest way to quiet it is simply to add the C<:utf8> layer, I<e.g.>, S<C<binmode STDOUT, ':utf8'>>. Another way to turn off the warning is to add S<C<no warnings 'utf8';>> but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see L<open> and L<perlfunc/binmode>. If the warning comes from other than I/O, this diagnostic probably indicates that incorrect results are being obtained. You should examine your code to determine how a wide character is getting to an operation that doesn't handle them. =item Wide character (U+%X) in %s (W locale) While in a single-byte locale (I<i.e.>, a non-UTF-8 one), a multi-byte character was encountered. Perl considers this character to be the specified Unicode code point. Combining non-UTF-8 locales and Unicode is dangerous. Almost certainly some characters will have two different representations. For example, in the ISO 8859-7 (Greek) locale, the code point 0xC3 represents a Capital Gamma. But so also does 0x393. This will make string comparisons unreliable. You likely need to figure out how this multi-byte character got mixed up with your single-byte locale (or perhaps you thought you had a UTF-8 locale, but Perl disagrees). =item Within []-length '%c' not allowed (F) The count in the (un)pack template may be replaced by C<[TEMPLATE]> only if C<TEMPLATE> always matches the same amount of packed bytes that can be determined from the template alone. This is not possible if it contains any of the codes @, /, U, u, w or a *-length. Redesign the template. =item While trying to resolve method call %s->%s() can not locate package "%s" yet it is mentioned in @%s::ISA (perhaps you forgot to load "%s"?) (W syntax) It is possible that the C<@ISA> contains a misspelled or never loaded package name, which can result in perl choosing an unexpected parent class's method to resolve the method call. If this is deliberate you can do something like @Missing::Package::ISA = (); to silence the warnings, otherwise you should correct the package name, or ensure that the package is loaded prior to the method call. =item %s() with negative argument (S misc) Certain operations make no sense with negative arguments. Warning is given and the operation is not done. =item write() on closed filehandle %s (W closed) The filehandle you're writing to got itself closed sometime before now. Check your control flow. =item %s "\x%X" does not map to Unicode (S utf8) When reading in different encodings, Perl tries to map everything into Unicode characters. The bytes you read in are not legal in this encoding. For example utf8 "\xE4" does not map to Unicode if you try to read in the a-diaereses Latin-1 as UTF-8. =item 'X' outside of string (F) You had a (un)pack template that specified a relative position before the beginning of the string being (un)packed. See L<perlfunc/pack>. =item 'x' outside of string in unpack (F) You had a pack template that specified a relative position after the end of the string being unpacked. See L<perlfunc/pack>. =item YOU HAVEN'T DISABLED SET-ID SCRIPTS IN THE KERNEL YET! (F) And you probably never will, because you probably don't have the sources to your kernel, and your vendor probably doesn't give a rip about what you want. There is a vulnerability anywhere that you have a set-id script, and to close it you need to remove the set-id bit from the script that you're attempting to run. To actually run the script set-id, your best bet is to put a set-id C wrapper around your script. =item You need to quote "%s" (W syntax) You assigned a bareword as a signal handler name. Unfortunately, you already have a subroutine of that name declared, which means that Perl 5 will try to call the subroutine when the assignment is executed, which is probably not what you want. (If it IS what you want, put an & in front.) =item Your random numbers are not that random (F) When trying to initialize the random seed for hashes, Perl could not get any randomness out of your system. This usually indicates Something Very Wrong. =item Zero length \N{} in regex; marked by S<<-- HERE> in m/%s/ (F) Named Unicode character escapes (C<\N{...}>) may return a zero-length sequence. Such an escape was used in an extended character class, i.e. C<(?[...])>, or under C<use re 'strict'>, which is not permitted. Check that the correct escape has been used, and the correct charnames handler is in scope. The S<<-- HERE> shows whereabouts in the regular expression the problem was discovered. =back =head1 SEE ALSO L<warnings>, L<diagnostics>. =cut PK �=�[`� � perl5161delta.podnu �[��� =encoding utf8 =head1 NAME perl5161delta - what is new for perl v5.16.1 =head1 DESCRIPTION This document describes differences between the 5.16.0 release and the 5.16.1 release. If you are upgrading from an earlier release such as 5.14.0, first read L<perl5160delta>, which describes differences between 5.14.0 and 5.16.0. =head1 Security =head2 an off-by-two error in Scalar-List-Util has been fixed The bugfix was in Scalar-List-Util 1.23_04, and perl 5.16.1 includes Scalar-List-Util 1.25. =head1 Incompatible Changes There are no changes intentionally incompatible with 5.16.0 If any exist, they are bugs, and we request that you submit a report. See L</Reporting Bugs> below. =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Scalar::Util> and L<List::Util> have been upgraded from version 1.23 to version 1.25. =item * L<B::Deparse> has been updated from version 1.14 to 1.14_01. An "uninitialized" warning emitted by B::Deparse has been squashed [perl #113464]. =back =head1 Configuration and Compilation =over =item * Building perl with some Windows compilers used to fail due to a problem with miniperl's C<glob> operator (which uses the C<perlglob> program) deleting the PATH environment variable [perl #113798]. =back =head1 Platform Support =head2 Platform-Specific Notes =over 4 =item VMS All C header files from the top-level directory of the distribution are now installed on VMS, providing consistency with a long-standing practice on other platforms. Previously only a subset were installed, which broke non-core extension builds for extensions that depended on the missing include files. =back =head1 Selected Bug Fixes =over 4 =item * A regression introduced in Perl v5.16.0 involving C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/> has been fixed. Only the first instance is supposed to be meaningful if a character appears more than once in C<I<SEARCHLIST>>. Under some circumstances, the final instance was overriding all earlier ones. [perl #113584] =item * C<B::COP::stashlen> has been added. This provides access to an internal field added in perl 5.16 under threaded builds. It was broken at the last minute before 5.16 was released [perl #113034]. =item * The L<re> pragma will no longer clobber C<$_>. [perl #113750] =item * Unicode 6.1 published an incorrect alias for one of the Canonical_Combining_Class property's values (which range between 0 and 254). The alias C<CCC133> should have been C<CCC132>. Perl now overrides the data file furnished by Unicode to give the correct value. =item * Duplicating scalar filehandles works again. [perl #113764] =item * Under threaded perls, a runtime code block in a regular expression could corrupt the package name stored in the op tree, resulting in bad reads in C<caller>, and possibly crashes [perl #113060]. =item * For efficiency's sake, many operators and built-in functions return the same scalar each time. Lvalue subroutines and subroutines in the CORE:: namespace were allowing this implementation detail to leak through. C<print &CORE::uc("a"), &CORE::uc("b")> used to print "BB". The same thing would happen with an lvalue subroutine returning the return value of C<uc>. Now the value is copied in such cases [perl #113044]. =item * C<__SUB__> now works in special blocks (C<BEGIN>, C<END>, etc.). =item * Formats that reference lexical variables from outside no longer result in crashes. =back =head1 Known Problems There are no new known problems, but consult L<perl5160delta/Known Problems> to see those identified in the 5.16.0 release. =head1 Acknowledgements Perl 5.16.1 represents approximately 2 months of development since Perl 5.16.0 and contains approximately 14,000 lines of changes across 96 files from 8 authors. Perl continues to flourish into its third decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.16.1: Chris 'BinGOs' Williams, Craig A. Berry, Father Chrysostomos, Karl Williamson, Paul Johnson, Reini Urban, Ricardo Signes, Tony Cook. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://rt.perl.org/perlbug/ . There may also be information at http://www.perl.org/ , the Perl Home Page. If you believe you have an unreported bug, please run the L<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. If the bug you are reporting has security implications, which make it inappropriate to send to a publicly archived mailing list, then please send it to perl5-security-report@perl.org. This points to a closed subscription unarchived mailing list, which includes all the core committers, who will be able to help assess the impact of issues, figure out a resolution, and help co-ordinate the release of patches to mitigate or fix the problem across all platforms on which Perl is supported. Please only use this address for security issues in the Perl core, not for modules independently distributed on CPAN. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[9>r��� �� perlobj.podnu �[��� =encoding utf8 =for comment Consistent formatting of this file is achieved with: perl ./Porting/podtidy pod/perlobj.pod =head1 NAME X<object> X<OOP> perlobj - Perl object reference =head1 DESCRIPTION This document provides a reference for Perl's object orientation features. If you're looking for an introduction to object-oriented programming in Perl, please see L<perlootut>. In order to understand Perl objects, you first need to understand references in Perl. See L<perlreftut> for details. This document describes all of Perl's object-oriented (OO) features from the ground up. If you're just looking to write some object-oriented code of your own, you are probably better served by using one of the object systems from CPAN described in L<perlootut>. If you're looking to write your own object system, or you need to maintain code which implements objects from scratch then this document will help you understand exactly how Perl does object orientation. There are a few basic principles which define object oriented Perl: =over 4 =item 1. An object is simply a data structure that knows to which class it belongs. =item 2. A class is simply a package. A class provides methods that expect to operate on objects. =item 3. A method is simply a subroutine that expects a reference to an object (or a package name, for class methods) as the first argument. =back Let's look at each of these principles in depth. =head2 An Object is Simply a Data Structure X<object> X<bless> X<constructor> X<new> Unlike many other languages which support object orientation, Perl does not provide any special syntax for constructing an object. Objects are merely Perl data structures (hashes, arrays, scalars, filehandles, etc.) that have been explicitly associated with a particular class. That explicit association is created by the built-in C<bless> function, which is typically used within the I<constructor> subroutine of the class. Here is a simple constructor: package File; sub new { my $class = shift; return bless {}, $class; } The name C<new> isn't special. We could name our constructor something else: package File; sub load { my $class = shift; return bless {}, $class; } The modern convention for OO modules is to always use C<new> as the name for the constructor, but there is no requirement to do so. Any subroutine that blesses a data structure into a class is a valid constructor in Perl. In the previous examples, the C<{}> code creates a reference to an empty anonymous hash. The C<bless> function then takes that reference and associates the hash with the class in C<$class>. In the simplest case, the C<$class> variable will end up containing the string "File". We can also use a variable to store a reference to the data structure that is being blessed as our object: sub new { my $class = shift; my $self = {}; bless $self, $class; return $self; } Once we've blessed the hash referred to by C<$self> we can start calling methods on it. This is useful if you want to put object initialization in its own separate method: sub new { my $class = shift; my $self = {}; bless $self, $class; $self->_initialize(); return $self; } Since the object is also a hash, you can treat it as one, using it to store data associated with the object. Typically, code inside the class can treat the hash as an accessible data structure, while code outside the class should always treat the object as opaque. This is called B<encapsulation>. Encapsulation means that the user of an object does not have to know how it is implemented. The user simply calls documented methods on the object. Note, however, that (unlike most other OO languages) Perl does not ensure or enforce encapsulation in any way. If you want objects to actually I<be> opaque you need to arrange for that yourself. This can be done in a variety of ways, including using L</"Inside-Out objects"> or modules from CPAN. =head3 Objects Are Blessed; Variables Are Not When we bless something, we are not blessing the variable which contains a reference to that thing, nor are we blessing the reference that the variable stores; we are blessing the thing that the variable refers to (sometimes known as the I<referent>). This is best demonstrated with this code: use Scalar::Util 'blessed'; my $foo = {}; my $bar = $foo; bless $foo, 'Class'; print blessed( $bar ) // 'not blessed'; # prints "Class" $bar = "some other value"; print blessed( $bar ) // 'not blessed'; # prints "not blessed" When we call C<bless> on a variable, we are actually blessing the underlying data structure that the variable refers to. We are not blessing the reference itself, nor the variable that contains that reference. That's why the second call to C<blessed( $bar )> returns false. At that point C<$bar> is no longer storing a reference to an object. You will sometimes see older books or documentation mention "blessing a reference" or describe an object as a "blessed reference", but this is incorrect. It isn't the reference that is blessed as an object; it's the thing the reference refers to (i.e. the referent). =head2 A Class is Simply a Package X<class> X<package> X<@ISA> X<inheritance> Perl does not provide any special syntax for class definitions. A package is simply a namespace containing variables and subroutines. The only difference is that in a class, the subroutines may expect a reference to an object or the name of a class as the first argument. This is purely a matter of convention, so a class may contain both methods and subroutines which I<don't> operate on an object or class. Each package contains a special array called C<@ISA>. The C<@ISA> array contains a list of that class's parent classes, if any. This array is examined when Perl does method resolution, which we will cover later. Calling methods from a package means it must be loaded, of course, so you will often want to load a module and add it to C<@ISA> at the same time. You can do so in a single step using the L<parent> pragma. (In older code you may encounter the L<base> pragma, which is nowadays discouraged except when you have to work with the equally discouraged L<fields> pragma.) However the parent classes are set, the package's C<@ISA> variable will contain a list of those parents. This is simply a list of scalars, each of which is a string that corresponds to a package name. All classes inherit from the L<UNIVERSAL> class implicitly. The L<UNIVERSAL> class is implemented by the Perl core, and provides several default methods, such as C<isa()>, C<can()>, and C<VERSION()>. The C<UNIVERSAL> class will I<never> appear in a package's C<@ISA> variable. Perl I<only> provides method inheritance as a built-in feature. Attribute inheritance is left up the class to implement. See the L</Writing Accessors> section for details. =head2 A Method is Simply a Subroutine X<method> Perl does not provide any special syntax for defining a method. A method is simply a regular subroutine, and is declared with C<sub>. What makes a method special is that it expects to receive either an object or a class name as its first argument. Perl I<does> provide special syntax for method invocation, the C<< -> >> operator. We will cover this in more detail later. Most methods you write will expect to operate on objects: sub save { my $self = shift; open my $fh, '>', $self->path() or die $!; print {$fh} $self->data() or die $!; close $fh or die $!; } =head2 Method Invocation X<invocation> X<method> X<arrow> X<< -> >> Calling a method on an object is written as C<< $object->method >>. The left hand side of the method invocation (or arrow) operator is the object (or class name), and the right hand side is the method name. my $pod = File->new( 'perlobj.pod', $data ); $pod->save(); The C<< -> >> syntax is also used when dereferencing a reference. It looks like the same operator, but these are two different operations. When you call a method, the thing on the left side of the arrow is passed as the first argument to the method. That means when we call C<< Critter->new() >>, the C<new()> method receives the string C<"Critter"> as its first argument. When we call C<< $fred->speak() >>, the C<$fred> variable is passed as the first argument to C<speak()>. Just as with any Perl subroutine, all of the arguments passed in C<@_> are aliases to the original argument. This includes the object itself. If you assign directly to C<$_[0]> you will change the contents of the variable that holds the reference to the object. We recommend that you don't do this unless you know exactly what you're doing. Perl knows what package the method is in by looking at the left side of the arrow. If the left hand side is a package name, it looks for the method in that package. If the left hand side is an object, then Perl looks for the method in the package that the object has been blessed into. If the left hand side is neither a package name nor an object, then the method call will cause an error, but see the section on L</Method Call Variations> for more nuances. =head2 Inheritance X<inheritance> We already talked about the special C<@ISA> array and the L<parent> pragma. When a class inherits from another class, any methods defined in the parent class are available to the child class. If you attempt to call a method on an object that isn't defined in its own class, Perl will also look for that method in any parent classes it may have. package File::MP3; use parent 'File'; # sets @File::MP3::ISA = ('File'); my $mp3 = File::MP3->new( 'Andvari.mp3', $data ); $mp3->save(); Since we didn't define a C<save()> method in the C<File::MP3> class, Perl will look at the C<File::MP3> class's parent classes to find the C<save()> method. If Perl cannot find a C<save()> method anywhere in the inheritance hierarchy, it will die. In this case, it finds a C<save()> method in the C<File> class. Note that the object passed to C<save()> in this case is still a C<File::MP3> object, even though the method is found in the C<File> class. We can override a parent's method in a child class. When we do so, we can still call the parent class's method with the C<SUPER> pseudo-class. sub save { my $self = shift; say 'Prepare to rock'; $self->SUPER::save(); } The C<SUPER> modifier can I<only> be used for method calls. You can't use it for regular subroutine calls or class methods: SUPER::save($thing); # FAIL: looks for save() sub in package SUPER SUPER->save($thing); # FAIL: looks for save() method in class # SUPER $thing->SUPER::save(); # Okay: looks for save() method in parent # classes =head3 How SUPER is Resolved X<SUPER> The C<SUPER> pseudo-class is resolved from the package where the call is made. It is I<not> resolved based on the object's class. This is important, because it lets methods at different levels within a deep inheritance hierarchy each correctly call their respective parent methods. package A; sub new { return bless {}, shift; } sub speak { my $self = shift; say 'A'; } package B; use parent -norequire, 'A'; sub speak { my $self = shift; $self->SUPER::speak(); say 'B'; } package C; use parent -norequire, 'B'; sub speak { my $self = shift; $self->SUPER::speak(); say 'C'; } my $c = C->new(); $c->speak(); In this example, we will get the following output: A B C This demonstrates how C<SUPER> is resolved. Even though the object is blessed into the C<C> class, the C<speak()> method in the C<B> class can still call C<SUPER::speak()> and expect it to correctly look in the parent class of C<B> (i.e the class the method call is in), not in the parent class of C<C> (i.e. the class the object belongs to). There are rare cases where this package-based resolution can be a problem. If you copy a subroutine from one package to another, C<SUPER> resolution will be done based on the original package. =head3 Multiple Inheritance X<multiple inheritance> Multiple inheritance often indicates a design problem, but Perl always gives you enough rope to hang yourself with if you ask for it. To declare multiple parents, you simply need to pass multiple class names to C<use parent>: package MultiChild; use parent 'Parent1', 'Parent2'; =head3 Method Resolution Order X<method resolution order> X<mro> Method resolution order only matters in the case of multiple inheritance. In the case of single inheritance, Perl simply looks up the inheritance chain to find a method: Grandparent | Parent | Child If we call a method on a C<Child> object and that method is not defined in the C<Child> class, Perl will look for that method in the C<Parent> class and then, if necessary, in the C<Grandparent> class. If Perl cannot find the method in any of these classes, it will die with an error message. When a class has multiple parents, the method lookup order becomes more complicated. By default, Perl does a depth-first left-to-right search for a method. That means it starts with the first parent in the C<@ISA> array, and then searches all of its parents, grandparents, etc. If it fails to find the method, it then goes to the next parent in the original class's C<@ISA> array and searches from there. SharedGreatGrandParent / \ PaternalGrandparent MaternalGrandparent \ / Father Mother \ / Child So given the diagram above, Perl will search C<Child>, C<Father>, C<PaternalGrandparent>, C<SharedGreatGrandParent>, C<Mother>, and finally C<MaternalGrandparent>. This may be a problem because now we're looking in C<SharedGreatGrandParent> I<before> we've checked all its derived classes (i.e. before we tried C<Mother> and C<MaternalGrandparent>). It is possible to ask for a different method resolution order with the L<mro> pragma. package Child; use mro 'c3'; use parent 'Father', 'Mother'; This pragma lets you switch to the "C3" resolution order. In simple terms, "C3" order ensures that shared parent classes are never searched before child classes, so Perl will now search: C<Child>, C<Father>, C<PaternalGrandparent>, C<Mother> C<MaternalGrandparent>, and finally C<SharedGreatGrandParent>. Note however that this is not "breadth-first" searching: All the C<Father> ancestors (except the common ancestor) are searched before any of the C<Mother> ancestors are considered. The C3 order also lets you call methods in sibling classes with the C<next> pseudo-class. See the L<mro> documentation for more details on this feature. =head3 Method Resolution Caching When Perl searches for a method, it caches the lookup so that future calls to the method do not need to search for it again. Changing a class's parent class or adding subroutines to a class will invalidate the cache for that class. The L<mro> pragma provides some functions for manipulating the method cache directly. =head2 Writing Constructors X<constructor> As we mentioned earlier, Perl provides no special constructor syntax. This means that a class must implement its own constructor. A constructor is simply a class method that returns a reference to a new object. The constructor can also accept additional parameters that define the object. Let's write a real constructor for the C<File> class we used earlier: package File; sub new { my $class = shift; my ( $path, $data ) = @_; my $self = bless { path => $path, data => $data, }, $class; return $self; } As you can see, we've stored the path and file data in the object itself. Remember, under the hood, this object is still just a hash. Later, we'll write accessors to manipulate this data. For our C<File::MP3> class, we can check to make sure that the path we're given ends with ".mp3": package File::MP3; sub new { my $class = shift; my ( $path, $data ) = @_; die "You cannot create a File::MP3 without an mp3 extension\n" unless $path =~ /\.mp3\z/; return $class->SUPER::new(@_); } This constructor lets its parent class do the actual object construction. =head2 Attributes X<attribute> An attribute is a piece of data belonging to a particular object. Unlike most object-oriented languages, Perl provides no special syntax or support for declaring and manipulating attributes. Attributes are often stored in the object itself. For example, if the object is an anonymous hash, we can store the attribute values in the hash using the attribute name as the key. While it's possible to refer directly to these hash keys outside of the class, it's considered a best practice to wrap all access to the attribute with accessor methods. This has several advantages. Accessors make it easier to change the implementation of an object later while still preserving the original API. An accessor lets you add additional code around attribute access. For example, you could apply a default to an attribute that wasn't set in the constructor, or you could validate that a new value for the attribute is acceptable. Finally, using accessors makes inheritance much simpler. Subclasses can use the accessors rather than having to know how a parent class is implemented internally. =head3 Writing Accessors X<accessor> As with constructors, Perl provides no special accessor declaration syntax, so classes must provide explicitly written accessor methods. There are two common types of accessors, read-only and read-write. A simple read-only accessor simply gets the value of a single attribute: sub path { my $self = shift; return $self->{path}; } A read-write accessor will allow the caller to set the value as well as get it: sub path { my $self = shift; if (@_) { $self->{path} = shift; } return $self->{path}; } =head2 An Aside About Smarter and Safer Code Our constructor and accessors are not very smart. They don't check that a C<$path> is defined, nor do they check that a C<$path> is a valid filesystem path. Doing these checks by hand can quickly become tedious. Writing a bunch of accessors by hand is also incredibly tedious. There are a lot of modules on CPAN that can help you write safer and more concise code, including the modules we recommend in L<perlootut>. =head2 Method Call Variations X<method> Perl supports several other ways to call methods besides the C<< $object->method() >> usage we've seen so far. =head3 Method Names with a Fully Qualified Name Perl allows you to call methods using their fully qualified name (the package and method name): my $mp3 = File::MP3->new( 'Regin.mp3', $data ); $mp3->File::save(); When you call a fully qualified method name like C<File::save>, the method resolution search for the C<save> method starts in the C<File> class, skipping any C<save> method the C<File::MP3> class may have defined. It still searches the C<File> class's parents if necessary. While this feature is most commonly used to explicitly call methods inherited from an ancestor class, there is no technical restriction that enforces this: my $obj = Tree->new(); $obj->Dog::bark(); This calls the C<bark> method from class C<Dog> on an object of class C<Tree>, even if the two classes are completely unrelated. Use this with great care. The C<SUPER> pseudo-class that was described earlier is I<not> the same as calling a method with a fully-qualified name. See the earlier L</Inheritance> section for details. =head3 Method Names as Strings Perl lets you use a scalar variable containing a string as a method name: my $file = File->new( $path, $data ); my $method = 'save'; $file->$method(); This works exactly like calling C<< $file->save() >>. This can be very useful for writing dynamic code. For example, it allows you to pass a method name to be called as a parameter to another method. =head3 Class Names as Strings Perl also lets you use a scalar containing a string as a class name: my $class = 'File'; my $file = $class->new( $path, $data ); Again, this allows for very dynamic code. =head3 Subroutine References as Methods You can also use a subroutine reference as a method: my $sub = sub { my $self = shift; $self->save(); }; $file->$sub(); This is exactly equivalent to writing C<< $sub->($file) >>. You may see this idiom in the wild combined with a call to C<can>: if ( my $meth = $object->can('foo') ) { $object->$meth(); } =head3 Dereferencing Method Call Perl also lets you use a dereferenced scalar reference in a method call. That's a mouthful, so let's look at some code: $file->${ \'save' }; $file->${ returns_scalar_ref() }; $file->${ \( returns_scalar() ) }; $file->${ returns_ref_to_sub_ref() }; This works if the dereference produces a string I<or> a subroutine reference. =head3 Method Calls on Filehandles Under the hood, Perl filehandles are instances of the C<IO::Handle> or C<IO::File> class. Once you have an open filehandle, you can call methods on it. Additionally, you can call methods on the C<STDIN>, C<STDOUT>, and C<STDERR> filehandles. open my $fh, '>', 'path/to/file'; $fh->autoflush(); $fh->print('content'); STDOUT->autoflush(); =head2 Invoking Class Methods X<invocation> Because Perl allows you to use barewords for package names and subroutine names, it sometimes interprets a bareword's meaning incorrectly. For example, the construct C<< Class->new() >> can be interpreted as either C<< 'Class'->new() >> or C<< Class()->new() >>. In English, that second interpretation reads as "call a subroutine named Class(), then call new() as a method on the return value of Class()". If there is a subroutine named C<Class()> in the current namespace, Perl will always interpret C<< Class->new() >> as the second alternative: a call to C<new()> on the object returned by a call to C<Class()> You can force Perl to use the first interpretation (i.e. as a method call on the class named "Class") in two ways. First, you can append a C<::> to the class name: Class::->new() Perl will always interpret this as a method call. Alternatively, you can quote the class name: 'Class'->new() Of course, if the class name is in a scalar Perl will do the right thing as well: my $class = 'Class'; $class->new(); =head3 Indirect Object Syntax X<indirect object> B<Outside of the file handle case, use of this syntax is discouraged as it can confuse the Perl interpreter. See below for more details.> Perl supports another method invocation syntax called "indirect object" notation. This syntax is called "indirect" because the method comes before the object it is being invoked on. This syntax can be used with any class or object method: my $file = new File $path, $data; save $file; We recommend that you avoid this syntax, for several reasons. First, it can be confusing to read. In the above example, it's not clear if C<save> is a method provided by the C<File> class or simply a subroutine that expects a file object as its first argument. When used with class methods, the problem is even worse. Because Perl allows subroutine names to be written as barewords, Perl has to guess whether the bareword after the method is a class name or subroutine name. In other words, Perl can resolve the syntax as either C<< File->new( $path, $data ) >> B<or> C<< new( File( $path, $data ) ) >>. To parse this code, Perl uses a heuristic based on what package names it has seen, what subroutines exist in the current package, what barewords it has previously seen, and other input. Needless to say, heuristics can produce very surprising results! Older documentation (and some CPAN modules) encouraged this syntax, particularly for constructors, so you may still find it in the wild. However, we encourage you to avoid using it in new code. You can force Perl to interpret the bareword as a class name by appending "::" to it, like we saw earlier: my $file = new File:: $path, $data; =head2 C<bless>, C<blessed>, and C<ref> As we saw earlier, an object is simply a data structure that has been blessed into a class via the C<bless> function. The C<bless> function can take either one or two arguments: my $object = bless {}, $class; my $object = bless {}; In the first form, the anonymous hash is being blessed into the class in C<$class>. In the second form, the anonymous hash is blessed into the current package. The second form is strongly discouraged, because it breaks the ability of a subclass to reuse the parent's constructor, but you may still run across it in existing code. If you want to know whether a particular scalar refers to an object, you can use the C<blessed> function exported by L<Scalar::Util>, which is shipped with the Perl core. use Scalar::Util 'blessed'; if ( defined blessed($thing) ) { ... } If C<$thing> refers to an object, then this function returns the name of the package the object has been blessed into. If C<$thing> doesn't contain a reference to a blessed object, the C<blessed> function returns C<undef>. Note that C<blessed($thing)> will also return false if C<$thing> has been blessed into a class named "0". This is a possible, but quite pathological. Don't create a class named "0" unless you know what you're doing. Similarly, Perl's built-in C<ref> function treats a reference to a blessed object specially. If you call C<ref($thing)> and C<$thing> holds a reference to an object, it will return the name of the class that the object has been blessed into. If you simply want to check that a variable contains an object reference, we recommend that you use C<defined blessed($object)>, since C<ref> returns true values for all references, not just objects. =head2 The UNIVERSAL Class X<UNIVERSAL> All classes automatically inherit from the L<UNIVERSAL> class, which is built-in to the Perl core. This class provides a number of methods, all of which can be called on either a class or an object. You can also choose to override some of these methods in your class. If you do so, we recommend that you follow the built-in semantics described below. =over 4 =item isa($class) X<isa> The C<isa> method returns I<true> if the object is a member of the class in C<$class>, or a member of a subclass of C<$class>. If you override this method, it should never throw an exception. =item DOES($role) X<DOES> The C<DOES> method returns I<true> if its object claims to perform the role C<$role>. By default, this is equivalent to C<isa>. This method is provided for use by object system extensions that implement roles, like C<Moose> and C<Role::Tiny>. You can also override C<DOES> directly in your own classes. If you override this method, it should never throw an exception. =item can($method) X<can> The C<can> method checks to see if the class or object it was called on has a method named C<$method>. This checks for the method in the class and all of its parents. If the method exists, then a reference to the subroutine is returned. If it does not then C<undef> is returned. If your class responds to method calls via C<AUTOLOAD>, you may want to overload C<can> to return a subroutine reference for methods which your C<AUTOLOAD> method handles. If you override this method, it should never throw an exception. =item VERSION($need) X<VERSION> The C<VERSION> method returns the version number of the class (package). If the C<$need> argument is given then it will check that the current version (as defined by the $VERSION variable in the package) is greater than or equal to C<$need>; it will die if this is not the case. This method is called automatically by the C<VERSION> form of C<use>. use Package 1.2 qw(some imported subs); # implies: Package->VERSION(1.2); We recommend that you use this method to access another package's version, rather than looking directly at C<$Package::VERSION>. The package you are looking at could have overridden the C<VERSION> method. We also recommend using this method to check whether a module has a sufficient version. The internal implementation uses the L<version> module to make sure that different types of version numbers are compared correctly. =back =head2 AUTOLOAD X<AUTOLOAD> If you call a method that doesn't exist in a class, Perl will throw an error. However, if that class or any of its parent classes defines an C<AUTOLOAD> method, that C<AUTOLOAD> method is called instead. C<AUTOLOAD> is called as a regular method, and the caller will not know the difference. Whatever value your C<AUTOLOAD> method returns is returned to the caller. The fully qualified method name that was called is available in the C<$AUTOLOAD> package global for your class. Since this is a global, if you want to refer to do it without a package name prefix under C<strict 'vars'>, you need to declare it. # XXX - this is a terrible way to implement accessors, but it makes # for a simple example. our $AUTOLOAD; sub AUTOLOAD { my $self = shift; # Remove qualifier from original method name... my $called = $AUTOLOAD =~ s/.*:://r; # Is there an attribute of that name? die "No such attribute: $called" unless exists $self->{$called}; # If so, return it... return $self->{$called}; } sub DESTROY { } # see below Without the C<our $AUTOLOAD> declaration, this code will not compile under the L<strict> pragma. As the comment says, this is not a good way to implement accessors. It's slow and too clever by far. However, you may see this as a way to provide accessors in older Perl code. See L<perlootut> for recommendations on OO coding in Perl. If your class does have an C<AUTOLOAD> method, we strongly recommend that you override C<can> in your class as well. Your overridden C<can> method should return a subroutine reference for any method that your C<AUTOLOAD> responds to. =head2 Destructors X<destructor> X<DESTROY> When the last reference to an object goes away, the object is destroyed. If you only have one reference to an object stored in a lexical scalar, the object is destroyed when that scalar goes out of scope. If you store the object in a package global, that object may not go out of scope until the program exits. If you want to do something when the object is destroyed, you can define a C<DESTROY> method in your class. This method will always be called by Perl at the appropriate time, unless the method is empty. This is called just like any other method, with the object as the first argument. It does not receive any additional arguments. However, the C<$_[0]> variable will be read-only in the destructor, so you cannot assign a value to it. If your C<DESTROY> method throws an exception, this will not cause any control transfer beyond exiting the method. The exception will be reported to C<STDERR> as a warning, marked "(in cleanup)", and Perl will continue with whatever it was doing before. Because C<DESTROY> methods can be called at any time, you should localize any global status variables that might be set by anything you do in your C<DESTROY> method. If you are in doubt about a particular status variable, it doesn't hurt to localize it. There are five global status variables, and the safest way is to localize all five of them: sub DESTROY { local($., $@, $!, $^E, $?); my $self = shift; ...; } If you define an C<AUTOLOAD> in your class, then Perl will call your C<AUTOLOAD> to handle the C<DESTROY> method. You can prevent this by defining an empty C<DESTROY>, like we did in the autoloading example. You can also check the value of C<$AUTOLOAD> and return without doing anything when called to handle C<DESTROY>. =head3 Global Destruction The order in which objects are destroyed during the global destruction before the program exits is unpredictable. This means that any objects contained by your object may already have been destroyed. You should check that a contained object is defined before calling a method on it: sub DESTROY { my $self = shift; $self->{handle}->close() if $self->{handle}; } You can use the C<${^GLOBAL_PHASE}> variable to detect if you are currently in the global destruction phase: sub DESTROY { my $self = shift; return if ${^GLOBAL_PHASE} eq 'DESTRUCT'; $self->{handle}->close(); } Note that this variable was added in Perl 5.14.0. If you want to detect the global destruction phase on older versions of Perl, you can use the C<Devel::GlobalDestruction> module on CPAN. If your C<DESTROY> method issues a warning during global destruction, the Perl interpreter will append the string " during global destruction" to the warning. During global destruction, Perl will always garbage collect objects before unblessed references. See L<perlhacktips/PERL_DESTRUCT_LEVEL> for more information about global destruction. =head2 Non-Hash Objects All the examples so far have shown objects based on a blessed hash. However, it's possible to bless any type of data structure or referent, including scalars, globs, and subroutines. You may see this sort of thing when looking at code in the wild. Here's an example of a module as a blessed scalar: package Time; use strict; use warnings; sub new { my $class = shift; my $time = time; return bless \$time, $class; } sub epoch { my $self = shift; return ${ $self }; } my $time = Time->new(); print $time->epoch(); =head2 Inside-Out objects In the past, the Perl community experimented with a technique called "inside-out objects". An inside-out object stores its data outside of the object's reference, indexed on a unique property of the object, such as its memory address, rather than in the object itself. This has the advantage of enforcing the encapsulation of object attributes, since their data is not stored in the object itself. This technique was popular for a while (and was recommended in Damian Conway's I<Perl Best Practices>), but never achieved universal adoption. The L<Object::InsideOut> module on CPAN provides a comprehensive implementation of this technique, and you may see it or other inside-out modules in the wild. Here is a simple example of the technique, using the L<Hash::Util::FieldHash> core module. This module was added to the core to support inside-out object implementations. package Time; use strict; use warnings; use Hash::Util::FieldHash 'fieldhash'; fieldhash my %time_for; sub new { my $class = shift; my $self = bless \( my $object ), $class; $time_for{$self} = time; return $self; } sub epoch { my $self = shift; return $time_for{$self}; } my $time = Time->new; print $time->epoch; =head2 Pseudo-hashes The pseudo-hash feature was an experimental feature introduced in earlier versions of Perl and removed in 5.10.0. A pseudo-hash is an array reference which can be accessed using named keys like a hash. You may run in to some code in the wild which uses it. See the L<fields> pragma for more information. =head1 SEE ALSO A kinder, gentler tutorial on object-oriented programming in Perl can be found in L<perlootut>. You should also check out L<perlmodlib> for some style guides on constructing both modules and classes. PK �=�[���) ) perldos.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see perlpod manpage) which is specially designed to be readable as is. =head1 NAME perldos - Perl under DOS, W31, W95. =head1 SYNOPSIS These are instructions for building Perl under DOS (or w??), using DJGPP v2.03 or later. Under w95 long filenames are supported. =head1 DESCRIPTION Before you start, you should glance through the README file found in the top-level directory where the Perl distribution was extracted. Make sure you read and understand the terms under which this software is being distributed. This port currently supports MakeMaker (the set of modules that is used to build extensions to perl). Therefore, you should be able to build and install most extensions found in the CPAN sites. Detailed instructions on how to build and install perl extension modules, including XS-type modules, is included. See 'BUILDING AND INSTALLING MODULES'. =head2 Prerequisites for Compiling Perl on DOS =over 4 =item DJGPP DJGPP is a port of GNU C/C++ compiler and development tools to 32-bit, protected-mode environment on Intel 32-bit CPUs running MS-DOS and compatible operating systems, by DJ Delorie <dj@delorie.com> and friends. For more details (FAQ), check out the home of DJGPP at: http://www.delorie.com/djgpp/ If you have questions about DJGPP, try posting to the DJGPP newsgroup: comp.os.msdos.djgpp, or use the email gateway djgpp@delorie.com. You can find the full DJGPP distribution on any of the mirrors listed here: http://www.delorie.com/djgpp/getting.html You need the following files to build perl (or add new modules): v2/djdev203.zip v2gnu/bnu2112b.zip v2gnu/gcc2953b.zip v2gnu/bsh204b.zip v2gnu/mak3791b.zip v2gnu/fil40b.zip v2gnu/sed3028b.zip v2gnu/txt20b.zip v2gnu/dif272b.zip v2gnu/grep24b.zip v2gnu/shl20jb.zip v2gnu/gwk306b.zip v2misc/csdpmi5b.zip or possibly any newer version. =item Pthreads Thread support is not tested in this version of the djgpp perl. =back =head2 Shortcomings of Perl under DOS Perl under DOS lacks some features of perl under UNIX because of deficiencies in the UNIX-emulation, most notably: =over 4 =item * fork() and pipe() =item * some features of the UNIX filesystem regarding link count and file dates =item * in-place operation is a little bit broken with short filenames =item * sockets =back =head2 Building Perl on DOS =over 4 =item * Unpack the source package F<perl5.8*.tar.gz> with djtarx. If you want to use long file names under w95 and also to get Perl to pass all its tests, don't forget to use set LFN=y set FNCASE=y before unpacking the archive. =item * Create a "symlink" or copy your bash.exe to sh.exe in your C<($DJDIR)/bin> directory. ln -s bash.exe sh.exe [If you have the recommended version of bash for DJGPP, this is already done for you.] And make the C<SHELL> environment variable point to this F<sh.exe>: set SHELL=c:/djgpp/bin/sh.exe (use full path name!) You can do this in F<djgpp.env> too. Add this line BEFORE any section definition: +SHELL=%DJDIR%/bin/sh.exe =item * If you have F<split.exe> and F<gsplit.exe> in your path, then rename F<split.exe> to F<djsplit.exe>, and F<gsplit.exe> to F<split.exe>. Copy or link F<gecho.exe> to F<echo.exe> if you don't have F<echo.exe>. Copy or link F<gawk.exe> to F<awk.exe> if you don't have F<awk.exe>. [If you have the recommended versions of djdev, shell utilities and gawk, all these are already done for you, and you will not need to do anything.] =item * Chdir to the djgpp subdirectory of perl toplevel and type the following commands: set FNCASE=y configure.bat This will do some preprocessing then run the Configure script for you. The Configure script is interactive, but in most cases you just need to press ENTER. The "set" command ensures that DJGPP preserves the letter case of file names when reading directories. If you already issued this set command when unpacking the archive, and you are in the same DOS session as when you unpacked the archive, you don't have to issue the set command again. This command is necessary *before* you start to (re)configure or (re)build perl in order to ensure both that perl builds correctly and that building XS-type modules can succeed. See the DJGPP info entry for "_preserve_fncase" for more information: info libc alphabetical _preserve_fncase If the script says that your package is incomplete, and asks whether to continue, just answer with Y (this can only happen if you don't use long filenames or forget to issue "set FNCASE=y" first). When Configure asks about the extensions, I suggest IO and Fcntl, and if you want database handling then SDBM_File or GDBM_File (you need to install gdbm for this one). If you want to use the POSIX extension (this is the default), make sure that the stack size of your F<cc1.exe> is at least 512kbyte (you can check this with: C<stubedit cc1.exe>). You can use the Configure script in non-interactive mode too. When I built my F<perl.exe>, I used something like this: configure.bat -des You can find more info about Configure's command line switches in the F<INSTALL> file. When the script ends, and you want to change some values in the generated F<config.sh> file, then run sh Configure -S after you made your modifications. IMPORTANT: if you use this C<-S> switch, be sure to delete the CONFIG environment variable before running the script: set CONFIG= =item * Now you can compile Perl. Type: make =back =head2 Testing Perl on DOS Type: make test If you're lucky you should see "All tests successful". But there can be a few failed subtests (less than 5 hopefully) depending on some external conditions (e.g. some subtests fail under linux/dosemu or plain dos with short filenames only). =head2 Installation of Perl on DOS Type: make install This will copy the newly compiled perl and libraries into your DJGPP directory structure. Perl.exe and the utilities go into C<($DJDIR)/bin>, and the library goes under C<($DJDIR)/lib/perl5>. The pod documentation goes under C<($DJDIR)/lib/perl5/pod>. =head1 BUILDING AND INSTALLING MODULES ON DOS =head2 Building Prerequisites for Perl on DOS For building and installing non-XS modules, all you need is a working perl under DJGPP. Non-XS modules do not require re-linking the perl binary, and so are simpler to build and install. XS-type modules do require re-linking the perl binary, because part of an XS module is written in "C", and has to be linked together with the perl binary to be executed. This is required because perl under DJGPP is built with the "static link" option, due to the lack of "dynamic linking" in the DJGPP environment. Because XS modules require re-linking of the perl binary, you need both the perl binary distribution and the perl source distribution to build an XS extension module. In addition, you will have to have built your perl binary from the source distribution so that all of the components of the perl binary are available for the required link step. =head2 Unpacking CPAN Modules on DOS First, download the module package from CPAN (e.g., the "Comma Separated Value" text package, Text-CSV-0.01.tar.gz). Then expand the contents of the package into some location on your disk. Most CPAN modules are built with an internal directory structure, so it is usually safe to expand it in the root of your DJGPP installation. Some people prefer to locate source trees under /usr/src (i.e., C<($DJDIR)/usr/src>), but you may put it wherever seems most logical to you, *EXCEPT* under the same directory as your perl source code. There are special rules that apply to modules which live in the perl source tree that do not apply to most of the modules in CPAN. Unlike other DJGPP packages, which are normal "zip" files, most CPAN module packages are "gzipped tarballs". Recent versions of WinZip will safely unpack and expand them, *UNLESS* they have zero-length files. It is a known WinZip bug (as of v7.0) that it will not extract zero-length files. From the command line, you can use the djtar utility provided with DJGPP to unpack and expand these files. For example: C:\djgpp>djtarx -v Text-CSV-0.01.tar.gz This will create the new directory C<($DJDIR)/Text-CSV-0.01>, filling it with the source for this module. =head2 Building Non-XS Modules on DOS To build a non-XS module, you can use the standard module-building instructions distributed with perl modules. perl Makefile.PL make make test make install This is sufficient because non-XS modules install only ".pm" files and (sometimes) pod and/or man documentation. No re-linking of the perl binary is needed to build, install or use non-XS modules. =head2 Building XS Modules on DOS To build an XS module, you must use the standard module-building instructions distributed with perl modules *PLUS* three extra instructions specific to the DJGPP "static link" build environment. set FNCASE=y perl Makefile.PL make make perl make test make -f Makefile.aperl inst_perl MAP_TARGET=perl.exe make install The first extra instruction sets DJGPP's FNCASE environment variable so that the new perl binary which you must build for an XS-type module will build correctly. The second extra instruction re-builds the perl binary in your module directory before you run "make test", so that you are testing with the new module code you built with "make". The third extra instruction installs the perl binary from your module directory into the standard DJGPP binary directory, C<($DJDIR)/bin>, replacing your previous perl binary. Note that the MAP_TARGET value *must* have the ".exe" extension or you will not create a "perl.exe" to replace the one in C<($DJDIR)/bin>. When you are done, the XS-module install process will have added information to your "perllocal" information telling that the perl binary has been replaced, and what module was installed. You can view this information at any time by using the command: perl -S perldoc perllocal =head1 AUTHOR Laszlo Molnar, F<laszlo.molnar@eth.ericsson.se> [Installing/building perl] Peter J. Farley III F<pjfarley@banet.net> [Building/installing modules] =head1 SEE ALSO perl(1). =cut PK �=�[�;` \/ \/ perlmacosx.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlmacosx - Perl under Mac OS X =head1 SYNOPSIS This document briefly describes Perl under Mac OS X. curl -O https://www.cpan.org/src/perl-5.32.1.tar.gz tar -xzf perl-5.32.1.tar.gz cd perl-5.32.1 ./Configure -des -Dprefix=/usr/local/ make make test sudo make install =head1 DESCRIPTION The latest Perl release (5.32.1 as of this writing) builds without changes under all versions of Mac OS X from 10.3 "Panther" onwards. In order to build your own version of Perl you will need 'make', which is part of Apple's developer tools - also known as Xcode. From Mac OS X 10.7 "Lion" onwards, it can be downloaded separately as the 'Command Line Tools' bundle directly from L<https://developer.apple.com/downloads/> (you will need a free account to log in), or as a part of the Xcode suite, freely available at the App Store. Xcode is a pretty big app, so unless you already have it or really want it, you are advised to get the 'Command Line Tools' bundle separately from the link above. If you want to do it from within Xcode, go to Xcode -> Preferences -> Downloads and select the 'Command Line Tools' option. Between Mac OS X 10.3 "Panther" and 10.6 "Snow Leopard", the 'Command Line Tools' bundle was called 'unix tools', and was usually supplied with Mac OS install DVDs. Earlier Mac OS X releases (10.2 "Jaguar" and older) did not include a completely thread-safe libc, so threading is not fully supported. Also, earlier releases included a buggy libdb, so some of the DB_File tests are known to fail on those releases. =head2 Installation Prefix The default installation location for this release uses the traditional UNIX directory layout under /usr/local. This is the recommended location for most users, and will leave the Apple-supplied Perl and its modules undisturbed. Using an installation prefix of '/usr' will result in a directory layout that mirrors that of Apple's default Perl, with core modules stored in '/System/Library/Perl/${version}', CPAN modules stored in '/Library/Perl/${version}', and the addition of '/Network/Library/Perl/${version}' to @INC for modules that are stored on a file server and used by many Macs. =head2 SDK support First, export the path to the SDK into the build environment: export SDK=/Applications/Xcode.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX10.8.sdk Please make sure the SDK version (i.e. the numbers right before '.sdk') matches your system's (in this case, Mac OS X 10.8 "Mountain Lion"), as it is possible to have more than one SDK installed. Also make sure the path exists in your system, and if it doesn't please make sure the SDK is properly installed, as it should come with the 'Command Line Tools' bundle mentioned above. Finally, if you have an older Mac OS X (10.6 "Snow Leopard" and below) running Xcode 4.2 or lower, the SDK path might be something like C<'/Developer/SDKs/MacOSX10.3.9.sdk'>. You can use the SDK by exporting some additions to Perl's 'ccflags' and '..flags' config variables: ./Configure -Accflags="-nostdinc -B$SDK/usr/include/gcc \ -B$SDK/usr/lib/gcc -isystem$SDK/usr/include \ -F$SDK/System/Library/Frameworks" \ -Aldflags="-Wl,-syslibroot,$SDK" \ -de =head2 Universal Binary support Note: From Mac OS X 10.6 "Snow Leopard" onwards, Apple only supports Intel-based hardware. This means you can safely skip this section unless you have an older Apple computer running on ppc or wish to create a perl binary with backwards compatibility. You can compile perl as a universal binary (built for both ppc and intel). In Mac OS X 10.4 "Tiger", you must export the 'u' variant of the SDK: export SDK=/Developer/SDKs/MacOSX10.4u.sdk Mac OS X 10.5 "Leopard" and above do not require the 'u' variant. In addition to the compiler flags used to select the SDK, also add the flags for creating a universal binary: ./Configure -Accflags="-arch i686 -arch ppc -nostdinc \ -B$SDK/usr/include/gcc \ -B$SDK/usr/lib/gcc -isystem$SDK/usr/include \ -F$SDK/System/Library/Frameworks" \ -Aldflags="-arch i686 -arch ppc -Wl,-syslibroot,$SDK" \ -de Keep in mind that these compiler and linker settings will also be used when building CPAN modules. For XS modules to be compiled as a universal binary, any libraries it links to must also be universal binaries. The system libraries that Apple includes with the 10.4u SDK are all universal, but user-installed libraries may need to be re-installed as universal binaries. =head2 64-bit PPC support Follow the instructions in F<INSTALL> to build perl with support for 64-bit integers (C<use64bitint>) or both 64-bit integers and 64-bit addressing (C<use64bitall>). In the latter case, the resulting binary will run only on G5-based hosts. Support for 64-bit addressing is experimental: some aspects of Perl may be omitted or buggy. Note the messages output by F<Configure> for further information. Please use L<https://github.com/Perl/perl5/issues> to submit a problem report in the event that you encounter difficulties. When building 64-bit modules, it is your responsibility to ensure that linked external libraries and frameworks provide 64-bit support: if they do not, module building may appear to succeed, but attempts to use the module will result in run-time dynamic linking errors, and subsequent test failures. You can use C<file> to discover the architectures supported by a library: $ file libgdbm.3.0.0.dylib libgdbm.3.0.0.dylib: Mach-O fat file with 2 architectures libgdbm.3.0.0.dylib (for architecture ppc): Mach-O dynamically linked shared library ppc libgdbm.3.0.0.dylib (for architecture ppc64): Mach-O 64-bit dynamically linked shared library ppc64 Note that this issue precludes the building of many Macintosh-specific CPAN modules (C<Mac::*>), as the required Apple frameworks do not provide PPC64 support. Similarly, downloads from Fink or Darwinports are unlikely to provide 64-bit support; the libraries must be rebuilt from source with the appropriate compiler and linker flags. For further information, see Apple's I<64-Bit Transition Guide> at L<https://developer.apple.com/library/archive/documentation/Darwin/Conceptual/64bitPorting/transition/transition.html>. =head2 libperl and Prebinding Mac OS X ships with a dynamically-loaded libperl, but the default for this release is to compile a static libperl. The reason for this is pre-binding. Dynamic libraries can be pre-bound to a specific address in memory in order to decrease load time. To do this, one needs to be aware of the location and size of all previously-loaded libraries. Apple collects this information as part of their overall OS build process, and thus has easy access to it when building Perl, but ordinary users would need to go to a great deal of effort to obtain the information needed for pre-binding. You can override the default and build a shared libperl if you wish (S<Configure ... -Duseshrplib>). With Mac OS X 10.4 "Tiger" and newer, there is almost no performance penalty for non-prebound libraries. Earlier releases will suffer a greater load time than either the static library, or Apple's pre-bound dynamic library. =head2 Updating Apple's Perl In a word - don't, at least not without a *very* good reason. Your scripts can just as easily begin with "#!/usr/local/bin/perl" as with "#!/usr/bin/perl". Scripts supplied by Apple and other third parties as part of installation packages and such have generally only been tested with the /usr/bin/perl that's installed by Apple. If you find that you do need to update the system Perl, one issue worth keeping in mind is the question of static vs. dynamic libraries. If you upgrade using the default static libperl, you will find that the dynamic libperl supplied by Apple will not be deleted. If both libraries are present when an application that links against libperl is built, ld will link against the dynamic library by default. So, if you need to replace Apple's dynamic libperl with a static libperl, you need to be sure to delete the older dynamic library after you've installed the update. =head2 Known problems If you have installed extra libraries such as GDBM through Fink (in other words, you have libraries under F</sw/lib>), or libdlcompat to F</usr/local/lib>, you may need to be extra careful when running Configure to not to confuse Configure and Perl about which libraries to use. Being confused will show up for example as "dyld" errors about symbol problems, for example during "make test". The safest bet is to run Configure as Configure ... -Uloclibpth -Dlibpth=/usr/lib to make Configure look only into the system libraries. If you have some extra library directories that you really want to use (such as newer Berkeley DB libraries in pre-Panther systems), add those to the libpth: Configure ... -Uloclibpth -Dlibpth='/usr/lib /opt/lib' The default of building Perl statically may cause problems with complex applications like Tk: in that case consider building shared Perl Configure ... -Duseshrplib but remember that there's a startup cost to pay in that case (see above "libperl and Prebinding"). Starting with Tiger (Mac OS X 10.4), Apple shipped broken locale files for the eu_ES locale (Basque-Spain). In previous releases of Perl, this resulted in failures in the F<lib/locale> test. These failures have been suppressed in the current release of Perl by making the test ignore the broken locale. If you need to use the eu_ES locale, you should contact Apple support. =head2 Cocoa There are two ways to use Cocoa from Perl. Apple's PerlObjCBridge module, included with Mac OS X, can be used by standalone scripts to access Foundation (i.e. non-GUI) classes and objects. An alternative is CamelBones, a framework that allows access to both Foundation and AppKit classes and objects, so that full GUI applications can be built in Perl. CamelBones can be found on SourceForge, at L<https://www.sourceforge.net/projects/camelbones/>. =head1 Starting From Scratch Unfortunately it is not that difficult somehow manage to break one's Mac OS X Perl rather severely. If all else fails and you want to really, B<REALLY>, start from scratch and remove even your Apple Perl installation (which has become corrupted somehow), the following instructions should do it. B<Please think twice before following these instructions: they are much like conducting brain surgery to yourself. Without anesthesia.> We will B<not> come to fix your system if you do this. First, get rid of the libperl.dylib: # cd /System/Library/Perl/darwin/CORE # rm libperl.dylib Then delete every .bundle file found anywhere in the folders: /System/Library/Perl /Library/Perl You can find them for example by # find /System/Library/Perl /Library/Perl -name '*.bundle' -print After this you can either copy Perl from your operating system media (you will need at least the /System/Library/Perl and /usr/bin/perl), or rebuild Perl from the source code with C<Configure -Dprefix=/usr -Duseshrplib> NOTE: the C<-Dprefix=/usr> to replace the system Perl works much better with Perl 5.8.1 and later, in Perl 5.8.0 the settings were not quite right. "Pacifist" from CharlesSoft (L<https://www.charlessoft.com/>) is a nice way to extract the Perl binaries from the OS media, without having to reinstall the entire OS. =head1 AUTHOR This README was written by Sherm Pendley E<lt>sherm@dot-app.orgE<gt>, and subsequently updated by Dominic Dunlop E<lt>domo@computer.orgE<gt> and Breno G. de Oliveira E<lt>garu@cpan.orgE<gt>. The "Starting From Scratch" recipe was contributed by John Montbriand E<lt>montbriand@apple.comE<gt>. =head1 DATE Last modified 2013-04-29. PK �=�[�q] perlplan9.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specially designed to be readable as is. =head1 NAME perlplan9 - Plan 9-specific documentation for Perl =head1 DESCRIPTION These are a few notes describing features peculiar to Plan 9 Perl. As such, it is not intended to be a replacement for the rest of the Perl 5 documentation (which is both copious and excellent). If you have any questions to which you can't find answers in these man pages, contact Luther Huffman at lutherh@stratcom.com and we'll try to answer them. =head2 Invoking Perl Perl is invoked from the command line as described in L<perl>. Most perl scripts, however, do have a first line such as "#!/usr/local/bin/perl". This is known as a shebang (shell-bang) statement and tells the OS shell where to find the perl interpreter. In Plan 9 Perl this statement should be "#!/bin/perl" if you wish to be able to directly invoke the script by its name. Alternatively, you may invoke perl with the command "Perl" instead of "perl". This will produce Acme-friendly error messages of the form "filename:18". Some scripts, usually identified with a *.PL extension, are self-configuring and are able to correctly create their own shebang path from config information located in Plan 9 Perl. These you won't need to be worried about. =head2 What's in Plan 9 Perl Although Plan 9 Perl currently only provides static loading, it is built with a number of useful extensions. These include Opcode, FileHandle, Fcntl, and POSIX. Expect to see others (and DynaLoading!) in the future. =head2 What's not in Plan 9 Perl As mentioned previously, dynamic loading isn't currently available nor is MakeMaker. Both are high-priority items. =head2 Perl5 Functions not currently supported in Plan 9 Perl Some, such as C<chown> and C<umask> aren't provided because the concept does not exist within Plan 9. Others, such as some of the socket-related functions, simply haven't been written yet. Many in the latter category may be supported in the future. The functions not currently implemented include: chown, chroot, dbmclose, dbmopen, getsockopt, setsockopt, recvmsg, sendmsg, getnetbyname, getnetbyaddr, getnetent, getprotoent, getservent, sethostent, setnetent, setprotoent, setservent, endservent, endnetent, endprotoent, umask There may be several other functions that have undefined behavior so this list shouldn't be considered complete. =head2 Signals in Plan 9 Perl For compatibility with perl scripts written for the Unix environment, Plan 9 Perl uses the POSIX signal emulation provided in Plan 9's ANSI POSIX Environment (APE). Signal stacking isn't supported. The signals provided are: SIGHUP, SIGINT, SIGQUIT, SIGILL, SIGABRT, SIGFPE, SIGKILL, SIGSEGV, SIGPIPE, SIGPIPE, SIGALRM, SIGTERM, SIGUSR1, SIGUSR2, SIGCHLD, SIGCONT, SIGSTOP, SIGTSTP, SIGTTIN, SIGTTOU =head1 COMPILING AND INSTALLING PERL ON PLAN 9 WELCOME to Plan 9 Perl, brave soul! This is a preliminary alpha version of Plan 9 Perl. Still to be implemented are MakeMaker and DynaLoader. Many perl commands are missing or currently behave in an inscrutable manner. These gaps will, with perseverance and a modicum of luck, be remedied in the near future.To install this software: 1. Create the source directories and libraries for perl by running the plan9/setup.rc command (i.e., located in the plan9 subdirectory). Note: the setup routine assumes that you haven't dearchived these files into /sys/src/cmd/perl. After running setup.rc you may delete the copy of the source you originally detarred, as source code has now been installed in /sys/src/cmd/perl. If you plan on installing perl binaries for all architectures, run "setup.rc -a". 2. After making sure that you have adequate privileges to build system software, from /sys/src/cmd/perl/5.00301 (adjust version appropriately) run: mk install If you wish to install perl versions for all architectures (68020, mips, sparc and 386) run: mk installall 3. Wait. The build process will take a *long* time because perl bootstraps itself. A 75MHz Pentium, 16MB RAM machine takes roughly 30 minutes to build the distribution from scratch. =head2 Installing Perl Documentation on Plan 9 This perl distribution comes with a tremendous amount of documentation. To add these to the built-in manuals that come with Plan 9, from /sys/src/cmd/perl/5.00301 (adjust version appropriately) run: mk man To begin your reading, start with: man perl This is a good introduction and will direct you towards other man pages that may interest you. (Note: "mk man" may produce some extraneous noise. Fear not.) =head1 BUGS "As many as there are grains of sand on all the beaches of the world . . ." - Carl Sagan =head1 Revision date This document was revised 09-October-1996 for Perl 5.003_7. =head1 AUTHOR Direct questions, comments, and the unlikely bug report (ahem) direct comments toward: Luther Huffman, lutherh@stratcom.com, Strategic Computer Solutions, Inc. PK �=�[i4(I� I� perl5320delta.podnu �[��� =pod =encoding utf8 =head1 NAME perl5320delta - what is new for perl v5.32.0 =head1 DESCRIPTION This document describes differences between the 5.30.0 release and the 5.32.0 release. If you are upgrading from an earlier release such as 5.28.0, first read L<perl5300delta>, which describes differences between 5.28.0 and 5.30.0. =head1 Core Enhancements =head2 The isa Operator A new experimental infix operator called C<isa> tests whether a given object is an instance of a given class or a class derived from it: if( $obj isa Package::Name ) { ... } For more detail see L<perlop/Class Instance Operator>. =head2 Unicode 13.0 is supported See L<https://www.unicode.org/versions/Unicode13.0.0/> for details. =head2 Chained comparisons capability Some comparison operators, as their associativity, I<chain> with some operators of the same precedence (but never with operators of different precedence). if ( $x < $y <= $z ) {...} behaves exactly like: if ( $x < $y && $y <= $z ) {...} (assuming that C<"$y"> is as simple a scalar as it looks.) You can read more about this in L<perlop> under L<perlop/Operator Precedence and Associativity>. =head2 New Unicode properties C<Identifier_Status> and C<Identifier_Type> supported Unicode has revised its regular expression requirements: L<https://www.unicode.org/reports/tr18/tr18-21.html>. As part of that they are wanting more properties to be exposed, ones that aren't part of the strict UCD (Unicode character database). These two are used for examining inputs for security purposes. Details on their usage is at L<https://www.unicode.org/reports/tr39/>. =head2 It is now possible to write C<qr/\p{Name=...}/>, or C<qr!\p{na=/(SMILING|GRINNING) FACE/}!> The Unicode Name property is now accessible in regular expression patterns, as an alternative to C<\N{...}>. A comparison of the two methods is given in L<perlunicode/Comparison of \N{...} and \p{name=...}>. The second example above shows that wildcard subpatterns are also usable in this property. See L<perlunicode/Wildcards in Property Values>. =head2 Improvement of C<POSIX::mblen()>, C<mbtowc>, and C<wctomb> The C<POSIX::mblen()>, C<mbtowc>, and C<wctomb> functions now work on shift state locales and are thread-safe on C99 and above compilers when executed on a platform that has locale thread-safety; the length parameters are now optional. These functions are always executed under the current C language locale. (See L<perllocale>.) Most locales are stateless, but a few, notably the very rarely encountered ISO 2022, maintain a state between calls to these functions. Previously the state was cleared on every call, but now the state is not reset unless the appropriate parameter is C<undef>. On threaded perls, the C99 functions L<mbrlen(3)>, L<mbrtowc(3)>, and L<wcrtomb(3)>, when available, are substituted for the plain functions. This makes these functions thread-safe when executing on a locale thread-safe platform. The string length parameters in C<mblen> and C<mbtowc> are now optional; useful only if you wish to restrict the length parsed in the source string to less than the actual length. =head2 Alpha assertions are no longer experimental See L<perlre/(*pla:pattern)>, L<perlre/(*plb:pattern)>, L<perlre/(*nla:pattern)>>, and L<perlre/(*nlb:pattern)>. Use of these no longer generates a warning; existing code that disables the warning category C<experimental::alpha_assertions> will continue to work without any changes needed. Enabling the category has no effect. =head2 Script runs are no longer experimental See L<perlre/Script Runs>. Use of these no longer generates a warning; existing code that disables the warning category C<experimental::script_run> will continue to work without any changes needed. Enabling the category has no effect. =head2 Feature checks are now faster Previously feature checks in the parser required a hash lookup when features were set outside of a feature bundle, this has been optimized to a bit mask check. [L<GH #17229|https://github.com/Perl/perl5/issues/17229>] =head2 Perl is now developed on GitHub Perl is now developed on GitHub. You can find us at L<https://github.com/Perl/perl5>. Non-security bugs should now be reported via GitHub. Security issues should continue to be reported as documented in L<perlsec>. =head2 Compiled patterns can now be dumped before optimization This is primarily useful for tracking down bugs in the regular expression compiler. This dump happens on C<-DDEBUGGING> perls, if you specify C<-Drv> on the command line; or on any perl if the pattern is compiled within the scope of S<C<use re qw(Debug DUMP_PRE_OPTIMIZE)>> or S<C<use re qw(Debug COMPILE EXTRA)>>. (All but the second case display other information as well.) =head1 Security =head2 [CVE-2020-10543] Buffer overflow caused by a crafted regular expression A signed C<size_t> integer overflow in the storage space calculations for nested regular expression quantifiers could cause a heap buffer overflow in Perl's regular expression compiler that overwrites memory allocated after the regular expression storage space with attacker supplied data. The target system needs a sufficient amount of memory to allocate partial expansions of the nested quantifiers prior to the overflow occurring. This requirement is unlikely to be met on 64-bit systems. Discovered by: ManhND of The Tarantula Team, VinCSS (a member of Vingroup). =head2 [CVE-2020-10878] Integer overflow via malformed bytecode produced by a crafted regular expression Integer overflows in the calculation of offsets between instructions for the regular expression engine could cause corruption of the intermediate language state of a compiled regular expression. An attacker could abuse this behaviour to insert instructions into the compiled form of a Perl regular expression. Discovered by: Hugo van der Sanden and Slaven Rezic. =head2 [CVE-2020-12723] Buffer overflow caused by a crafted regular expression Recursive calls to C<S_study_chunk()> by Perl's regular expression compiler to optimize the intermediate language representation of a regular expression could cause corruption of the intermediate language state of a compiled regular expression. Discovered by: Sergey Aleynikov. =head2 Additional Note An application written in Perl would only be vulnerable to any of the above flaws if it evaluates regular expressions supplied by the attacker. Evaluating regular expressions in this fashion is known to be dangerous since the regular expression engine does not protect against denial of service attacks in this usage scenario. =head1 Incompatible Changes =head2 Certain pattern matching features are now prohibited in compiling Unicode property value wildcard subpatterns These few features are either inappropriate or interfere with the algorithm used to accomplish this task. The complete list is in L<perlunicode/Wildcards in Property Values>. =head2 Unused functions C<POSIX::mbstowcs> and C<POSIX::wcstombs> are removed These functions could never have worked due to a defective interface specification. There is clearly no demand for them, given that no one has ever complained in the many years the functions were claimed to be available, hence so-called "support" for them is now dropped. =head2 A bug fix for C<(?[...])> may have caused some patterns to no longer compile See L</Selected Bug Fixes>. The heuristics previously used may have let some constructs compile (perhaps not with the programmer's intended effect) that should have been errors. None are known, but it is possible that some erroneous constructs no longer compile. =head2 C<\p{I<user-defined>}> properties now always override official Unicode ones Previously, if and only if a user-defined property was declared prior to the compilation of the regular expression pattern that contains it, its definition was used instead of any official Unicode property with the same name. Now, it always overrides the official property. This change could break existing code that relied (likely unwittingly) on the previous behavior. Without this fix, if Unicode released a new version with a new property that happens to have the same name as the one you had long been using, your program would break when you upgraded to a perl that used that new Unicode version. See L<perlunicode/User-Defined Character Properties>. [L<GH #17205|https://github.com/Perl/perl5/issues/17205>] =head2 Modifiable variables are no longer permitted in constants Code like: my $var; $sub = sub () { $var }; where C<$var> is referenced elsewhere in some sort of modifiable context now produces an exception when the sub is defined. This error can be avoided by adding a return to the sub definition: $sub = sub () { return $var }; This has been deprecated since Perl 5.22. [L<perl #134138|https://rt.perl.org/Ticket/Display.html?id=134138>] =head2 Use of L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on strings with code points above 0xFF is forbidden Such strings are represented internally in UTF-8, and C<vec> is a bit-oriented operation that will likely give unexpected results on those strings. This was deprecated in perl 5.28.0. =head2 Use of code points over 0xFF in string bitwise operators Some uses of these were already illegal after a previous deprecation cycle. The remaining uses are now prohibited, having been deprecated in perl 5.28.0. See L<perldeprecation>. =head2 C<Sys::Hostname::hostname()> does not accept arguments This usage was deprecated in perl 5.28.0 and is now fatal. =head2 Plain "0" string now treated as a number for range operator Previously a range C<"0" .. "-1"> would produce a range of numeric strings from "0" through "99"; this now produces an empty list, just as C<0 .. -1> does. This also means that C<"0" .. "9"> now produces a list of integers, where previously it would produce a list of strings. This was due to a special case that treated strings starting with "0" as strings so ranges like C<"00" .. "03"> produced C<"00", "01", "02", "03">, but didn't specially handle the string C<"0">. [L<perl #133695|https://rt.perl.org/Ticket/Display.html?id=133695>] =head2 C<\K> now disallowed in look-ahead and look-behind assertions This was disallowed because it causes unexpected behaviour, and no-one could define what the desired behaviour should be. [L<perl #124256|https://rt.perl.org/Ticket/Display.html?id=124256>] =head1 Performance Enhancements =over 4 =item * C<my_strnlen> has been sped up for systems that don't have their own C<strnlen> implementation. =item * C<grok_bin_oct_hex> (and so, C<grok_bin>, C<grok_oct>, and C<grok_hex>) have been sped up. =item * C<grok_number_flags> has been sped up. =item * C<sort> is now noticeably faster in cases such as C<< sort {$a <=> $b} >> or C<< sort {$b <=> $a} >>. [L<GH #17608|https://github.com/Perl/perl5/pull/17608>] =back =head1 Modules and Pragmata =head2 Updated Modules and Pragmata =over 4 =item * L<Archive::Tar> has been upgraded from version 2.32 to 2.36. =item * L<autodie> has been upgraded from version 2.29 to 2.32. =item * L<B> has been upgraded from version 1.76 to 1.80. =item * L<B::Deparse> has been upgraded from version 1.49 to 1.54. =item * L<Benchmark> has been upgraded from version 1.22 to 1.23. =item * L<charnames> has been upgraded from version 1.45 to 1.48. =item * L<Class::Struct> has been upgraded from version 0.65 to 0.66. =item * L<Compress::Raw::Bzip2> has been upgraded from version 2.084 to 2.093. =item * L<Compress::Raw::Zlib> has been upgraded from version 2.084 to 2.093. =item * L<CPAN> has been upgraded from version 2.22 to 2.27. =item * L<DB_File> has been upgraded from version 1.843 to 1.853. =item * L<Devel::PPPort> has been upgraded from version 3.52 to 3.57. The test files generated on Win32 are now identical to when they are generated on POSIX-like systems. =item * L<diagnostics> has been upgraded from version 1.36 to 1.37. =item * L<Digest::MD5> has been upgraded from version 2.55 to 2.55_01. =item * L<Dumpvalue> has been upgraded from version 1.18 to 1.21. Previously, when dumping elements of an array and encountering an undefined value, the string printed would have been C<empty array>. This has been changed to what was apparently originally intended: C<empty slot>. =item * L<DynaLoader> has been upgraded from version 1.45 to 1.47. =item * L<Encode> has been upgraded from version 3.01 to 3.06. =item * L<encoding> has been upgraded from version 2.22 to 3.00. =item * L<English> has been upgraded from version 1.10 to 1.11. =item * L<Exporter> has been upgraded from version 5.73 to 5.74. =item * L<ExtUtils::CBuilder> has been upgraded from version 0.280231 to 0.280234. =item * L<ExtUtils::MakeMaker> has been upgraded from version 7.34 to 7.44. =item * L<feature> has been upgraded from version 1.54 to 1.58. A new C<indirect> feature has been added, which is enabled by default but allows turning off L<indirect object syntax|perlobj/Indirect Object Syntax>. =item * L<File::Find> has been upgraded from version 1.36 to 1.37. On Win32, the tests no longer require either a file in the drive root directory, or a writable root directory. =item * L<File::Glob> has been upgraded from version 1.32 to 1.33. =item * L<File::stat> has been upgraded from version 1.08 to 1.09. =item * L<Filter::Simple> has been upgraded from version 0.95 to 0.96. =item * L<Getopt::Long> has been upgraded from version 2.5 to 2.51. =item * L<Hash::Util> has been upgraded from version 0.22 to 0.23. The Synopsis has been updated as the example code stopped working with newer perls. [L<GH #17399|https://github.com/Perl/perl5/issues/17399>] =item * L<I18N::Langinfo> has been upgraded from version 0.18 to 0.19. =item * L<I18N::LangTags> has been upgraded from version 0.43 to 0.44. Document the C<IGNORE_WIN32_LOCALE> environment variable. =item * L<IO> has been upgraded from version 1.40 to 1.43. L<IO::Socket> no longer caches a zero protocol value, since this indicates that the implementation will select a protocol. This means that on platforms that don't implement C<SO_PROTOCOL> for a given socket type the protocol method may return C<undef>. The supplied I<TO> is now always honoured on calls to the C<send()> method. [L<perl #133936|https://rt.perl.org/Ticket/Display.html?id=133936>] =item * IO-Compress has been upgraded from version 2.084 to 2.093. =item * L<IPC::Cmd> has been upgraded from version 1.02 to 1.04. =item * L<IPC::Open3> has been upgraded from version 1.20 to 1.21. =item * L<JSON::PP> has been upgraded from version 4.02 to 4.04. =item * L<Math::BigInt> has been upgraded from version 1.999816 to 1.999818. =item * L<Math::BigInt::FastCalc> has been upgraded from version 0.5008 to 0.5009. =item * L<Module::CoreList> has been upgraded from version 5.20190522 to 5.20200620. =item * L<Module::Load::Conditional> has been upgraded from version 0.68 to 0.70. =item * L<Module::Metadata> has been upgraded from version 1.000036 to 1.000037. =item * L<mro> has been upgraded from version 1.22 to 1.23. =item * L<Net::Ping> has been upgraded from version 2.71 to 2.72. =item * L<Opcode> has been upgraded from version 1.43 to 1.47. =item * L<open> has been upgraded from version 1.11 to 1.12. =item * L<overload> has been upgraded from version 1.30 to 1.31. =item * L<parent> has been upgraded from version 0.237 to 0.238. =item * L<perlfaq> has been upgraded from version 5.20190126 to 5.20200523. =item * L<PerlIO> has been upgraded from version 1.10 to 1.11. =item * L<PerlIO::encoding> has been upgraded from version 0.27 to 0.28. =item * L<PerlIO::via> has been upgraded from version 0.17 to 0.18. =item * L<Pod::Html> has been upgraded from version 1.24 to 1.25. =item * L<Pod::Simple> has been upgraded from version 3.35 to 3.40. =item * L<podlators> has been upgraded from version 4.11 to 4.14. =item * L<POSIX> has been upgraded from version 1.88 to 1.94. =item * L<re> has been upgraded from version 0.37 to 0.40. =item * L<Safe> has been upgraded from version 2.40 to 2.41. =item * L<Scalar::Util> has been upgraded from version 1.50 to 1.55. =item * L<SelfLoader> has been upgraded from version 1.25 to 1.26. =item * L<Socket> has been upgraded from version 2.027 to 2.029. =item * L<Storable> has been upgraded from version 3.15 to 3.21. Use of C<note()> from L<Test::More> is now optional in tests. This works around a circular dependency with L<Test::More> when installing on very old perls from CPAN. Vstring magic strings over 2GB are now disallowed. Regular expressions objects weren't properly counted for object id purposes on retrieve. This would corrupt the resulting structure, or cause a runtime error in some cases. [L<perl #134179|https://rt.perl.org/Ticket/Display.html?id=134179>] =item * L<Sys::Hostname> has been upgraded from version 1.22 to 1.23. =item * L<Sys::Syslog> has been upgraded from version 0.35 to 0.36. =item * L<Term::ANSIColor> has been upgraded from version 4.06 to 5.01. =item * L<Test::Simple> has been upgraded from version 1.302162 to 1.302175. =item * L<Thread> has been upgraded from version 3.04 to 3.05. =item * L<Thread::Queue> has been upgraded from version 3.13 to 3.14. =item * L<threads> has been upgraded from version 2.22 to 2.25. =item * L<threads::shared> has been upgraded from version 1.60 to 1.61. =item * L<Tie::File> has been upgraded from version 1.02 to 1.06. =item * L<Tie::Hash::NamedCapture> has been upgraded from version 0.10 to 0.13. =item * L<Tie::Scalar> has been upgraded from version 1.04 to 1.05. =item * L<Tie::StdHandle> has been upgraded from version 4.5 to 4.6. =item * L<Time::HiRes> has been upgraded from version 1.9760 to 1.9764. Removed obsolete code such as support for pre-5.6 perl and classic MacOS. [L<perl #134288|https://rt.perl.org/Ticket/Display.html?id=134288>] =item * L<Time::Piece> has been upgraded from version 1.33 to 1.3401. =item * L<Unicode::Normalize> has been upgraded from version 1.26 to 1.27. =item * L<Unicode::UCD> has been upgraded from version 0.72 to 0.75. =item * L<VMS::Stdio> has been upgraded from version 2.44 to 2.45. =item * L<warnings> has been upgraded from version 1.44 to 1.47. =item * L<Win32> has been upgraded from version 0.52 to 0.53. =item * L<Win32API::File> has been upgraded from version 0.1203 to 0.1203_01. =item * L<XS::APItest> has been upgraded from version 1.00 to 1.09. =back =head2 Removed Modules and Pragmata =over 4 =item * Pod::Parser has been removed from the core distribution. It still is available for download from CPAN. This resolves [L<perl #119439|https://rt.perl.org/Ticket/Display.html?id=119439>]. =back =head1 Documentation =head2 Changes to Existing Documentation We have attempted to update the documentation to reflect the changes listed in this document. If you find any we have missed, open an issue at L<https://github.com/Perl/perl5/issues>. Additionally, the following selected changes have been made: =head3 L<perldebguts> =over 4 =item * Simplify a few regnode definitions Update C<BOUND> and C<NBOUND> definitions. =item * Add ANYOFHs regnode This node is like C<ANYOFHb>, but is used when more than one leading byte is the same in all the matched code points. C<ANYOFHb> is used to avoid having to convert from UTF-8 to code point for something that won't match. It checks that the first byte in the UTF-8 encoded target is the desired one, thus ruling out most of the possible code points. =back =head3 L<perlapi> =over 4 =item * C<sv_2pvbyte> updated to mention it will croak if the SV cannot be downgraded. =item * C<sv_setpvn> updated to mention that the UTF-8 flag will not be changed by this function, and a terminating NUL byte is guaranteed. =item * Documentation for C<PL_phase> has been added. =item * The documentation for C<grok_bin>, C<grok_oct>, and C<grok_hex> has been updated and clarified. =back =head3 L<perldiag> =over 4 =item * Add documentation for experimental 'isa' operator (S experimental::isa) This warning is emitted if you use the (C<isa>) operator. This operator is currently experimental and its behaviour may change in future releases of Perl. =back =head3 L<perlfunc> =over 4 =item C<caller> Like L<C<__FILE__>|/__FILE__> and L<C<__LINE__>|/__LINE__>, the filename and line number returned here may be altered by the mechanism described at L<perlsyn/"Plain Old Comments (Not!)">. =item C<__FILE__> It can be altered by the mechanism described at L<perlsyn/"Plain Old Comments (Not!)">. =item C<__LINE__> It can be altered by the mechanism described at L<perlsyn/"Plain Old Comments (Not!)">. =item C<return> Now mentions that you cannot return from C<do BLOCK>. =item C<open> The C<open()> section had been renovated significantly. =back =head3 L<perlguts> =over 4 =item * No longer suggesting using perl's C<malloc>. Modern system C<malloc> is assumed to be much better than perl's implementation now. =item * Documentation about F<embed.fnc> flags has been removed. F<embed.fnc> now has sufficient comments within it. Anyone changing that file will see those comments first, so entries here are now redundant. =item * Updated documentation for C<UTF8f> =item * Added missing C<=for apidoc> lines =back =head3 L<perlhacktips> =over 4 =item * The differences between Perl strings and C strings are now detailed. =back =head3 L<perlintro> =over 4 =item * The documentation for the repetition operator C<x> have been clarified. [L<GH #17335|https://github.com/Perl/perl5/issues/17335>] =back =head3 L<perlipc> =over 4 =item * The documentation surrounding C<open> and handle usage has been modernized to prefer 3-arg open and lexical variables instead of barewords. =item * Various updates and fixes including making all examples strict-safe and replacing C<-w> with C<use warnings>. =back =head3 L<perlop> =over 4 =item * 'isa' operator is experimental This is an experimental feature and is available when enabled by C<use feature 'isa'>. It emits a warning in the C<experimental::isa> category. =back =head3 L<perlpod> =over 4 =item * Details of the various stacks within the perl interpreter are now explained here. =item * Advice has been added regarding the usage of C<< ZE<lt>E<gt> >>. =back =head3 L<perlport> =over 4 =item * Update C<timegm> example to use the correct year format I<1970> instead of I<70>. [L<GH #16431|https://github.com/Perl/perl5/issues/16431>] =back =head3 L<perlreref> =over 4 =item * Fix some typos. =back =head3 L<perlvar> =over 4 =item * Now recommends stringifying C<$]> and comparing it numerically. =back =head3 L<perlapi>, L<perlintern> =over 4 =item * Documentation has been added for several functions that were lacking it before. =back =head3 L<perlxs> =over 4 =item * Suggest using C<libffi> for simple library bindings via CPAN modules like L<FFI::Platypus> or L<FFI::Raw>. =back =head3 L<POSIX> =over 4 =item * C<setlocale> warning about threaded builds updated to note it does not apply on Perl 5.28.X and later. =item * C<< Posix::SigSet->new(...) >> updated to state it throws an error if any of the supplied signals cannot be added to the set. =back Additionally, the following selected changes have been made: =head3 Updating of links =over 4 =item * Links to the now defunct L<https://search.cpan.org> site now point at the equivalent L<https://metacpan.org> URL. [L<GH #17393|https://github.com/Perl/perl5/issues/17393>] =item * The man page for L<ExtUtils::XSSymSet> is now only installed on VMS, which is the only platform the module is installed on. [L<GH #17424|https://github.com/Perl/perl5/issues/17424>] =item * URLs have been changed to C<https://> and stale links have been updated. Where applicable, the URLs in the documentation have been moved from using the C<http://> protocol to C<https://>. This also affects the location of the bug tracker at L<https://rt.perl.org>. =item * Some links to OS/2 libraries, Address Sanitizer and other system tools had gone stale. These have been updated with working links. =item * Some links to old email addresses on perl5-porters had gone stale. These have been updated with working links. =back =head1 Diagnostics The following additions or changes have been made to diagnostic output, including warnings and fatal error messages. For the complete list of diagnostic messages, see L<perldiag>. =head2 New Diagnostics =head3 New Errors =over 4 =item * L<Expecting interpolated extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol> |perldiag/"Expecting interpolated extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol>"> This is a replacement for several error messages listed under L</Changes to Existing Diagnostics>. =item * C<L<No digits found for %s literal|perldiag/"No digits found for %s literal">> (F) No hexadecimal digits were found following C<0x> or no binary digits were found following C<0b>. =back =head3 New Warnings =over 4 =item * L<Code point 0x%X is not Unicode, and not portable|perldiag/"Code point 0x%X is not Unicode, and not portable"> This is actually not a new message, but it is now output when the warnings category C<portable> is enabled. When raised during regular expression pattern compilation, the warning has extra text added at the end marking where precisely in the pattern it occurred. =item * L<Non-hex character '%c' terminates \x early. Resolved as "%s"|perldiag/"Non-hex character '%c' terminates \x early. Resolved as "%s""> This replaces a warning that was much less specific, and which gave false information. This new warning parallels the similar already-existing one raised for C<\o{}>. =back =head2 Changes to Existing Diagnostics =over 4 =item * L<Character following "\c" must be printable ASCII|perldiag/"Character following "\c" must be printable ASCII"> ...now has extra text added at the end, when raised during regular expression pattern compilation, marking where precisely in the pattern it occurred. =item * L<Use "%s" instead of "%s"|perldiag/"Use "%s" instead of "%s""> ...now has extra text added at the end, when raised during regular expression pattern compilation, marking where precisely in the pattern it occurred. =item * L<Sequence "\c{" invalid|perldiag/"Sequence "\c{" invalid"> ...now has extra text added at the end, when raised during regular expression pattern compilation, marking where precisely in the pattern it occurred. =item * L<"\c%c" is more clearly written simply as "%s"|perldiag/""\c%c" is more clearly written simply as "%s""> ...now has extra text added at the end, when raised during regular expression pattern compilation, marking where precisely in the pattern it occurred. =item * L<Non-octal character '%c' terminates \o early. Resolved as "%s"|perldiag/"Non-octal character '%c' terminates \o early. Resolved as "%s""> ...now includes the phrase "terminates \o early", and has extra text added at the end, when raised during regular expression pattern compilation, marking where precisely in the pattern it occurred. In some instances the text of the resolution has been clarified. =item * L<'%s' resolved to '\o{%s}%d'|perldiag/'%s' resolved to '\o{%s}%d'> As of Perl 5.32, this message is no longer generated. Instead, L<perldiag/Non-octal character '%c' terminates \o early. Resolved as "%s"> is used instead. =item * L<Use of code point 0x%s is not allowed; the permissible max is 0x%X|perldiag/"Use of code point 0x%s is not allowed; the permissible max is 0x%X"> Some instances of this message previously output the hex digits C<A>, C<B>, C<C>, C<D>, C<E>, and C<F> in lower case. Now they are all consistently upper case. =item * The following three diagnostics have been removed, and replaced by L<C<Expecting interpolated extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol>> |perldiag/"Expecting interpolated extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol>">: C<Expecting close paren for nested extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol>>, C<Expecting close paren for wrapper for nested extended charclass in regex; marked by <-- HERE in mE<sol>%sE<sol>>, and C<Expecting '(?flags:(?[...' in regex; marked by S<<-- HERE> in mE<sol>%sE<sol>>. =item * The C<Code point 0x%X is not Unicode, and not portable> warning removed the line C<Code points above 0xFFFF_FFFF require larger than a 32 bit word.> as code points that large are no longer legal on 32-bit platforms. =item * L<Can't use global %s in %s|perldiag/"Can't use global %s in %s"> This error message has been slightly reformatted from the original C<Can't use global %s in "%s">, and in particular misleading error messages like C<Can't use global $_ in "my"> are now rendered as C<Can't use global $_ in subroutine signature>. =item * L<Constants from lexical variables potentially modified elsewhere are no longer permitted|perldiag/"Constants from lexical variables potentially modified elsewhere are no longer permitted"> This error message replaces the former C<Constants from lexical variables potentially modified elsewhere are deprecated. This will not be allowed in Perl 5.32> to reflect the fact that this previously deprecated usage has now been transformed into an exception. The message's classification has also been updated from D (deprecated) to F (fatal). See also L</Incompatible Changes>. =item * C<\N{} here is restricted to one character> is now emitted in the same circumstances where previously C<\N{} in inverted character class or as a range end-point is restricted to one character> was. This is due to new circumstances having been added in Perl 5.30 that weren't covered by the earlier wording. =back =head1 Utility Changes =head2 L<perlbug> =over 4 =item * The bug tracker homepage URL now points to GitHub. =back =head2 L<streamzip> =over 4 =item * This is a new utility, included as part of an L<IO::Compress::Base> upgrade. L<streamzip> creates a zip file from stdin. The program will read data from stdin, compress it into a zip container and, by default, write a streamed zip file to stdout. =back =head1 Configuration and Compilation =head2 F<Configure> =over 4 =item * For clang++, add C<< #include <stdlib.h> >> to Configure's probes for C<futimes>, C<strtoll>, C<strtoul>, C<strtoull>, C<strtouq>, otherwise the probes would fail to compile. =item * Use a compile and run test for C<lchown> to satisfy clang++ which should more reliably detect it. =item * For C++ compilers, add C<< #include <stdio.h> >> to Configure's probes for C<getpgrp> and C<setpgrp> as they use printf and C++ compilers may fail compilation instead of just warning. =item * Check if the compiler can handle inline attribute. =item * Check for character data alignment. =item * F<Configure> now correctly handles gcc-10. Previously it was interpreting it as gcc-1 and turned on C<-fpcc-struct-return>. =item * Perl now no longer probes for C<d_u32align>, defaulting to C<define> on all platforms. This check was error-prone when it was done, which was on 32-bit platforms only. [L<perl #133495|https://rt.perl.org/Ticket/Display.html?id=133495>] =item * Documentation and hints for building perl on Z/OS (native EBCDIC) have been updated. This is still a work in progress. =item * A new probe for C<malloc_usable_size> has been added. =item * Improvements in F<Configure> to detection in C++ and clang++. Work ongoing by Andy Dougherty. [L<perl #134171|https://rt.perl.org/Ticket/Display.html?id=134171>] =item * F<autodoc.pl> This tool that regenerates L<perlintern> and L<perlapi> has been overhauled significantly, restoring consistency in flags used in F<embed.fnc> and L<Devel::PPPort> and allowing removal of many redundant C<=for apidoc> entries in code. =item * The C<ECHO> macro is now defined. This is used in a C<dtrace> rule that was originally changed for FreeBSD, and the FreeBSD make apparently predefines it. The Solaris make does not predefine C<ECHO> which broke this rule on Solaris. [L<perl #134218|https://rt.perl.org/Ticket/Display.html?id=134218>] =item * Bison versions 3.1 through 3.4 are now supported. =back =head1 Testing Tests were added and changed to reflect the other additions and changes in this release. Furthermore, these significant changes were made: =over 4 =item * F<t/run/switches.t> no longer uses (and re-uses) the F<tmpinplace/> directory under F<t/>. This may prevent spurious failures. [L<GH #17424|https://github.com/Perl/perl5/issues/17424>] =item * Various bugs in C<POSIX::mbtowc> were fixed. Potential races with other threads are now avoided, and previously the returned wide character could well be garbage. =item * Various bugs in C<POSIX::wctomb> were fixed. Potential races with other threads are now avoided, and previously it would segfault if the string parameter was shared or hadn't been pre-allocated with a string of sufficient length to hold the result. =item * Certain test output of scalars containing control characters and Unicode has been fixed on EBCDIC. =item * F<t/charset_tools.pl>: Avoid some work on ASCII platforms. =item * F<t/re/regexp.t>: Speed up many regex tests on ASCII platform =item * F<t/re/pat.t>: Skip tests that don't work on EBCDIC. =back =head1 Platform Support =head2 Discontinued Platforms =over 4 =item Windows CE Support for building perl on Windows CE has now been removed. =back =head2 Platform-Specific Notes =over 4 =item Linux C<cc> will be used to populate C<plibpth> if C<cc> is C<clang>. [L<perl #134189|https://rt.perl.org/Ticket/Display.html?id=134189>] =item NetBSD 8.0 Fix compilation of Perl on NetBSD 8.0 with g++. [L<GH #17381|https://github.com/Perl/perl5/issues/17381>] =item Windows =over 4 =item * The configuration for C<ccflags> and C<optimize> are now separate, as with POSIX platforms. [L<GH #17156|https://github.com/Perl/perl5/issues/17156>] =item * Support for building perl with Visual C++ 6.0 has now been removed. =item * The locale tests could crash on Win32 due to a Windows bug, and separately due to the CRT throwing an exception if the locale name wasn't validly encoded in the current code page. For the second we now decode the locale name ourselves, and always decode it as UTF-8. [L<perl #133981|https://rt.perl.org/Ticket/Display.html?id=133981>] =item * F<t/op/magic.t> could fail if environment variables starting with C<FOO> already existed. =item * MYMALLOC (PERL_MALLOC) build has been fixed. =back =item Solaris =over 4 =item * C<Configure> will now find recent versions of the Oracle Developer Studio compiler, which are found under C</opt/developerstudio*>. =item * C<Configure> now uses the detected types for C<gethostby*> functions, allowing Perl to once again compile on certain configurations of Solaris. =back =item VMS =over 4 =item * With the release of the patch kit C99 V2.0, VSI has provided support for a number of previously-missing C99 features. On systems with that patch kit installed, Perl's configuration process will now detect the presence of the header C<stdint.h> and the following functions: C<fpclassify>, C<isblank>, C<isless>, C<llrint>, C<llrintl>, C<llround>, C<llroundl>, C<nearbyint>, C<round>, C<scalbn>, and C<scalbnl>. =item * C<-Duse64bitint> is now the default on VMS. =back =item z/OS Perl 5.32 has been tested on z/OS 2.4, with the following caveats: =over 4 =item * Only static builds (the default) build reliably =item * When using locales, z/OS does not handle the C<LC_MESSAGES> category properly, so when compiling perl, you should add the following to your F<Configure> options ./Configure <other options> -Accflags=-DNO_LOCALE_MESSAGES =item * z/OS does not support locales with threads, so when compiling a threaded perl, you should add the following to your F<Configure> options ./Configure <other Configure options> -Accflags=-DNO_LOCALE =item * Some CPAN modules that are shipped with perl fail at least one of their self-tests. These are: Archive::Tar, Config::Perl::V, CPAN::Meta, CPAN::Meta::YAML, Digest::MD5, Digest::SHA, Encode, ExtUtils::MakeMaker, ExtUtils::Manifest, HTTP::Tiny, IO::Compress, IPC::Cmd, JSON::PP, libnet, MIME::Base64, Module::Metadata, PerlIO::via-QuotedPrint, Pod::Checker, podlators, Pod::Simple, Socket, and Test::Harness. The causes of the failures range from the self-test itself is flawed, and the module actually works fine, up to the module doesn't work at all on EBCDIC platforms. =back =back =head1 Internal Changes =over 4 =item * C<savepvn>'s len parameter is now a C<Size_t> instead of an C<I32> since we can handle longer strings than 31 bits. =item * The lexer (C<Perl_yylex()> in F<toke.c>) was previously a single 4100-line function, relying heavily on C<goto> and a lot of widely-scoped local variables to do its work. It has now been pulled apart into a few dozen smaller static functions; the largest remaining chunk (C<yyl_word_or_keyword()>) is a little over 900 lines, and consists of a single C<switch> statement, all of whose C<case> groups are independent. This should be much easier to understand and maintain. =item * The OS-level signal handlers and type (Sighandler_t) used by the perl core were declared as having three parameters, but the OS was always told to call them with one argument. This has been fixed by declaring them to have one parameter. See the merge commit C<v5.31.5-346-g116e19abbf> for full details. =item * The code that handles C<tr///> has been extensively revised, fixing various bugs, especially when the source and/or replacement strings contain characters whose code points are above 255. Some of the bugs were undocumented, one being that under some circumstances (but not all) with C</s>, the squeezing was done based on the source, rather than the replacement. A documented bug that got fixed was [L<perl #125493|https://rt.perl.org/Ticket/Display.html?id=125493>]. =item * A new macro for XS writers dealing with UTF-8-encoded Unicode strings has been created L<perlapi/C<UTF8_CHK_SKIP>> that is safer in the face of malformed UTF-8 input than L<perlapi/C<UTF8_SKIP>> (but not as safe as L<perlapi/C<UTF8_SAFE_SKIP>>). It won't read past a NUL character. It has been backported in L<Devel::PPPort> 3.55 and later. =item * Added the C<< PL_curstackinfo->si_cxsubix >> field. This records the stack index of the most recently pushed sub/format/eval context. It is set and restored automatically by C<cx_pushsub()>, C<cx_popsub()> etc., but would need to be manually managed if you do any unusual manipulation of the context stack. =item * Various macros dealing with character type classification and changing case where the input is encoded in UTF-8 now require an extra parameter to prevent potential reads beyond the end of the buffer. Use of these has generated a deprecation warning since Perl 5.26. Details are in L<perldeprecation/In XS code, use of various macros dealing with UTF-8.> =item * A new parser function L<parse_subsignature()|perlapi/parse_subsignature> allows a keyword plugin to parse a subroutine signature while C<use feature 'signatures'> is in effect. This allows custom keywords to implement semantics similar to regular C<sub> declarations that include signatures. [L<perl #132474|https://rt.perl.org/Ticket/Display.html?id=132474>] =item * Since on some platforms we need to hold a mutex when temporarily switching locales, new macros (C<STORE_LC_NUMERIC_SET_TO_NEEDED_IN>, C<WITH_LC_NUMERIC_SET_TO_NEEDED> and C<WITH_LC_NUMERIC_SET_TO_NEEDED_IN>) have been added to make it easier to do this safely and efficiently as part of [L<perl #134172|https://rt.perl.org/Ticket/Display.html?id=134172>]. =item * The memory bookkeeping overhead for allocating an OP structure has been reduced by 8 bytes per OP on 64-bit systems. =item * L<eval_pv()|perlapi/eval_pv> no longer stringifies the exception when C<croak_on_error> is true. [L<perl #134175|https://rt.perl.org/Ticket/Display.html?id=134175>] =item * The PERL_DESTRUCT_LEVEL environment variable was formerly only honoured on perl binaries built with DEBUGGING support. It is now checked on all perl builds. Its normal use is to force perl to individually free every block of memory which it has allocated before exiting, which is useful when using automated leak detection tools such as valgrind. =item * The API eval_sv() now accepts a C<G_RETHROW> flag. If this flag is set and an exception is thrown while compiling or executing the supplied code, it will be rethrown, and eval_sv() will not return. [L<perl #134177|https://rt.perl.org/Ticket/Display.html?id=134177>] =item * As part of the fix for [L<perl #2754|https://rt.perl.org/Ticket/Display.html?id=2754>] perl_parse() now returns non-zero if exit(0) is called in a C<BEGIN>, C<UNITCHECK> or C<CHECK> block. =item * Most functions which recursively walked an op tree during compilation have been made non-recursive. This avoids SEGVs from stack overflow when the op tree is deeply nested, such as C<$n == 1 ? "one" : $n == 2 ? "two" : ....> (especially in code which is auto-generated). This is particularly noticeable where the code is compiled within a separate thread, as threads tend to have small stacks by default. =back =head1 Selected Bug Fixes =over 4 =item * Previously L<perlfunc/require> would only treat the special built-in SV C<&PL_sv_undef> as a value in C<%INC> as if a previous C<require> has failed, treating other undefined SVs as if the previous C<require> has succeeded. This could cause unexpected success from C<require> e.g., on C<local %INC = %INC;>. This has been fixed. [L<GH #17428|https://github.com/Perl/perl5/issues/17428>] =item * C<(?{...})> eval groups in regular expressions no longer unintentionally trigger "EVAL without pos change exceeded limit in regex" [L<GH #17490|https://github.com/Perl/perl5/issues/17490>]. =item * C<(?[...])> extended bracketed character classes do not wrongly raise an error on some cases where a previously-compiled such class is interpolated into another. The heuristics previously used have been replaced by a reliable method, and hence the diagnostics generated have changed. See L</Diagnostics>. =item * The debug display (say by specifying C<-Dr> or S<C<use re>> (with appropriate options) of compiled Unicode property wildcard subpatterns no longer has extraneous output. =item * Fix an assertion failure in the regular expression engine. [L<GH #17372|https://github.com/Perl/perl5/issues/17372>] =item * Fix coredump in pp_hot.c after C<B::UNOP_AUX::aux_list()>. [L<GH #17301|https://github.com/Perl/perl5/issues/17301>] =item * Loading IO is now threadsafe. [L<GH #14816|https://github.com/Perl/perl5/issues/14816>] =item * C<\p{user-defined}> overrides official Unicode [L<GH #17025|https://github.com/Perl/perl5/issues/17025>] Prior to this patch, the override was only sometimes in effect. =item * Properly handle filled C</il> regnodes and multi-char folds =item * Compilation error during make minitest [L<GH #17293|https://github.com/Perl/perl5/issues/17293>] =item * Move the implementation of C<%->, C<%+> into core. =item * Read beyond buffer in C<grok_inf_nan> [L<GH #17370|https://github.com/Perl/perl5/issues/17370>] =item * Workaround glibc bug with C<LC_MESSAGES> [L<GH #17081|https://github.com/Perl/perl5/issues/17081>] =item * C<printf()> or C<sprintf()> with the C<%n> format could cause a panic on debugging builds, or report an incorrectly cached length value when producing C<SVfUTF8> flagged strings. [L<GH #17221|https://github.com/Perl/perl5/issues/17221>] =item * The tokenizer has been extensively refactored. [L<GH #17241|https://github.com/Perl/perl5/issues/17241>] [L<GH #17189|https://github.com/Perl/perl5/issues/17189>] =item * C<use strict "subs"> is now enforced for bareword constants optimized into a C<multiconcat> operator. [L<GH #17254|https://github.com/Perl/perl5/issues/17254>] =item * A memory leak in regular expression patterns has been fixed. [L<GH #17218|https://github.com/Perl/perl5/issues/17218>] =item * Perl no longer treats strings starting with "0x" or "0b" as hex or binary numbers respectively when converting a string to a number. This reverts a change in behaviour inadvertently introduced in perl 5.30.0 intended to improve precision when converting a string to a floating point number. [L<perl #134230|https://rt.perl.org/Ticket/Display.html?id=134230>] =item * Matching a non-C<SVf_UTF8> string against a regular expression containing unicode literals could leak a SV on each match attempt. [L<perl #134390|https://rt.perl.org/Ticket/Display.html?id=134390>] =item * Overloads for octal and binary floating point literals were always passed a string with a C<0x> prefix instead of the appropriate C<0> or C<0b> prefix. [L<perl #125557|https://rt.perl.org/Ticket/Display.html?id=125557>] =item * C<< $@ = 100; die; >> now correctly propagates the 100 as an exception instead of ignoring it. [L<perl #134291|https://rt.perl.org/Ticket/Display.html?id=134291>] =item * C<< 0 0x@ >> no longer asserts in S_no_op(). [L<perl #134310|https://rt.perl.org/Ticket/Display.html?id=134310>] =item * Exceptions thrown while C<$@> is read-only could result in infinite recursion as perl tried to update C<$@>, which throws another exception, resulting in a stack overflow. Perl now replaces C<$@> with a copy if it's not a simple writable SV. [L<perl #134266|https://rt.perl.org/Ticket/Display.html?id=134266>] =item * Setting C<$)> now properly sets supplementary group ids if you have the necessary privileges. [L<perl #134169|https://rt.perl.org/Ticket/Display.html?id=134169>] =item * close() on a pipe now preemptively clears the PerlIO object from the IO SV. This prevents a second attempt to close the already closed PerlIO object if a signal handler calls die() or exit() while close() is waiting for the child process to complete. [L<perl #122112|https://rt.perl.org/Ticket/Display.html?id=122112>] =item * C<< sprintf("%.*a", -10000, $x) >> would cause a buffer overflow due to mishandling of the negative precision value. [L<perl #134008|https://rt.perl.org/Ticket/Display.html?id=134008>] =item * scalar() on a reference could cause an erroneous assertion failure during compilation. [L<perl #134045|https://rt.perl.org/Ticket/Display.html?id=134045>] =item * C<%{^CAPTURE_ALL}> is now an alias to C<%-> as documented, rather than incorrectly an alias for C<%+>. [L<perl #131867|https://rt.perl.org/Ticket/Display.html?id=131867>] =item * C<%{^CAPTURE}> didn't work if C<@{^CAPTURE}> was mentioned first. Similarly for C<%{^CAPTURE_ALL}> and C<@{^CAPTURE_ALL}>, though C<@{^CAPTURE_ALL}> currently isn't used. [L<perl #134193|https://rt.perl.org/Ticket/Display.html?id=134193>] =item * Extraordinarily large (over 2GB) floating point format widths could cause an integer overflow in the underlying call to snprintf(), resulting in an assertion. Formatted floating point widths are now limited to the range of int, the return value of snprintf(). [L<perl #133913|https://rt.perl.org/Ticket/Display.html?id=133913>] =item * Parsing the following constructs within a sub-parse (such as with C<"${code here}"> or C<s/.../code here/e>) has changed to match how they're parsed normally: =over =item * C<print $fh ...> no longer produces a syntax error. =item * Code like C<s/.../ ${time} /e> now properly produces an "Ambiguous use of ${time} resolved to $time at ..." warning when warnings are enabled. =item * C<@x {"a"}> (with the space) in a sub-parse now properly produces a "better written as" warning when warnings are enabled. =item * Attributes can now be used in a sub-parse. [L<perl #133850|https://rt.perl.org/Ticket/Display.html?id=133850>] =back =item * Incomplete hex and binary literals like C<0x> and C<0b> are now treated as if the C<x> or C<b> is part of the next token. [L<perl #134125|https://rt.perl.org/Ticket/Display.html?id=134125>] =item * A spurious C<)> in a subparse, such as in C<s/.../code here/e> or C<"...${code here}">, no longer confuses the parser. Previously a subparse was bracketed with generated C<(> and C<)> tokens, so a spurious C<)> would close the construct without doing the normal subparse clean up, confusing the parser and possible causing an assertion failure. Such constructs are now surrounded by artificial tokens that can't be included in the source. [L<perl #130585|https://rt.perl.org/Ticket/Display.html?id=130585>] =item * Reference assignment of a sub, such as C<\&foo = \&bar;>, silently did nothing in the C<main::> package. [L<perl #134072|https://rt.perl.org/Ticket/Display.html?id=134072>] =item * sv_gets() now recovers better if the target SV is modified by a signal handler. [L<perl #134035|https://rt.perl.org/Ticket/Display.html?id=134035>] =item * C<readline @foo> now evaluates C<@foo> in scalar context. Previously it would be evaluated in list context, and since readline() pops only one argument from the stack, the stack could underflow, or be left with unexpected values on the stack. [L<perl #133989|https://rt.perl.org/Ticket/Display.html?id=133989>] =item * Parsing incomplete hex or binary literals was changed in 5.31.1 to treat such a literal as just the 0, leaving the following C<x> or C<b> to be parsed as part of the next token. This could lead to some silent changes in behaviour, so now incomplete hex or binary literals produce a fatal error. [L<perl #134125|https://rt.perl.org/Ticket/Display.html?id=134125>] =item * eval_pv()'s I<croak_on_error> flag will now throw even if the exception is a false overloaded value. [L<perl #134177|https://rt.perl.org/Ticket/Display.html?id=134177>] =item * C<INIT> blocks and the program itself are no longer run if exit(0) is called within a C<BEGIN>, C<UNITCHECK> or C<CHECK> block. [L<perl #2754|https://rt.perl.org/Ticket/Display.html?id=2754>] =item * C<< open my $fh, ">>+", undef >> now opens the temporary file in append mode: writes will seek to the end of file before writing. [L<perl #134221|https://rt.perl.org/Ticket/Display.html?id=134221>] =item * Fixed a SEGV when searching for the source of an uninitialized value warning on an op whose subtree includes an OP_MULTIDEREF. [L<perl #134275|https://rt.perl.org/Ticket/Display.html?id=134275>] =back =head1 Obituary Jeff Goff (JGOFF or DrForr), an integral part of the Perl and Raku communities and a dear friend to all of us, has passed away on March 13th, 2020. DrForr was a prominent member of the communities, attending and speaking at countless events, contributing to numerous projects, and assisting and helping in any way he could. His passing leaves a hole in our hearts and in our communities and he will be sorely missed. =head1 Acknowledgements Perl 5.32.0 represents approximately 13 months of development since Perl 5.30.0 and contains approximately 220,000 lines of changes across 1,800 files from 89 authors. Excluding auto-generated files, documentation and release tools, there were approximately 140,000 lines of changes to 880 .pm, .t, .c and .h files. Perl continues to flourish into its fourth decade thanks to a vibrant community of users and developers. The following people are known to have contributed the improvements that became Perl 5.32.0: Aaron Crane, Alberto Simões, Alexandr Savca, Andreas König, Andrew Fresh, Andy Dougherty, Ask Bjørn Hansen, Atsushi Sugawara, Bernhard M. Wiedemann, brian d foy, Bryan Stenson, Chad Granum, Chase Whitener, Chris 'BinGOs' Williams, Craig A. Berry, Dagfinn Ilmari Mannsåker, Dan Book, Daniel Dragan, Dan Kogai, Dave Cross, Dave Rolsky, David Cantrell, David Mitchell, Dominic Hargreaves, E. Choroba, Felipe Gasper, Florian Weimer, Graham Knop, Håkon Hægland, Hauke D, H.Merijn Brand, Hugo van der Sanden, Ichinose Shogo, James E Keenan, Jason McIntosh, Jerome Duval, Johan Vromans, John Lightsey, John Paul Adrian Glaubitz, Kang-min Liu, Karen Etheridge, Karl Williamson, Leon Timmermans, Manuel Mausz, Marc Green, Matthew Horsfall, Matt Turner, Max Maischein, Michael Haardt, Nicholas Clark, Nicolas R., Niko Tyni, Pali, Paul Evans, Paul Johnson, Paul Marquess, Peter Eisentraut, Peter John Acklam, Peter Oliver, Petr Písař, Renee Baecker, Ricardo Signes, Richard Leach, Russ Allbery, Samuel Smith, Santtu Ojanperä, Sawyer X, Sergey Aleynikov, Sergiy Borodych, Shirakata Kentaro, Shlomi Fish, Sisyphus, Slaven Rezic, Smylers, Stefan Seifert, Steve Hay, Steve Peters, Svyatoslav, Thibault Duponchelle, Todd Rinaldo, Tomasz Konojacki, Tom Hukins, Tony Cook, Unicode Consortium, VanL, Vickenty Fesunov, Vitali Peil, Yves Orton, Zefram. The list above is almost certainly incomplete as it is automatically generated from version control history. In particular, it does not include the names of the (very much appreciated) contributors who reported issues to the Perl bug tracker. Many of the changes included in this version originated in the CPAN modules included in Perl's core. We're grateful to the entire CPAN community for helping Perl to flourish. For a more complete list of all of Perl's historical contributors, please see the F<AUTHORS> file in the Perl source distribution. =head1 Reporting Bugs If you find what you think is a bug, you might check the perl bug database at L<https://github.com/Perl/perl5/issues>. There may also be information at L<http://www.perl.org/>, the Perl Home Page. If you believe you have an unreported bug, please open an issue at L<https://github.com/Perl/perl5/issues>. Be sure to trim your bug down to a tiny but sufficient test case. If the bug you are reporting has security implications which make it inappropriate to send to a public issue tracker, then see L<perlsec/SECURITY VULNERABILITY CONTACT INFORMATION> for details of how to report the issue. =head1 Give Thanks If you wish to thank the Perl 5 Porters for the work we had done in Perl 5, you can do so by running the C<perlthanks> program: perlthanks This will send an email to the Perl 5 Porters list with your show of thanks. =head1 SEE ALSO The F<Changes> file for an explanation of how to view exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[&�47�� �� perl581delta.podnu �[��� =head1 NAME perl581delta - what is new for perl v5.8.1 =head1 DESCRIPTION This document describes differences between the 5.8.0 release and the 5.8.1 release. If you are upgrading from an earlier release such as 5.6.1, first read the L<perl58delta>, which describes differences between 5.6.0 and 5.8.0. In case you are wondering about 5.6.1, it was bug-fix-wise rather identical to the development release 5.7.1. Confused? This timeline hopefully helps a bit: it lists the new major releases, their maintenance releases, and the development releases. New Maintenance Development 5.6.0 2000-Mar-22 5.7.0 2000-Sep-02 5.6.1 2001-Apr-08 5.7.1 2001-Apr-09 5.7.2 2001-Jul-13 5.7.3 2002-Mar-05 5.8.0 2002-Jul-18 5.8.1 2003-Sep-25 =head1 Incompatible Changes =head2 Hash Randomisation Mainly due to security reasons, the "random ordering" of hashes has been made even more random. Previously while the order of hash elements from keys(), values(), and each() was essentially random, it was still repeatable. Now, however, the order varies between different runs of Perl. B<Perl has never guaranteed any ordering of the hash keys>, and the ordering has already changed several times during the lifetime of Perl 5. Also, the ordering of hash keys has always been, and continues to be, affected by the insertion order. The added randomness may affect applications. One possible scenario is when output of an application has included hash data. For example, if you have used the Data::Dumper module to dump data into different files, and then compared the files to see whether the data has changed, now you will have false positives since the order in which hashes are dumped will vary. In general the cure is to sort the keys (or the values); in particular for Data::Dumper to use the C<Sortkeys> option. If some particular order is really important, use tied hashes: for example the Tie::IxHash module which by default preserves the order in which the hash elements were added. More subtle problem is reliance on the order of "global destruction". That is what happens at the end of execution: Perl destroys all data structures, including user data. If your destructors (the DESTROY subroutines) have assumed any particular ordering to the global destruction, there might be problems ahead. For example, in a destructor of one object you cannot assume that objects of any other class are still available, unless you hold a reference to them. If the environment variable PERL_DESTRUCT_LEVEL is set to a non-zero value, or if Perl is exiting a spawned thread, it will also destruct the ordinary references and the symbol tables that are no longer in use. You can't call a class method or an ordinary function on a class that has been collected that way. The hash randomisation is certain to reveal hidden assumptions about some particular ordering of hash elements, and outright bugs: it revealed a few bugs in the Perl core and core modules. To disable the hash randomisation in runtime, set the environment variable PERL_HASH_SEED to 0 (zero) before running Perl (for more information see L<perlrun/PERL_HASH_SEED>), or to disable the feature completely in compile time, compile with C<-DNO_HASH_SEED> (see F<INSTALL>). See L<perlsec/"Algorithmic Complexity Attacks"> for the original rationale behind this change. =head2 UTF-8 On Filehandles No Longer Activated By Locale In Perl 5.8.0 all filehandles, including the standard filehandles, were implicitly set to be in Unicode UTF-8 if the locale settings indicated the use of UTF-8. This feature caused too many problems, so the feature was turned off and redesigned: see L</"Core Enhancements">. =head2 Single-number v-strings are no longer v-strings before "=>" The version strings or v-strings (see L<perldata/"Version Strings">) feature introduced in Perl 5.6.0 has been a source of some confusion-- especially when the user did not want to use it, but Perl thought it knew better. Especially troublesome has been the feature that before a "=>" a version string (a "v" followed by digits) has been interpreted as a v-string instead of a string literal. In other words: %h = ( v65 => 42 ); has meant since Perl 5.6.0 %h = ( 'A' => 42 ); (at least in platforms of ASCII progeny) Perl 5.8.1 restores the more natural interpretation %h = ( 'v65' => 42 ); The multi-number v-strings like v65.66 and 65.66.67 still continue to be v-strings in Perl 5.8. =head2 (Win32) The -C Switch Has Been Repurposed The -C switch has changed in an incompatible way. The old semantics of this switch only made sense in Win32 and only in the "use utf8" universe in 5.6.x releases, and do not make sense for the Unicode implementation in 5.8.0. Since this switch could not have been used by anyone, it has been repurposed. The behavior that this switch enabled in 5.6.x releases may be supported in a transparent, data-dependent fashion in a future release. For the new life of this switch, see L</"UTF-8 no longer default under UTF-8 locales">, and L<perlrun/-C>. =head2 (Win32) The /d Switch Of cmd.exe Perl 5.8.1 uses the /d switch when running the cmd.exe shell internally for system(), backticks, and when opening pipes to external programs. The extra switch disables the execution of AutoRun commands from the registry, which is generally considered undesirable when running external programs. If you wish to retain compatibility with the older behavior, set PERL5SHELL in your environment to C<cmd /x/c>. =head1 Core Enhancements =head2 UTF-8 no longer default under UTF-8 locales In Perl 5.8.0 many Unicode features were introduced. One of them was found to be of more nuisance than benefit: the automagic (and silent) "UTF-8-ification" of filehandles, including the standard filehandles, if the user's locale settings indicated use of UTF-8. For example, if you had C<en_US.UTF-8> as your locale, your STDIN and STDOUT were automatically "UTF-8", in other words an implicit binmode(..., ":utf8") was made. This meant that trying to print, say, chr(0xff), ended up printing the bytes 0xc3 0xbf. Hardly what you had in mind unless you were aware of this feature of Perl 5.8.0. The problem is that the vast majority of people weren't: for example in RedHat releases 8 and 9 the B<default> locale setting is UTF-8, so all RedHat users got UTF-8 filehandles, whether they wanted it or not. The pain was intensified by the Unicode implementation of Perl 5.8.0 (still) having nasty bugs, especially related to the use of s/// and tr///. (Bugs that have been fixed in 5.8.1) Therefore a decision was made to backtrack the feature and change it from implicit silent default to explicit conscious option. The new Perl command line option C<-C> and its counterpart environment variable PERL_UNICODE can now be used to control how Perl and Unicode interact at interfaces like I/O and for example the command line arguments. See L<perlrun/-C> and L<perlrun/PERL_UNICODE> for more information. =head2 Unsafe signals again available In Perl 5.8.0 the so-called "safe signals" were introduced. This means that Perl no longer handles signals immediately but instead "between opcodes", when it is safe to do so. The earlier immediate handling easily could corrupt the internal state of Perl, resulting in mysterious crashes. However, the new safer model has its problems too. Because now an opcode, a basic unit of Perl execution, is never interrupted but instead let to run to completion, certain operations that can take a long time now really do take a long time. For example, certain network operations have their own blocking and timeout mechanisms, and being able to interrupt them immediately would be nice. Therefore perl 5.8.1 introduces a "backdoor" to restore the pre-5.8.0 (pre-5.7.3, really) signal behaviour. Just set the environment variable PERL_SIGNALS to C<unsafe>, and the old immediate (and unsafe) signal handling behaviour returns. See L<perlrun/PERL_SIGNALS> and L<perlipc/"Deferred Signals (Safe Signals)">. In completely unrelated news, you can now use safe signals with POSIX::SigAction. See L<POSIX/POSIX::SigAction>. =head2 Tied Arrays with Negative Array Indices Formerly, the indices passed to C<FETCH>, C<STORE>, C<EXISTS>, and C<DELETE> methods in tied array class were always non-negative. If the actual argument was negative, Perl would call FETCHSIZE implicitly and add the result to the index before passing the result to the tied array method. This behaviour is now optional. If the tied array class contains a package variable named C<$NEGATIVE_INDICES> which is set to a true value, negative values will be passed to C<FETCH>, C<STORE>, C<EXISTS>, and C<DELETE> unchanged. =head2 local ${$x} The syntaxes local ${$x} local @{$x} local %{$x} now do localise variables, given that the $x is a valid variable name. =head2 Unicode Character Database 4.0.0 The copy of the Unicode Character Database included in Perl 5.8 has been updated to 4.0.0 from 3.2.0. This means for example that the Unicode character properties are as in Unicode 4.0.0. =head2 Deprecation Warnings There is one new feature deprecation. Perl 5.8.0 forgot to add some deprecation warnings, these warnings have now been added. Finally, a reminder of an impending feature removal. =head3 (Reminder) Pseudo-hashes are deprecated (really) Pseudo-hashes were deprecated in Perl 5.8.0 and will be removed in Perl 5.10.0, see L<perl58delta> for details. Each attempt to access pseudo-hashes will trigger the warning C<Pseudo-hashes are deprecated>. If you really want to continue using pseudo-hashes but not to see the deprecation warnings, use: no warnings 'deprecated'; Or you can continue to use the L<fields> pragma, but please don't expect the data structures to be pseudohashes any more. =head3 (Reminder) 5.005-style threads are deprecated (really) 5.005-style threads (activated by C<use Thread;>) were deprecated in Perl 5.8.0 and will be removed after Perl 5.8, see L<perl58delta> for details. Each 5.005-style thread creation will trigger the warning C<5.005 threads are deprecated>. If you really want to continue using the 5.005 threads but not to see the deprecation warnings, use: no warnings 'deprecated'; =head3 (Reminder) The $* variable is deprecated (really) The C<$*> variable controlling multi-line matching has been deprecated and will be removed after 5.8. The variable has been deprecated for a long time, and a deprecation warning C<Use of $* is deprecated> is given, now the variable will just finally be removed. The functionality has been supplanted by the C</s> and C</m> modifiers on pattern matching. If you really want to continue using the C<$*>-variable but not to see the deprecation warnings, use: no warnings 'deprecated'; =head2 Miscellaneous Enhancements C<map> in void context is no longer expensive. C<map> is now context aware, and will not construct a list if called in void context. If a socket gets closed by the server while printing to it, the client now gets a SIGPIPE. While this new feature was not planned, it fell naturally out of PerlIO changes, and is to be considered an accidental feature. PerlIO::get_layers(FH) returns the names of the PerlIO layers active on a filehandle. PerlIO::via layers can now have an optional UTF8 method to indicate whether the layer wants to "auto-:utf8" the stream. utf8::is_utf8() has been added as a quick way to test whether a scalar is encoded internally in UTF-8 (Unicode). =head1 Modules and Pragmata =head2 Updated Modules And Pragmata The following modules and pragmata have been updated since Perl 5.8.0: =over 4 =item base =item B::Bytecode In much better shape than it used to be. Still far from perfect, but maybe worth a try. =item B::Concise =item B::Deparse =item Benchmark An optional feature, C<:hireswallclock>, now allows for high resolution wall clock times (uses Time::HiRes). =item ByteLoader See B::Bytecode. =item bytes Now has bytes::substr. =item CGI =item charnames One can now have custom character name aliases. =item CPAN There is now a simple command line frontend to the CPAN.pm module called F<cpan>. =item Data::Dumper A new option, Pair, allows choosing the separator between hash keys and values. =item DB_File =item Devel::PPPort =item Digest::MD5 =item Encode Significant updates on the encoding pragma functionality (tr/// and the DATA filehandle, formats). If a filehandle has been marked as to have an encoding, unmappable characters are detected already during input, not later (when the corrupted data is being used). The ISO 8859-6 conversion table has been corrected (the 0x30..0x39 erroneously mapped to U+0660..U+0669, instead of U+0030..U+0039). The GSM 03.38 conversion did not handle escape sequences correctly. The UTF-7 encoding has been added (making Encode feature-complete with Unicode::String). =item fields =item libnet =item Math::BigInt A lot of bugs have been fixed since v1.60, the version included in Perl v5.8.0. Especially noteworthy are the bug in Calc that caused div and mod to fail for some large values, and the fixes to the handling of bad inputs. Some new features were added, e.g. the broot() method, you can now pass parameters to config() to change some settings at runtime, and it is now possible to trap the creation of NaN and infinity. As usual, some optimizations took place and made the math overall a tad faster. In some cases, quite a lot faster, actually. Especially alternative libraries like Math::BigInt::GMP benefit from this. In addition, a lot of the quite clunky routines like fsqrt() and flog() are now much much faster. =item MIME::Base64 =item NEXT Diamond inheritance now works. =item Net::Ping =item PerlIO::scalar Reading from non-string scalars (like the special variables, see L<perlvar>) now works. =item podlators =item Pod::LaTeX =item PodParsers =item Pod::Perldoc Complete rewrite. As a side-effect, no longer refuses to startup when run by root. =item Scalar::Util New utilities: refaddr, isvstring, looks_like_number, set_prototype. =item Storable Can now store code references (via B::Deparse, so not foolproof). =item strict Earlier versions of the strict pragma did not check the parameters implicitly passed to its "import" (use) and "unimport" (no) routine. This caused the false idiom such as: use strict qw(@ISA); @ISA = qw(Foo); This however (probably) raised the false expectation that the strict refs, vars and subs were being enforced (and that @ISA was somehow "declared"). But the strict refs, vars, and subs are B<not> enforced when using this false idiom. Starting from Perl 5.8.1, the above B<will> cause an error to be raised. This may cause programs which used to execute seemingly correctly without warnings and errors to fail when run under 5.8.1. This happens because use strict qw(@ISA); will now fail with the error: Unknown 'strict' tag(s) '@ISA' The remedy to this problem is to replace this code with the correct idiom: use strict; use vars qw(@ISA); @ISA = qw(Foo); =item Term::ANSIcolor =item Test::Harness Now much more picky about extra or missing output from test scripts. =item Test::More =item Test::Simple =item Text::Balanced =item Time::HiRes Use of nanosleep(), if available, allows mixing subsecond sleeps with alarms. =item threads Several fixes, for example for join() problems and memory leaks. In some platforms (like Linux) that use glibc the minimum memory footprint of one ithread has been reduced by several hundred kilobytes. =item threads::shared Many memory leaks have been fixed. =item Unicode::Collate =item Unicode::Normalize =item Win32::GetFolderPath =item Win32::GetOSVersion Now returns extra information. =back =head1 Utility Changes The C<h2xs> utility now produces a more modern layout: F<Foo-Bar/lib/Foo/Bar.pm> instead of F<Foo/Bar/Bar.pm>. Also, the boilerplate test is now called F<t/Foo-Bar.t> instead of F<t/1.t>. The Perl debugger (F<lib/perl5db.pl>) has now been extensively documented and bugs found while documenting have been fixed. C<perldoc> has been rewritten from scratch to be more robust and feature rich. C<perlcc -B> works now at least somewhat better, while C<perlcc -c> is rather more broken. (The Perl compiler suite as a whole continues to be experimental.) =head1 New Documentation perl573delta has been added to list the differences between the (now quite obsolete) development releases 5.7.2 and 5.7.3. perl58delta has been added: it is the perldelta of 5.8.0, detailing the differences between 5.6.0 and 5.8.0. perlartistic has been added: it is the Artistic License in pod format, making it easier for modules to refer to it. perlcheat has been added: it is a Perl cheat sheet. perlgpl has been added: it is the GNU General Public License in pod format, making it easier for modules to refer to it. perlmacosx has been added to tell about the installation and use of Perl in Mac OS X. perlos400 has been added to tell about the installation and use of Perl in OS/400 PASE. perlreref has been added: it is a regular expressions quick reference. =head1 Installation and Configuration Improvements The Unix standard Perl location, F</usr/bin/perl>, is no longer overwritten by default if it exists. This change was very prudent because so many Unix vendors already provide a F</usr/bin/perl>, but simultaneously many system utilities may depend on that exact version of Perl, so better not to overwrite it. One can now specify installation directories for site and vendor man and HTML pages, and site and vendor scripts. See F<INSTALL>. One can now specify a destination directory for Perl installation by specifying the DESTDIR variable for C<make install>. (This feature is slightly different from the previous C<Configure -Dinstallprefix=...>.) See F<INSTALL>. gcc versions 3.x introduced a new warning that caused a lot of noise during Perl compilation: C<gcc -Ialreadyknowndirectory (warning: changing search order)>. This warning has now been avoided by Configure weeding out such directories before the compilation. One can now build subsets of Perl core modules by using the Configure flags C<-Dnoextensions=...> and C<-Donlyextensions=...>, see F<INSTALL>. =head2 Platform-specific enhancements In Cygwin Perl can now be built with threads (C<Configure -Duseithreads>). This works with both Cygwin 1.3.22 and Cygwin 1.5.3. In newer FreeBSD releases Perl 5.8.0 compilation failed because of trying to use F<malloc.h>, which in FreeBSD is just a dummy file, and a fatal error to even try to use. Now F<malloc.h> is not used. Perl is now known to build also in Hitachi HI-UXMPP. Perl is now known to build again in LynxOS. Mac OS X now installs with Perl version number embedded in installation directory names for easier upgrading of user-compiled Perl, and the installation directories in general are more standard. In other words, the default installation no longer breaks the Apple-provided Perl. On the other hand, with C<Configure -Dprefix=/usr> you can now really replace the Apple-supplied Perl (B<please be careful>). Mac OS X now builds Perl statically by default. This change was done mainly for faster startup times. The Apple-provided Perl is still dynamically linked and shared, and you can enable the sharedness for your own Perl builds by C<Configure -Duseshrplib>. Perl has been ported to IBM's OS/400 PASE environment. The best way to build a Perl for PASE is to use an AIX host as a cross-compilation environment. See README.os400. Yet another cross-compilation option has been added: now Perl builds on OpenZaurus, a Linux distribution based on Mandrake + Embedix for the Sharp Zaurus PDA. See the Cross/README file. Tru64 when using gcc 3 drops the optimisation for F<toke.c> to C<-O2> because of gigantic memory use with the default C<-O3>. Tru64 can now build Perl with the newer Berkeley DBs. Building Perl on WinCE has been much enhanced, see F<README.ce> and F<README.perlce>. =head1 Selected Bug Fixes =head2 Closures, eval and lexicals There have been many fixes in the area of anonymous subs, lexicals and closures. Although this means that Perl is now more "correct", it is possible that some existing code will break that happens to rely on the faulty behaviour. In practice this is unlikely unless your code contains a very complex nesting of anonymous subs, evals and lexicals. =head2 Generic fixes If an input filehandle is marked C<:utf8> and Perl sees illegal UTF-8 coming in when doing C<< <FH> >>, if warnings are enabled a warning is immediately given - instead of being silent about it and Perl being unhappy about the broken data later. (The C<:encoding(utf8)> layer also works the same way.) binmode(SOCKET, ":utf8") only worked on the input side, not on the output side of the socket. Now it works both ways. For threaded Perls certain system database functions like getpwent() and getgrent() now grow their result buffer dynamically, instead of failing. This means that at sites with lots of users and groups the functions no longer fail by returning only partial results. Perl 5.8.0 had accidentally broken the capability for users to define their own uppercase<->lowercase Unicode mappings (as advertised by the Camel). This feature has been fixed and is also documented better. In 5.8.0 this $some_unicode .= <FH>; didn't work correctly but instead corrupted the data. This has now been fixed. Tied methods like FETCH etc. may now safely access tied values, i.e. resulting in a recursive call to FETCH etc. Remember to break the recursion, though. At startup Perl blocks the SIGFPE signal away since there isn't much Perl can do about it. Previously this blocking was in effect also for programs executed from within Perl. Now Perl restores the original SIGFPE handling routine, whatever it was, before running external programs. Linenumbers in Perl scripts may now be greater than 65536, or 2**16. (Perl scripts have always been able to be larger than that, it's just that the linenumber for reported errors and warnings have "wrapped around".) While scripts that large usually indicate a need to rethink your code a bit, such Perl scripts do exist, for example as results from generated code. Now linenumbers can go all the way to 4294967296, or 2**32. =head2 Platform-specific fixes Linux =over 4 =item * Setting $0 works again (with certain limitations that Perl cannot do much about: see L<perlvar/$0>) =back HP-UX =over 4 =item * Setting $0 now works. =back VMS =over 4 =item * Configuration now tests for the presence of C<poll()>, and IO::Poll now uses the vendor-supplied function if detected. =item * A rare access violation at Perl start-up could occur if the Perl image was installed with privileges or if there was an identifier with the subsystem attribute set in the process's rightslist. Either of these circumstances triggered tainting code that contained a pointer bug. The faulty pointer arithmetic has been fixed. =item * The length limit on values (not keys) in the %ENV hash has been raised from 255 bytes to 32640 bytes (except when the PERL_ENV_TABLES setting overrides the default use of logical names for %ENV). If it is necessary to access these long values from outside Perl, be aware that they are implemented using search list logical names that store the value in pieces, each 255-byte piece (up to 128 of them) being an element in the search list. When doing a lookup in %ENV from within Perl, the elements are combined into a single value. The existing VMS-specific ability to access individual elements of a search list logical name via the $ENV{'foo;N'} syntax (where N is the search list index) is unimpaired. =item * The piping implementation now uses local rather than global DCL symbols for inter-process communication. =item * File::Find could become confused when navigating to a relative directory whose name collided with a logical name. This problem has been corrected by adding directory syntax to relative path names, thus preventing logical name translation. =back Win32 =over 4 =item * A memory leak in the fork() emulation has been fixed. =item * The return value of the ioctl() built-in function was accidentally broken in 5.8.0. This has been corrected. =item * The internal message loop executed by perl during blocking operations sometimes interfered with messages that were external to Perl. This often resulted in blocking operations terminating prematurely or returning incorrect results, when Perl was executing under environments that could generate Windows messages. This has been corrected. =item * Pipes and sockets are now automatically in binary mode. =item * The four-argument form of select() did not preserve $! (errno) properly when there were errors in the underlying call. This is now fixed. =item * The "CR CR LF" problem of has been fixed, binmode(FH, ":crlf") is now effectively a no-op. =back =head1 New or Changed Diagnostics All the warnings related to pack() and unpack() were made more informative and consistent. =head2 Changed "A thread exited while %d threads were running" The old version A thread exited while %d other threads were still running was misleading because the "other" included also the thread giving the warning. =head2 Removed "Attempt to clear a restricted hash" It is not illegal to clear a restricted hash, so the warning was removed. =head2 New "Illegal declaration of anonymous subroutine" You must specify the block of code for C<sub>. =head2 Changed "Invalid range "%s" in transliteration operator" The old version Invalid [] range "%s" in transliteration operator was simply wrong because there are no "[] ranges" in tr///. =head2 New "Missing control char name in \c" Self-explanatory. =head2 New "Newline in left-justified string for %s" The padding spaces would appear after the newline, which is probably not what you had in mind. =head2 New "Possible precedence problem on bitwise %c operator" If you think this $x & $y == 0 tests whether the bitwise AND of $x and $y is zero, you will like this warning. =head2 New "Pseudo-hashes are deprecated" This warning should have been already in 5.8.0, since they are. =head2 New "read() on %s filehandle %s" You cannot read() (or sysread()) from a closed or unopened filehandle. =head2 New "5.005 threads are deprecated" This warning should have been already in 5.8.0, since they are. =head2 New "Tied variable freed while still in use" Something pulled the plug on a live tied variable, Perl plays safe by bailing out. =head2 New "To%s: illegal mapping '%s'" An illegal user-defined Unicode casemapping was specified. =head2 New "Use of freed value in iteration" Something modified the values being iterated over. This is not good. =head1 Changed Internals These news matter to you only if you either write XS code or like to know about or hack Perl internals (using Devel::Peek or any of the C<B::> modules counts), or like to run Perl with the C<-D> option. The embedding examples of L<perlembed> have been reviewed to be up to date and consistent: for example, the correct use of PERL_SYS_INIT3() and PERL_SYS_TERM(). Extensive reworking of the pad code (the code responsible for lexical variables) has been conducted by Dave Mitchell. Extensive work on the v-strings by John Peacock. UTF-8 length and position cache: to speed up the handling of Unicode (UTF-8) scalars, a cache was introduced. Potential problems exist if an extension bypasses the official APIs and directly modifies the PV of an SV: the UTF-8 cache does not get cleared as it should. APIs obsoleted in Perl 5.8.0, like sv_2pv, sv_catpvn, sv_catsv, sv_setsv, are again available. Certain Perl core C APIs like cxinc and regatom are no longer available at all to code outside the Perl core of the Perl core extensions. This is intentional. They never should have been available with the shorter names, and if you application depends on them, you should (be ashamed and) contact perl5-porters to discuss what are the proper APIs. Certain Perl core C APIs like C<Perl_list> are no longer available without their C<Perl_> prefix. If your XS module stops working because some functions cannot be found, in many cases a simple fix is to add the C<Perl_> prefix to the function and the thread context C<aTHX_> as the first argument of the function call. This is also how it should always have been done: letting the Perl_-less forms to leak from the core was an accident. For cleaner embedding you can also force this for all APIs by defining at compile time the cpp define PERL_NO_SHORT_NAMES. Perl_save_bool() has been added. Regexp objects (those created with C<qr>) now have S-magic rather than R-magic. This fixed regexps of the form /...(??{...;$x})/ to no longer ignore changes made to $x. The S-magic avoids dropping the caching optimization and making (??{...}) constructs obscenely slow (and consequently useless). See also L<perlguts/"Magic Variables">. Regexp::Copy was affected by this change. The Perl internal debugging macros DEBUG() and DEB() have been renamed to PERL_DEBUG() and PERL_DEB() to avoid namespace conflicts. C<-DL> removed (the leaktest had been broken and unsupported for years, use alternative debugging mallocs or tools like valgrind and Purify). Verbose modifier C<v> added for C<-DXv> and C<-Dsv>, see L<perlrun>. =head1 New Tests In Perl 5.8.0 there were about 69000 separate tests in about 700 test files, in Perl 5.8.1 there are about 77000 separate tests in about 780 test files. The exact numbers depend on the Perl configuration and on the operating system platform. =head1 Known Problems The hash randomisation mentioned in L</Incompatible Changes> is definitely problematic: it will wake dormant bugs and shake out bad assumptions. If you want to use mod_perl 2.x with Perl 5.8.1, you will need mod_perl-1.99_10 or higher. Earlier versions of mod_perl 2.x do not work with the randomised hashes. (mod_perl 1.x works fine.) You will also need Apache::Test 1.04 or higher. Many of the rarer platforms that worked 100% or pretty close to it with perl 5.8.0 have been left a little bit untended since their maintainers have been otherwise busy lately, and therefore there will be more failures on those platforms. Such platforms include Mac OS Classic, IBM z/OS (and other EBCDIC platforms), and NetWare. The most common Perl platforms (Unix and Unix-like, Microsoft platforms, and VMS) have large enough testing and expert population that they are doing well. =head2 Tied hashes in scalar context Tied hashes do not currently return anything useful in scalar context, for example when used as boolean tests: if (%tied_hash) { ... } The current nonsensical behaviour is always to return false, regardless of whether the hash is empty or has elements. The root cause is that there is no interface for the implementors of tied hashes to implement the behaviour of a hash in scalar context. =head2 Net::Ping 450_service and 510_ping_udp failures The subtests 9 and 18 of lib/Net/Ping/t/450_service.t, and the subtest 2 of lib/Net/Ping/t/510_ping_udp.t might fail if you have an unusual networking setup. For example in the latter case the test is trying to send a UDP ping to the IP address 127.0.0.1. =head2 B::C The C-generating compiler backend B::C (the frontend being C<perlcc -c>) is even more broken than it used to be because of the extensive lexical variable changes. (The good news is that B::Bytecode and ByteLoader are better than they used to be.) =head1 Platform Specific Problems =head2 EBCDIC Platforms IBM z/OS and other EBCDIC platforms continue to be problematic regarding Unicode support. Many Unicode tests are skipped when they really should be fixed. =head2 Cygwin 1.5 problems In Cygwin 1.5 the F<io/tell> and F<op/sysio> tests have failures for some yet unknown reason. In 1.5.5 the threads tests stress_cv, stress_re, and stress_string are failing unless the environment variable PERLIO is set to "perlio" (which makes also the io/tell failure go away). Perl 5.8.1 does build and work well with Cygwin 1.3: with (uname -a) C<CYGWIN_NT-5.0 ... 1.3.22(0.78/3/2) 2003-03-18 09:20 i686 ...> a 100% "make test" was achieved with C<Configure -des -Duseithreads>. =head2 HP-UX: HP cc warnings about sendfile and sendpath With certain HP C compiler releases (e.g. B.11.11.02) you will get many warnings like this (lines wrapped for easier reading): cc: "/usr/include/sys/socket.h", line 504: warning 562: Redeclaration of "sendfile" with a different storage class specifier: "sendfile" will have internal linkage. cc: "/usr/include/sys/socket.h", line 505: warning 562: Redeclaration of "sendpath" with a different storage class specifier: "sendpath" will have internal linkage. The warnings show up both during the build of Perl and during certain lib/ExtUtils tests that invoke the C compiler. The warning, however, is not serious and can be ignored. =head2 IRIX: t/uni/tr_7jis.t falsely failing The test t/uni/tr_7jis.t is known to report failure under 'make test' or the test harness with certain releases of IRIX (at least IRIX 6.5 and MIPSpro Compilers Version 7.3.1.1m), but if run manually the test fully passes. =head2 Mac OS X: no usemymalloc The Perl malloc (C<-Dusemymalloc>) does not work at all in Mac OS X. This is not that serious, though, since the native malloc works just fine. =head2 Tru64: No threaded builds with GNU cc (gcc) In the latest Tru64 releases (e.g. v5.1B or later) gcc cannot be used to compile a threaded Perl (-Duseithreads) because the system C<< <pthread.h> >> file doesn't know about gcc. =head2 Win32: sysopen, sysread, syswrite As of the 5.8.0 release, sysopen()/sysread()/syswrite() do not behave like they used to in 5.6.1 and earlier with respect to "text" mode. These built-ins now always operate in "binary" mode (even if sysopen() was passed the O_TEXT flag, or if binmode() was used on the file handle). Note that this issue should only make a difference for disk files, as sockets and pipes have always been in "binary" mode in the Windows port. As this behavior is currently considered a bug, compatible behavior may be re-introduced in a future release. Until then, the use of sysopen(), sysread() and syswrite() is not supported for "text" mode operations. =head1 Future Directions The following things B<might> happen in future. The first publicly available releases having these characteristics will be the developer releases Perl 5.9.x, culminating in the Perl 5.10.0 release. These are our best guesses at the moment: we reserve the right to rethink. =over 4 =item * PerlIO will become The Default. Currently (in Perl 5.8.x) the stdio library is still used if Perl thinks it can use certain tricks to make stdio go B<really> fast. For future releases our goal is to make PerlIO go even faster. =item * A new feature called I<assertions> will be available. This means that one can have code called assertions sprinkled in the code: usually they are optimised away, but they can be enabled with the C<-A> option. =item * A new operator C<//> (defined-or) will be available. This means that one will be able to say $a // $b instead of defined $a ? $a : $b and $c //= $d; instead of $c = $d unless defined $c; The operator will have the same precedence and associativity as C<||>. A source code patch against the Perl 5.8.1 sources will be available in CPAN as F<authors/id/H/HM/HMBRAND/dor-5.8.1.diff>. =item * C<unpack()> will default to unpacking the C<$_>. =item * Various Copy-On-Write techniques will be investigated in hopes of speeding up Perl. =item * CPANPLUS, Inline, and Module::Build will become core modules. =item * The ability to write true lexically scoped pragmas will be introduced. =item * Work will continue on the bytecompiler and byteloader. =item * v-strings as they currently exist are scheduled to be deprecated. The v-less form (1.2.3) will become a "version object" when used with C<use>, C<require>, and C<$VERSION>. $^V will also be a "version object" so the printf("%vd",...) construct will no longer be needed. The v-ful version (v1.2.3) will become obsolete. The equivalence of strings and v-strings (e.g. that currently 5.8.0 is equal to "\5\8\0") will go away. B<There may be no deprecation warning for v-strings>, though: it is quite hard to detect when v-strings are being used safely, and when they are not. =item * 5.005 Threads Will Be Removed =item * The C<$*> Variable Will Be Removed (it was deprecated a long time ago) =item * Pseudohashes Will Be Removed =back =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org/ . There may also be information at http://www.perl.com/ , the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/ =head1 SEE ALSO The F<Changes> file for exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[�3�yI yI perlebcdic.podnu �[��� =encoding utf8 =head1 NAME perlebcdic - Considerations for running Perl on EBCDIC platforms =head1 DESCRIPTION An exploration of some of the issues facing Perl programmers on EBCDIC based computers. Portions of this document that are still incomplete are marked with XXX. Early Perl versions worked on some EBCDIC machines, but the last known version that ran on EBCDIC was v5.8.7, until v5.22, when the Perl core again works on z/OS. Theoretically, it could work on OS/400 or Siemens' BS2000 (or their successors), but this is untested. In v5.22 and 5.24, not all the modules found on CPAN but shipped with core Perl work on z/OS. If you want to use Perl on a non-z/OS EBCDIC machine, please let us know at L<https://github.com/Perl/perl5/issues>. Writing Perl on an EBCDIC platform is really no different than writing on an L</ASCII> one, but with different underlying numbers, as we'll see shortly. You'll have to know something about those L</ASCII> platforms because the documentation is biased and will frequently use example numbers that don't apply to EBCDIC. There are also very few CPAN modules that are written for EBCDIC and which don't work on ASCII; instead the vast majority of CPAN modules are written for ASCII, and some may happen to work on EBCDIC, while a few have been designed to portably work on both. If your code just uses the 52 letters A-Z and a-z, plus SPACE, the digits 0-9, and the punctuation characters that Perl uses, plus a few controls that are denoted by escape sequences like C<\n> and C<\t>, then there's nothing special about using Perl, and your code may very well work on an ASCII machine without change. But if you write code that uses C<\005> to mean a TAB or C<\xC1> to mean an "A", or C<\xDF> to mean a "E<yuml>" (small C<"y"> with a diaeresis), then your code may well work on your EBCDIC platform, but not on an ASCII one. That's fine to do if no one will ever want to run your code on an ASCII platform; but the bias in this document will be towards writing code portable between EBCDIC and ASCII systems. Again, if every character you care about is easily enterable from your keyboard, you don't have to know anything about ASCII, but many keyboards don't easily allow you to directly enter, say, the character C<\xDF>, so you have to specify it indirectly, such as by using the C<"\xDF"> escape sequence. In those cases it's easiest to know something about the ASCII/Unicode character sets. If you know that the small "E<yuml>" is C<U+00FF>, then you can instead specify it as C<"\N{U+FF}">, and have the computer automatically translate it to C<\xDF> on your platform, and leave it as C<\xFF> on ASCII ones. Or you could specify it by name, C<\N{LATIN SMALL LETTER Y WITH DIAERESIS> and not have to know the numbers. Either way works, but both require familiarity with Unicode. =head1 COMMON CHARACTER CODE SETS =head2 ASCII The American Standard Code for Information Interchange (ASCII or US-ASCII) is a set of integers running from 0 to 127 (decimal) that have standardized interpretations by the computers which use ASCII. For example, 65 means the letter "A". The range 0..127 can be covered by setting various bits in a 7-bit binary digit, hence the set is sometimes referred to as "7-bit ASCII". ASCII was described by the American National Standards Institute document ANSI X3.4-1986. It was also described by ISO 646:1991 (with localization for currency symbols). The full ASCII set is given in the table L<below|/recipe 3> as the first 128 elements. Languages that can be written adequately with the characters in ASCII include English, Hawaiian, Indonesian, Swahili and some Native American languages. Most non-EBCDIC character sets are supersets of ASCII. That is the integers 0-127 mean what ASCII says they mean. But integers 128 and above are specific to the character set. Many of these fit entirely into 8 bits, using ASCII as 0-127, while specifying what 128-255 mean, and not using anything above 255. Thus, these are single-byte (or octet if you prefer) character sets. One important one (since Unicode is a superset of it) is the ISO 8859-1 character set. =head2 ISO 8859 The ISO 8859-I<B<$n>> are a collection of character code sets from the International Organization for Standardization (ISO), each of which adds characters to the ASCII set that are typically found in various languages, many of which are based on the Roman, or Latin, alphabet. Most are for European languages, but there are also ones for Arabic, Greek, Hebrew, and Thai. There are good references on the web about all these. =head2 Latin 1 (ISO 8859-1) A particular 8-bit extension to ASCII that includes grave and acute accented Latin characters. Languages that can employ ISO 8859-1 include all the languages covered by ASCII as well as Afrikaans, Albanian, Basque, Catalan, Danish, Faroese, Finnish, Norwegian, Portuguese, Spanish, and Swedish. Dutch is covered albeit without the ij ligature. French is covered too but without the oe ligature. German can use ISO 8859-1 but must do so without German-style quotation marks. This set is based on Western European extensions to ASCII and is commonly encountered in world wide web work. In IBM character code set identification terminology, ISO 8859-1 is also known as CCSID 819 (or sometimes 0819 or even 00819). =head2 EBCDIC The Extended Binary Coded Decimal Interchange Code refers to a large collection of single- and multi-byte coded character sets that are quite different from ASCII and ISO 8859-1, and are all slightly different from each other; they typically run on host computers. The EBCDIC encodings derive from 8-bit byte extensions of Hollerith punched card encodings, which long predate ASCII. The layout on the cards was such that high bits were set for the upper and lower case alphabetic characters C<[a-z]> and C<[A-Z]>, but there were gaps within each Latin alphabet range, visible in the table L<below|/recipe 3>. These gaps can cause complications. Some IBM EBCDIC character sets may be known by character code set identification numbers (CCSID numbers) or code page numbers. Perl can be compiled on platforms that run any of three commonly used EBCDIC character sets, listed below. =head3 The 13 variant characters Among IBM EBCDIC character code sets there are 13 characters that are often mapped to different integer values. Those characters are known as the 13 "variant" characters and are: \ [ ] { } ^ ~ ! # | $ @ ` When Perl is compiled for a platform, it looks at all of these characters to guess which EBCDIC character set the platform uses, and adapts itself accordingly to that platform. If the platform uses a character set that is not one of the three Perl knows about, Perl will either fail to compile, or mistakenly and silently choose one of the three. The Line Feed (LF) character is actually a 14th variant character, and Perl checks for that as well. =head3 EBCDIC code sets recognized by Perl =over =item B<0037> Character code set ID 0037 is a mapping of the ASCII plus Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set. 0037 is used in North American English locales on the OS/400 operating system that runs on AS/400 computers. CCSID 0037 differs from ISO 8859-1 in 236 places; in other words they agree on only 20 code point values. =item B<1047> Character code set ID 1047 is also a mapping of the ASCII plus Latin-1 characters (i.e. ISO 8859-1) to an EBCDIC set. 1047 is used under Unix System Services for OS/390 or z/OS, and OpenEdition for VM/ESA. CCSID 1047 differs from CCSID 0037 in eight places, and from ISO 8859-1 in 236. =item B<POSIX-BC> The EBCDIC code page in use on Siemens' BS2000 system is distinct from 1047 and 0037. It is identified below as the POSIX-BC set. Like 0037 and 1047, it is the same as ISO 8859-1 in 20 code point values. =back =head2 Unicode code points versus EBCDIC code points In Unicode terminology a I<code point> is the number assigned to a character: for example, in EBCDIC the character "A" is usually assigned the number 193. In Unicode, the character "A" is assigned the number 65. All the code points in ASCII and Latin-1 (ISO 8859-1) have the same meaning in Unicode. All three of the recognized EBCDIC code sets have 256 code points, and in each code set, all 256 code points are mapped to equivalent Latin1 code points. Obviously, "A" will map to "A", "B" => "B", "%" => "%", etc., for all printable characters in Latin1 and these code pages. It also turns out that EBCDIC has nearly precise equivalents for the ASCII/Latin1 C0 controls and the DELETE control. (The C0 controls are those whose ASCII code points are 0..0x1F; things like TAB, ACK, BEL, etc.) A mapping is set up between these ASCII/EBCDIC controls. There isn't such a precise mapping between the C1 controls on ASCII platforms and the remaining EBCDIC controls. What has been done is to map these controls, mostly arbitrarily, to some otherwise unmatched character in the other character set. Most of these are very very rarely used nowadays in EBCDIC anyway, and their names have been dropped, without much complaint. For example the EO (Eight Ones) EBCDIC control (consisting of eight one bits = 0xFF) is mapped to the C1 APC control (0x9F), and you can't use the name "EO". The EBCDIC controls provide three possible line terminator characters, CR (0x0D), LF (0x25), and NL (0x15). On ASCII platforms, the symbols "NL" and "LF" refer to the same character, but in strict EBCDIC terminology they are different ones. The EBCDIC NL is mapped to the C1 control called "NEL" ("Next Line"; here's a case where the mapping makes quite a bit of sense, and hence isn't just arbitrary). On some EBCDIC platforms, this NL or NEL is the typical line terminator. This is true of z/OS and BS2000. In these platforms, the C compilers will swap the LF and NEL code points, so that C<"\n"> is 0x15, and refers to NL. Perl does that too; you can see it in the code chart L<below|/recipe 3>. This makes things generally "just work" without you even having to be aware that there is a swap. =head2 Unicode and UTF UTF stands for "Unicode Transformation Format". UTF-8 is an encoding of Unicode into a sequence of 8-bit byte chunks, based on ASCII and Latin-1. The length of a sequence required to represent a Unicode code point depends on the ordinal number of that code point, with larger numbers requiring more bytes. UTF-EBCDIC is like UTF-8, but based on EBCDIC. They are enough alike that often, casual usage will conflate the two terms, and use "UTF-8" to mean both the UTF-8 found on ASCII platforms, and the UTF-EBCDIC found on EBCDIC ones. You may see the term "invariant" character or code point. This simply means that the character has the same numeric value and representation when encoded in UTF-8 (or UTF-EBCDIC) as when not. (Note that this is a very different concept from L</The 13 variant characters> mentioned above. Careful prose will use the term "UTF-8 invariant" instead of just "invariant", but most often you'll see just "invariant".) For example, the ordinal value of "A" is 193 in most EBCDIC code pages, and also is 193 when encoded in UTF-EBCDIC. All UTF-8 (or UTF-EBCDIC) variant code points occupy at least two bytes when encoded in UTF-8 (or UTF-EBCDIC); by definition, the UTF-8 (or UTF-EBCDIC) invariant code points are exactly one byte whether encoded in UTF-8 (or UTF-EBCDIC), or not. (By now you see why people typically just say "UTF-8" when they also mean "UTF-EBCDIC". For the rest of this document, we'll mostly be casual about it too.) In ASCII UTF-8, the code points corresponding to the lowest 128 ordinal numbers (0 - 127: the ASCII characters) are invariant. In UTF-EBCDIC, there are 160 invariant characters. (If you care, the EBCDIC invariants are those characters which have ASCII equivalents, plus those that correspond to the C1 controls (128 - 159 on ASCII platforms).) A string encoded in UTF-EBCDIC may be longer (very rarely shorter) than one encoded in UTF-8. Perl extends both UTF-8 and UTF-EBCDIC so that they can encode code points above the Unicode maximum of U+10FFFF. Both extensions are constructed to allow encoding of any code point that fits in a 64-bit word. UTF-EBCDIC is defined by L<Unicode Technical Report #16|https://www.unicode.org/reports/tr16> (often referred to as just TR16). It is defined based on CCSID 1047, not allowing for the differences for other code pages. This allows for easy interchange of text between computers running different code pages, but makes it unusable, without adaptation, for Perl on those other code pages. The reason for this unusability is that a fundamental assumption of Perl is that the characters it cares about for parsing and lexical analysis are the same whether or not the text is in UTF-8. For example, Perl expects the character C<"["> to have the same representation, no matter if the string containing it (or program text) is UTF-8 encoded or not. To ensure this, Perl adapts UTF-EBCDIC to the particular code page so that all characters it expects to be UTF-8 invariant are in fact UTF-8 invariant. This means that text generated on a computer running one version of Perl's UTF-EBCDIC has to be translated to be intelligible to a computer running another. TR16 implies a method to extend UTF-EBCDIC to encode points up through S<C<2 ** 31 - 1>>. Perl uses this method for code points up through S<C<2 ** 30 - 1>>, but uses an incompatible method for larger ones, to enable it to handle much larger code points than otherwise. =head2 Using Encode Starting from Perl 5.8 you can use the standard module Encode to translate from EBCDIC to Latin-1 code points. Encode knows about more EBCDIC character sets than Perl can currently be compiled to run on. use Encode 'from_to'; my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' ); # $a is in EBCDIC code points from_to($a, $ebcdic{ord '^'}, 'latin1'); # $a is ISO 8859-1 code points and from Latin-1 code points to EBCDIC code points use Encode 'from_to'; my %ebcdic = ( 176 => 'cp37', 95 => 'cp1047', 106 => 'posix-bc' ); # $a is ISO 8859-1 code points from_to($a, 'latin1', $ebcdic{ord '^'}); # $a is in EBCDIC code points For doing I/O it is suggested that you use the autotranslating features of PerlIO, see L<perluniintro>. Since version 5.8 Perl uses the PerlIO I/O library. This enables you to use different encodings per IO channel. For example you may use use Encode; open($f, ">:encoding(ascii)", "test.ascii"); print $f "Hello World!\n"; open($f, ">:encoding(cp37)", "test.ebcdic"); print $f "Hello World!\n"; open($f, ">:encoding(latin1)", "test.latin1"); print $f "Hello World!\n"; open($f, ">:encoding(utf8)", "test.utf8"); print $f "Hello World!\n"; to get four files containing "Hello World!\n" in ASCII, CP 0037 EBCDIC, ISO 8859-1 (Latin-1) (in this example identical to ASCII since only ASCII characters were printed), and UTF-EBCDIC (in this example identical to normal EBCDIC since only characters that don't differ between EBCDIC and UTF-EBCDIC were printed). See the documentation of L<Encode::PerlIO> for details. As the PerlIO layer uses raw IO (bytes) internally, all this totally ignores things like the type of your filesystem (ASCII or EBCDIC). =head1 SINGLE OCTET TABLES The following tables list the ASCII and Latin 1 ordered sets including the subsets: C0 controls (0..31), ASCII graphics (32..7e), delete (7f), C1 controls (80..9f), and Latin-1 (a.k.a. ISO 8859-1) (a0..ff). In the table names of the Latin 1 extensions to ASCII have been labelled with character names roughly corresponding to I<The Unicode Standard, Version 6.1> albeit with substitutions such as C<s/LATIN//> and C<s/VULGAR//> in all cases; S<C<s/CAPITAL LETTER//>> in some cases; and S<C<s/SMALL LETTER ([A-Z])/\l$1/>> in some other cases. Controls are listed using their Unicode 6.2 abbreviations. The differences between the 0037 and 1047 sets are flagged with C<**>. The differences between the 1047 and POSIX-BC sets are flagged with C<##.> All C<ord()> numbers listed are decimal. If you would rather see this table listing octal values, then run the table (that is, the pod source text of this document, since this recipe may not work with a pod2_other_format translation) through: =over 4 =item recipe 0 =back perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \ -e '{printf("%s%-5.03o%-5.03o%-5.03o%.03o\n",$1,$2,$3,$4,$5)}' \ perlebcdic.pod If you want to retain the UTF-x code points then in script form you might want to write: =over 4 =item recipe 1 =back open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!"; while (<FH>) { if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*) \s+(\d+)\.?(\d*)/x) { if ($7 ne '' && $9 ne '') { printf( "%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%-3o.%.03o\n", $1,$2,$3,$4,$5,$6,$7,$8,$9); } elsif ($7 ne '') { printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-3o.%-5o%.03o\n", $1,$2,$3,$4,$5,$6,$7,$8); } else { printf("%s%-5.03o%-5.03o%-5.03o%-5.03o%-5.03o%.03o\n", $1,$2,$3,$4,$5,$6,$8); } } } If you would rather see this table listing hexadecimal values then run the table through: =over 4 =item recipe 2 =back perl -ne 'if(/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)/)' \ -e '{printf("%s%-5.02X%-5.02X%-5.02X%.02X\n",$1,$2,$3,$4,$5)}' \ perlebcdic.pod Or, in order to retain the UTF-x code points in hexadecimal: =over 4 =item recipe 3 =back open(FH,"<perlebcdic.pod") or die "Could not open perlebcdic.pod: $!"; while (<FH>) { if (/(.{29})(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\s+(\d+)\.?(\d*) \s+(\d+)\.?(\d*)/x) { if ($7 ne '' && $9 ne '') { printf( "%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X.%02X\n", $1,$2,$3,$4,$5,$6,$7,$8,$9); } elsif ($7 ne '') { printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-2X.%-6.02X%02X\n", $1,$2,$3,$4,$5,$6,$7,$8); } else { printf("%s%-5.02X%-5.02X%-5.02X%-5.02X%-5.02X%02X\n", $1,$2,$3,$4,$5,$6,$8); } } } ISO 8859-1 POS- CCSID CCSID CCSID CCSID IX- 1047 chr 0819 0037 1047 BC UTF-8 UTF-EBCDIC --------------------------------------------------------------------- <NUL> 0 0 0 0 0 0 <SOH> 1 1 1 1 1 1 <STX> 2 2 2 2 2 2 <ETX> 3 3 3 3 3 3 <EOT> 4 55 55 55 4 55 <ENQ> 5 45 45 45 5 45 <ACK> 6 46 46 46 6 46 <BEL> 7 47 47 47 7 47 <BS> 8 22 22 22 8 22 <HT> 9 5 5 5 9 5 <LF> 10 37 21 21 10 21 ** <VT> 11 11 11 11 11 11 <FF> 12 12 12 12 12 12 <CR> 13 13 13 13 13 13 <SO> 14 14 14 14 14 14 <SI> 15 15 15 15 15 15 <DLE> 16 16 16 16 16 16 <DC1> 17 17 17 17 17 17 <DC2> 18 18 18 18 18 18 <DC3> 19 19 19 19 19 19 <DC4> 20 60 60 60 20 60 <NAK> 21 61 61 61 21 61 <SYN> 22 50 50 50 22 50 <ETB> 23 38 38 38 23 38 <CAN> 24 24 24 24 24 24 <EOM> 25 25 25 25 25 25 <SUB> 26 63 63 63 26 63 <ESC> 27 39 39 39 27 39 <FS> 28 28 28 28 28 28 <GS> 29 29 29 29 29 29 <RS> 30 30 30 30 30 30 <US> 31 31 31 31 31 31 <SPACE> 32 64 64 64 32 64 ! 33 90 90 90 33 90 " 34 127 127 127 34 127 # 35 123 123 123 35 123 $ 36 91 91 91 36 91 % 37 108 108 108 37 108 & 38 80 80 80 38 80 ' 39 125 125 125 39 125 ( 40 77 77 77 40 77 ) 41 93 93 93 41 93 * 42 92 92 92 42 92 + 43 78 78 78 43 78 , 44 107 107 107 44 107 - 45 96 96 96 45 96 . 46 75 75 75 46 75 / 47 97 97 97 47 97 0 48 240 240 240 48 240 1 49 241 241 241 49 241 2 50 242 242 242 50 242 3 51 243 243 243 51 243 4 52 244 244 244 52 244 5 53 245 245 245 53 245 6 54 246 246 246 54 246 7 55 247 247 247 55 247 8 56 248 248 248 56 248 9 57 249 249 249 57 249 : 58 122 122 122 58 122 ; 59 94 94 94 59 94 < 60 76 76 76 60 76 = 61 126 126 126 61 126 > 62 110 110 110 62 110 ? 63 111 111 111 63 111 @ 64 124 124 124 64 124 A 65 193 193 193 65 193 B 66 194 194 194 66 194 C 67 195 195 195 67 195 D 68 196 196 196 68 196 E 69 197 197 197 69 197 F 70 198 198 198 70 198 G 71 199 199 199 71 199 H 72 200 200 200 72 200 I 73 201 201 201 73 201 J 74 209 209 209 74 209 K 75 210 210 210 75 210 L 76 211 211 211 76 211 M 77 212 212 212 77 212 N 78 213 213 213 78 213 O 79 214 214 214 79 214 P 80 215 215 215 80 215 Q 81 216 216 216 81 216 R 82 217 217 217 82 217 S 83 226 226 226 83 226 T 84 227 227 227 84 227 U 85 228 228 228 85 228 V 86 229 229 229 86 229 W 87 230 230 230 87 230 X 88 231 231 231 88 231 Y 89 232 232 232 89 232 Z 90 233 233 233 90 233 [ 91 186 173 187 91 173 ** ## \ 92 224 224 188 92 224 ## ] 93 187 189 189 93 189 ** ^ 94 176 95 106 94 95 ** ## _ 95 109 109 109 95 109 ` 96 121 121 74 96 121 ## a 97 129 129 129 97 129 b 98 130 130 130 98 130 c 99 131 131 131 99 131 d 100 132 132 132 100 132 e 101 133 133 133 101 133 f 102 134 134 134 102 134 g 103 135 135 135 103 135 h 104 136 136 136 104 136 i 105 137 137 137 105 137 j 106 145 145 145 106 145 k 107 146 146 146 107 146 l 108 147 147 147 108 147 m 109 148 148 148 109 148 n 110 149 149 149 110 149 o 111 150 150 150 111 150 p 112 151 151 151 112 151 q 113 152 152 152 113 152 r 114 153 153 153 114 153 s 115 162 162 162 115 162 t 116 163 163 163 116 163 u 117 164 164 164 117 164 v 118 165 165 165 118 165 w 119 166 166 166 119 166 x 120 167 167 167 120 167 y 121 168 168 168 121 168 z 122 169 169 169 122 169 { 123 192 192 251 123 192 ## | 124 79 79 79 124 79 } 125 208 208 253 125 208 ## ~ 126 161 161 255 126 161 ## <DEL> 127 7 7 7 127 7 <PAD> 128 32 32 32 194.128 32 <HOP> 129 33 33 33 194.129 33 <BPH> 130 34 34 34 194.130 34 <NBH> 131 35 35 35 194.131 35 <IND> 132 36 36 36 194.132 36 <NEL> 133 21 37 37 194.133 37 ** <SSA> 134 6 6 6 194.134 6 <ESA> 135 23 23 23 194.135 23 <HTS> 136 40 40 40 194.136 40 <HTJ> 137 41 41 41 194.137 41 <VTS> 138 42 42 42 194.138 42 <PLD> 139 43 43 43 194.139 43 <PLU> 140 44 44 44 194.140 44 <RI> 141 9 9 9 194.141 9 <SS2> 142 10 10 10 194.142 10 <SS3> 143 27 27 27 194.143 27 <DCS> 144 48 48 48 194.144 48 <PU1> 145 49 49 49 194.145 49 <PU2> 146 26 26 26 194.146 26 <STS> 147 51 51 51 194.147 51 <CCH> 148 52 52 52 194.148 52 <MW> 149 53 53 53 194.149 53 <SPA> 150 54 54 54 194.150 54 <EPA> 151 8 8 8 194.151 8 <SOS> 152 56 56 56 194.152 56 <SGC> 153 57 57 57 194.153 57 <SCI> 154 58 58 58 194.154 58 <CSI> 155 59 59 59 194.155 59 <ST> 156 4 4 4 194.156 4 <OSC> 157 20 20 20 194.157 20 <PM> 158 62 62 62 194.158 62 <APC> 159 255 255 95 194.159 255 ## <NON-BREAKING SPACE> 160 65 65 65 194.160 128.65 <INVERTED "!" > 161 170 170 170 194.161 128.66 <CENT SIGN> 162 74 74 176 194.162 128.67 ## <POUND SIGN> 163 177 177 177 194.163 128.68 <CURRENCY SIGN> 164 159 159 159 194.164 128.69 <YEN SIGN> 165 178 178 178 194.165 128.70 <BROKEN BAR> 166 106 106 208 194.166 128.71 ## <SECTION SIGN> 167 181 181 181 194.167 128.72 <DIAERESIS> 168 189 187 121 194.168 128.73 ** ## <COPYRIGHT SIGN> 169 180 180 180 194.169 128.74 <FEMININE ORDINAL> 170 154 154 154 194.170 128.81 <LEFT POINTING GUILLEMET> 171 138 138 138 194.171 128.82 <NOT SIGN> 172 95 176 186 194.172 128.83 ** ## <SOFT HYPHEN> 173 202 202 202 194.173 128.84 <REGISTERED TRADE MARK> 174 175 175 175 194.174 128.85 <MACRON> 175 188 188 161 194.175 128.86 ## <DEGREE SIGN> 176 144 144 144 194.176 128.87 <PLUS-OR-MINUS SIGN> 177 143 143 143 194.177 128.88 <SUPERSCRIPT TWO> 178 234 234 234 194.178 128.89 <SUPERSCRIPT THREE> 179 250 250 250 194.179 128.98 <ACUTE ACCENT> 180 190 190 190 194.180 128.99 <MICRO SIGN> 181 160 160 160 194.181 128.100 <PARAGRAPH SIGN> 182 182 182 182 194.182 128.101 <MIDDLE DOT> 183 179 179 179 194.183 128.102 <CEDILLA> 184 157 157 157 194.184 128.103 <SUPERSCRIPT ONE> 185 218 218 218 194.185 128.104 <MASC. ORDINAL INDICATOR> 186 155 155 155 194.186 128.105 <RIGHT POINTING GUILLEMET> 187 139 139 139 194.187 128.106 <FRACTION ONE QUARTER> 188 183 183 183 194.188 128.112 <FRACTION ONE HALF> 189 184 184 184 194.189 128.113 <FRACTION THREE QUARTERS> 190 185 185 185 194.190 128.114 <INVERTED QUESTION MARK> 191 171 171 171 194.191 128.115 <A WITH GRAVE> 192 100 100 100 195.128 138.65 <A WITH ACUTE> 193 101 101 101 195.129 138.66 <A WITH CIRCUMFLEX> 194 98 98 98 195.130 138.67 <A WITH TILDE> 195 102 102 102 195.131 138.68 <A WITH DIAERESIS> 196 99 99 99 195.132 138.69 <A WITH RING ABOVE> 197 103 103 103 195.133 138.70 <CAPITAL LIGATURE AE> 198 158 158 158 195.134 138.71 <C WITH CEDILLA> 199 104 104 104 195.135 138.72 <E WITH GRAVE> 200 116 116 116 195.136 138.73 <E WITH ACUTE> 201 113 113 113 195.137 138.74 <E WITH CIRCUMFLEX> 202 114 114 114 195.138 138.81 <E WITH DIAERESIS> 203 115 115 115 195.139 138.82 <I WITH GRAVE> 204 120 120 120 195.140 138.83 <I WITH ACUTE> 205 117 117 117 195.141 138.84 <I WITH CIRCUMFLEX> 206 118 118 118 195.142 138.85 <I WITH DIAERESIS> 207 119 119 119 195.143 138.86 <CAPITAL LETTER ETH> 208 172 172 172 195.144 138.87 <N WITH TILDE> 209 105 105 105 195.145 138.88 <O WITH GRAVE> 210 237 237 237 195.146 138.89 <O WITH ACUTE> 211 238 238 238 195.147 138.98 <O WITH CIRCUMFLEX> 212 235 235 235 195.148 138.99 <O WITH TILDE> 213 239 239 239 195.149 138.100 <O WITH DIAERESIS> 214 236 236 236 195.150 138.101 <MULTIPLICATION SIGN> 215 191 191 191 195.151 138.102 <O WITH STROKE> 216 128 128 128 195.152 138.103 <U WITH GRAVE> 217 253 253 224 195.153 138.104 ## <U WITH ACUTE> 218 254 254 254 195.154 138.105 <U WITH CIRCUMFLEX> 219 251 251 221 195.155 138.106 ## <U WITH DIAERESIS> 220 252 252 252 195.156 138.112 <Y WITH ACUTE> 221 173 186 173 195.157 138.113 ** ## <CAPITAL LETTER THORN> 222 174 174 174 195.158 138.114 <SMALL LETTER SHARP S> 223 89 89 89 195.159 138.115 <a WITH GRAVE> 224 68 68 68 195.160 139.65 <a WITH ACUTE> 225 69 69 69 195.161 139.66 <a WITH CIRCUMFLEX> 226 66 66 66 195.162 139.67 <a WITH TILDE> 227 70 70 70 195.163 139.68 <a WITH DIAERESIS> 228 67 67 67 195.164 139.69 <a WITH RING ABOVE> 229 71 71 71 195.165 139.70 <SMALL LIGATURE ae> 230 156 156 156 195.166 139.71 <c WITH CEDILLA> 231 72 72 72 195.167 139.72 <e WITH GRAVE> 232 84 84 84 195.168 139.73 <e WITH ACUTE> 233 81 81 81 195.169 139.74 <e WITH CIRCUMFLEX> 234 82 82 82 195.170 139.81 <e WITH DIAERESIS> 235 83 83 83 195.171 139.82 <i WITH GRAVE> 236 88 88 88 195.172 139.83 <i WITH ACUTE> 237 85 85 85 195.173 139.84 <i WITH CIRCUMFLEX> 238 86 86 86 195.174 139.85 <i WITH DIAERESIS> 239 87 87 87 195.175 139.86 <SMALL LETTER eth> 240 140 140 140 195.176 139.87 <n WITH TILDE> 241 73 73 73 195.177 139.88 <o WITH GRAVE> 242 205 205 205 195.178 139.89 <o WITH ACUTE> 243 206 206 206 195.179 139.98 <o WITH CIRCUMFLEX> 244 203 203 203 195.180 139.99 <o WITH TILDE> 245 207 207 207 195.181 139.100 <o WITH DIAERESIS> 246 204 204 204 195.182 139.101 <DIVISION SIGN> 247 225 225 225 195.183 139.102 <o WITH STROKE> 248 112 112 112 195.184 139.103 <u WITH GRAVE> 249 221 221 192 195.185 139.104 ## <u WITH ACUTE> 250 222 222 222 195.186 139.105 <u WITH CIRCUMFLEX> 251 219 219 219 195.187 139.106 <u WITH DIAERESIS> 252 220 220 220 195.188 139.112 <y WITH ACUTE> 253 141 141 141 195.189 139.113 <SMALL LETTER thorn> 254 142 142 142 195.190 139.114 <y WITH DIAERESIS> 255 223 223 223 195.191 139.115 If you would rather see the above table in CCSID 0037 order rather than ASCII + Latin-1 order then run the table through: =over 4 =item recipe 4 =back perl \ -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,34,3)]}@l;}' perlebcdic.pod If you would rather see it in CCSID 1047 order then change the number 34 in the last line to 39, like this: =over 4 =item recipe 5 =back perl \ -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,39,3)]}@l;}' perlebcdic.pod If you would rather see it in POSIX-BC order then change the number 34 in the last line to 44, like this: =over 4 =item recipe 6 =back perl \ -ne 'if(/.{29}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}\s{2,4}\d{1,3}/)'\ -e '{push(@l,$_)}' \ -e 'END{print map{$_->[0]}' \ -e ' sort{$a->[1] <=> $b->[1]}' \ -e ' map{[$_,substr($_,44,3)]}@l;}' perlebcdic.pod =head2 Table in hex, sorted in 1047 order Since this document was first written, the convention has become more and more to use hexadecimal notation for code points. To do this with the recipes and to also sort is a multi-step process, so here, for convenience, is the table from above, re-sorted to be in Code Page 1047 order, and using hex notation. ISO 8859-1 POS- CCSID CCSID CCSID CCSID IX- 1047 chr 0819 0037 1047 BC UTF-8 UTF-EBCDIC --------------------------------------------------------------------- <NUL> 00 00 00 00 00 00 <SOH> 01 01 01 01 01 01 <STX> 02 02 02 02 02 02 <ETX> 03 03 03 03 03 03 <ST> 9C 04 04 04 C2.9C 04 <HT> 09 05 05 05 09 05 <SSA> 86 06 06 06 C2.86 06 <DEL> 7F 07 07 07 7F 07 <EPA> 97 08 08 08 C2.97 08 <RI> 8D 09 09 09 C2.8D 09 <SS2> 8E 0A 0A 0A C2.8E 0A <VT> 0B 0B 0B 0B 0B 0B <FF> 0C 0C 0C 0C 0C 0C <CR> 0D 0D 0D 0D 0D 0D <SO> 0E 0E 0E 0E 0E 0E <SI> 0F 0F 0F 0F 0F 0F <DLE> 10 10 10 10 10 10 <DC1> 11 11 11 11 11 11 <DC2> 12 12 12 12 12 12 <DC3> 13 13 13 13 13 13 <OSC> 9D 14 14 14 C2.9D 14 <LF> 0A 25 15 15 0A 15 ** <BS> 08 16 16 16 08 16 <ESA> 87 17 17 17 C2.87 17 <CAN> 18 18 18 18 18 18 <EOM> 19 19 19 19 19 19 <PU2> 92 1A 1A 1A C2.92 1A <SS3> 8F 1B 1B 1B C2.8F 1B <FS> 1C 1C 1C 1C 1C 1C <GS> 1D 1D 1D 1D 1D 1D <RS> 1E 1E 1E 1E 1E 1E <US> 1F 1F 1F 1F 1F 1F <PAD> 80 20 20 20 C2.80 20 <HOP> 81 21 21 21 C2.81 21 <BPH> 82 22 22 22 C2.82 22 <NBH> 83 23 23 23 C2.83 23 <IND> 84 24 24 24 C2.84 24 <NEL> 85 15 25 25 C2.85 25 ** <ETB> 17 26 26 26 17 26 <ESC> 1B 27 27 27 1B 27 <HTS> 88 28 28 28 C2.88 28 <HTJ> 89 29 29 29 C2.89 29 <VTS> 8A 2A 2A 2A C2.8A 2A <PLD> 8B 2B 2B 2B C2.8B 2B <PLU> 8C 2C 2C 2C C2.8C 2C <ENQ> 05 2D 2D 2D 05 2D <ACK> 06 2E 2E 2E 06 2E <BEL> 07 2F 2F 2F 07 2F <DCS> 90 30 30 30 C2.90 30 <PU1> 91 31 31 31 C2.91 31 <SYN> 16 32 32 32 16 32 <STS> 93 33 33 33 C2.93 33 <CCH> 94 34 34 34 C2.94 34 <MW> 95 35 35 35 C2.95 35 <SPA> 96 36 36 36 C2.96 36 <EOT> 04 37 37 37 04 37 <SOS> 98 38 38 38 C2.98 38 <SGC> 99 39 39 39 C2.99 39 <SCI> 9A 3A 3A 3A C2.9A 3A <CSI> 9B 3B 3B 3B C2.9B 3B <DC4> 14 3C 3C 3C 14 3C <NAK> 15 3D 3D 3D 15 3D <PM> 9E 3E 3E 3E C2.9E 3E <SUB> 1A 3F 3F 3F 1A 3F <SPACE> 20 40 40 40 20 40 <NON-BREAKING SPACE> A0 41 41 41 C2.A0 80.41 <a WITH CIRCUMFLEX> E2 42 42 42 C3.A2 8B.43 <a WITH DIAERESIS> E4 43 43 43 C3.A4 8B.45 <a WITH GRAVE> E0 44 44 44 C3.A0 8B.41 <a WITH ACUTE> E1 45 45 45 C3.A1 8B.42 <a WITH TILDE> E3 46 46 46 C3.A3 8B.44 <a WITH RING ABOVE> E5 47 47 47 C3.A5 8B.46 <c WITH CEDILLA> E7 48 48 48 C3.A7 8B.48 <n WITH TILDE> F1 49 49 49 C3.B1 8B.58 <CENT SIGN> A2 4A 4A B0 C2.A2 80.43 ## . 2E 4B 4B 4B 2E 4B < 3C 4C 4C 4C 3C 4C ( 28 4D 4D 4D 28 4D + 2B 4E 4E 4E 2B 4E | 7C 4F 4F 4F 7C 4F & 26 50 50 50 26 50 <e WITH ACUTE> E9 51 51 51 C3.A9 8B.4A <e WITH CIRCUMFLEX> EA 52 52 52 C3.AA 8B.51 <e WITH DIAERESIS> EB 53 53 53 C3.AB 8B.52 <e WITH GRAVE> E8 54 54 54 C3.A8 8B.49 <i WITH ACUTE> ED 55 55 55 C3.AD 8B.54 <i WITH CIRCUMFLEX> EE 56 56 56 C3.AE 8B.55 <i WITH DIAERESIS> EF 57 57 57 C3.AF 8B.56 <i WITH GRAVE> EC 58 58 58 C3.AC 8B.53 <SMALL LETTER SHARP S> DF 59 59 59 C3.9F 8A.73 ! 21 5A 5A 5A 21 5A $ 24 5B 5B 5B 24 5B * 2A 5C 5C 5C 2A 5C ) 29 5D 5D 5D 29 5D ; 3B 5E 5E 5E 3B 5E ^ 5E B0 5F 6A 5E 5F ** ## - 2D 60 60 60 2D 60 / 2F 61 61 61 2F 61 <A WITH CIRCUMFLEX> C2 62 62 62 C3.82 8A.43 <A WITH DIAERESIS> C4 63 63 63 C3.84 8A.45 <A WITH GRAVE> C0 64 64 64 C3.80 8A.41 <A WITH ACUTE> C1 65 65 65 C3.81 8A.42 <A WITH TILDE> C3 66 66 66 C3.83 8A.44 <A WITH RING ABOVE> C5 67 67 67 C3.85 8A.46 <C WITH CEDILLA> C7 68 68 68 C3.87 8A.48 <N WITH TILDE> D1 69 69 69 C3.91 8A.58 <BROKEN BAR> A6 6A 6A D0 C2.A6 80.47 ## , 2C 6B 6B 6B 2C 6B % 25 6C 6C 6C 25 6C _ 5F 6D 6D 6D 5F 6D > 3E 6E 6E 6E 3E 6E ? 3F 6F 6F 6F 3F 6F <o WITH STROKE> F8 70 70 70 C3.B8 8B.67 <E WITH ACUTE> C9 71 71 71 C3.89 8A.4A <E WITH CIRCUMFLEX> CA 72 72 72 C3.8A 8A.51 <E WITH DIAERESIS> CB 73 73 73 C3.8B 8A.52 <E WITH GRAVE> C8 74 74 74 C3.88 8A.49 <I WITH ACUTE> CD 75 75 75 C3.8D 8A.54 <I WITH CIRCUMFLEX> CE 76 76 76 C3.8E 8A.55 <I WITH DIAERESIS> CF 77 77 77 C3.8F 8A.56 <I WITH GRAVE> CC 78 78 78 C3.8C 8A.53 ` 60 79 79 4A 60 79 ## : 3A 7A 7A 7A 3A 7A # 23 7B 7B 7B 23 7B @ 40 7C 7C 7C 40 7C ' 27 7D 7D 7D 27 7D = 3D 7E 7E 7E 3D 7E " 22 7F 7F 7F 22 7F <O WITH STROKE> D8 80 80 80 C3.98 8A.67 a 61 81 81 81 61 81 b 62 82 82 82 62 82 c 63 83 83 83 63 83 d 64 84 84 84 64 84 e 65 85 85 85 65 85 f 66 86 86 86 66 86 g 67 87 87 87 67 87 h 68 88 88 88 68 88 i 69 89 89 89 69 89 <LEFT POINTING GUILLEMET> AB 8A 8A 8A C2.AB 80.52 <RIGHT POINTING GUILLEMET> BB 8B 8B 8B C2.BB 80.6A <SMALL LETTER eth> F0 8C 8C 8C C3.B0 8B.57 <y WITH ACUTE> FD 8D 8D 8D C3.BD 8B.71 <SMALL LETTER thorn> FE 8E 8E 8E C3.BE 8B.72 <PLUS-OR-MINUS SIGN> B1 8F 8F 8F C2.B1 80.58 <DEGREE SIGN> B0 90 90 90 C2.B0 80.57 j 6A 91 91 91 6A 91 k 6B 92 92 92 6B 92 l 6C 93 93 93 6C 93 m 6D 94 94 94 6D 94 n 6E 95 95 95 6E 95 o 6F 96 96 96 6F 96 p 70 97 97 97 70 97 q 71 98 98 98 71 98 r 72 99 99 99 72 99 <FEMININE ORDINAL> AA 9A 9A 9A C2.AA 80.51 <MASC. ORDINAL INDICATOR> BA 9B 9B 9B C2.BA 80.69 <SMALL LIGATURE ae> E6 9C 9C 9C C3.A6 8B.47 <CEDILLA> B8 9D 9D 9D C2.B8 80.67 <CAPITAL LIGATURE AE> C6 9E 9E 9E C3.86 8A.47 <CURRENCY SIGN> A4 9F 9F 9F C2.A4 80.45 <MICRO SIGN> B5 A0 A0 A0 C2.B5 80.64 ~ 7E A1 A1 FF 7E A1 ## s 73 A2 A2 A2 73 A2 t 74 A3 A3 A3 74 A3 u 75 A4 A4 A4 75 A4 v 76 A5 A5 A5 76 A5 w 77 A6 A6 A6 77 A6 x 78 A7 A7 A7 78 A7 y 79 A8 A8 A8 79 A8 z 7A A9 A9 A9 7A A9 <INVERTED "!" > A1 AA AA AA C2.A1 80.42 <INVERTED QUESTION MARK> BF AB AB AB C2.BF 80.73 <CAPITAL LETTER ETH> D0 AC AC AC C3.90 8A.57 [ 5B BA AD BB 5B AD ** ## <CAPITAL LETTER THORN> DE AE AE AE C3.9E 8A.72 <REGISTERED TRADE MARK> AE AF AF AF C2.AE 80.55 <NOT SIGN> AC 5F B0 BA C2.AC 80.53 ** ## <POUND SIGN> A3 B1 B1 B1 C2.A3 80.44 <YEN SIGN> A5 B2 B2 B2 C2.A5 80.46 <MIDDLE DOT> B7 B3 B3 B3 C2.B7 80.66 <COPYRIGHT SIGN> A9 B4 B4 B4 C2.A9 80.4A <SECTION SIGN> A7 B5 B5 B5 C2.A7 80.48 <PARAGRAPH SIGN> B6 B6 B6 B6 C2.B6 80.65 <FRACTION ONE QUARTER> BC B7 B7 B7 C2.BC 80.70 <FRACTION ONE HALF> BD B8 B8 B8 C2.BD 80.71 <FRACTION THREE QUARTERS> BE B9 B9 B9 C2.BE 80.72 <Y WITH ACUTE> DD AD BA AD C3.9D 8A.71 ** ## <DIAERESIS> A8 BD BB 79 C2.A8 80.49 ** ## <MACRON> AF BC BC A1 C2.AF 80.56 ## ] 5D BB BD BD 5D BD ** <ACUTE ACCENT> B4 BE BE BE C2.B4 80.63 <MULTIPLICATION SIGN> D7 BF BF BF C3.97 8A.66 { 7B C0 C0 FB 7B C0 ## A 41 C1 C1 C1 41 C1 B 42 C2 C2 C2 42 C2 C 43 C3 C3 C3 43 C3 D 44 C4 C4 C4 44 C4 E 45 C5 C5 C5 45 C5 F 46 C6 C6 C6 46 C6 G 47 C7 C7 C7 47 C7 H 48 C8 C8 C8 48 C8 I 49 C9 C9 C9 49 C9 <SOFT HYPHEN> AD CA CA CA C2.AD 80.54 <o WITH CIRCUMFLEX> F4 CB CB CB C3.B4 8B.63 <o WITH DIAERESIS> F6 CC CC CC C3.B6 8B.65 <o WITH GRAVE> F2 CD CD CD C3.B2 8B.59 <o WITH ACUTE> F3 CE CE CE C3.B3 8B.62 <o WITH TILDE> F5 CF CF CF C3.B5 8B.64 } 7D D0 D0 FD 7D D0 ## J 4A D1 D1 D1 4A D1 K 4B D2 D2 D2 4B D2 L 4C D3 D3 D3 4C D3 M 4D D4 D4 D4 4D D4 N 4E D5 D5 D5 4E D5 O 4F D6 D6 D6 4F D6 P 50 D7 D7 D7 50 D7 Q 51 D8 D8 D8 51 D8 R 52 D9 D9 D9 52 D9 <SUPERSCRIPT ONE> B9 DA DA DA C2.B9 80.68 <u WITH CIRCUMFLEX> FB DB DB DB C3.BB 8B.6A <u WITH DIAERESIS> FC DC DC DC C3.BC 8B.70 <u WITH GRAVE> F9 DD DD C0 C3.B9 8B.68 ## <u WITH ACUTE> FA DE DE DE C3.BA 8B.69 <y WITH DIAERESIS> FF DF DF DF C3.BF 8B.73 \ 5C E0 E0 BC 5C E0 ## <DIVISION SIGN> F7 E1 E1 E1 C3.B7 8B.66 S 53 E2 E2 E2 53 E2 T 54 E3 E3 E3 54 E3 U 55 E4 E4 E4 55 E4 V 56 E5 E5 E5 56 E5 W 57 E6 E6 E6 57 E6 X 58 E7 E7 E7 58 E7 Y 59 E8 E8 E8 59 E8 Z 5A E9 E9 E9 5A E9 <SUPERSCRIPT TWO> B2 EA EA EA C2.B2 80.59 <O WITH CIRCUMFLEX> D4 EB EB EB C3.94 8A.63 <O WITH DIAERESIS> D6 EC EC EC C3.96 8A.65 <O WITH GRAVE> D2 ED ED ED C3.92 8A.59 <O WITH ACUTE> D3 EE EE EE C3.93 8A.62 <O WITH TILDE> D5 EF EF EF C3.95 8A.64 0 30 F0 F0 F0 30 F0 1 31 F1 F1 F1 31 F1 2 32 F2 F2 F2 32 F2 3 33 F3 F3 F3 33 F3 4 34 F4 F4 F4 34 F4 5 35 F5 F5 F5 35 F5 6 36 F6 F6 F6 36 F6 7 37 F7 F7 F7 37 F7 8 38 F8 F8 F8 38 F8 9 39 F9 F9 F9 39 F9 <SUPERSCRIPT THREE> B3 FA FA FA C2.B3 80.62 <U WITH CIRCUMFLEX> DB FB FB DD C3.9B 8A.6A ## <U WITH DIAERESIS> DC FC FC FC C3.9C 8A.70 <U WITH GRAVE> D9 FD FD E0 C3.99 8A.68 ## <U WITH ACUTE> DA FE FE FE C3.9A 8A.69 <APC> 9F FF FF 5F C2.9F FF ## =head1 IDENTIFYING CHARACTER CODE SETS It is possible to determine which character set you are operating under. But first you need to be really really sure you need to do this. Your code will be simpler and probably just as portable if you don't have to test the character set and do different things, depending. There are actually only very few circumstances where it's not easy to write straight-line code portable to all character sets. See L<perluniintro/Unicode and EBCDIC> for how to portably specify characters. But there are some cases where you may want to know which character set you are running under. One possible example is doing L<sorting|/SORTING> in inner loops where performance is critical. To determine if you are running under ASCII or EBCDIC, you can use the return value of C<ord()> or C<chr()> to test one or more character values. For example: $is_ascii = "A" eq chr(65); $is_ebcdic = "A" eq chr(193); $is_ascii = ord("A") == 65; $is_ebcdic = ord("A") == 193; There's even less need to distinguish between EBCDIC code pages, but to do so try looking at one or more of the characters that differ between them. $is_ascii = ord('[') == 91; $is_ebcdic_37 = ord('[') == 186; $is_ebcdic_1047 = ord('[') == 173; $is_ebcdic_POSIX_BC = ord('[') == 187; However, it would be unwise to write tests such as: $is_ascii = "\r" ne chr(13); # WRONG $is_ascii = "\n" ne chr(10); # ILL ADVISED Obviously the first of these will fail to distinguish most ASCII platforms from either a CCSID 0037, a 1047, or a POSIX-BC EBCDIC platform since S<C<"\r" eq chr(13)>> under all of those coded character sets. But note too that because C<"\n"> is C<chr(13)> and C<"\r"> is C<chr(10)> on old Macintosh (which is an ASCII platform) the second C<$is_ascii> test will lead to trouble there. To determine whether or not perl was built under an EBCDIC code page you can use the Config module like so: use Config; $is_ebcdic = $Config{'ebcdic'} eq 'define'; =head1 CONVERSIONS =head2 C<utf8::unicode_to_native()> and C<utf8::native_to_unicode()> These functions take an input numeric code point in one encoding and return what its equivalent value is in the other. See L<utf8>. =head2 tr/// In order to convert a string of characters from one character set to another a simple list of numbers, such as in the right columns in the above table, along with Perl's C<tr///> operator is all that is needed. The data in the table are in ASCII/Latin1 order, hence the EBCDIC columns provide easy-to-use ASCII/Latin1 to EBCDIC operations that are also easily reversed. For example, to convert ASCII/Latin1 to code page 037 take the output of the second numbers column from the output of recipe 2 (modified to add C<"\"> characters), and use it in C<tr///> like so: $cp_037 = '\x00\x01\x02\x03\x37\x2D\x2E\x2F\x16\x05\x25\x0B\x0C\x0D\x0E\x0F' . '\x10\x11\x12\x13\x3C\x3D\x32\x26\x18\x19\x3F\x27\x1C\x1D\x1E\x1F' . '\x40\x5A\x7F\x7B\x5B\x6C\x50\x7D\x4D\x5D\x5C\x4E\x6B\x60\x4B\x61' . '\xF0\xF1\xF2\xF3\xF4\xF5\xF6\xF7\xF8\xF9\x7A\x5E\x4C\x7E\x6E\x6F' . '\x7C\xC1\xC2\xC3\xC4\xC5\xC6\xC7\xC8\xC9\xD1\xD2\xD3\xD4\xD5\xD6' . '\xD7\xD8\xD9\xE2\xE3\xE4\xE5\xE6\xE7\xE8\xE9\xBA\xE0\xBB\xB0\x6D' . '\x79\x81\x82\x83\x84\x85\x86\x87\x88\x89\x91\x92\x93\x94\x95\x96' . '\x97\x98\x99\xA2\xA3\xA4\xA5\xA6\xA7\xA8\xA9\xC0\x4F\xD0\xA1\x07' . '\x20\x21\x22\x23\x24\x15\x06\x17\x28\x29\x2A\x2B\x2C\x09\x0A\x1B' . '\x30\x31\x1A\x33\x34\x35\x36\x08\x38\x39\x3A\x3B\x04\x14\x3E\xFF' . '\x41\xAA\x4A\xB1\x9F\xB2\x6A\xB5\xBD\xB4\x9A\x8A\x5F\xCA\xAF\xBC' . '\x90\x8F\xEA\xFA\xBE\xA0\xB6\xB3\x9D\xDA\x9B\x8B\xB7\xB8\xB9\xAB' . '\x64\x65\x62\x66\x63\x67\x9E\x68\x74\x71\x72\x73\x78\x75\x76\x77' . '\xAC\x69\xED\xEE\xEB\xEF\xEC\xBF\x80\xFD\xFE\xFB\xFC\xAD\xAE\x59' . '\x44\x45\x42\x46\x43\x47\x9C\x48\x54\x51\x52\x53\x58\x55\x56\x57' . '\x8C\x49\xCD\xCE\xCB\xCF\xCC\xE1\x70\xDD\xDE\xDB\xDC\x8D\x8E\xDF'; my $ebcdic_string = $ascii_string; eval '$ebcdic_string =~ tr/\000-\377/' . $cp_037 . '/'; To convert from EBCDIC 037 to ASCII just reverse the order of the tr/// arguments like so: my $ascii_string = $ebcdic_string; eval '$ascii_string =~ tr/' . $cp_037 . '/\000-\377/'; Similarly one could take the output of the third numbers column from recipe 2 to obtain a C<$cp_1047> table. The fourth numbers column of the output from recipe 2 could provide a C<$cp_posix_bc> table suitable for transcoding as well. If you wanted to see the inverse tables, you would first have to sort on the desired numbers column as in recipes 4, 5 or 6, then take the output of the first numbers column. =head2 iconv XPG operability often implies the presence of an I<iconv> utility available from the shell or from the C library. Consult your system's documentation for information on iconv. On OS/390 or z/OS see the L<iconv(1)> manpage. One way to invoke the C<iconv> shell utility from within perl would be to: # OS/390 or z/OS example $ascii_data = `echo '$ebcdic_data'| iconv -f IBM-1047 -t ISO8859-1` or the inverse map: # OS/390 or z/OS example $ebcdic_data = `echo '$ascii_data'| iconv -f ISO8859-1 -t IBM-1047` For other Perl-based conversion options see the C<Convert::*> modules on CPAN. =head2 C RTL The OS/390 and z/OS C run-time libraries provide C<_atoe()> and C<_etoa()> functions. =head1 OPERATOR DIFFERENCES The C<..> range operator treats certain character ranges with care on EBCDIC platforms. For example the following array will have twenty six elements on either an EBCDIC platform or an ASCII platform: @alphabet = ('A'..'Z'); # $#alphabet == 25 The bitwise operators such as & ^ | may return different results when operating on string or character data in a Perl program running on an EBCDIC platform than when run on an ASCII platform. Here is an example adapted from the one in L<perlop>: # EBCDIC-based examples print "j p \n" ^ " a h"; # prints "JAPH\n" print "JA" | " ph\n"; # prints "japh\n" print "JAPH\nJunk" & "\277\277\277\277\277"; # prints "japh\n"; print 'p N$' ^ " E<H\n"; # prints "Perl\n"; An interesting property of the 32 C0 control characters in the ASCII table is that they can "literally" be constructed as control characters in Perl, e.g. C<(chr(0)> eq C<\c@>)> C<(chr(1)> eq C<\cA>)>, and so on. Perl on EBCDIC platforms has been ported to take C<\c@> to C<chr(0)> and C<\cA> to C<chr(1)>, etc. as well, but the characters that result depend on which code page you are using. The table below uses the standard acronyms for the controls. The POSIX-BC and 1047 sets are identical throughout this range and differ from the 0037 set at only one spot (21 decimal). Note that the line terminator character may be generated by C<\cJ> on ASCII platforms but by C<\cU> on 1047 or POSIX-BC platforms and cannot be generated as a C<"\c.letter."> control character on 0037 platforms. Note also that C<\c\> cannot be the final element in a string or regex, as it will absorb the terminator. But C<\c\I<X>> is a C<FILE SEPARATOR> concatenated with I<X> for all I<X>. The outlier C<\c?> on ASCII, which yields a non-C0 control C<DEL>, yields the outlier control C<APC> on EBCDIC, the one that isn't in the block of contiguous controls. Note that a subtlety of this is that C<\c?> on ASCII platforms is an ASCII character, while it isn't equivalent to any ASCII character in EBCDIC platforms. chr ord 8859-1 0037 1047 && POSIX-BC ----------------------------------------------------------------------- \c@ 0 <NUL> <NUL> <NUL> \cA 1 <SOH> <SOH> <SOH> \cB 2 <STX> <STX> <STX> \cC 3 <ETX> <ETX> <ETX> \cD 4 <EOT> <ST> <ST> \cE 5 <ENQ> <HT> <HT> \cF 6 <ACK> <SSA> <SSA> \cG 7 <BEL> <DEL> <DEL> \cH 8 <BS> <EPA> <EPA> \cI 9 <HT> <RI> <RI> \cJ 10 <LF> <SS2> <SS2> \cK 11 <VT> <VT> <VT> \cL 12 <FF> <FF> <FF> \cM 13 <CR> <CR> <CR> \cN 14 <SO> <SO> <SO> \cO 15 <SI> <SI> <SI> \cP 16 <DLE> <DLE> <DLE> \cQ 17 <DC1> <DC1> <DC1> \cR 18 <DC2> <DC2> <DC2> \cS 19 <DC3> <DC3> <DC3> \cT 20 <DC4> <OSC> <OSC> \cU 21 <NAK> <NEL> <LF> ** \cV 22 <SYN> <BS> <BS> \cW 23 <ETB> <ESA> <ESA> \cX 24 <CAN> <CAN> <CAN> \cY 25 <EOM> <EOM> <EOM> \cZ 26 <SUB> <PU2> <PU2> \c[ 27 <ESC> <SS3> <SS3> \c\X 28 <FS>X <FS>X <FS>X \c] 29 <GS> <GS> <GS> \c^ 30 <RS> <RS> <RS> \c_ 31 <US> <US> <US> \c? * <DEL> <APC> <APC> C<*> Note: C<\c?> maps to ordinal 127 (C<DEL>) on ASCII platforms, but since ordinal 127 is a not a control character on EBCDIC machines, C<\c?> instead maps on them to C<APC>, which is 255 in 0037 and 1047, and 95 in POSIX-BC. =head1 FUNCTION DIFFERENCES =over 8 =item C<chr()> C<chr()> must be given an EBCDIC code number argument to yield a desired character return value on an EBCDIC platform. For example: $CAPITAL_LETTER_A = chr(193); =item C<ord()> C<ord()> will return EBCDIC code number values on an EBCDIC platform. For example: $the_number_193 = ord("A"); =item C<pack()> The C<"c"> and C<"C"> templates for C<pack()> are dependent upon character set encoding. Examples of usage on EBCDIC include: $foo = pack("CCCC",193,194,195,196); # $foo eq "ABCD" $foo = pack("C4",193,194,195,196); # same thing $foo = pack("ccxxcc",193,194,195,196); # $foo eq "AB\0\0CD" The C<"U"> template has been ported to mean "Unicode" on all platforms so that pack("U", 65) eq 'A' is true on all platforms. If you want native code points for the low 256, use the C<"W"> template. This means that the equivalences pack("W", ord($character)) eq $character unpack("W", $character) == ord $character will hold. =item C<print()> One must be careful with scalars and strings that are passed to print that contain ASCII encodings. One common place for this to occur is in the output of the MIME type header for CGI script writing. For example, many Perl programming guides recommend something similar to: print "Content-type:\ttext/html\015\012\015\012"; # this may be wrong on EBCDIC You can instead write print "Content-type:\ttext/html\r\n\r\n"; # OK for DGW et al and have it work portably. That is because the translation from EBCDIC to ASCII is done by the web server in this case. Consult your web server's documentation for further details. =item C<printf()> The formats that can convert characters to numbers and vice versa will be different from their ASCII counterparts when executed on an EBCDIC platform. Examples include: printf("%c%c%c",193,194,195); # prints ABC =item C<sort()> EBCDIC sort results may differ from ASCII sort results especially for mixed case strings. This is discussed in more detail L<below|/SORTING>. =item C<sprintf()> See the discussion of C<L</printf()>> above. An example of the use of sprintf would be: $CAPITAL_LETTER_A = sprintf("%c",193); =item C<unpack()> See the discussion of C<L</pack()>> above. =back Note that it is possible to write portable code for these by specifying things in Unicode numbers, and using a conversion function: printf("%c",utf8::unicode_to_native(65)); # prints A on all # platforms print utf8::native_to_unicode(ord("A")); # Likewise, prints 65 See L<perluniintro/Unicode and EBCDIC> and L</CONVERSIONS> for other options. =head1 REGULAR EXPRESSION DIFFERENCES You can write your regular expressions just like someone on an ASCII platform would do. But keep in mind that using octal or hex notation to specify a particular code point will give you the character that the EBCDIC code page natively maps to it. (This is also true of all double-quoted strings.) If you want to write portably, just use the C<\N{U+...}> notation everywhere where you would have used C<\x{...}>, and don't use octal notation at all. Starting in Perl v5.22, this applies to ranges in bracketed character classes. If you say, for example, C<qr/[\N{U+20}-\N{U+7F}]/>, it means the characters C<\N{U+20}>, C<\N{U+21}>, ..., C<\N{U+7F}>. This range is all the printable characters that the ASCII character set contains. Prior to v5.22, you couldn't specify any ranges portably, except (starting in Perl v5.5.3) all subsets of the C<[A-Z]> and C<[a-z]> ranges are specially coded to not pick up gap characters. For example, characters such as "E<ocirc>" (C<o WITH CIRCUMFLEX>) that lie between "I" and "J" would not be matched by the regular expression range C</[H-K]/>. But if either of the range end points is explicitly numeric (and neither is specified by C<\N{U+...}>), the gap characters are matched: /[\x89-\x91]/ will match C<\x8e>, even though C<\x89> is "i" and C<\x91 > is "j", and C<\x8e> is a gap character, from the alphabetic viewpoint. Another construct to be wary of is the inappropriate use of hex (unless you use C<\N{U+...}>) or octal constants in regular expressions. Consider the following set of subs: sub is_c0 { my $char = substr(shift,0,1); $char =~ /[\000-\037]/; } sub is_print_ascii { my $char = substr(shift,0,1); $char =~ /[\040-\176]/; } sub is_delete { my $char = substr(shift,0,1); $char eq "\177"; } sub is_c1 { my $char = substr(shift,0,1); $char =~ /[\200-\237]/; } sub is_latin_1 { # But not ASCII; not C1 my $char = substr(shift,0,1); $char =~ /[\240-\377]/; } These are valid only on ASCII platforms. Starting in Perl v5.22, simply changing the octal constants to equivalent C<\N{U+...}> values makes them portable: sub is_c0 { my $char = substr(shift,0,1); $char =~ /[\N{U+00}-\N{U+1F}]/; } sub is_print_ascii { my $char = substr(shift,0,1); $char =~ /[\N{U+20}-\N{U+7E}]/; } sub is_delete { my $char = substr(shift,0,1); $char eq "\N{U+7F}"; } sub is_c1 { my $char = substr(shift,0,1); $char =~ /[\N{U+80}-\N{U+9F}]/; } sub is_latin_1 { # But not ASCII; not C1 my $char = substr(shift,0,1); $char =~ /[\N{U+A0}-\N{U+FF}]/; } And here are some alternative portable ways to write them: sub Is_c0 { my $char = substr(shift,0,1); return $char =~ /[[:cntrl:]]/a && ! Is_delete($char); # Alternatively: # return $char =~ /[[:cntrl:]]/ # && $char =~ /[[:ascii:]]/ # && ! Is_delete($char); } sub Is_print_ascii { my $char = substr(shift,0,1); return $char =~ /[[:print:]]/a; # Alternatively: # return $char =~ /[[:print:]]/ && $char =~ /[[:ascii:]]/; # Or # return $char # =~ /[ !"\#\$%&'()*+,\-.\/0-9:;<=>?\@A-Z[\\\]^_`a-z{|}~]/; } sub Is_delete { my $char = substr(shift,0,1); return utf8::native_to_unicode(ord $char) == 0x7F; } sub Is_c1 { use feature 'unicode_strings'; my $char = substr(shift,0,1); return $char =~ /[[:cntrl:]]/ && $char !~ /[[:ascii:]]/; } sub Is_latin_1 { # But not ASCII; not C1 use feature 'unicode_strings'; my $char = substr(shift,0,1); return ord($char) < 256 && $char !~ /[[:ascii:]]/ && $char !~ /[[:cntrl:]]/; } Another way to write C<Is_latin_1()> would be to use the characters in the range explicitly: sub Is_latin_1 { my $char = substr(shift,0,1); $char =~ /[ ¡¢£¤¥¦§¨©ª«¬®¯°±²³´µ¶·¸¹º»¼½¾¿ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏ] [ÐÑÒÓÔÕÖרÙÚÛÜÝÞßàáâãäåæçèéêëìíîïðñòóôõö÷øùúûüýþÿ]/x; } Although that form may run into trouble in network transit (due to the presence of 8 bit characters) or on non ISO-Latin character sets. But it does allow C<Is_c1> to be rewritten so it works on Perls that don't have C<'unicode_strings'> (earlier than v5.14): sub Is_latin_1 { # But not ASCII; not C1 my $char = substr(shift,0,1); return ord($char) < 256 && $char !~ /[[:ascii:]]/ && ! Is_latin1($char); } =head1 SOCKETS Most socket programming assumes ASCII character encodings in network byte order. Exceptions can include CGI script writing under a host web server where the server may take care of translation for you. Most host web servers convert EBCDIC data to ISO-8859-1 or Unicode on output. =head1 SORTING One big difference between ASCII-based character sets and EBCDIC ones are the relative positions of the characters when sorted in native order. Of most concern are the upper- and lowercase letters, the digits, and the underscore (C<"_">). On ASCII platforms the native sort order has the digits come before the uppercase letters which come before the underscore which comes before the lowercase letters. On EBCDIC, the underscore comes first, then the lowercase letters, then the uppercase ones, and the digits last. If sorted on an ASCII-based platform, the two-letter abbreviation for a physician comes before the two letter abbreviation for drive; that is: @sorted = sort(qw(Dr. dr.)); # @sorted holds ('Dr.','dr.') on ASCII, # but ('dr.','Dr.') on EBCDIC The property of lowercase before uppercase letters in EBCDIC is even carried to the Latin 1 EBCDIC pages such as 0037 and 1047. An example would be that "E<Euml>" (C<E WITH DIAERESIS>, 203) comes before "E<euml>" (C<e WITH DIAERESIS>, 235) on an ASCII platform, but the latter (83) comes before the former (115) on an EBCDIC platform. (Astute readers will note that the uppercase version of "E<szlig>" C<SMALL LETTER SHARP S> is simply "SS" and that the upper case versions of "E<yuml>" (small C<y WITH DIAERESIS>) and "E<micro>" (C<MICRO SIGN>) are not in the 0..255 range but are in Unicode, in a Unicode enabled Perl). The sort order will cause differences between results obtained on ASCII platforms versus EBCDIC platforms. What follows are some suggestions on how to deal with these differences. =head2 Ignore ASCII vs. EBCDIC sort differences. This is the least computationally expensive strategy. It may require some user education. =head2 Use a sort helper function This is completely general, but the most computationally expensive strategy. Choose one or the other character set and transform to that for every sort comparison. Here's a complete example that transforms to ASCII sort order: sub native_to_uni($) { my $string = shift; # Saves time on an ASCII platform return $string if ord 'A' == 65; my $output = ""; for my $i (0 .. length($string) - 1) { $output .= chr(utf8::native_to_unicode(ord(substr($string, $i, 1)))); } # Preserve utf8ness of input onto the output, even if it didn't need # to be utf8 utf8::upgrade($output) if utf8::is_utf8($string); return $output; } sub ascii_order { # Sort helper return native_to_uni($a) cmp native_to_uni($b); } sort ascii_order @list; =head2 MONO CASE then sort data (for non-digits, non-underscore) If you don't care about where digits and underscore sort to, you can do something like this sub case_insensitive_order { # Sort helper return lc($a) cmp lc($b) } sort case_insensitive_order @list; If performance is an issue, and you don't care if the output is in the same case as the input, Use C<tr///> to transform to the case most employed within the data. If the data are primarily UPPERCASE non-Latin1, then apply C<tr/[a-z]/[A-Z]/>, and then C<sort()>. If the data are primarily lowercase non Latin1 then apply C<tr/[A-Z]/[a-z]/> before sorting. If the data are primarily UPPERCASE and include Latin-1 characters then apply: tr/[a-z]/[A-Z]/; tr/[àáâãäåæçèéêëìíîïðñòóôõöøùúûüýþ]/[ÀÁÂÃÄÅÆÇÈÉÊËÌÍÎÏÐÑÒÓÔÕÖØÙÚÛÜÝÞ/; s/ß/SS/g; then C<sort()>. If you have a choice, it's better to lowercase things to avoid the problems of the two Latin-1 characters whose uppercase is outside Latin-1: "E<yuml>" (small C<y WITH DIAERESIS>) and "E<micro>" (C<MICRO SIGN>). If you do need to upppercase, you can; with a Unicode-enabled Perl, do: tr/ÿ/\x{178}/; tr/µ/\x{39C}/; =head2 Perform sorting on one type of platform only. This strategy can employ a network connection. As such it would be computationally expensive. =head1 TRANSFORMATION FORMATS There are a variety of ways of transforming data with an intra character set mapping that serve a variety of purposes. Sorting was discussed in the previous section and a few of the other more popular mapping techniques are discussed next. =head2 URL decoding and encoding Note that some URLs have hexadecimal ASCII code points in them in an attempt to overcome character or protocol limitation issues. For example the tilde character is not on every keyboard hence a URL of the form: http://www.pvhp.com/~pvhp/ may also be expressed as either of: http://www.pvhp.com/%7Epvhp/ http://www.pvhp.com/%7epvhp/ where 7E is the hexadecimal ASCII code point for "~". Here is an example of decoding such a URL in any EBCDIC code page: $url = 'http://www.pvhp.com/%7Epvhp/'; $url =~ s/%([0-9a-fA-F]{2})/ pack("c",utf8::unicode_to_native(hex($1)))/xge; Conversely, here is a partial solution for the task of encoding such a URL in any EBCDIC code page: $url = 'http://www.pvhp.com/~pvhp/'; # The following regular expression does not address the # mappings for: ('.' => '%2E', '/' => '%2F', ':' => '%3A') $url =~ s/([\t "#%&\(\),;<=>\?\@\[\\\]^`{|}~])/ sprintf("%%%02X",utf8::native_to_unicode(ord($1)))/xge; where a more complete solution would split the URL into components and apply a full s/// substitution only to the appropriate parts. =head2 uu encoding and decoding The C<u> template to C<pack()> or C<unpack()> will render EBCDIC data in EBCDIC characters equivalent to their ASCII counterparts. For example, the following will print "Yes indeed\n" on either an ASCII or EBCDIC computer: $all_byte_chrs = ''; for (0..255) { $all_byte_chrs .= chr($_); } $uuencode_byte_chrs = pack('u', $all_byte_chrs); ($uu = <<'ENDOFHEREDOC') =~ s/^\s*//gm; M``$"`P0%!@<("0H+#`T.#Q`1$A,4%187&!D:&QP='A\@(2(C)"4F)R@I*BLL M+2XO,#$R,S0U-C<X.3H[/#T^/T!!0D-$149'2$E*2TQ-3D]045)35%565UA9 M6EM<75Y?8&%B8V1E9F=H:6IK;&UN;W!Q<G-T=79W>'EZ>WQ]?G^`@8*#A(6& MAXB)BHN,C8Z/D)&2DY25EI>8F9J;G)V>GZ"AHJ.DI::GJ*FJJZRMKJ^PL;*S MM+6VM[BYNKN\O;Z_P,'"P\3%QL?(R<K+S,W.S]#1TM/4U=;7V-G:V]S=WM_@ ?X>+CY.7FY^CIZNOL[>[O\/'R\_3U]O?X^?K[_/W^_P`` ENDOFHEREDOC if ($uuencode_byte_chrs eq $uu) { print "Yes "; } $uudecode_byte_chrs = unpack('u', $uuencode_byte_chrs); if ($uudecode_byte_chrs eq $all_byte_chrs) { print "indeed\n"; } Here is a very spartan uudecoder that will work on EBCDIC: #!/usr/local/bin/perl $_ = <> until ($mode,$file) = /^begin\s*(\d*)\s*(\S*)/; open(OUT, "> $file") if $file ne ""; while(<>) { last if /^end/; next if /[a-z]/; next unless int((((utf8::native_to_unicode(ord()) - 32 ) & 077) + 2) / 3) == int(length() / 4); print OUT unpack("u", $_); } close(OUT); chmod oct($mode), $file; =head2 Quoted-Printable encoding and decoding On ASCII-encoded platforms it is possible to strip characters outside of the printable set using: # This QP encoder works on ASCII only $qp_string =~ s/([=\x00-\x1F\x80-\xFF])/ sprintf("=%02X",ord($1))/xge; Starting in Perl v5.22, this is trivially changeable to work portably on both ASCII and EBCDIC platforms. # This QP encoder works on both ASCII and EBCDIC $qp_string =~ s/([=\N{U+00}-\N{U+1F}\N{U+80}-\N{U+FF}])/ sprintf("=%02X",ord($1))/xge; For earlier Perls, a QP encoder that works on both ASCII and EBCDIC platforms would look somewhat like the following: $delete = utf8::unicode_to_native(ord("\x7F")); $qp_string =~ s/([^[:print:]$delete])/ sprintf("=%02X",utf8::native_to_unicode(ord($1)))/xage; (although in production code the substitutions might be done in the EBCDIC branch with the function call and separately in the ASCII branch without the expense of the identity map; in Perl v5.22, the identity map is optimized out so there is no expense, but the alternative above is simpler and is also available in v5.22). Such QP strings can be decoded with: # This QP decoder is limited to ASCII only $string =~ s/=([[:xdigit:][[:xdigit:])/chr hex $1/ge; $string =~ s/=[\n\r]+$//; Whereas a QP decoder that works on both ASCII and EBCDIC platforms would look somewhat like the following: $string =~ s/=([[:xdigit:][:xdigit:]])/ chr utf8::native_to_unicode(hex $1)/xge; $string =~ s/=[\n\r]+$//; =head2 Caesarean ciphers The practice of shifting an alphabet one or more characters for encipherment dates back thousands of years and was explicitly detailed by Gaius Julius Caesar in his B<Gallic Wars> text. A single alphabet shift is sometimes referred to as a rotation and the shift amount is given as a number $n after the string 'rot' or "rot$n". Rot0 and rot26 would designate identity maps on the 26-letter English version of the Latin alphabet. Rot13 has the interesting property that alternate subsequent invocations are identity maps (thus rot13 is its own non-trivial inverse in the group of 26 alphabet rotations). Hence the following is a rot13 encoder and decoder that will work on ASCII and EBCDIC platforms: #!/usr/local/bin/perl while(<>){ tr/n-za-mN-ZA-M/a-zA-Z/; print; } In one-liner form: perl -ne 'tr/n-za-mN-ZA-M/a-zA-Z/;print' =head1 Hashing order and checksums Perl deliberately randomizes hash order for security purposes on both ASCII and EBCDIC platforms. EBCDIC checksums will differ for the same file translated into ASCII and vice versa. =head1 I18N AND L10N Internationalization (I18N) and localization (L10N) are supported at least in principle even on EBCDIC platforms. The details are system-dependent and discussed under the L</OS ISSUES> section below. =head1 MULTI-OCTET CHARACTER SETS Perl works with UTF-EBCDIC, a multi-byte encoding. In Perls earlier than v5.22, there may be various bugs in this regard. Legacy multi byte EBCDIC code pages XXX. =head1 OS ISSUES There may be a few system-dependent issues of concern to EBCDIC Perl programmers. =head2 OS/400 =over 8 =item PASE The PASE environment is a runtime environment for OS/400 that can run executables built for PowerPC AIX in OS/400; see L<perlos400>. PASE is ASCII-based, not EBCDIC-based as the ILE. =item IFS access XXX. =back =head2 OS/390, z/OS Perl runs under Unix Systems Services or USS. =over 8 =item C<sigaction> C<SA_SIGINFO> can have segmentation faults. =item C<chcp> B<chcp> is supported as a shell utility for displaying and changing one's code page. See also L<chcp(1)>. =item dataset access For sequential data set access try: my @ds_records = `cat //DSNAME`; or: my @ds_records = `cat //'HLQ.DSNAME'`; See also the OS390::Stdio module on CPAN. =item C<iconv> B<iconv> is supported as both a shell utility and a C RTL routine. See also the L<iconv(1)> and L<iconv(3)> manual pages. =item locales Locales are supported. There may be glitches when a locale is another EBCDIC code page which has some of the L<code-page variant characters|/The 13 variant characters> in other positions. There aren't currently any real UTF-8 locales, even though some locale names contain the string "UTF-8". See L<perllocale> for information on locales. The L10N files are in F</usr/nls/locale>. C<$Config{d_setlocale}> is C<'define'> on OS/390 or z/OS. =back =head2 POSIX-BC? XXX. =head1 BUGS =over 4 =item * Not all shells will allow multiple C<-e> string arguments to perl to be concatenated together properly as recipes in this document 0, 2, 4, 5, and 6 might seem to imply. =item * There are a significant number of test failures in the CPAN modules shipped with Perl v5.22 and 5.24. These are only in modules not primarily maintained by Perl 5 porters. Some of these are failures in the tests only: they don't realize that it is proper to get different results on EBCDIC platforms. And some of the failures are real bugs. If you compile and do a C<make test> on Perl, all tests on the C</cpan> directory are skipped. L<Encode> partially works. =item * In earlier Perl versions, when byte and character data were concatenated, the new string was sometimes created by decoding the byte strings as I<ISO 8859-1 (Latin-1)>, even if the old Unicode string used EBCDIC. =back =head1 SEE ALSO L<perllocale>, L<perlfunc>, L<perlunicode>, L<utf8>. =head1 REFERENCES L<http://anubis.dkuug.dk/i18n/charmaps> L<https://www.unicode.org/> L<https://www.unicode.org/unicode/reports/tr16/> L<http://www.wps.com/projects/codes/> B<ASCII: American Standard Code for Information Infiltration> Tom Jennings, September 1999. B<The Unicode Standard, Version 3.0> The Unicode Consortium, Lisa Moore ed., ISBN 0-201-61633-5, Addison Wesley Developers Press, February 2000. B<CDRA: IBM - Character Data Representation Architecture - Reference and Registry>, IBM SC09-2190-00, December 1996. "Demystifying Character Sets", Andrea Vine, Multilingual Computing & Technology, B<#26 Vol. 10 Issue 4>, August/September 1999; ISSN 1523-0309; Multilingual Computing Inc. Sandpoint ID, USA. B<Codes, Ciphers, and Other Cryptic and Clandestine Communication> Fred B. Wrixon, ISBN 1-57912-040-7, Black Dog & Leventhal Publishers, 1998. L<http://www.bobbemer.com/P-BIT.HTM> B<IBM - EBCDIC and the P-bit; The biggest Computer Goof Ever> Robert Bemer. =head1 HISTORY 15 April 2001: added UTF-8 and UTF-EBCDIC to main table, pvhp. =head1 AUTHOR Peter Prymmer pvhp@best.com wrote this in 1999 and 2000 with CCSID 0819 and 0037 help from Chris Leach and AndrE<eacute> Pirard A.Pirard@ulg.ac.be as well as POSIX-BC help from Thomas Dorner Thomas.Dorner@start.de. Thanks also to Vickie Cooper, Philip Newton, William Raffloer, and Joe Smith. Trademarks, registered trademarks, service marks and registered service marks used in this document are the property of their respective owners. Now maintained by Perl5 Porters. PK �=�[�3^�t �t perlsolaris.podnu �[��� If you read this file _as_is_, just ignore the funny characters you see. It is written in the POD format (see pod/perlpod.pod) which is specifically designed to be readable as is. =head1 NAME perlsolaris - Perl version 5 on Solaris systems =head1 DESCRIPTION This document describes various features of Sun's Solaris operating system that will affect how Perl version 5 (hereafter just perl) is compiled and/or runs. Some issues relating to the older SunOS 4.x are also discussed, though they may be out of date. For the most part, everything should just work. Starting with Solaris 8, perl5.00503 (or higher) is supplied with the operating system, so you might not even need to build a newer version of perl at all. The Sun-supplied version is installed in /usr/perl5 with F</usr/bin/perl> pointing to F</usr/perl5/bin/perl>. Do not disturb that installation unless you really know what you are doing. If you remove the perl supplied with the OS, you will render some bits of your system inoperable. If you wish to install a newer version of perl, install it under a different prefix from /usr/perl5. Common prefixes to use are /usr/local and /opt/perl. You may wish to put your version of perl in the PATH of all users by changing the link F</usr/bin/perl>. This is probably OK, as most perl scripts shipped with Solaris use an explicit path. (There are a few exceptions, such as F</usr/bin/rpm2cpio> and F</etc/rcm/scripts/README>, but these are also sufficiently generic that the actual version of perl probably doesn't matter too much.) Solaris ships with a range of Solaris-specific modules. If you choose to install your own version of perl you will find the source of many of these modules is available on CPAN under the Sun::Solaris:: namespace. Solaris may include two versions of perl, e.g. Solaris 9 includes both 5.005_03 and 5.6.1. This is to provide stability across Solaris releases, in cases where a later perl version has incompatibilities with the version included in the preceding Solaris release. The default perl version will always be the most recent, and in general the old version will only be retained for one Solaris release. Note also that the default perl will NOT be configured to search for modules in the older version, again due to compatibility/stability concerns. As a consequence if you upgrade Solaris, you will have to rebuild/reinstall any additional CPAN modules that you installed for the previous Solaris version. See the CPAN manpage under 'autobundle' for a quick way of doing this. As an interim measure, you may either change the #! line of your scripts to specifically refer to the old perl version, e.g. on Solaris 9 use #!/usr/perl5/5.00503/bin/perl to use the perl version that was the default for Solaris 8, or if you have a large number of scripts it may be more convenient to make the old version of perl the default on your system. You can do this by changing the appropriate symlinks under /usr/perl5 as follows (example for Solaris 9): # cd /usr/perl5 # rm bin man pod # ln -s ./5.00503/bin # ln -s ./5.00503/man # ln -s ./5.00503/lib/pod # rm /usr/bin/perl # ln -s ../perl5/5.00503/bin/perl /usr/bin/perl In both cases this should only be considered to be a temporary measure - you should upgrade to the later version of perl as soon as is practicable. Note also that the perl command-line utilities (e.g. perldoc) and any that are added by modules that you install will be under /usr/perl5/bin, so that directory should be added to your PATH. =head2 Solaris Version Numbers. For consistency with common usage, perl's Configure script performs some minor manipulations on the operating system name and version number as reported by uname. Here's a partial translation table: Sun: perl's Configure: uname uname -r Name osname osvers SunOS 4.1.3 Solaris 1.1 sunos 4.1.3 SunOS 5.6 Solaris 2.6 solaris 2.6 SunOS 5.8 Solaris 8 solaris 2.8 SunOS 5.9 Solaris 9 solaris 2.9 SunOS 5.10 Solaris 10 solaris 2.10 The complete table can be found in the Sun Managers' FAQ L<ftp://ftp.cs.toronto.edu/pub/jdd/sunmanagers/faq> under "9.1) Which Sun models run which versions of SunOS?". =head1 RESOURCES There are many, many sources for Solaris information. A few of the important ones for perl: =over 4 =item Solaris FAQ The Solaris FAQ is available at L<http://www.science.uva.nl/pub/solaris/solaris2.html>. The Sun Managers' FAQ is available at L<ftp://ftp.cs.toronto.edu/pub/jdd/sunmanagers/faq> =item Precompiled Binaries Precompiled binaries, links to many sites, and much, much more are available at L<http://www.sunfreeware.com/> and L<http://www.blastwave.org/>. =item Solaris Documentation All Solaris documentation is available on-line at L<http://docs.sun.com/>. =back =head1 SETTING UP =head2 File Extraction Problems on Solaris. Be sure to use a tar program compiled under Solaris (not SunOS 4.x) to extract the perl-5.x.x.tar.gz file. Do not use GNU tar compiled for SunOS4 on Solaris. (GNU tar compiled for Solaris should be fine.) When you run SunOS4 binaries on Solaris, the run-time system magically alters pathnames matching m#lib/locale# so that when tar tries to create lib/locale.pm, a file named lib/oldlocale.pm gets created instead. If you found this advice too late and used a SunOS4-compiled tar anyway, you must find the incorrectly renamed file and move it back to lib/locale.pm. =head2 Compiler and Related Tools on Solaris. You must use an ANSI C compiler to build perl. Perl can be compiled with either Sun's add-on C compiler or with gcc. The C compiler that shipped with SunOS4 will not do. =head3 Include /usr/ccs/bin/ in your PATH. Several tools needed to build perl are located in /usr/ccs/bin/: ar, as, ld, and make. Make sure that /usr/ccs/bin/ is in your PATH. On all the released versions of Solaris (8, 9 and 10) you need to make sure the following packages are installed (this info is extracted from the Solaris FAQ): for tools (sccs, lex, yacc, make, nm, truss, ld, as): SUNWbtool, SUNWsprot, SUNWtoo for libraries & headers: SUNWhea, SUNWarc, SUNWlibm, SUNWlibms, SUNWdfbh, SUNWcg6h, SUNWxwinc Additionally, on Solaris 8 and 9 you also need: for 64 bit development: SUNWarcx, SUNWbtoox, SUNWdplx, SUNWscpux, SUNWsprox, SUNWtoox, SUNWlmsx, SUNWlmx, SUNWlibCx And only on Solaris 8 you also need: for libraries & headers: SUNWolinc If you are in doubt which package contains a file you are missing, try to find an installation that has that file. Then do a $ grep /my/missing/file /var/sadm/install/contents This will display a line like this: /usr/include/sys/errno.h f none 0644 root bin 7471 37605 956241356 SUNWhea The last item listed (SUNWhea in this example) is the package you need. =head3 Avoid /usr/ucb/cc. You don't need to have /usr/ucb/ in your PATH to build perl. If you want /usr/ucb/ in your PATH anyway, make sure that /usr/ucb/ is NOT in your PATH before the directory containing the right C compiler. =head3 Sun's C Compiler If you use Sun's C compiler, make sure the correct directory (usually /opt/SUNWspro/bin/) is in your PATH (before /usr/ucb/). =head3 GCC If you use gcc, make sure your installation is recent and complete. perl versions since 5.6.0 build fine with gcc > 2.8.1 on Solaris >= 2.6. You must Configure perl with $ sh Configure -Dcc=gcc If you don't, you may experience strange build errors. If you have updated your Solaris version, you may also have to update your gcc. For example, if you are running Solaris 2.6 and your gcc is installed under /usr/local, check in /usr/local/lib/gcc-lib and make sure you have the appropriate directory, sparc-sun-solaris2.6/ or i386-pc-solaris2.6/. If gcc's directory is for a different version of Solaris than you are running, then you will need to rebuild gcc for your new version of Solaris. You can get a precompiled version of gcc from L<http://www.sunfreeware.com/> or L<http://www.blastwave.org/>. Make sure you pick up the package for your Solaris release. If you wish to use gcc to build add-on modules for use with the perl shipped with Solaris, you should use the Solaris::PerlGcc module which is available from CPAN. The perl shipped with Solaris is configured and built with the Sun compilers, and the compiler configuration information stored in Config.pm is therefore only relevant to the Sun compilers. The Solaris:PerlGcc module contains a replacement Config.pm that is correct for gcc - see the module for details. =head3 GNU as and GNU ld The following information applies to gcc version 2. Volunteers to update it as appropriately for gcc version 3 would be appreciated. The versions of as and ld supplied with Solaris work fine for building perl. There is normally no need to install the GNU versions to compile perl. If you decide to ignore this advice and use the GNU versions anyway, then be sure that they are relatively recent. Versions newer than 2.7 are apparently new enough. Older versions may have trouble with dynamic loading. If you wish to use GNU ld, then you need to pass it the -Wl,-E flag. The hints/solaris_2.sh file tries to do this automatically by setting the following Configure variables: ccdlflags="$ccdlflags -Wl,-E" lddlflags="$lddlflags -Wl,-E -G" However, over the years, changes in gcc, GNU ld, and Solaris ld have made it difficult to automatically detect which ld ultimately gets called. You may have to manually edit config.sh and add the -Wl,-E flags yourself, or else run Configure interactively and add the flags at the appropriate prompts. If your gcc is configured to use GNU as and ld but you want to use the Solaris ones instead to build perl, then you'll need to add -B/usr/ccs/bin/ to the gcc command line. One convenient way to do that is with $ sh Configure -Dcc='gcc -B/usr/ccs/bin/' Note that the trailing slash is required. This will result in some harmless warnings as Configure is run: gcc: file path prefix `/usr/ccs/bin/' never used These messages may safely be ignored. (Note that for a SunOS4 system, you must use -B/bin/ instead.) Alternatively, you can use the GCC_EXEC_PREFIX environment variable to ensure that Sun's as and ld are used. Consult your gcc documentation for further information on the -B option and the GCC_EXEC_PREFIX variable. =head3 Sun and GNU make The make under /usr/ccs/bin works fine for building perl. If you have the Sun C compilers, you will also have a parallel version of make (dmake). This works fine to build perl, but can sometimes cause problems when running 'make test' due to underspecified dependencies between the different test harness files. The same problem can also affect the building of some add-on modules, so in those cases either specify '-m serial' on the dmake command line, or use /usr/ccs/bin/make instead. If you wish to use GNU make, be sure that the set-group-id bit is not set. If it is, then arrange your PATH so that /usr/ccs/bin/make is before GNU make or else have the system administrator disable the set-group-id bit on GNU make. =head3 Avoid libucb. Solaris provides some BSD-compatibility functions in /usr/ucblib/libucb.a. Perl will not build and run correctly if linked against -lucb since it contains routines that are incompatible with the standard Solaris libc. Normally this is not a problem since the solaris hints file prevents Configure from even looking in /usr/ucblib for libraries, and also explicitly omits -lucb. =head2 Environment for Compiling perl on Solaris =head3 PATH Make sure your PATH includes the compiler (/opt/SUNWspro/bin/ if you're using Sun's compiler) as well as /usr/ccs/bin/ to pick up the other development tools (such as make, ar, as, and ld). Make sure your path either doesn't include /usr/ucb or that it includes it after the compiler and compiler tools and other standard Solaris directories. You definitely don't want /usr/ucb/cc. =head3 LD_LIBRARY_PATH If you have the LD_LIBRARY_PATH environment variable set, be sure that it does NOT include /lib or /usr/lib. If you will be building extensions that call third-party shared libraries (e.g. Berkeley DB) then make sure that your LD_LIBRARY_PATH environment variable includes the directory with that library (e.g. /usr/local/lib). If you get an error message dlopen: stub interception failed it is probably because your LD_LIBRARY_PATH environment variable includes a directory which is a symlink to /usr/lib (such as /lib). The reason this causes a problem is quite subtle. The file libdl.so.1.0 actually *only* contains functions which generate 'stub interception failed' errors! The runtime linker intercepts links to "/usr/lib/libdl.so.1.0" and links in internal implementations of those functions instead. [Thanks to Tim Bunce for this explanation.] =head1 RUN CONFIGURE. See the INSTALL file for general information regarding Configure. Only Solaris-specific issues are discussed here. Usually, the defaults should be fine. =head2 64-bit perl on Solaris. See the INSTALL file for general information regarding 64-bit compiles. In general, the defaults should be fine for most people. By default, perl-5.6.0 (or later) is compiled as a 32-bit application with largefile and long-long support. =head3 General 32-bit vs. 64-bit issues. Solaris 7 and above will run in either 32 bit or 64 bit mode on SPARC CPUs, via a reboot. You can build 64 bit apps whilst running 32 bit mode and vice-versa. 32 bit apps will run under Solaris running in either 32 or 64 bit mode. 64 bit apps require Solaris to be running 64 bit mode. Existing 32 bit apps are properly known as LP32, i.e. Longs and Pointers are 32 bit. 64-bit apps are more properly known as LP64. The discriminating feature of a LP64 bit app is its ability to utilise a 64-bit address space. It is perfectly possible to have a LP32 bit app that supports both 64-bit integers (long long) and largefiles (> 2GB), and this is the default for perl-5.6.0. For a more complete explanation of 64-bit issues, see the "Solaris 64-bit Developer's Guide" at L<http://docs.sun.com/> You can detect the OS mode using "isainfo -v", e.g. $ isainfo -v # Ultra 30 in 64 bit mode 64-bit sparcv9 applications 32-bit sparc applications By default, perl will be compiled as a 32-bit application. Unless you want to allocate more than ~ 4GB of memory inside perl, or unless you need more than 255 open file descriptors, you probably don't need perl to be a 64-bit app. =head3 Large File Support For Solaris 2.6 and onwards, there are two different ways for 32-bit applications to manipulate large files (files whose size is > 2GByte). (A 64-bit application automatically has largefile support built in by default.) First is the "transitional compilation environment", described in lfcompile64(5). According to the man page, The transitional compilation environment exports all the explicit 64-bit functions (xxx64()) and types in addition to all the regular functions (xxx()) and types. Both xxx() and xxx64() functions are available to the program source. A 32-bit application must use the xxx64() functions in order to access large files. See the lf64(5) manual page for a complete listing of the 64-bit transitional interfaces. The transitional compilation environment is obtained with the following compiler and linker flags: getconf LFS64_CFLAGS -D_LARGEFILE64_SOURCE getconf LFS64_LDFLAG # nothing special needed getconf LFS64_LIBS # nothing special needed Second is the "large file compilation environment", described in lfcompile(5). According to the man page, Each interface named xxx() that needs to access 64-bit entities to access large files maps to a xxx64() call in the resulting binary. All relevant data types are defined to be of correct size (for example, off_t has a typedef definition for a 64-bit entity). An application compiled in this environment is able to use the xxx() source interfaces to access both large and small files, rather than having to explicitly utilize the transitional xxx64() interface calls to access large files. Two exceptions are fseek() and ftell(). 32-bit applications should use fseeko(3C) and ftello(3C). These will get automatically mapped to fseeko64() and ftello64(). The large file compilation environment is obtained with getconf LFS_CFLAGS -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64 getconf LFS_LDFLAGS # nothing special needed getconf LFS_LIBS # nothing special needed By default, perl uses the large file compilation environment and relies on Solaris to do the underlying mapping of interfaces. =head3 Building an LP64 perl To compile a 64-bit application on an UltraSparc with a recent Sun Compiler, you need to use the flag "-xarch=v9". getconf(1) will tell you this, e.g. $ getconf -a | grep v9 XBS5_LP64_OFF64_CFLAGS: -xarch=v9 XBS5_LP64_OFF64_LDFLAGS: -xarch=v9 XBS5_LP64_OFF64_LINTFLAGS: -xarch=v9 XBS5_LPBIG_OFFBIG_CFLAGS: -xarch=v9 XBS5_LPBIG_OFFBIG_LDFLAGS: -xarch=v9 XBS5_LPBIG_OFFBIG_LINTFLAGS: -xarch=v9 _XBS5_LP64_OFF64_CFLAGS: -xarch=v9 _XBS5_LP64_OFF64_LDFLAGS: -xarch=v9 _XBS5_LP64_OFF64_LINTFLAGS: -xarch=v9 _XBS5_LPBIG_OFFBIG_CFLAGS: -xarch=v9 _XBS5_LPBIG_OFFBIG_LDFLAGS: -xarch=v9 _XBS5_LPBIG_OFFBIG_LINTFLAGS: -xarch=v9 This flag is supported in Sun WorkShop Compilers 5.0 and onwards (now marketed under the name Forte) when used on Solaris 7 or later on UltraSparc systems. If you are using gcc, you would need to use -mcpu=v9 -m64 instead. This option is not yet supported as of gcc 2.95.2; from install/SPECIFIC in that release: GCC version 2.95 is not able to compile code correctly for sparc64 targets. Users of the Linux kernel, at least, can use the sparc32 program to start up a new shell invocation with an environment that causes configure to recognize (via uname -a) the system as sparc-*-* instead. All this should be handled automatically by the hints file, if requested. =head3 Long Doubles. As of 5.8.1, long doubles are working if you use the Sun compilers (needed for additional math routines not included in libm). =head2 Threads in perl on Solaris. It is possible to build a threaded version of perl on Solaris. The entire perl thread implementation is still experimental, however, so beware. =head2 Malloc Issues with perl on Solaris. Starting from perl 5.7.1 perl uses the Solaris malloc, since the perl malloc breaks when dealing with more than 2GB of memory, and the Solaris malloc also seems to be faster. If you for some reason (such as binary backward compatibility) really need to use perl's malloc, you can rebuild perl from the sources and Configure the build with $ sh Configure -Dusemymalloc You should not use perl's malloc if you are building with gcc. There are reports of core dumps, especially in the PDL module. The problem appears to go away under -DDEBUGGING, so it has been difficult to track down. Sun's compiler appears to be okay with or without perl's malloc. [XXX further investigation is needed here.] =head1 MAKE PROBLEMS. =over 4 =item Dynamic Loading Problems With GNU as and GNU ld If you have problems with dynamic loading using gcc on SunOS or Solaris, and you are using GNU as and GNU ld, see the section L</"GNU as and GNU ld"> above. =item ld.so.1: ./perl: fatal: relocation error: If you get this message on SunOS or Solaris, and you're using gcc, it's probably the GNU as or GNU ld problem in the previous item L</"GNU as and GNU ld">. =item dlopen: stub interception failed The primary cause of the 'dlopen: stub interception failed' message is that the LD_LIBRARY_PATH environment variable includes a directory which is a symlink to /usr/lib (such as /lib). See L</"LD_LIBRARY_PATH"> above. =item #error "No DATAMODEL_NATIVE specified" This is a common error when trying to build perl on Solaris 2.6 with a gcc installation from Solaris 2.5 or 2.5.1. The Solaris header files changed, so you need to update your gcc installation. You can either rerun the fixincludes script from gcc or take the opportunity to update your gcc installation. =item sh: ar: not found This is a message from your shell telling you that the command 'ar' was not found. You need to check your PATH environment variable to make sure that it includes the directory with the 'ar' command. This is a common problem on Solaris, where 'ar' is in the /usr/ccs/bin/ directory. =back =head1 MAKE TEST =head2 op/stat.t test 4 in Solaris F<op/stat.t> test 4 may fail if you are on a tmpfs of some sort. Building in /tmp sometimes shows this behavior. The test suite detects if you are building in /tmp, but it may not be able to catch all tmpfs situations. =head2 nss_delete core dump from op/pwent or op/grent See L<perlhpux/"nss_delete core dump from op/pwent or op/grent">. =head1 CROSS-COMPILATION Nothing too unusual here. You can easily do this if you have a cross-compiler available; A usual Configure invocation when targetting a Solaris x86 looks something like this: sh ./Configure -des -Dusecrosscompile \ -Dcc=i386-pc-solaris2.11-gcc \ -Dsysroot=$SYSROOT \ -Alddlflags=" -Wl,-z,notext" \ -Dtargethost=... # The usual cross-compilation options The lddlflags addition is the only abnormal bit. =head1 PREBUILT BINARIES OF PERL FOR SOLARIS. You can pick up prebuilt binaries for Solaris from L<http://www.sunfreeware.com/>, L<http://www.blastwave.org>, ActiveState L<http://www.activestate.com/>, and L<http://www.perl.com/> under the Binaries list at the top of the page. There are probably other sources as well. Please note that these sites are under the control of their respective owners, not the perl developers. =head1 RUNTIME ISSUES FOR PERL ON SOLARIS. =head2 Limits on Numbers of Open Files on Solaris. The stdio(3C) manpage notes that for LP32 applications, only 255 files may be opened using fopen(), and only file descriptors 0 through 255 can be used in a stream. Since perl calls open() and then fdopen(3C) with the resulting file descriptor, perl is limited to 255 simultaneous open files, even if sysopen() is used. If this proves to be an insurmountable problem, you can compile perl as a LP64 application, see L</Building an LP64 perl> for details. Note also that the default resource limit for open file descriptors on Solaris is 255, so you will have to modify your ulimit or rctl (Solaris 9 onwards) appropriately. =head1 SOLARIS-SPECIFIC MODULES. See the modules under the Solaris:: and Sun::Solaris namespaces on CPAN, see L<http://www.cpan.org/modules/by-module/Solaris/> and L<http://www.cpan.org/modules/by-module/Sun/>. =head1 SOLARIS-SPECIFIC PROBLEMS WITH MODULES. =head2 Proc::ProcessTable on Solaris Proc::ProcessTable does not compile on Solaris with perl5.6.0 and higher if you have LARGEFILES defined. Since largefile support is the default in 5.6.0 and later, you have to take special steps to use this module. The problem is that various structures visible via procfs use off_t, and if you compile with largefile support these change from 32 bits to 64 bits. Thus what you get back from procfs doesn't match up with the structures in perl, resulting in garbage. See proc(4) for further discussion. A fix for Proc::ProcessTable is to edit Makefile to explicitly remove the largefile flags from the ones MakeMaker picks up from Config.pm. This will result in Proc::ProcessTable being built under the correct environment. Everything should then be OK as long as Proc::ProcessTable doesn't try to share off_t's with the rest of perl, or if it does they should be explicitly specified as off64_t. =head2 BSD::Resource on Solaris BSD::Resource versions earlier than 1.09 do not compile on Solaris with perl 5.6.0 and higher, for the same reasons as Proc::ProcessTable. BSD::Resource versions starting from 1.09 have a workaround for the problem. =head2 Net::SSLeay on Solaris Net::SSLeay requires a /dev/urandom to be present. This device is available from Solaris 9 onwards. For earlier Solaris versions you can either get the package SUNWski (packaged with several Sun software products, for example the Sun WebServer, which is part of the Solaris Server Intranet Extension, or the Sun Directory Services, part of Solaris for ISPs) or download the ANDIrand package from L<http://www.cosy.sbg.ac.at/~andi/>. If you use SUNWski, make a symbolic link /dev/urandom pointing to /dev/random. For more details, see Document ID27606 entitled "Differing /dev/random support requirements within Solaris[TM] Operating Environments", available at L<http://sunsolve.sun.com> . It may be possible to use the Entropy Gathering Daemon (written in Perl!), available from L<http://www.lothar.com/tech/crypto/>. =head1 SunOS 4.x In SunOS 4.x you most probably want to use the SunOS ld, /usr/bin/ld, since the more recent versions of GNU ld (like 2.13) do not seem to work for building Perl anymore. When linking the extensions, the GNU ld gets very unhappy and spews a lot of errors like this ... relocation truncated to fit: BASE13 ... and dies. Therefore the SunOS 4.1 hints file explicitly sets the ld to be F</usr/bin/ld>. As of Perl 5.8.1 the dynamic loading of libraries (DynaLoader, XSLoader) also seems to have become broken in in SunOS 4.x. Therefore the default is to build Perl statically. Running the test suite in SunOS 4.1 is a bit tricky since the F<dist/Tie-File/t/09_gen_rs.t> test hangs (subtest #51, FWIW) for some unknown reason. Just stop the test and kill that particular Perl process. There are various other failures, that as of SunOS 4.1.4 and gcc 3.2.2 look a lot like gcc bugs. Many of the failures happen in the Encode tests, where for example when the test expects "0" you get "0" which should after a little squinting look very odd indeed. Another example is earlier in F<t/run/fresh_perl> where chr(0xff) is expected but the test fails because the result is chr(0xff). Exactly. This is the "make test" result from the said combination: Failed 27 test scripts out of 745, 96.38% okay. Running the C<harness> is painful because of the many failing Unicode-related tests will output megabytes of failure messages, but if one patiently waits, one gets these results: Failed Test Stat Wstat Total Fail Failed List of Failed ----------------------------------------------------------------------------- ... ../ext/Encode/t/at-cn.t 4 1024 29 4 13.79% 14-17 ../ext/Encode/t/at-tw.t 10 2560 17 10 58.82% 2 4 6 8 10 12 14-17 ../ext/Encode/t/enc_data.t 29 7424 ?? ?? % ?? ../ext/Encode/t/enc_eucjp.t 29 7424 ?? ?? % ?? ../ext/Encode/t/enc_module.t 29 7424 ?? ?? % ?? ../ext/Encode/t/encoding.t 29 7424 ?? ?? % ?? ../ext/Encode/t/grow.t 12 3072 24 12 50.00% 2 4 6 8 10 12 14 16 18 20 22 24 Failed Test Stat Wstat Total Fail Failed List of Failed ------------------------------------------------------------------------------ ../ext/Encode/t/guess.t 255 65280 29 40 137.93% 10-29 ../ext/Encode/t/jperl.t 29 7424 15 30 200.00% 1-15 ../ext/Encode/t/mime-header.t 2 512 10 2 20.00% 2-3 ../ext/Encode/t/perlio.t 22 5632 38 22 57.89% 1-4 9-16 19-20 23-24 27-32 ../ext/List/Util/t/shuffle.t 0 139 ?? ?? % ?? ../ext/PerlIO/t/encoding.t 14 1 7.14% 11 ../ext/PerlIO/t/fallback.t 9 2 22.22% 3 5 ../ext/Socket/t/socketpair.t 0 2 45 70 155.56% 11-45 ../lib/CPAN/t/vcmp.t 30 1 3.33% 25 ../lib/Tie/File/t/09_gen_rs.t 0 15 ?? ?? % ?? ../lib/Unicode/Collate/t/test.t 199 30 15.08% 7 26-27 71-75 81-88 95 101 103-104 106 108- 109 122 124 161 169-172 ../lib/sort.t 0 139 119 26 21.85% 107-119 op/alarm.t 4 1 25.00% 4 op/utfhash.t 97 1 1.03% 31 run/fresh_perl.t 91 1 1.10% 32 uni/tr_7jis.t ?? ?? % ?? uni/tr_eucjp.t 29 7424 6 12 200.00% 1-6 uni/tr_sjis.t 29 7424 6 12 200.00% 1-6 56 tests and 467 subtests skipped. Failed 27/811 test scripts, 96.67% okay. 1383/75399 subtests failed, 98.17% okay. The alarm() test failure is caused by system() apparently blocking alarm(). That is probably a libc bug, and given that SunOS 4.x has been end-of-lifed years ago, don't hold your breath for a fix. In addition to that, don't try anything too Unicode-y, especially with Encode, and you should be fine in SunOS 4.x. =head1 AUTHOR The original was written by Andy Dougherty F<doughera@lafayette.edu> drawing heavily on advice from Alan Burlison, Nick Ing-Simmons, Tim Bunce, and many other Solaris users over the years. Please report any errors, updates, or suggestions to L<https://github.com/Perl/perl5/issues>. PK �=�[Go�+ + perl586delta.podnu �[��� =head1 NAME perl586delta - what is new for perl v5.8.6 =head1 DESCRIPTION This document describes differences between the 5.8.5 release and the 5.8.6 release. =head1 Incompatible Changes There are no changes incompatible with 5.8.5. =head1 Core Enhancements The perl interpreter is now more tolerant of UTF-16-encoded scripts. On Win32, Perl can now use non-IFS compatible LSPs, which allows Perl to work in conjunction with firewalls such as McAfee Guardian. For full details see the file F<README.win32>, particularly if you're running Win95. =head1 Modules and Pragmata =over 4 =item * With the C<base> pragma, an intermediate class with no fields used to messes up private fields in the base class. This has been fixed. =item * Cwd upgraded to version 3.01 (as part of the new PathTools distribution) =item * Devel::PPPort upgraded to version 3.03 =item * File::Spec upgraded to version 3.01 (as part of the new PathTools distribution) =item * Encode upgraded to version 2.08 =item * ExtUtils::MakeMaker remains at version 6.17, as later stable releases currently available on CPAN have some issues with core modules on some core platforms. =item * I18N::LangTags upgraded to version 0.35 =item * Math::BigInt upgraded to version 1.73 =item * Math::BigRat upgraded to version 0.13 =item * MIME::Base64 upgraded to version 3.05 =item * POSIX::sigprocmask function can now retrieve the current signal mask without also setting it. =item * Time::HiRes upgraded to version 1.65 =back =head1 Utility Changes Perl has a new -dt command-line flag, which enables threads support in the debugger. =head1 Performance Enhancements C<reverse sort ...> is now optimized to sort in reverse, avoiding the generation of a temporary intermediate list. C<for (reverse @foo)> now iterates in reverse, avoiding the generation of a temporary reversed list. =head1 Selected Bug Fixes The regexp engine is now more robust when given invalid utf8 input, as is sometimes generated by buggy XS modules. C<foreach> on threads::shared array used to be able to crash Perl. This bug has now been fixed. A regexp in C<STDOUT>'s destructor used to coredump, because the regexp pad was already freed. This has been fixed. C<goto &> is now more robust - bugs in deep recursion and chained C<goto &> have been fixed. Using C<delete> on an array no longer leaks memory. A C<pop> of an item from a shared array reference no longer causes a leak. C<eval_sv()> failing a taint test could corrupt the stack - this has been fixed. On platforms with 64 bit pointers numeric comparison operators used to erroneously compare the addresses of references that are overloaded, rather than using the overloaded values. This has been fixed. C<read> into a UTF8-encoded buffer with an offset off the end of the buffer no longer mis-calculates buffer lengths. Although Perl has promised since version 5.8 that C<sort()> would be stable, the two cases C<sort {$b cmp $a}> and C<< sort {$b <=> $a} >> could produce non-stable sorts. This is corrected in perl5.8.6. Localising C<$^D> no longer generates a diagnostic message about valid -D flags. =head1 New or Changed Diagnostics For -t and -T, Too late for "-T" option has been changed to the more informative "-T" is on the #! line, it must also be used on the command line =head1 Changed Internals From now on all applications embedding perl will behave as if perl were compiled with -DPERL_USE_SAFE_PUTENV. See "Environment access" in the F<INSTALL> file for details. Most C<C> source files now have comments at the top explaining their purpose, which should help anyone wishing to get an overview of the implementation. =head1 New Tests There are significantly more tests for the C<B> suite of modules. =head1 Reporting Bugs If you find what you think is a bug, you might check the articles recently posted to the comp.lang.perl.misc newsgroup and the perl bug database at http://bugs.perl.org. There may also be information at http://www.perl.org, the Perl Home Page. If you believe you have an unreported bug, please run the B<perlbug> program included with your release. Be sure to trim your bug down to a tiny but sufficient test case. Your bug report, along with the output of C<perl -V>, will be sent off to perlbug@perl.org to be analysed by the Perl porting team. You can browse and search the Perl 5 bugs at http://bugs.perl.org/ =head1 SEE ALSO The F<Changes> file for exhaustive details on what changed. The F<INSTALL> file for how to build Perl. The F<README> file for general stuff. The F<Artistic> and F<Copying> files for copyright information. =cut PK �=�[ϊ;h� h� perltoc.podnu �[��� # !!!!!!! DO NOT EDIT THIS FILE !!!!!!! # This file is autogenerated by buildtoc from all the other pods. # Edit those files and run pod/buildtoc to effect changes. =encoding UTF-8 =head1 NAME perltoc - perl documentation table of contents =head1 DESCRIPTION This page provides a brief table of contents for the rest of the Perl documentation set. It is meant to be scanned quickly or grepped through to locate the proper section you're looking for. =head1 BASIC DOCUMENTATION =head2 perl - The Perl 5 language interpreter =over 4 =item SYNOPSIS =item GETTING HELP =over 4 =item Overview =item Tutorials =item Reference Manual =item Internals and C Language Interface =item Miscellaneous =item Language-Specific =item Platform-Specific =item Stubs for Deleted Documents =back =item DESCRIPTION =item AVAILABILITY =item ENVIRONMENT =item AUTHOR =item FILES =item SEE ALSO =item DIAGNOSTICS =item BUGS =item NOTES =back =head2 perlintro -- a brief introduction and overview of Perl =over 4 =item DESCRIPTION =over 4 =item What is Perl? =item Running Perl programs =item Safety net =item Basic syntax overview =item Perl variable types Scalars, Arrays, Hashes =item Variable scoping =item Conditional and looping constructs if, while, for, foreach =item Builtin operators and functions Arithmetic, Numeric comparison, String comparison, Boolean logic, Miscellaneous =item Files and I/O =item Regular expressions Simple matching, Simple substitution, More complex regular expressions, Parentheses for capturing, Other regexp features =item Writing subroutines =item OO Perl =item Using Perl modules =back =item AUTHOR =back =head2 perlrun - how to execute the Perl interpreter =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item #! and quoting on non-Unix systems X<hashbang> X<#!> OS/2, MS-DOS, Win95/NT, VMS =item Location of Perl X<perl, location of interpreter> =item Command Switches X<perl, command switches> X<command switches> B<-0>[I<octal/hexadecimal>] X<-0> X<$/>, B<-a> X<-a> X<autosplit>, B<-C [I<number/list>]> X<-C>, B<-c> X<-c>, B<-d> X<-d> X<-dt>, B<-dt>, B<-d:>I<MOD[=bar,baz]> X<-d> X<-dt>, B<-dt:>I<MOD[=bar,baz]>, B<-D>I<letters> X<-D> X<DEBUGGING> X<-DDEBUGGING>, B<-D>I<number>, B<-e> I<commandline> X<-e>, B<-E> I<commandline> X<-E>, B<-f> X<-f> X<sitecustomize> X<sitecustomize.pl>, B<-F>I<pattern> X<-F>, B<-h> X<-h>, B<-i>[I<extension>] X<-i> X<in-place>, B<-I>I<directory> X<-I> X<@INC>, B<-l>[I<octnum>] X<-l> X<$/> X<$\>, B<-m>[B<->]I<module> X<-m> X<-M>, B<-M>[B<->]I<module>, B<-M>[B<->]I<'module ...'>, B<-[mM]>[B<->]I<module=arg[,arg]...>, B<-n> X<-n>, B<-p> X<-p>, B<-s> X<-s>, B<-S> X<-S>, B<-t> X<-t>, B<-T> X<-T>, B<-u> X<-u>, B<-U> X<-U>, B<-v> X<-v>, B<-V> X<-V>, B<-V:>I<configvar>, B<-w> X<-w>, B<-W> X<-W>, B<-X> X<-X>, B<-x> X<-x>, B<-x>I<directory> =back =item ENVIRONMENT X<perl, environment variables> HOME X<HOME>, LOGDIR X<LOGDIR>, PATH X<PATH>, PERL5LIB X<PERL5LIB>, PERL5OPT X<PERL5OPT>, PERLIO X<PERLIO>, :crlf X<:crlf>, :perlio X<:perlio>, :stdio X<:stdio>, :unix X<:unix>, :win32 X<:win32>, PERLIO_DEBUG X<PERLIO_DEBUG>, PERLLIB X<PERLLIB>, PERL5DB X<PERL5DB>, PERL5DB_THREADED X<PERL5DB_THREADED>, PERL5SHELL (specific to the Win32 port) X<PERL5SHELL>, PERL_ALLOW_NON_IFS_LSP (specific to the Win32 port) X<PERL_ALLOW_NON_IFS_LSP>, PERL_DEBUG_MSTATS X<PERL_DEBUG_MSTATS>, PERL_DESTRUCT_LEVEL X<PERL_DESTRUCT_LEVEL>, PERL_DL_NONLAZY X<PERL_DL_NONLAZY>, PERL_ENCODING X<PERL_ENCODING>, PERL_HASH_SEED X<PERL_HASH_SEED>, PERL_PERTURB_KEYS X<PERL_PERTURB_KEYS>, PERL_HASH_SEED_DEBUG X<PERL_HASH_SEED_DEBUG>, PERL_MEM_LOG X<PERL_MEM_LOG>, PERL_ROOT (specific to the VMS port) X<PERL_ROOT>, PERL_SIGNALS X<PERL_SIGNALS>, PERL_UNICODE X<PERL_UNICODE>, PERL_USE_UNSAFE_INC X<PERL_USE_UNSAFE_INC>, SYS$LOGIN (specific to the VMS port) X<SYS$LOGIN>, PERL_INTERNAL_RAND_SEED X<PERL_INTERNAL_RAND_SEED> =item ORDER OF APPLICATION -I, -M, the PERL5LIB environment variable, combinations of -I, -M and PERL5LIB, the PERL5OPT environment variable, Other complications, arch and version subdirs, sitecustomize.pl =back =head2 perlreftut - Mark's very short tutorial about references =over 4 =item DESCRIPTION =item Who Needs Complicated Data Structures? =item The Solution =item Syntax =over 4 =item Making References =item Using References =item An Example =item Arrow Rule =back =item Solution =item The Rest =item Summary =item Credits =over 4 =item Distribution Conditions =back =back =head2 perldsc - Perl Data Structures Cookbook =over 4 =item DESCRIPTION arrays of arrays, hashes of arrays, arrays of hashes, hashes of hashes, more elaborate constructs =item REFERENCES X<reference> X<dereference> X<dereferencing> X<pointer> =item COMMON MISTAKES =item CAVEAT ON PRECEDENCE X<dereference, precedence> X<dereferencing, precedence> =item WHY YOU SHOULD ALWAYS C<use strict> =item DEBUGGING X<data structure, debugging> X<complex data structure, debugging> X<AoA, debugging> X<HoA, debugging> X<AoH, debugging> X<HoH, debugging> X<array of arrays, debugging> X<hash of arrays, debugging> X<array of hashes, debugging> X<hash of hashes, debugging> =item CODE EXAMPLES =item ARRAYS OF ARRAYS X<array of arrays> X<AoA> =over 4 =item Declaration of an ARRAY OF ARRAYS =item Generation of an ARRAY OF ARRAYS =item Access and Printing of an ARRAY OF ARRAYS =back =item HASHES OF ARRAYS X<hash of arrays> X<HoA> =over 4 =item Declaration of a HASH OF ARRAYS =item Generation of a HASH OF ARRAYS =item Access and Printing of a HASH OF ARRAYS =back =item ARRAYS OF HASHES X<array of hashes> X<AoH> =over 4 =item Declaration of an ARRAY OF HASHES =item Generation of an ARRAY OF HASHES =item Access and Printing of an ARRAY OF HASHES =back =item HASHES OF HASHES X<hash of hashes> X<HoH> =over 4 =item Declaration of a HASH OF HASHES =item Generation of a HASH OF HASHES =item Access and Printing of a HASH OF HASHES =back =item MORE ELABORATE RECORDS X<record> X<structure> X<struct> =over 4 =item Declaration of MORE ELABORATE RECORDS =item Declaration of a HASH OF COMPLEX RECORDS =item Generation of a HASH OF COMPLEX RECORDS =back =item Database Ties =item SEE ALSO =item AUTHOR =back =head2 perllol - Manipulating Arrays of Arrays in Perl =over 4 =item DESCRIPTION =over 4 =item Declaration and Access of Arrays of Arrays =item Growing Your Own =item Access and Printing =item Slices =back =item SEE ALSO =item AUTHOR =back =head2 perlrequick - Perl regular expressions quick start =over 4 =item DESCRIPTION =item The Guide =over 4 =item Simple word matching =item Using character classes =item Matching this or that =item Grouping things and hierarchical matching =item Extracting matches =item Matching repetitions =item More matching =item Search and replace =item The split operator =item C<use re 'strict'> =back =item BUGS =item SEE ALSO =item AUTHOR AND COPYRIGHT =over 4 =item Acknowledgments =back =back =head2 perlretut - Perl regular expressions tutorial =over 4 =item DESCRIPTION =item Part 1: The basics =over 4 =item Simple word matching =item Using character classes =item Matching this or that =item Grouping things and hierarchical matching Z<>0. Start with the first letter in the string C<'a'>, Z<>1. Try the first alternative in the first group C<'abd'>, Z<>2. Match C<'a'> followed by C<'b'>. So far so good, Z<>3. C<'d'> in the regexp doesn't match C<'c'> in the string - a dead end. So backtrack two characters and pick the second alternative in the first group C<'abc'>, Z<>4. Match C<'a'> followed by C<'b'> followed by C<'c'>. We are on a roll and have satisfied the first group. Set C<$1> to C<'abc'>, Z<>5 Move on to the second group and pick the first alternative C<'df'>, Z<>6 Match the C<'d'>, Z<>7. C<'f'> in the regexp doesn't match C<'e'> in the string, so a dead end. Backtrack one character and pick the second alternative in the second group C<'d'>, Z<>8. C<'d'> matches. The second grouping is satisfied, so set C<$2> to C<'d'>, Z<>9. We are at the end of the regexp, so we are done! We have matched C<'abcd'> out of the string C<"abcde"> =item Extracting matches =item Backreferences =item Relative backreferences =item Named backreferences =item Alternative capture group numbering =item Position information =item Non-capturing groupings =item Matching repetitions Z<>0. Start with the first letter in the string C<'t'>, Z<>1. The first quantifier C<'.*'> starts out by matching the whole string "C<the cat in the hat>", Z<>2. C<'a'> in the regexp element C<'at'> doesn't match the end of the string. Backtrack one character, Z<>3. C<'a'> in the regexp element C<'at'> still doesn't match the last letter of the string C<'t'>, so backtrack one more character, Z<>4. Now we can match the C<'a'> and the C<'t'>, Z<>5. Move on to the third element C<'.*'>. Since we are at the end of the string and C<'.*'> can match 0 times, assign it the empty string, Z<>6. We are done! =item Possessive quantifiers =item Building a regexp =item Using regular expressions in Perl =back =item Part 2: Power tools =over 4 =item More on characters, strings, and character classes =item Compiling and saving regular expressions =item Composing regular expressions at runtime =item Embedding comments and modifiers in a regular expression =item Looking ahead and looking behind =item Using independent subexpressions to prevent backtracking =item Conditional expressions =item Defining named patterns =item Recursive patterns =item A bit of magic: executing Perl code in a regular expression =item Backtracking control verbs =item Pragmas and debugging =back =item SEE ALSO =item AUTHOR AND COPYRIGHT =over 4 =item Acknowledgments =back =back =head2 perlootut - Object-Oriented Programming in Perl Tutorial =over 4 =item DATE =item DESCRIPTION =item OBJECT-ORIENTED FUNDAMENTALS =over 4 =item Object =item Class =item Methods =item Attributes =item Polymorphism =item Inheritance =item Encapsulation =item Composition =item Roles =item When to Use OO =back =item PERL OO SYSTEMS =over 4 =item Moose Declarative sugar, Roles built-in, A miniature type system, Full introspection and manipulation, Self-hosted and extensible, Rich ecosystem, Many more features =item Class::Accessor =item Class::Tiny =item Role::Tiny =item OO System Summary L<Moose>, L<Class::Accessor>, L<Class::Tiny>, L<Role::Tiny> =item Other OO Systems =back =item CONCLUSION =back =head2 perlperf - Perl Performance and Optimization Techniques =over 4 =item DESCRIPTION =item OVERVIEW =over 4 =item ONE STEP SIDEWAYS =item ONE STEP FORWARD =item ANOTHER STEP SIDEWAYS =back =item GENERAL GUIDELINES =item BENCHMARKS =over 4 =item Assigning and Dereferencing Variables. =item Search and replace or tr =back =item PROFILING TOOLS =over 4 =item Devel::DProf =item Devel::Profiler =item Devel::SmallProf =item Devel::FastProf =item Devel::NYTProf =back =item SORTING Elapsed Real Time, User CPU Time, System CPU Time =item LOGGING =over 4 =item Logging if DEBUG (constant) =back =item POSTSCRIPT =item SEE ALSO =over 4 =item PERLDOCS =item MAN PAGES =item MODULES =item URLS =back =item AUTHOR =back =head2 perlstyle - Perl style guide =over 4 =item DESCRIPTION =back =head2 perlcheat - Perl 5 Cheat Sheet =over 4 =item DESCRIPTION =over 4 =item The sheet =back =item ACKNOWLEDGEMENTS =item AUTHOR =item SEE ALSO =back =head2 perltrap - Perl traps for the unwary =over 4 =item DESCRIPTION =over 4 =item Awk Traps =item C/C++ Traps =item JavaScript Traps =item Sed Traps =item Shell Traps =item Perl Traps =back =back =head2 perldebtut - Perl debugging tutorial =over 4 =item DESCRIPTION =item use strict =item Looking at data and -w and v =item help =item Stepping through code =item Placeholder for a, w, t, T =item REGULAR EXPRESSIONS =item OUTPUT TIPS =item CGI =item GUIs =item SUMMARY =item SEE ALSO =item AUTHOR =item CONTRIBUTORS =back =head2 perlfaq - Frequently asked questions about Perl =over 4 =item VERSION =item DESCRIPTION =over 4 =item Where to find the perlfaq =item How to use the perlfaq =item How to contribute to the perlfaq =item What if my question isn't answered in the FAQ? =back =item TABLE OF CONTENTS perlfaq1 - General Questions About Perl, perlfaq2 - Obtaining and Learning about Perl, perlfaq3 - Programming Tools, perlfaq4 - Data Manipulation, perlfaq5 - Files and Formats, perlfaq6 - Regular Expressions, perlfaq7 - General Perl Language Issues, perlfaq8 - System Interaction, perlfaq9 - Web, Email and Networking =item THE QUESTIONS =over 4 =item L<perlfaq1>: General Questions About Perl =item L<perlfaq2>: Obtaining and Learning about Perl =item L<perlfaq3>: Programming Tools =item L<perlfaq4>: Data Manipulation =item L<perlfaq5>: Files and Formats =item L<perlfaq6>: Regular Expressions =item L<perlfaq7>: General Perl Language Issues =item L<perlfaq8>: System Interaction =item L<perlfaq9>: Web, Email and Networking =back =item CREDITS =item AUTHOR AND COPYRIGHT =back =head2 perlfaq1 - General Questions About Perl =over 4 =item VERSION =item DESCRIPTION =over 4 =item What is Perl? =item Who supports Perl? Who develops it? Why is it free? =item Which version of Perl should I use? =item What are Perl 4, Perl 5, or Raku (Perl 6)? =item What is Raku (Perl 6)? =item How stable is Perl? =item How often are new versions of Perl released? =item Is Perl difficult to learn? =item How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? =item Can I do [task] in Perl? =item When shouldn't I program in Perl? =item What's the difference between "perl" and "Perl"? =item What is a JAPH? =item How can I convince others to use Perl? L<http://www.perl.org/about.html>, L<http://perltraining.com.au/whyperl.html> =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq2 - Obtaining and Learning about Perl =over 4 =item VERSION =item DESCRIPTION =over 4 =item What machines support Perl? Where do I get it? =item How can I get a binary version of Perl? =item I don't have a C compiler. How can I build my own Perl interpreter? =item I copied the Perl binary from one machine to another, but scripts don't work. =item I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? =item What modules and extensions are available for Perl? What is CPAN? =item Where can I get information on Perl? L<http://www.perl.org/>, L<http://perldoc.perl.org/>, L<http://learn.perl.org/> =item What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org? L<http://www.perl.org/>, L<http://learn.perl.org/>, L<http://jobs.perl.org/>, L<http://lists.perl.org/> =item Where can I post questions? =item Perl Books =item Which magazines have Perl content? =item Which Perl blogs should I read? =item What mailing lists are there for Perl? =item Where can I buy a commercial version of Perl? =item Where do I send bug reports? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq3 - Programming Tools =over 4 =item VERSION =item DESCRIPTION =over 4 =item How do I do (anything)? Basics, L<perldata> - Perl data types, L<perlvar> - Perl pre-defined variables, L<perlsyn> - Perl syntax, L<perlop> - Perl operators and precedence, L<perlsub> - Perl subroutines, Execution, L<perlrun> - how to execute the Perl interpreter, L<perldebug> - Perl debugging, Functions, L<perlfunc> - Perl builtin functions, Objects, L<perlref> - Perl references and nested data structures, L<perlmod> - Perl modules (packages and symbol tables), L<perlobj> - Perl objects, L<perltie> - how to hide an object class in a simple variable, Data Structures, L<perlref> - Perl references and nested data structures, L<perllol> - Manipulating arrays of arrays in Perl, L<perldsc> - Perl Data Structures Cookbook, Modules, L<perlmod> - Perl modules (packages and symbol tables), L<perlmodlib> - constructing new Perl modules and finding existing ones, Regexes, L<perlre> - Perl regular expressions, L<perlfunc> - Perl builtin functions>, L<perlop> - Perl operators and precedence, L<perllocale> - Perl locale handling (internationalization and localization), Moving to perl5, L<perltrap> - Perl traps for the unwary, L<perl>, Linking with C, L<perlxstut> - Tutorial for writing XSUBs, L<perlxs> - XS language reference manual, L<perlcall> - Perl calling conventions from C, L<perlguts> - Introduction to the Perl API, L<perlembed> - how to embed perl in your C program, Various =item How can I use Perl interactively? =item How do I find which modules are installed on my system? =item How do I debug my Perl programs? =item How do I profile my Perl programs? =item How do I cross-reference my Perl programs? =item Is there a pretty-printer (formatter) for Perl? =item Is there an IDE or Windows Perl Editor? Eclipse, Enginsite, IntelliJ IDEA, Kephra, Komodo, Notepad++, Open Perl IDE, OptiPerl, Padre, PerlBuilder, visiPerl+, Visual Perl, Zeus, GNU Emacs, MicroEMACS, XEmacs, Jed, Vim, Vile, MultiEdit, SlickEdit, ConTEXT, bash, zsh, BBEdit and TextWrangler =item Where can I get Perl macros for vi? =item Where can I get perl-mode or cperl-mode for emacs? X<emacs> =item How can I use curses with Perl? =item How can I write a GUI (X, Tk, Gtk, etc.) in Perl? X<GUI> X<Tk> X<Wx> X<WxWidgets> X<Gtk> X<Gtk2> X<CamelBones> X<Qt> Tk, Wx, Gtk and Gtk2, Win32::GUI, CamelBones, Qt, Athena =item How can I make my Perl program run faster? =item How can I make my Perl program take less memory? Don't slurp!, Use map and grep selectively, Avoid unnecessary quotes and stringification, Pass by reference, Tie large variables to disk =item Is it safe to return a reference to local or lexical data? =item How can I free an array or hash so my program shrinks? =item How can I make my CGI script more efficient? =item How can I hide the source for my Perl program? =item How can I compile my Perl program into byte code or C? =item How can I get C<#!perl> to work on [MS-DOS,NT,...]? =item Can I write useful Perl programs on the command line? =item Why don't Perl one-liners work on my DOS/Mac/VMS system? =item Where can I learn about CGI or Web programming in Perl? =item Where can I learn about object-oriented Perl programming? =item Where can I learn about linking C with Perl? =item I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? =item When I tried to run my script, I got this message. What does it mean? =item What's MakeMaker? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq4 - Data Manipulation =over 4 =item VERSION =item DESCRIPTION =item Data: Numbers =over 4 =item Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? =item Why is int() broken? =item Why isn't my octal data interpreted correctly? =item Does Perl have a round() function? What about ceil() and floor()? Trig functions? =item How do I convert between numeric representations/bases/radixes? How do I convert hexadecimal into decimal, How do I convert from decimal to hexadecimal, How do I convert from octal to decimal, How do I convert from decimal to octal, How do I convert from binary to decimal, How do I convert from decimal to binary =item Why doesn't & work the way I want it to? =item How do I multiply matrices? =item How do I perform an operation on a series of integers? =item How can I output Roman numerals? =item Why aren't my random numbers random? =item How do I get a random number between X and Y? =back =item Data: Dates =over 4 =item How do I find the day or week of the year? =item How do I find the current century or millennium? =item How can I compare two dates and find the difference? =item How can I take a string and turn it into epoch seconds? =item How can I find the Julian Day? =item How do I find yesterday's date? X<date> X<yesterday> X<DateTime> X<Date::Calc> X<Time::Local> X<daylight saving time> X<day> X<Today_and_Now> X<localtime> X<timelocal> =item Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant? =back =item Data: Strings =over 4 =item How do I validate input? =item How do I unescape a string? =item How do I remove consecutive pairs of characters? =item How do I expand function calls in a string? =item How do I find matching/nesting anything? =item How do I reverse a string? =item How do I expand tabs in a string? =item How do I reformat a paragraph? =item How can I access or change N characters of a string? =item How do I change the Nth occurrence of something? =item How can I count the number of occurrences of a substring within a string? =item How do I capitalize all the words on one line? X<Text::Autoformat> X<capitalize> X<case, title> X<case, sentence> =item How can I split a [character]-delimited string except when inside [character]? =item How do I strip blank space from the beginning/end of a string? =item How do I pad a string with blanks or pad a number with zeroes? =item How do I extract selected columns from a string? =item How do I find the soundex value of a string? =item How can I expand variables in text strings? =item What's wrong with always quoting "$vars"? =item Why don't my E<lt>E<lt>HERE documents work? There must be no space after the E<lt>E<lt> part, There (probably) should be a semicolon at the end of the opening token, You can't (easily) have any space in front of the tag, There needs to be at least a line separator after the end token =back =item Data: Arrays =over 4 =item What is the difference between a list and an array? =item What is the difference between $array[1] and @array[1]? =item How can I remove duplicate elements from a list or array? =item How can I tell whether a certain element is contained in a list or array? =item How do I compute the difference of two arrays? How do I compute the intersection of two arrays? =item How do I test whether two arrays or hashes are equal? =item How do I find the first array element for which a condition is true? =item How do I handle linked lists? =item How do I handle circular lists? X<circular> X<array> X<Tie::Cycle> X<Array::Iterator::Circular> X<cycle> X<modulus> =item How do I shuffle an array randomly? =item How do I process/modify each element of an array? =item How do I select a random element from an array? =item How do I permute N elements of a list? X<List::Permutor> X<permute> X<Algorithm::Loops> X<Knuth> X<The Art of Computer Programming> X<Fischer-Krause> =item How do I sort an array by (anything)? =item How do I manipulate arrays of bits? =item Why does defined() return true on empty arrays and hashes? =back =item Data: Hashes (Associative Arrays) =over 4 =item How do I process an entire hash? =item How do I merge two hashes? X<hash> X<merge> X<slice, hash> =item What happens if I add or remove keys from a hash while iterating over it? =item How do I look up a hash element by value? =item How can I know how many entries are in a hash? =item How do I sort a hash (optionally by value instead of key)? =item How can I always keep my hash sorted? X<hash tie sort DB_File Tie::IxHash> =item What's the difference between "delete" and "undef" with hashes? =item Why don't my tied hashes make the defined/exists distinction? =item How do I reset an each() operation part-way through? =item How can I get the unique keys from two hashes? =item How can I store a multidimensional array in a DBM file? =item How can I make my hash remember the order I put elements into it? =item Why does passing a subroutine an undefined element in a hash create it? =item How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? =item How can I use a reference as a hash key? =item How can I check if a key exists in a multilevel hash? =item How can I prevent addition of unwanted keys into a hash? =back =item Data: Misc =over 4 =item How do I handle binary data correctly? =item How do I determine whether a scalar is a number/whole/integer/float? =item How do I keep persistent data across program calls? =item How do I print out or copy a recursive data structure? =item How do I define methods for every class/object? =item How do I verify a credit card checksum? =item How do I pack arrays of doubles or floats for XS code? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq5 - Files and Formats =over 4 =item VERSION =item DESCRIPTION =over 4 =item How do I flush/unbuffer an output filehandle? Why must I do this? X<flush> X<buffer> X<unbuffer> X<autoflush> =item How do I change, delete, or insert a line in a file, or append to the beginning of a file? X<file, editing> =item How do I count the number of lines in a file? X<file, counting lines> X<lines> X<line> =item How do I delete the last N lines from a file? X<lines> X<file> =item How can I use Perl's C<-i> option from within a program? X<-i> X<in-place> =item How can I copy a file? X<copy> X<file, copy> X<File::Copy> =item How do I make a temporary file name? X<file, temporary> =item How can I manipulate fixed-record-length files? X<fixed-length> X<file, fixed-length records> =item How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? X<filehandle, local> X<filehandle, passing> X<filehandle, reference> =item How can I use a filehandle indirectly? X<filehandle, indirect> =item How can I open a filehandle to a string? X<string> X<open> X<IO::String> X<filehandle> =item How can I set up a footer format to be used with write()? X<footer> =item How can I write() into a string? X<write, into a string> =item How can I output my numbers with commas added? X<number, commify> =item How can I translate tildes (~) in a filename? X<tilde> X<tilde expansion> =item How come when I open a file read-write it wipes it out? X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> =item Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? X<argument list too long> =item How can I open a file named with a leading ">" or trailing blanks? X<filename, special characters> =item How can I reliably rename a file? X<rename> X<mv> X<move> X<file, rename> =item How can I lock a file? X<lock> X<file, lock> X<flock> =item Why can't I just open(FH, "E<gt>file.lock")? X<lock, lockfile race condition> =item I still don't get locking. I just want to increment the number in the file. How can I do this? X<counter> X<file, counter> =item All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? X<append> X<file, append> =item How do I randomly update a binary file? X<file, binary patch> =item How do I get a file's timestamp in perl? X<timestamp> X<file, timestamp> =item How do I set a file's timestamp in perl? X<timestamp> X<file, timestamp> =item How do I print to more than one file at once? X<print, to multiple files> =item How can I read in an entire file all at once? X<slurp> X<file, slurping> =item How can I read in a file by paragraphs? X<file, reading by paragraphs> =item How can I read a single character from a file? From the keyboard? X<getc> X<file, reading one character at a time> =item How can I tell whether there's a character waiting on a filehandle? =item How do I do a C<tail -f> in perl? X<tail> X<IO::Handle> X<File::Tail> X<clearerr> =item How do I dup() a filehandle in Perl? X<dup> =item How do I close a file descriptor by number? X<file, closing file descriptors> X<POSIX> X<close> =item Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? X<filename, DOS issues> =item Why doesn't glob("*.*") get all the files? X<glob> =item Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? =item How do I select a random line from a file? X<file, selecting a random line> =item Why do I get weird spaces when I print an array of lines? =item How do I traverse a directory tree? =item How do I delete a directory tree? =item How do I copy an entire directory? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq6 - Regular Expressions =over 4 =item VERSION =item DESCRIPTION =over 4 =item How can I hope to use regular expressions without creating illegible and unmaintainable code? X<regex, legibility> X<regexp, legibility> X<regular expression, legibility> X</x> Comments Outside the Regex, Comments Inside the Regex, Different Delimiters =item I'm having trouble matching over more than one line. What's wrong? X<regex, multiline> X<regexp, multiline> X<regular expression, multiline> =item How can I pull out lines between two patterns that are themselves on different lines? X<..> =item How do I match XML, HTML, or other nasty, ugly things with a regex? X<regex, XML> X<regex, HTML> X<XML> X<HTML> X<pain> X<frustration> X<sucking out, will to live> =item I put a regular expression into $/ but it didn't work. What's wrong? X<$/, regexes in> X<$INPUT_RECORD_SEPARATOR, regexes in> X<$RS, regexes in> =item How do I substitute case-insensitively on the LHS while preserving case on the RHS? X<replace, case preserving> X<substitute, case preserving> X<substitution, case preserving> X<s, case preserving> =item How can I make C<\w> match national character sets? X<\w> =item How can I match a locale-smart version of C</[a-zA-Z]/>? X<alpha> =item How can I quote a variable to use in a regex? X<regex, escaping> X<regexp, escaping> X<regular expression, escaping> =item What is C</o> really for? X</o, regular expressions> X<compile, regular expressions> =item How do I use a regular expression to strip C-style comments from a file? =item Can I use Perl regular expressions to match balanced text? X<regex, matching balanced test> X<regexp, matching balanced test> X<regular expression, matching balanced test> X<possessive> X<PARNO> X<Text::Balanced> X<Regexp::Common> X<backtracking> X<recursion> =item What does it mean that regexes are greedy? How can I get around it? X<greedy> X<greediness> =item How do I process each word on each line? X<word> =item How can I print out a word-frequency or line-frequency summary? =item How can I do approximate matching? X<match, approximate> X<matching, approximate> =item How do I efficiently match many regular expressions at once? X<regex, efficiency> X<regexp, efficiency> X<regular expression, efficiency> =item Why don't word-boundary searches with C<\b> work for me? X<\b> =item Why does using $&, $`, or $' slow my program down? X<$MATCH> X<$&> X<$POSTMATCH> X<$'> X<$PREMATCH> X<$`> =item What good is C<\G> in a regular expression? X<\G> =item Are Perl regexes DFAs or NFAs? Are they POSIX compliant? X<DFA> X<NFA> X<POSIX> =item What's wrong with using grep in a void context? X<grep> =item How can I match strings with multibyte characters? X<regex, and multibyte characters> X<regexp, and multibyte characters> X<regular expression, and multibyte characters> X<martian> X<encoding, Martian> =item How do I match a regular expression that's in a variable? X<regex, in variable> X<eval> X<regex> X<quotemeta> X<\Q, regex> X<\E, regex> X<qr//> =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq7 - General Perl Language Issues =over 4 =item VERSION =item DESCRIPTION =over 4 =item Can I get a BNF/yacc/RE for the Perl language? =item What are all these $@%&* punctuation signs, and how do I know when to use them? =item Do I always/never have to quote my strings or use semicolons and commas? =item How do I skip some return values? =item How do I temporarily block warnings? =item What's an extension? =item Why do Perl operators have different precedence than C operators? =item How do I declare/create a structure? =item How do I create a module? =item How do I adopt or take over a module already on CPAN? =item How do I create a class? X<class, creation> X<package> =item How can I tell if a variable is tainted? =item What's a closure? =item What is variable suicide and how can I prevent it? =item How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? Passing Variables and Functions, Passing Filehandles, Passing Regexes, Passing Methods =item How do I create a static variable? =item What's the difference between dynamic and lexical (static) scoping? Between local() and my()? =item How can I access a dynamic variable while a similarly named lexical is in scope? =item What's the difference between deep and shallow binding? =item Why doesn't "my($foo) = E<lt>$fhE<gt>;" work right? =item How do I redefine a builtin function, operator, or method? =item What's the difference between calling a function as &foo and foo()? =item How do I create a switch or case statement? =item How can I catch accesses to undefined variables, functions, or methods? =item Why can't a method included in this same file be found? =item How can I find out my current or calling package? =item How can I comment out a large block of Perl code? =item How do I clear a package? =item How can I use a variable as a variable name? =item What does "bad interpreter" mean? =item Do I need to recompile XS modules when there is a change in the C library? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq8 - System Interaction =over 4 =item VERSION =item DESCRIPTION =over 4 =item How do I find out which operating system I'm running under? =item How come exec() doesn't return? X<exec> X<system> X<fork> X<open> X<pipe> =item How do I do fancy stuff with the keyboard/screen/mouse? Keyboard, Screen, Mouse =item How do I print something out in color? =item How do I read just one key without waiting for a return key? =item How do I check whether input is ready on the keyboard? =item How do I clear the screen? =item How do I get the screen size? =item How do I ask the user for a password? =item How do I read and write the serial port? lockfiles, open mode, end of line, flushing output, non-blocking input =item How do I decode encrypted password files? =item How do I start a process in the background? STDIN, STDOUT, and STDERR are shared, Signals, Zombies =item How do I trap control characters/signals? =item How do I modify the shadow password file on a Unix system? =item How do I set the time and date? =item How can I sleep() or alarm() for under a second? X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> =item How can I measure time under a second? X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> =item How can I do an atexit() or setjmp()/longjmp()? (Exception handling) =item Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? =item How can I call my system's unique C functions from Perl? =item Where do I get the include files to do ioctl() or syscall()? =item Why do setuid perl scripts complain about kernel problems? =item How can I open a pipe both to and from a command? =item Why can't I get the output of a command with system()? =item How can I capture STDERR from an external command? =item Why doesn't open() return an error when a pipe open fails? =item What's wrong with using backticks in a void context? =item How can I call backticks without shell processing? =item Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? =item How can I convert my shell script to perl? =item Can I use perl to run a telnet or ftp session? =item How can I write expect in Perl? =item Is there a way to hide perl's command line from programs such as "ps"? =item I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? Unix =item How do I close a process's filehandle without waiting for it to complete? =item How do I fork a daemon process? =item How do I find out if I'm running interactively or not? =item How do I timeout a slow event? =item How do I set CPU limits? X<BSD::Resource> X<limit> X<CPU> =item How do I avoid zombies on a Unix system? =item How do I use an SQL database? =item How do I make a system() exit on control-C? =item How do I open a file without blocking? =item How do I tell the difference between errors from the shell and perl? =item How do I install a module from CPAN? =item What's the difference between require and use? =item How do I keep my own module/library directory? =item How do I add the directory my program lives in to the module/library search path? =item How do I add a directory to my include path (@INC) at runtime? the C<PERLLIB> environment variable, the C<PERL5LIB> environment variable, the C<perl -Idir> command line flag, the C<lib> pragma:, the L<local::lib> module: =item Where are modules installed? =item What is socket.ph and where do I get it? =back =item AUTHOR AND COPYRIGHT =back =head2 perlfaq9 - Web, Email and Networking =over 4 =item VERSION =item DESCRIPTION =over 4 =item Should I use a web framework? =item Which web framework should I use? X<framework> X<CGI.pm> X<CGI> X<Catalyst> X<Dancer> L<Catalyst>, L<Dancer2>, L<Mojolicious>, L<Web::Simple> =item What is Plack and PSGI? =item How do I remove HTML from a string? =item How do I extract URLs? =item How do I fetch an HTML file? =item How do I automate an HTML form submission? =item How do I decode or create those %-encodings on the web? X<URI> X<URI::Escape> X<RFC 2396> =item How do I redirect to another page? =item How do I put a password on my web pages? =item How do I make sure users can't enter values into a form that causes my CGI script to do bad things? =item How do I parse a mail header? =item How do I check a valid mail address? =item How do I decode a MIME/BASE64 string? =item How do I find the user's mail address? =item How do I send email? L<Email::Sender::Transport::Sendmail>, L<Email::Sender::Transport::SMTP> =item How do I use MIME to make an attachment to a mail message? =item How do I read email? =item How do I find out my hostname, domainname, or IP address? X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa, gethostbyname, Socket, Net::Domain, Sys::Hostname> =item How do I fetch/put an (S)FTP file? =item How can I do RPC in Perl? =back =item AUTHOR AND COPYRIGHT =back =head2 perlsyn - Perl syntax =over 4 =item DESCRIPTION =over 4 =item Declarations X<declaration> X<undef> X<undefined> X<uninitialized> =item Comments X<comment> X<#> =item Simple Statements X<statement> X<semicolon> X<expression> X<;> =item Statement Modifiers X<statement modifier> X<modifier> X<if> X<unless> X<while> X<until> X<when> X<foreach> X<for> =item Compound Statements X<statement, compound> X<block> X<bracket, curly> X<curly bracket> X<brace> X<{> X<}> X<if> X<unless> X<given> X<while> X<until> X<foreach> X<for> X<continue> =item Loop Control X<loop control> X<loop, control> X<next> X<last> X<redo> X<continue> =item For Loops X<for> X<foreach> =item Foreach Loops X<for> X<foreach> =item Basic BLOCKs X<block> =item Switch Statements =item Goto X<goto> =item The Ellipsis Statement X<...> X<... statement> X<ellipsis operator> X<elliptical statement> X<unimplemented statement> X<unimplemented operator> X<yada-yada> X<yada-yada operator> X<... operator> X<whatever operator> X<triple-dot operator> =item PODs: Embedded Documentation X<POD> X<documentation> =item Plain Old Comments (Not!) X<comment> X<line> X<#> X<preprocessor> X<eval> =item Experimental Details on given and when Z<>1, Z<>2, Z<>3, Z<>4, Z<>5, Z<>6, Z<>7, Z<>8, Z<>9, Z<>10 =back =back =head2 perldata - Perl data types =over 4 =item DESCRIPTION =over 4 =item Variable names X<variable, name> X<variable name> X<data type> X<type> =item Identifier parsing X<identifiers> =item Context X<context> X<scalar context> X<list context> =item Scalar values X<scalar> X<number> X<string> X<reference> =item Scalar value constructors X<scalar, literal> X<scalar, constant> =item List value constructors X<list> =item Subscripts =item Multi-dimensional array emulation =item Slices X<slice> X<array, slice> X<hash, slice> =item Typeglobs and Filehandles X<typeglob> X<filehandle> X<*> =back =item SEE ALSO =back =head2 perlop - Perl operators and precedence =over 4 =item DESCRIPTION =over 4 =item Operator Precedence and Associativity X<operator, precedence> X<precedence> X<associativity> =item Terms and List Operators (Leftward) X<list operator> X<operator, list> X<term> =item The Arrow Operator X<arrow> X<dereference> X<< -> >> =item Auto-increment and Auto-decrement X<increment> X<auto-increment> X<++> X<decrement> X<auto-decrement> X<--> =item Exponentiation X<**> X<exponentiation> X<power> =item Symbolic Unary Operators X<unary operator> X<operator, unary> =item Binding Operators X<binding> X<operator, binding> X<=~> X<!~> =item Multiplicative Operators X<operator, multiplicative> =item Additive Operators X<operator, additive> =item Shift Operators X<shift operator> X<operator, shift> X<<< << >>> X<<< >> >>> X<right shift> X<left shift> X<bitwise shift> X<shl> X<shr> X<shift, right> X<shift, left> =item Named Unary Operators X<operator, named unary> =item Relational Operators X<relational operator> X<operator, relational> =item Equality Operators X<equality> X<equal> X<equals> X<operator, equality> =item Class Instance Operator X<isa operator> =item Smartmatch Operator 1. Empty hashes or arrays match, 2. That is, each element smartmatches the element of the same index in the other array.[3], 3. If a circular reference is found, fall back to referential equality, 4. Either an actual number, or a string that looks like one =item Bitwise And X<operator, bitwise, and> X<bitwise and> X<&> =item Bitwise Or and Exclusive Or X<operator, bitwise, or> X<bitwise or> X<|> X<operator, bitwise, xor> X<bitwise xor> X<^> =item C-style Logical And X<&&> X<logical and> X<operator, logical, and> =item C-style Logical Or X<||> X<operator, logical, or> =item Logical Defined-Or X<//> X<operator, logical, defined-or> =item Range Operators X<operator, range> X<range> X<..> X<...> =item Conditional Operator X<operator, conditional> X<operator, ternary> X<ternary> X<?:> =item Assignment Operators X<assignment> X<operator, assignment> X<=> X<**=> X<+=> X<*=> X<&=> X<<< <<= >>> X<&&=> X<-=> X</=> X<|=> X<<< >>= >>> X<||=> X<//=> X<.=> X<%=> X<^=> X<x=> X<&.=> X<|.=> X<^.=> =item Comma Operator X<comma> X<operator, comma> X<,> =item List Operators (Rightward) X<operator, list, rightward> X<list operator> =item Logical Not X<operator, logical, not> X<not> =item Logical And X<operator, logical, and> X<and> =item Logical or and Exclusive Or X<operator, logical, or> X<operator, logical, xor> X<operator, logical, exclusive or> X<or> X<xor> =item C Operators Missing From Perl X<operator, missing from perl> X<&> X<*> X<typecasting> X<(TYPE)> unary &, unary *, (TYPE) =item Quote and Quote-like Operators X<operator, quote> X<operator, quote-like> X<q> X<qq> X<qx> X<qw> X<m> X<qr> X<s> X<tr> X<'> X<''> X<"> X<""> X<//> X<`> X<``> X<<< << >>> X<escape sequence> X<escape> [1], [2], [3], [4], [5], [6], [7], [8] =item Regexp Quote-Like Operators X<operator, regexp> C<qr/I<STRING>/msixpodualn> X<qr> X</i> X</m> X</o> X</s> X</x> X</p>, C<m/I<PATTERN>/msixpodualngc> X<m> X<operator, match> X<regexp, options> X<regexp> X<regex, options> X<regex> X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c>, C</I<PATTERN>/msixpodualngc>, The empty pattern C<//>, Matching in list context, C<\G I<assertion>>, C<m?I<PATTERN>?msixpodualngc> X<?> X<operator, match-once>, C<s/I<PATTERN>/I<REPLACEMENT>/msixpodualngcer> X<s> X<substitute> X<substitution> X<replace> X<regexp, replace> X<regexp, substitute> X</m> X</s> X</i> X</x> X</p> X</o> X</g> X</c> X</e> X</r> =item Quote-Like Operators X<operator, quote-like> C<q/I<STRING>/> X<q> X<quote, single> X<'> X<''>, C<'I<STRING>'>, C<qq/I<STRING>/> X<qq> X<quote, double> X<"> X<"">, C<"I<STRING>">, C<qx/I<STRING>/> X<qx> X<`> X<``> X<backtick>, C<`I<STRING>`>, C<qw/I<STRING>/> X<qw> X<quote, list> X<quote, words>, C<tr/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr> X<tr> X<y> X<transliterate> X</c> X</d> X</s>, C<y/I<SEARCHLIST>/I<REPLACEMENTLIST>/cdsr>, C<< <<I<EOF> >> X<here-doc> X<heredoc> X<here-document> X<<< << >>>, Double Quotes, Single Quotes, Backticks, Indented Here-docs =item Gory details of parsing quoted constructs X<quote, gory details> Finding the end, Interpolation X<interpolation>, C<<<'EOF'>, C<m''>, the pattern of C<s'''>, C<''>, C<q//>, C<tr'''>, C<y'''>, the replacement of C<s'''>, C<tr///>, C<y///>, C<"">, C<``>, C<qq//>, C<qx//>, C<< <file*glob> >>, C<<<"EOF">, the replacement of C<s///>, C<RE> in C<m?RE?>, C</RE/>, C<m/RE/>, C<s/RE/foo/>,, parsing regular expressions X<regexp, parse>, Optimization of regular expressions X<regexp, optimization> =item I/O Operators X<operator, i/o> X<operator, io> X<io> X<while> X<filehandle> X<< <> >> X<< <<>> >> X<@ARGV> =item Constant Folding X<constant folding> X<folding> =item No-ops X<no-op> X<nop> =item Bitwise String Operators X<operator, bitwise, string> X<&.> X<|.> X<^.> X<~.> =item Integer Arithmetic X<integer> =item Floating-point Arithmetic =item Bigger Numbers X<number, arbitrary precision> =back =back =head2 perlsub - Perl subroutines =over 4 =item SYNOPSIS =item DESCRIPTION documented later in this document, documented in L<perlmod>, documented in L<perlobj>, documented in L<perltie>, documented in L<PerlIO::via>, documented in L<perlfunc>, documented in L<UNIVERSAL>, documented in L<perldebguts>, undocumented, used internally by the L<overload> feature =over 4 =item Signatures =item Private Variables via my() X<my> X<variable, lexical> X<lexical> X<lexical variable> X<scope, lexical> X<lexical scope> X<attributes, my> =item Persistent Private Variables X<state> X<state variable> X<static> X<variable, persistent> X<variable, static> X<closure> =item Temporary Values via local() X<local> X<scope, dynamic> X<dynamic scope> X<variable, local> X<variable, temporary> =item Lvalue subroutines X<lvalue> X<subroutine, lvalue> =item Lexical Subroutines X<my sub> X<state sub> X<our sub> X<subroutine, lexical> =item Passing Symbol Table Entries (typeglobs) X<typeglob> X<*> =item When to Still Use local() X<local> X<variable, local> =item Pass by Reference X<pass by reference> X<pass-by-reference> X<reference> =item Prototypes X<prototype> X<subroutine, prototype> =item Constant Functions X<constant> =item Overriding Built-in Functions X<built-in> X<override> X<CORE> X<CORE::GLOBAL> =item Autoloading X<autoloading> X<AUTOLOAD> =item Subroutine Attributes X<attribute> X<subroutine, attribute> X<attrs> =back =item SEE ALSO =back =head2 perlfunc - Perl builtin functions =over 4 =item DESCRIPTION =over 4 =item Perl Functions by Category X<function> Functions for SCALARs or strings X<scalar> X<string> X<character>, Regular expressions and pattern matching X<regular expression> X<regex> X<regexp>, Numeric functions X<numeric> X<number> X<trigonometric> X<trigonometry>, Functions for real @ARRAYs X<array>, Functions for list data X<list>, Functions for real %HASHes X<hash>, Input and output functions X<I/O> X<input> X<output> X<dbm>, Functions for fixed-length data or records, Functions for filehandles, files, or directories X<file> X<filehandle> X<directory> X<pipe> X<link> X<symlink>, Keywords related to the control flow of your Perl program X<control flow>, Keywords related to scoping, Miscellaneous functions, Functions for processes and process groups X<process> X<pid> X<process id>, Keywords related to Perl modules X<module>, Keywords related to classes and object-orientation X<object> X<class> X<package>, Low-level socket functions X<socket> X<sock>, System V interprocess communication functions X<IPC> X<System V> X<semaphore> X<shared memory> X<memory> X<message>, Fetching user and group info X<user> X<group> X<password> X<uid> X<gid> X<passwd> X</etc/passwd>, Fetching network info X<network> X<protocol> X<host> X<hostname> X<IP> X<address> X<service>, Time-related functions X<time> X<date>, Non-function keywords =item Portability X<portability> X<Unix> X<portable> =item Alphabetical Listing of Perl Functions -I<X> FILEHANDLE X<-r>X<-w>X<-x>X<-o>X<-R>X<-W>X<-X>X<-O>X<-e>X<-z>X<-s>X<-f>X<-d>X<-l>X<-p> X<-S>X<-b>X<-c>X<-t>X<-u>X<-g>X<-k>X<-T>X<-B>X<-M>X<-A>X<-C>, -I<X> EXPR, -I<X> DIRHANDLE, -I<X>, abs VALUE X<abs> X<absolute>, abs, accept NEWSOCKET,GENERICSOCKET X<accept>, alarm SECONDS X<alarm> X<SIGALRM> X<timer>, alarm, atan2 Y,X X<atan2> X<arctangent> X<tan> X<tangent>, bind SOCKET,NAME X<bind>, binmode FILEHANDLE, LAYER X<binmode> X<binary> X<text> X<DOS> X<Windows>, binmode FILEHANDLE, bless REF,CLASSNAME X<bless>, bless REF, break, caller EXPR X<caller> X<call stack> X<stack> X<stack trace>, caller, chdir EXPR X<chdir> X<cd> X<directory, change>, chdir FILEHANDLE, chdir DIRHANDLE, chdir, chmod LIST X<chmod> X<permission> X<mode>, chomp VARIABLE X<chomp> X<INPUT_RECORD_SEPARATOR> X<$/> X<newline> X<eol>, chomp( LIST ), chomp, chop VARIABLE X<chop>, chop( LIST ), chop, chown LIST X<chown> X<owner> X<user> X<group>, chr NUMBER X<chr> X<character> X<ASCII> X<Unicode>, chr, chroot FILENAME X<chroot> X<root>, chroot, close FILEHANDLE X<close>, close, closedir DIRHANDLE X<closedir>, connect SOCKET,NAME X<connect>, continue BLOCK X<continue>, continue, cos EXPR X<cos> X<cosine> X<acos> X<arccosine>, cos, crypt PLAINTEXT,SALT X<crypt> X<digest> X<hash> X<salt> X<plaintext> X<password> X<decrypt> X<cryptography> X<passwd> X<encrypt>, dbmclose HASH X<dbmclose>, dbmopen HASH,DBNAME,MASK X<dbmopen> X<dbm> X<ndbm> X<sdbm> X<gdbm>, defined EXPR X<defined> X<undef> X<undefined>, defined, delete EXPR X<delete>, die LIST X<die> X<throw> X<exception> X<raise> X<$@> X<abort>, do BLOCK X<do> X<block>, do EXPR X<do>, dump LABEL X<dump> X<core> X<undump>, dump EXPR, dump, each HASH X<each> X<hash, iterator>, each ARRAY X<array, iterator>, eof FILEHANDLE X<eof> X<end of file> X<end-of-file>, eof (), eof, eval EXPR X<eval> X<try> X<catch> X<evaluate> X<parse> X<execute> X<error, handling> X<exception, handling>, eval BLOCK, eval, String eval, Under the L<C<"unicode_eval"> feature|feature/The 'unicode_eval' and 'evalbytes' features>, Outside the C<"unicode_eval"> feature, Block eval, evalbytes EXPR X<evalbytes>, evalbytes, exec LIST X<exec> X<execute>, exec PROGRAM LIST, exists EXPR X<exists> X<autovivification>, exit EXPR X<exit> X<terminate> X<abort>, exit, exp EXPR X<exp> X<exponential> X<antilog> X<antilogarithm> X<e>, exp, fc EXPR X<fc> X<foldcase> X<casefold> X<fold-case> X<case-fold>, fc, fcntl FILEHANDLE,FUNCTION,SCALAR X<fcntl>, __FILE__ X<__FILE__>, fileno FILEHANDLE X<fileno>, fileno DIRHANDLE, flock FILEHANDLE,OPERATION X<flock> X<lock> X<locking>, fork X<fork> X<child> X<parent>, format X<format>, formline PICTURE,LIST X<formline>, getc FILEHANDLE X<getc> X<getchar> X<character> X<file, read>, getc, getlogin X<getlogin> X<login>, getpeername SOCKET X<getpeername> X<peer>, getpgrp PID X<getpgrp> X<group>, getppid X<getppid> X<parent> X<pid>, getpriority WHICH,WHO X<getpriority> X<priority> X<nice>, getpwnam NAME X<getpwnam> X<getgrnam> X<gethostbyname> X<getnetbyname> X<getprotobyname> X<getpwuid> X<getgrgid> X<getservbyname> X<gethostbyaddr> X<getnetbyaddr> X<getprotobynumber> X<getservbyport> X<getpwent> X<getgrent> X<gethostent> X<getnetent> X<getprotoent> X<getservent> X<setpwent> X<setgrent> X<sethostent> X<setnetent> X<setprotoent> X<setservent> X<endpwent> X<endgrent> X<endhostent> X<endnetent> X<endprotoent> X<endservent>, getgrnam NAME, gethostbyname NAME, getnetbyname NAME, getprotobyname NAME, getpwuid UID, getgrgid GID, getservbyname NAME,PROTO, gethostbyaddr ADDR,ADDRTYPE, getnetbyaddr ADDR,ADDRTYPE, getprotobynumber NUMBER, getservbyport PORT,PROTO, getpwent, getgrent, gethostent, getnetent, getprotoent, getservent, setpwent, setgrent, sethostent STAYOPEN, setnetent STAYOPEN, setprotoent STAYOPEN, setservent STAYOPEN, endpwent, endgrent, endhostent, endnetent, endprotoent, endservent, getsockname SOCKET X<getsockname>, getsockopt SOCKET,LEVEL,OPTNAME X<getsockopt>, glob EXPR X<glob> X<wildcard> X<filename, expansion> X<expand>, glob, gmtime EXPR X<gmtime> X<UTC> X<Greenwich>, gmtime, goto LABEL X<goto> X<jump> X<jmp>, goto EXPR, goto &NAME, grep BLOCK LIST X<grep>, grep EXPR,LIST, hex EXPR X<hex> X<hexadecimal>, hex, import LIST X<import>, index STR,SUBSTR,POSITION X<index> X<indexOf> X<InStr>, index STR,SUBSTR, int EXPR X<int> X<integer> X<truncate> X<trunc> X<floor>, int, ioctl FILEHANDLE,FUNCTION,SCALAR X<ioctl>, join EXPR,LIST X<join>, keys HASH X<keys> X<key>, keys ARRAY, kill SIGNAL, LIST, kill SIGNAL X<kill> X<signal>, last LABEL X<last> X<break>, last EXPR, last, lc EXPR X<lc> X<lowercase>, lc, If C<use bytes> is in effect:, Otherwise, if C<use locale> for C<LC_CTYPE> is in effect:, Otherwise, If EXPR has the UTF8 flag set:, Otherwise, if C<use feature 'unicode_strings'> or C<use locale ':not_characters'> is in effect:, Otherwise:, lcfirst EXPR X<lcfirst> X<lowercase>, lcfirst, length EXPR X<length> X<size>, length, __LINE__ X<__LINE__>, link OLDFILE,NEWFILE X<link>, listen SOCKET,QUEUESIZE X<listen>, local EXPR X<local>, localtime EXPR X<localtime> X<ctime>, localtime, lock THING X<lock>, log EXPR X<log> X<logarithm> X<e> X<ln> X<base>, log, lstat FILEHANDLE X<lstat>, lstat EXPR, lstat DIRHANDLE, lstat, m//, map BLOCK LIST X<map>, map EXPR,LIST, mkdir FILENAME,MODE X<mkdir> X<md> X<directory, create>, mkdir FILENAME, mkdir, msgctl ID,CMD,ARG X<msgctl>, msgget KEY,FLAGS X<msgget>, msgrcv ID,VAR,SIZE,TYPE,FLAGS X<msgrcv>, msgsnd ID,MSG,FLAGS X<msgsnd>, my VARLIST X<my>, my TYPE VARLIST, my VARLIST : ATTRS, my TYPE VARLIST : ATTRS, next LABEL X<next> X<continue>, next EXPR, next, no MODULE VERSION LIST X<no declarations> X<unimporting>, no MODULE VERSION, no MODULE LIST, no MODULE, no VERSION, oct EXPR X<oct> X<octal> X<hex> X<hexadecimal> X<binary> X<bin>, oct, open FILEHANDLE,MODE,EXPR X<open> X<pipe> X<file, open> X<fopen>, open FILEHANDLE,MODE,EXPR,LIST, open FILEHANDLE,MODE,REFERENCE, open FILEHANDLE,EXPR, open FILEHANDLE, Working with files, Simple examples, About filehandles, About modes, Checking the return value, Specifying I/O layers in MODE, Using C<undef> for temporary files, Opening a filehandle into an in-memory scalar, Opening a filehandle into a command, Duping filehandles, Legacy usage, Specifying mode and filename as a single argument, Calling C<open> with one argument via global variables, Assigning a filehandle to a bareword, Other considerations, Automatic filehandle closure, Automatic pipe flushing, Direct versus by-reference assignment of filehandles, Whitespace and special characters in the filename argument, Invoking C-style C<open>, Portability issues, opendir DIRHANDLE,EXPR X<opendir>, ord EXPR X<ord> X<encoding>, ord, our VARLIST X<our> X<global>, our TYPE VARLIST, our VARLIST : ATTRS, our TYPE VARLIST : ATTRS, pack TEMPLATE,LIST X<pack>, package NAMESPACE, package NAMESPACE VERSION X<package> X<module> X<namespace> X<version>, package NAMESPACE BLOCK, package NAMESPACE VERSION BLOCK X<package> X<module> X<namespace> X<version>, __PACKAGE__ X<__PACKAGE__>, pipe READHANDLE,WRITEHANDLE X<pipe>, pop ARRAY X<pop> X<stack>, pop, pos SCALAR X<pos> X<match, position>, pos, print FILEHANDLE LIST X<print>, print FILEHANDLE, print LIST, print, printf FILEHANDLE FORMAT, LIST X<printf>, printf FILEHANDLE, printf FORMAT, LIST, printf, prototype FUNCTION X<prototype>, prototype, push ARRAY,LIST X<push> X<stack>, q/STRING/, qq/STRING/, qw/STRING/, qx/STRING/, qr/STRING/, quotemeta EXPR X<quotemeta> X<metacharacter>, quotemeta, rand EXPR X<rand> X<random>, rand, read FILEHANDLE,SCALAR,LENGTH,OFFSET X<read> X<file, read>, read FILEHANDLE,SCALAR,LENGTH, readdir DIRHANDLE X<readdir>, readline EXPR, readline X<readline> X<gets> X<fgets>, readlink EXPR X<readlink>, readlink, readpipe EXPR, readpipe X<readpipe>, recv SOCKET,SCALAR,LENGTH,FLAGS X<recv>, redo LABEL X<redo>, redo EXPR, redo, ref EXPR X<ref> X<reference>, ref, rename OLDNAME,NEWNAME X<rename> X<move> X<mv> X<ren>, require VERSION X<require>, require EXPR, require, reset EXPR X<reset>, reset, return EXPR X<return>, return, reverse LIST X<reverse> X<rev> X<invert>, rewinddir DIRHANDLE X<rewinddir>, rindex STR,SUBSTR,POSITION X<rindex>, rindex STR,SUBSTR, rmdir FILENAME X<rmdir> X<rd> X<directory, remove>, rmdir, s///, say FILEHANDLE LIST X<say>, say FILEHANDLE, say LIST, say, scalar EXPR X<scalar> X<context>, seek FILEHANDLE,POSITION,WHENCE X<seek> X<fseek> X<filehandle, position>, seekdir DIRHANDLE,POS X<seekdir>, select FILEHANDLE X<select> X<filehandle, default>, select, select RBITS,WBITS,EBITS,TIMEOUT X<select>, semctl ID,SEMNUM,CMD,ARG X<semctl>, semget KEY,NSEMS,FLAGS X<semget>, semop KEY,OPSTRING X<semop>, send SOCKET,MSG,FLAGS,TO X<send>, send SOCKET,MSG,FLAGS, setpgrp PID,PGRP X<setpgrp> X<group>, setpriority WHICH,WHO,PRIORITY X<setpriority> X<priority> X<nice> X<renice>, setsockopt SOCKET,LEVEL,OPTNAME,OPTVAL X<setsockopt>, shift ARRAY X<shift>, shift, shmctl ID,CMD,ARG X<shmctl>, shmget KEY,SIZE,FLAGS X<shmget>, shmread ID,VAR,POS,SIZE X<shmread> X<shmwrite>, shmwrite ID,STRING,POS,SIZE, shutdown SOCKET,HOW X<shutdown>, sin EXPR X<sin> X<sine> X<asin> X<arcsine>, sin, sleep EXPR X<sleep> X<pause>, sleep, socket SOCKET,DOMAIN,TYPE,PROTOCOL X<socket>, socketpair SOCKET1,SOCKET2,DOMAIN,TYPE,PROTOCOL X<socketpair>, sort SUBNAME LIST X<sort>, sort BLOCK LIST, sort LIST, splice ARRAY,OFFSET,LENGTH,LIST X<splice>, splice ARRAY,OFFSET,LENGTH, splice ARRAY,OFFSET, splice ARRAY, split /PATTERN/,EXPR,LIMIT X<split>, split /PATTERN/,EXPR, split /PATTERN/, split, sprintf FORMAT, LIST X<sprintf>, format parameter index, flags, vector flag, (minimum) width, precision, or maximum width X<precision>, size, order of arguments, sqrt EXPR X<sqrt> X<root> X<square root>, sqrt, srand EXPR X<srand> X<seed> X<randseed>, srand, stat FILEHANDLE X<stat> X<file, status> X<ctime>, stat EXPR, stat DIRHANDLE, stat, state VARLIST X<state>, state TYPE VARLIST, state VARLIST : ATTRS, state TYPE VARLIST : ATTRS, study SCALAR X<study>, study, sub NAME BLOCK X<sub>, sub NAME (PROTO) BLOCK, sub NAME : ATTRS BLOCK, sub NAME (PROTO) : ATTRS BLOCK, __SUB__ X<__SUB__>, substr EXPR,OFFSET,LENGTH,REPLACEMENT X<substr> X<substring> X<mid> X<left> X<right>, substr EXPR,OFFSET,LENGTH, substr EXPR,OFFSET, symlink OLDFILE,NEWFILE X<symlink> X<link> X<symbolic link> X<link, symbolic>, syscall NUMBER, LIST X<syscall> X<system call>, sysopen FILEHANDLE,FILENAME,MODE X<sysopen>, sysopen FILEHANDLE,FILENAME,MODE,PERMS, sysread FILEHANDLE,SCALAR,LENGTH,OFFSET X<sysread>, sysread FILEHANDLE,SCALAR,LENGTH, sysseek FILEHANDLE,POSITION,WHENCE X<sysseek> X<lseek>, system LIST X<system> X<shell>, system PROGRAM LIST, syswrite FILEHANDLE,SCALAR,LENGTH,OFFSET X<syswrite>, syswrite FILEHANDLE,SCALAR,LENGTH, syswrite FILEHANDLE,SCALAR, tell FILEHANDLE X<tell>, tell, telldir DIRHANDLE X<telldir>, tie VARIABLE,CLASSNAME,LIST X<tie>, tied VARIABLE X<tied>, time X<time> X<epoch>, times X<times>, tr///, truncate FILEHANDLE,LENGTH X<truncate>, truncate EXPR,LENGTH, uc EXPR X<uc> X<uppercase> X<toupper>, uc, ucfirst EXPR X<ucfirst> X<uppercase>, ucfirst, umask EXPR X<umask>, umask, undef EXPR X<undef> X<undefine>, undef, unlink LIST X<unlink> X<delete> X<remove> X<rm> X<del>, unlink, unpack TEMPLATE,EXPR X<unpack>, unpack TEMPLATE, unshift ARRAY,LIST X<unshift>, untie VARIABLE X<untie>, use Module VERSION LIST X<use> X<module> X<import>, use Module VERSION, use Module LIST, use Module, use VERSION, utime LIST X<utime>, values HASH X<values>, values ARRAY, vec EXPR,OFFSET,BITS X<vec> X<bit> X<bit vector>, wait X<wait>, waitpid PID,FLAGS X<waitpid>, wantarray X<wantarray> X<context>, warn LIST X<warn> X<warning> X<STDERR>, write FILEHANDLE X<write>, write EXPR, write, y/// =item Non-function Keywords by Cross-reference __DATA__, __END__, BEGIN, CHECK, END, INIT, UNITCHECK, DESTROY, and, cmp, eq, ge, gt, le, lt, ne, not, or, x, xor, AUTOLOAD, else, elsif, for, foreach, if, unless, until, while, elseif, default, given, when =back =back =head2 perlopentut - simple recipes for opening files and pipes in Perl =over 4 =item DESCRIPTION I<OK>, I<HANDLE>, I<MODE>, I<PATHNAME> =item Opening Text Files =over 4 =item Opening Text Files for Reading =item Opening Text Files for Writing =back =item Opening Binary Files =item Opening Pipes =over 4 =item Opening a pipe for reading =item Opening a pipe for writing =item Expressing the command as a list =back =item SEE ALSO =item AUTHOR and COPYRIGHT =back =head2 perlpacktut - tutorial on C<pack> and C<unpack> =over 4 =item DESCRIPTION =item The Basic Principle =item Packing Text =item Packing Numbers =over 4 =item Integers =item Unpacking a Stack Frame =item How to Eat an Egg on a Net =item Byte-order modifiers =item Floating point Numbers =back =item Exotic Templates =over 4 =item Bit Strings =item Uuencoding =item Doing Sums =item Unicode =item Another Portable Binary Encoding =back =item Template Grouping =item Lengths and Widths =over 4 =item String Lengths =item Dynamic Templates =item Counting Repetitions =item Intel HEX =back =item Packing and Unpacking C Structures =over 4 =item The Alignment Pit =item Dealing with Endian-ness =item Alignment, Take 2 =item Alignment, Take 3 =item Pointers for How to Use Them =back =item Pack Recipes =item Funnies Section =item Authors =back =head2 perlpod - the Plain Old Documentation format =over 4 =item DESCRIPTION =over 4 =item Ordinary Paragraph X<POD, ordinary paragraph> =item Verbatim Paragraph X<POD, verbatim paragraph> X<verbatim> =item Command Paragraph X<POD, command> C<=head1 I<Heading Text>> X<=head1> X<=head2> X<=head3> X<=head4> X<head1> X<head2> X<head3> X<head4>, C<=head2 I<Heading Text>>, C<=head3 I<Heading Text>>, C<=head4 I<Heading Text>>, C<=over I<indentlevel>> X<=over> X<=item> X<=back> X<over> X<item> X<back>, C<=item I<stuff...>>, C<=back>, C<=cut> X<=cut> X<cut>, C<=pod> X<=pod> X<pod>, C<=begin I<formatname>> X<=begin> X<=end> X<=for> X<begin> X<end> X<for>, C<=end I<formatname>>, C<=for I<formatname> I<text...>>, C<=encoding I<encodingname>> X<=encoding> X<encoding> =item Formatting Codes X<POD, formatting code> X<formatting code> X<POD, interior sequence> X<interior sequence> C<IE<lt>textE<gt>> -- italic text X<I> X<< IZ<><> >> X<POD, formatting code, italic> X<italic>, C<BE<lt>textE<gt>> -- bold text X<B> X<< BZ<><> >> X<POD, formatting code, bold> X<bold>, C<CE<lt>codeE<gt>> -- code text X<C> X<< CZ<><> >> X<POD, formatting code, code> X<code>, C<LE<lt>nameE<gt>> -- a hyperlink X<L> X<< LZ<><> >> X<POD, formatting code, hyperlink> X<hyperlink>, C<EE<lt>escapeE<gt>> -- a character escape X<E> X<< EZ<><> >> X<POD, formatting code, escape> X<escape>, C<FE<lt>filenameE<gt>> -- used for filenames X<F> X<< FZ<><> >> X<POD, formatting code, filename> X<filename>, C<SE<lt>textE<gt>> -- text contains non-breaking spaces X<S> X<< SZ<><> >> X<POD, formatting code, non-breaking space> X<non-breaking space>, C<XE<lt>topic nameE<gt>> -- an index entry X<X> X<< XZ<><> >> X<POD, formatting code, index entry> X<index entry>, C<ZE<lt>E<gt>> -- a null (zero-effect) formatting code X<Z> X<< ZZ<><> >> X<POD, formatting code, null> X<null> =item The Intent X<POD, intent of> =item Embedding Pods in Perl Modules X<POD, embedding> =item Hints for Writing Pod X<podchecker> X<POD, validating> =back =item SEE ALSO =item AUTHOR =back =head2 perlpodspec - Plain Old Documentation: format specification and notes =over 4 =item DESCRIPTION =item Pod Definitions =item Pod Commands "=head1", "=head2", "=head3", "=head4", "=pod", "=cut", "=over", "=item", "=back", "=begin formatname", "=begin formatname parameter", "=end formatname", "=for formatname text...", "=encoding encodingname" =item Pod Formatting Codes C<IE<lt>textE<gt>> -- italic text, C<BE<lt>textE<gt>> -- bold text, C<CE<lt>codeE<gt>> -- code text, C<FE<lt>filenameE<gt>> -- style for filenames, C<XE<lt>topic nameE<gt>> -- an index entry, C<ZE<lt>E<gt>> -- a null (zero-effect) formatting code, C<LE<lt>nameE<gt>> -- a hyperlink, C<EE<lt>escapeE<gt>> -- a character escape, C<SE<lt>textE<gt>> -- text contains non-breaking spaces =item Notes on Implementing Pod Processors =item About LE<lt>...E<gt> Codes First:, Second:, Third:, Fourth:, Fifth:, Sixth: =item About =over...=back Regions =item About Data Paragraphs and "=begin/=end" Regions =item SEE ALSO =item AUTHOR =back =head2 perlpodstyle - Perl POD style guide =over 4 =item DESCRIPTION NAME, SYNOPSIS, DESCRIPTION, OPTIONS, RETURN VALUE, ERRORS, DIAGNOSTICS, EXAMPLES, ENVIRONMENT, FILES, CAVEATS, BUGS, RESTRICTIONS, NOTES, AUTHOR, HISTORY, COPYRIGHT AND LICENSE, SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 perldiag - various Perl diagnostics =over 4 =item DESCRIPTION =item SEE ALSO =back =head2 perldeprecation - list Perl deprecations =over 4 =item DESCRIPTION =over 4 =item Perl 5.34 =item Perl 5.32 =item Perl 5.30 =item Perl 5.28 =item Perl 5.26 =item Perl 5.24 =item Perl 5.16 =back =item SEE ALSO =back =head2 perllexwarn - Perl Lexical Warnings =over 4 =item DESCRIPTION =back =head2 perldebug - Perl debugging =over 4 =item DESCRIPTION =item The Perl Debugger =over 4 =item Calling the Debugger perl -d program_name, perl -d -e 0, perl -d:ptkdb program_name, perl -dt threaded_program_name =item Debugger Commands h X<debugger command, h>, h [command], h h, p expr X<debugger command, p>, x [maxdepth] expr X<debugger command, x>, V [pkg [vars]] X<debugger command, V>, X [vars] X<debugger command, X>, y [level [vars]] X<debugger command, y>, T X<debugger command, T> X<backtrace> X<stack, backtrace>, s [expr] X<debugger command, s> X<step>, n [expr] X<debugger command, n>, r X<debugger command, r>, <CR>, c [line|sub] X<debugger command, c>, l X<debugger command, l>, l min+incr, l min-max, l line, l subname, - X<debugger command, ->, v [line] X<debugger command, v>, . X<debugger command, .>, f filename X<debugger command, f>, /pattern/, ?pattern?, L [abw] X<debugger command, L>, S [[!]regex] X<debugger command, S>, t [n] X<debugger command, t>, t [n] expr X<debugger command, t>, b X<breakpoint> X<debugger command, b>, b [line] [condition] X<breakpoint> X<debugger command, b>, b [file]:[line] [condition] X<breakpoint> X<debugger command, b>, b subname [condition] X<breakpoint> X<debugger command, b>, b postpone subname [condition] X<breakpoint> X<debugger command, b>, b load filename X<breakpoint> X<debugger command, b>, b compile subname X<breakpoint> X<debugger command, b>, B line X<breakpoint> X<debugger command, B>, B * X<breakpoint> X<debugger command, B>, disable [file]:[line] X<breakpoint> X<debugger command, disable> X<disable>, disable [line] X<breakpoint> X<debugger command, disable> X<disable>, enable [file]:[line] X<breakpoint> X<debugger command, disable> X<disable>, enable [line] X<breakpoint> X<debugger command, disable> X<disable>, a [line] command X<debugger command, a>, A line X<debugger command, A>, A * X<debugger command, A>, w expr X<debugger command, w>, W expr X<debugger command, W>, W * X<debugger command, W>, o X<debugger command, o>, o booloption ... X<debugger command, o>, o anyoption? ... X<debugger command, o>, o option=value ... X<debugger command, o>, < ? X<< debugger command, < >>, < [ command ] X<< debugger command, < >>, < * X<< debugger command, < >>, << command X<< debugger command, << >>, > ? X<< debugger command, > >>, > command X<< debugger command, > >>, > * X<< debugger command, > >>, >> command X<<< debugger command, >> >>>, { ? X<debugger command, {>, { [ command ], { * X<debugger command, {>, {{ command X<debugger command, {{>, ! number X<debugger command, !>, ! -number X<debugger command, !>, ! pattern X<debugger command, !>, !! cmd X<debugger command, !!>, source file X<debugger command, source>, H -number X<debugger command, H>, q or ^D X<debugger command, q> X<debugger command, ^D>, R X<debugger command, R>, |dbcmd X<debugger command, |>, ||dbcmd X<debugger command, ||>, command, m expr X<debugger command, m>, M X<debugger command, M>, man [manpage] X<debugger command, man> =item Configurable Options C<recallCommand>, C<ShellBang> X<debugger option, recallCommand> X<debugger option, ShellBang>, C<pager> X<debugger option, pager>, C<tkRunning> X<debugger option, tkRunning>, C<signalLevel>, C<warnLevel>, C<dieLevel> X<debugger option, signalLevel> X<debugger option, warnLevel> X<debugger option, dieLevel>, C<AutoTrace> X<debugger option, AutoTrace>, C<LineInfo> X<debugger option, LineInfo>, C<inhibit_exit> X<debugger option, inhibit_exit>, C<PrintRet> X<debugger option, PrintRet>, C<ornaments> X<debugger option, ornaments>, C<frame> X<debugger option, frame>, C<maxTraceLen> X<debugger option, maxTraceLen>, C<windowSize> X<debugger option, windowSize>, C<arrayDepth>, C<hashDepth> X<debugger option, arrayDepth> X<debugger option, hashDepth>, C<dumpDepth> X<debugger option, dumpDepth>, C<compactDump>, C<veryCompact> X<debugger option, compactDump> X<debugger option, veryCompact>, C<globPrint> X<debugger option, globPrint>, C<DumpDBFiles> X<debugger option, DumpDBFiles>, C<DumpPackages> X<debugger option, DumpPackages>, C<DumpReused> X<debugger option, DumpReused>, C<quote>, C<HighBit>, C<undefPrint> X<debugger option, quote> X<debugger option, HighBit> X<debugger option, undefPrint>, C<UsageOnly> X<debugger option, UsageOnly>, C<HistFile> X<debugger option, history, HistFile>, C<HistSize> X<debugger option, history, HistSize>, C<TTY> X<debugger option, TTY>, C<noTTY> X<debugger option, noTTY>, C<ReadLine> X<debugger option, ReadLine>, C<NonStop> X<debugger option, NonStop> =item Debugger Input/Output Prompt, Multiline commands, Stack backtrace X<backtrace> X<stack, backtrace>, Line Listing Format, Frame listing =item Debugging Compile-Time Statements =item Debugger Customization =item Readline Support / History in the Debugger =item Editor Support for Debugging =item The Perl Profiler X<profile> X<profiling> X<profiler> =back =item Debugging Regular Expressions X<regular expression, debugging> X<regex, debugging> X<regexp, debugging> =item Debugging Memory Usage X<memory usage> =item SEE ALSO =item BUGS =back =head2 perlvar - Perl predefined variables =over 4 =item DESCRIPTION =over 4 =item The Syntax of Variable Names =back =item SPECIAL VARIABLES =over 4 =item General Variables $ARG, $_ X<$_> X<$ARG>, @ARG, @_ X<@_> X<@ARG>, $LIST_SEPARATOR, $" X<$"> X<$LIST_SEPARATOR>, $PROCESS_ID, $PID, $$ X<$$> X<$PID> X<$PROCESS_ID>, $PROGRAM_NAME, $0 X<$0> X<$PROGRAM_NAME>, $REAL_GROUP_ID, $GID, $( X<$(> X<$GID> X<$REAL_GROUP_ID>, $EFFECTIVE_GROUP_ID, $EGID, $) X<$)> X<$EGID> X<$EFFECTIVE_GROUP_ID>, $REAL_USER_ID, $UID, $< X<< $< >> X<$UID> X<$REAL_USER_ID>, $EFFECTIVE_USER_ID, $EUID, $> X<< $> >> X<$EUID> X<$EFFECTIVE_USER_ID>, $SUBSCRIPT_SEPARATOR, $SUBSEP, $; X<$;> X<$SUBSEP> X<SUBSCRIPT_SEPARATOR>, $a, $b X<$a> X<$b>, %ENV X<%ENV>, $OLD_PERL_VERSION, $] X<$]> X<$OLD_PERL_VERSION>, $SYSTEM_FD_MAX, $^F X<$^F> X<$SYSTEM_FD_MAX>, @F X<@F>, @INC X<@INC>, %INC X<%INC>, $INPLACE_EDIT, $^I X<$^I> X<$INPLACE_EDIT>, @ISA X<@ISA>, $^M X<$^M>, $OSNAME, $^O X<$^O> X<$OSNAME>, %SIG X<%SIG>, $BASETIME, $^T X<$^T> X<$BASETIME>, $PERL_VERSION, $^V X<$^V> X<$PERL_VERSION>, ${^WIN32_SLOPPY_STAT} X<${^WIN32_SLOPPY_STAT}> X<sitecustomize> X<sitecustomize.pl>, $EXECUTABLE_NAME, $^X X<$^X> X<$EXECUTABLE_NAME> =item Variables related to regular expressions $<I<digits>> ($1, $2, ...) X<$1> X<$2> X<$3> X<$I<digits>>, @{^CAPTURE} X<@{^CAPTURE}> X<@^CAPTURE>, $MATCH, $& X<$&> X<$MATCH>, ${^MATCH} X<${^MATCH}>, $PREMATCH, $` X<$`> X<$PREMATCH> X<${^PREMATCH}>, ${^PREMATCH} X<$`> X<${^PREMATCH}>, $POSTMATCH, $' X<$'> X<$POSTMATCH> X<${^POSTMATCH}> X<@->, ${^POSTMATCH} X<${^POSTMATCH}> X<$'> X<$POSTMATCH>, $LAST_PAREN_MATCH, $+ X<$+> X<$LAST_PAREN_MATCH>, $LAST_SUBMATCH_RESULT, $^N X<$^N> X<$LAST_SUBMATCH_RESULT>, @LAST_MATCH_END, @+ X<@+> X<@LAST_MATCH_END>, %{^CAPTURE}, %LAST_PAREN_MATCH, %+ X<%+> X<%LAST_PAREN_MATCH> X<%{^CAPTURE}>, @LAST_MATCH_START, @- X<@-> X<@LAST_MATCH_START>, C<$`> is the same as C<substr($var, 0, $-[0])>, C<$&> is the same as C<substr($var, $-[0], $+[0] - $-[0])>, C<$'> is the same as C<substr($var, $+[0])>, C<$1> is the same as C<substr($var, $-[1], $+[1] - $-[1])>, C<$2> is the same as C<substr($var, $-[2], $+[2] - $-[2])>, C<$3> is the same as C<substr($var, $-[3], $+[3] - $-[3])>, %{^CAPTURE_ALL} X<%{^CAPTURE_ALL}>, %- X<%->, $LAST_REGEXP_CODE_RESULT, $^R X<$^R> X<$LAST_REGEXP_CODE_RESULT>, ${^RE_COMPILE_RECURSION_LIMIT} X<${^RE_COMPILE_RECURSION_LIMIT}>, ${^RE_DEBUG_FLAGS} X<${^RE_DEBUG_FLAGS}>, ${^RE_TRIE_MAXBUF} X<${^RE_TRIE_MAXBUF}> =item Variables related to filehandles $ARGV X<$ARGV>, @ARGV X<@ARGV>, ARGV X<ARGV>, ARGVOUT X<ARGVOUT>, IO::Handle->output_field_separator( EXPR ), $OUTPUT_FIELD_SEPARATOR, $OFS, $, X<$,> X<$OFS> X<$OUTPUT_FIELD_SEPARATOR>, HANDLE->input_line_number( EXPR ), $INPUT_LINE_NUMBER, $NR, $. X<$.> X<$NR> X<$INPUT_LINE_NUMBER> X<line number>, IO::Handle->input_record_separator( EXPR ), $INPUT_RECORD_SEPARATOR, $RS, $/ X<$/> X<$RS> X<$INPUT_RECORD_SEPARATOR>, IO::Handle->output_record_separator( EXPR ), $OUTPUT_RECORD_SEPARATOR, $ORS, $\ X<$\> X<$ORS> X<$OUTPUT_RECORD_SEPARATOR>, HANDLE->autoflush( EXPR ), $OUTPUT_AUTOFLUSH, $| X<$|> X<autoflush> X<flush> X<$OUTPUT_AUTOFLUSH>, ${^LAST_FH} X<${^LAST_FH}>, $ACCUMULATOR, $^A X<$^A> X<$ACCUMULATOR>, IO::Handle->format_formfeed(EXPR), $FORMAT_FORMFEED, $^L X<$^L> X<$FORMAT_FORMFEED>, HANDLE->format_page_number(EXPR), $FORMAT_PAGE_NUMBER, $% X<$%> X<$FORMAT_PAGE_NUMBER>, HANDLE->format_lines_left(EXPR), $FORMAT_LINES_LEFT, $- X<$-> X<$FORMAT_LINES_LEFT>, IO::Handle->format_line_break_characters EXPR, $FORMAT_LINE_BREAK_CHARACTERS, $: X<$:> X<FORMAT_LINE_BREAK_CHARACTERS>, HANDLE->format_lines_per_page(EXPR), $FORMAT_LINES_PER_PAGE, $= X<$=> X<$FORMAT_LINES_PER_PAGE>, HANDLE->format_top_name(EXPR), $FORMAT_TOP_NAME, $^ X<$^> X<$FORMAT_TOP_NAME>, HANDLE->format_name(EXPR), $FORMAT_NAME, $~ X<$~> X<$FORMAT_NAME> =item Error Variables X<error> X<exception> ${^CHILD_ERROR_NATIVE} X<$^CHILD_ERROR_NATIVE>, $EXTENDED_OS_ERROR, $^E X<$^E> X<$EXTENDED_OS_ERROR>, $EXCEPTIONS_BEING_CAUGHT, $^S X<$^S> X<$EXCEPTIONS_BEING_CAUGHT>, $WARNING, $^W X<$^W> X<$WARNING>, ${^WARNING_BITS} X<${^WARNING_BITS}>, $OS_ERROR, $ERRNO, $! X<$!> X<$ERRNO> X<$OS_ERROR>, %OS_ERROR, %ERRNO, %! X<%!> X<%OS_ERROR> X<%ERRNO>, $CHILD_ERROR, $? X<$?> X<$CHILD_ERROR>, $EVAL_ERROR, $@ X<$@> X<$EVAL_ERROR> =item Variables related to the interpreter state $COMPILING, $^C X<$^C> X<$COMPILING>, $DEBUGGING, $^D X<$^D> X<$DEBUGGING>, ${^ENCODING} X<${^ENCODING}>, ${^GLOBAL_PHASE} X<${^GLOBAL_PHASE}>, CONSTRUCT, START, CHECK, INIT, RUN, END, DESTRUCT, $^H X<$^H>, %^H X<%^H>, ${^OPEN} X<${^OPEN}>, $PERLDB, $^P X<$^P> X<$PERLDB>, 0x01, 0x02, 0x04, 0x08, 0x10, 0x20, 0x40, 0x80, 0x100, 0x200, 0x400, 0x800, 0x1000, ${^TAINT} X<${^TAINT}>, ${^SAFE_LOCALES} X<${^SAFE_LOCALES}>, ${^UNICODE} X<${^UNICODE}>, ${^UTF8CACHE} X<${^UTF8CACHE}>, ${^UTF8LOCALE} X<${^UTF8LOCALE}> =item Deprecated and removed variables $# X<$#>, $* X<$*>, $[ X<$[> =back =back =head2 perlre - Perl regular expressions =over 4 =item DESCRIPTION =over 4 =item The Basics X<regular expression, version 8> X<regex, version 8> X<regexp, version 8> =item Modifiers B<C<m>> X</m> X<regex, multiline> X<regexp, multiline> X<regular expression, multiline>, B<C<s>> X</s> X<regex, single-line> X<regexp, single-line> X<regular expression, single-line>, B<C<i>> X</i> X<regex, case-insensitive> X<regexp, case-insensitive> X<regular expression, case-insensitive>, B<C<x>> and B<C<xx>> X</x>, B<C<p>> X</p> X<regex, preserve> X<regexp, preserve>, B<C<a>>, B<C<d>>, B<C<l>>, and B<C<u>> X</a> X</d> X</l> X</u>, B<C<n>> X</n> X<regex, non-capture> X<regexp, non-capture> X<regular expression, non-capture>, Other Modifiers =item Regular Expressions [1], [2], [3], [4], [5], [6], [7], [8] =item Quoting metacharacters =item Extended Patterns C<(?#I<text>)> X<(?#)>, C<(?adlupimnsx-imnsx)>, C<(?^alupimnsx)> X<(?)> X<(?^)>, C<(?:I<pattern>)> X<(?:)>, C<(?adluimnsx-imnsx:I<pattern>)>, C<(?^aluimnsx:I<pattern>)> X<(?^:)>, C<(?|I<pattern>)> X<(?|)> X<Branch reset>, Lookaround Assertions X<look-around assertion> X<lookaround assertion> X<look-around> X<lookaround>, C<(?=I<pattern>)>, C<(*pla:I<pattern>)>, C<(*positive_lookahead:I<pattern>)> X<(?=)> X<(*pla> X<(*positive_lookahead> X<look-ahead, positive> X<lookahead, positive>, C<(?!I<pattern>)>, C<(*nla:I<pattern>)>, C<(*negative_lookahead:I<pattern>)> X<(?!)> X<(*nla> X<(*negative_lookahead> X<look-ahead, negative> X<lookahead, negative>, C<(?<=I<pattern>)>, C<\K>, C<(*plb:I<pattern>)>, C<(*positive_lookbehind:I<pattern>)> X<(?<=)> X<(*plb> X<(*positive_lookbehind> X<look-behind, positive> X<lookbehind, positive> X<\K>, C<(?<!I<pattern>)>, C<(*nlb:I<pattern>)>, C<(*negative_lookbehind:I<pattern>)> X<(?<!)> X<(*nlb> X<(*negative_lookbehind> X<look-behind, negative> X<lookbehind, negative>, C<< (?<I<NAME>>I<pattern>) >>, C<(?'I<NAME>'I<pattern>)> X<< (?<NAME>) >> X<(?'NAME')> X<named capture> X<capture>, C<< \k<I<NAME>> >>, C<< \k'I<NAME>' >>, C<(?{ I<code> })> X<(?{})> X<regex, code in> X<regexp, code in> X<regular expression, code in>, C<(??{ I<code> })> X<(??{})> X<regex, postponed> X<regexp, postponed> X<regular expression, postponed>, C<(?I<PARNO>)> C<(?-I<PARNO>)> C<(?+I<PARNO>)> C<(?R)> C<(?0)> X<(?PARNO)> X<(?1)> X<(?R)> X<(?0)> X<(?-1)> X<(?+1)> X<(?-PARNO)> X<(?+PARNO)> X<regex, recursive> X<regexp, recursive> X<regular expression, recursive> X<regex, relative recursion> X<GOSUB> X<GOSTART>, C<(?&I<NAME>)> X<(?&NAME)>, C<(?(I<condition>)I<yes-pattern>|I<no-pattern>)> X<(?()>, C<(?(I<condition>)I<yes-pattern>)>, an integer in parentheses, a lookahead/lookbehind/evaluate zero-width assertion;, a name in angle brackets or single quotes, the special symbol C<(R)>, C<(1)> C<(2)> .., C<(E<lt>I<NAME>E<gt>)> C<('I<NAME>')>, C<(?=...)> C<(?!...)> C<(?<=...)> C<(?<!...)>, C<(?{ I<CODE> })>, C<(R)>, C<(R1)> C<(R2)> .., C<(R&I<NAME>)>, C<(DEFINE)>, C<< (?>I<pattern>) >>, C<< (*atomic:I<pattern>) >> X<(?E<gt>pattern)> X<(*atomic> X<backtrack> X<backtracking> X<atomic> X<possessive>, C<(?[ ])> =item Backtracking X<backtrack> X<backtracking> =item Script Runs X<(*script_run:...)> X<(sr:...)> X<(*atomic_script_run:...)> X<(asr:...)> =item Special Backtracking Control Verbs Verbs, C<(*PRUNE)> C<(*PRUNE:I<NAME>)> X<(*PRUNE)> X<(*PRUNE:NAME)>, C<(*SKIP)> C<(*SKIP:I<NAME>)> X<(*SKIP)>, C<(*MARK:I<NAME>)> C<(*:I<NAME>)> X<(*MARK)> X<(*MARK:NAME)> X<(*:NAME)>, C<(*THEN)> C<(*THEN:I<NAME>)>, C<(*COMMIT)> C<(*COMMIT:I<arg>)> X<(*COMMIT)>, C<(*FAIL)> C<(*F)> C<(*FAIL:I<arg>)> X<(*FAIL)> X<(*F)>, C<(*ACCEPT)> C<(*ACCEPT:I<arg>)> X<(*ACCEPT)> =item Warning on C<\1> Instead of C<$1> =item Repeated Patterns Matching a Zero-length Substring =item Combining RE Pieces C<ST>, C<S|T>, C<S{REPEAT_COUNT}>, C<S{min,max}>, C<S{min,max}?>, C<S?>, C<S*>, C<S+>, C<S??>, C<S*?>, C<S+?>, C<< (?>S) >>, C<(?=S)>, C<(?<=S)>, C<(?!S)>, C<(?<!S)>, C<(??{ I<EXPR> })>, C<(?I<PARNO>)>, C<(?(I<condition>)I<yes-pattern>|I<no-pattern>)> =item Creating Custom RE Engines =item Embedded Code Execution Frequency =item PCRE/Python Support C<< (?PE<lt>I<NAME>E<gt>I<pattern>) >>, C<< (?P=I<NAME>) >>, C<< (?P>I<NAME>) >> =back =item BUGS =item SEE ALSO =back =head2 perlrebackslash - Perl Regular Expression Backslash Sequences and Escapes =over 4 =item DESCRIPTION =over 4 =item The backslash [1] =item All the sequences and escapes =item Character Escapes [1], [2] =item Modifiers =item Character classes =item Referencing =item Assertions \A, \z, \Z, \G, \b{}, \b, \B{}, \B, C<\b{gcb}> or C<\b{g}>, C<\b{lb}>, C<\b{sb}>, C<\b{wb}> =item Misc \K, \N, \R X<\R>, \X X<\X> =back =back =head2 perlrecharclass - Perl Regular Expression Character Classes =over 4 =item DESCRIPTION =over 4 =item The dot =item Backslash sequences X<\w> X<\W> X<\s> X<\S> X<\d> X<\D> X<\p> X<\P> X<\N> X<\v> X<\V> X<\h> X<\H> X<word> X<whitespace> If the C</a> modifier is in effect .., otherwise .., For code points above 255 .., For code points below 256 .., if locale rules are in effect .., if, instead, Unicode rules are in effect .., otherwise .., If the C</a> modifier is in effect .., otherwise .., For code points above 255 .., For code points below 256 .., if locale rules are in effect .., if, instead, Unicode rules are in effect .., otherwise .., [1], [2] =item Bracketed Character Classes [1], [2], [3], [4], [5], [6], [7], If the C</a> modifier, is in effect .., otherwise .., For code points above 255 .., For code points below 256 .., if locale rules are in effect .., C<word>, C<ascii>, C<blank>, if, instead, Unicode rules are in effect .., otherwise .. =back =back =head2 perlreref - Perl Regular Expressions Reference =over 4 =item DESCRIPTION =over 4 =item OPERATORS =item SYNTAX =item ESCAPE SEQUENCES =item CHARACTER CLASSES =item ANCHORS =item QUANTIFIERS =item EXTENDED CONSTRUCTS =item VARIABLES =item FUNCTIONS =item TERMINOLOGY =back =item AUTHOR =item SEE ALSO =item THANKS =back =head2 perlref - Perl references and nested data structures =over 4 =item NOTE =item DESCRIPTION =over 4 =item Making References X<reference, creation> X<referencing> =item Using References X<reference, use> X<dereferencing> X<dereference> =item Circular References X<circular reference> X<reference, circular> =item Symbolic references X<reference, symbolic> X<reference, soft> X<symbolic reference> X<soft reference> =item Not-so-symbolic references =item Pseudo-hashes: Using an array as a hash X<pseudo-hash> X<pseudo hash> X<pseudohash> =item Function Templates X<scope, lexical> X<closure> X<lexical> X<lexical scope> X<subroutine, nested> X<sub, nested> X<subroutine, local> X<sub, local> =back =item WARNING: Don't use references as hash keys X<reference, string context> X<reference, use as hash key> =over 4 =item Postfix Dereference Syntax =item Postfix Reference Slicing =item Assigning to References =back =item Declaring a Reference to a Variable =item SEE ALSO =back =head2 perlform - Perl formats =over 4 =item DESCRIPTION =over 4 =item Text Fields X<format, text field> =item Numeric Fields X<#> X<format, numeric field> =item The Field @* for Variable-Width Multi-Line Text X<@*> =item The Field ^* for Variable-Width One-line-at-a-time Text X<^*> =item Specifying Values X<format, specifying values> =item Using Fill Mode X<format, fill mode> =item Suppressing Lines Where All Fields Are Void X<format, suppressing lines> =item Repeating Format Lines X<format, repeating lines> =item Top of Form Processing X<format, top of form> X<top> X<header> =item Format Variables X<format variables> X<format, variables> =back =item NOTES =over 4 =item Footers X<format, footer> X<footer> =item Accessing Formatting Internals X<format, internals> =back =item WARNINGS =back =head2 perlobj - Perl object reference =over 4 =item DESCRIPTION =over 4 =item An Object is Simply a Data Structure X<object> X<bless> X<constructor> X<new> =item A Class is Simply a Package X<class> X<package> X<@ISA> X<inheritance> =item A Method is Simply a Subroutine X<method> =item Method Invocation X<invocation> X<method> X<arrow> X<< -> >> =item Inheritance X<inheritance> =item Writing Constructors X<constructor> =item Attributes X<attribute> =item An Aside About Smarter and Safer Code =item Method Call Variations X<method> =item Invoking Class Methods X<invocation> =item C<bless>, C<blessed>, and C<ref> =item The UNIVERSAL Class X<UNIVERSAL> isa($class) X<isa>, DOES($role) X<DOES>, can($method) X<can>, VERSION($need) X<VERSION> =item AUTOLOAD X<AUTOLOAD> =item Destructors X<destructor> X<DESTROY> =item Non-Hash Objects =item Inside-Out objects =item Pseudo-hashes =back =item SEE ALSO =back =head2 perltie - how to hide an object class in a simple variable =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Tying Scalars X<scalar, tying> TIESCALAR classname, LIST X<TIESCALAR>, FETCH this X<FETCH>, STORE this, value X<STORE>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY> =item Tying Arrays X<array, tying> TIEARRAY classname, LIST X<TIEARRAY>, FETCH this, index X<FETCH>, STORE this, index, value X<STORE>, FETCHSIZE this X<FETCHSIZE>, STORESIZE this, count X<STORESIZE>, EXTEND this, count X<EXTEND>, EXISTS this, key X<EXISTS>, DELETE this, key X<DELETE>, CLEAR this X<CLEAR>, PUSH this, LIST X<PUSH>, POP this X<POP>, SHIFT this X<SHIFT>, UNSHIFT this, LIST X<UNSHIFT>, SPLICE this, offset, length, LIST X<SPLICE>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY> =item Tying Hashes X<hash, tying> USER, HOME, CLOBBER, LIST, TIEHASH classname, LIST X<TIEHASH>, FETCH this, key X<FETCH>, STORE this, key, value X<STORE>, DELETE this, key X<DELETE>, CLEAR this X<CLEAR>, EXISTS this, key X<EXISTS>, FIRSTKEY this X<FIRSTKEY>, NEXTKEY this, lastkey X<NEXTKEY>, SCALAR this X<SCALAR>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY> =item Tying FileHandles X<filehandle, tying> TIEHANDLE classname, LIST X<TIEHANDLE>, WRITE this, LIST X<WRITE>, PRINT this, LIST X<PRINT>, PRINTF this, LIST X<PRINTF>, READ this, LIST X<READ>, READLINE this X<READLINE>, GETC this X<GETC>, EOF this X<EOF>, CLOSE this X<CLOSE>, UNTIE this X<UNTIE>, DESTROY this X<DESTROY> =item UNTIE this X<UNTIE> =item The C<untie> Gotcha X<untie> =back =item SEE ALSO =item BUGS =item AUTHOR =back =head2 perldbmfilter - Perl DBM Filters =over 4 =item SYNOPSIS =item DESCRIPTION B<filter_store_key>, B<filter_store_value>, B<filter_fetch_key>, B<filter_fetch_value> =over 4 =item The Filter =item An Example: the NULL termination problem. =item Another Example: Key is a C int. =back =item SEE ALSO =item AUTHOR =back =head2 perlipc - Perl interprocess communication (signals, fifos, pipes, safe subprocesses, sockets, and semaphores) =over 4 =item DESCRIPTION =item Signals =over 4 =item Handling the SIGHUP Signal in Daemons =item Deferred Signals (Safe Signals) Long-running opcodes, Interrupting IO, Restartable system calls, Signals as "faults", Signals triggered by operating system state =back =item Named Pipes =item Using open() for IPC =over 4 =item Filehandles =item Background Processes =item Complete Dissociation of Child from Parent =item Safe Pipe Opens =item Avoiding Pipe Deadlocks =item Bidirectional Communication with Another Process =item Bidirectional Communication with Yourself =back =item Sockets: Client/Server Communication =over 4 =item Internet Line Terminators =item Internet TCP Clients and Servers =item Unix-Domain TCP Clients and Servers =back =item TCP Clients with IO::Socket =over 4 =item A Simple Client C<Proto>, C<PeerAddr>, C<PeerPort> =item A Webget Client =item Interactive Client with IO::Socket =back =item TCP Servers with IO::Socket Proto, LocalPort, Listen, Reuse =item UDP: Message Passing =item SysV IPC =item NOTES =item BUGS =item AUTHOR =item SEE ALSO =back =head2 perlfork - Perl's fork() emulation =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Behavior of other Perl features in forked pseudo-processes $$ or $PROCESS_ID, %ENV, chdir() and all other builtins that accept filenames, wait() and waitpid(), kill(), exec(), exit(), Open handles to files, directories and network sockets =item Resource limits =item Killing the parent process =item Lifetime of the parent process and pseudo-processes =back =item CAVEATS AND LIMITATIONS BEGIN blocks, Open filehandles, Open directory handles, Forking pipe open() not yet implemented, Global state maintained by XSUBs, Interpreter embedded in larger application, Thread-safety of extensions =item PORTABILITY CAVEATS =item BUGS =item AUTHOR =item SEE ALSO =back =head2 perlnumber - semantics of numbers and numeric operations in Perl =over 4 =item SYNOPSIS =item DESCRIPTION =item Storing numbers =item Numeric operators and numeric conversions =item Flavors of Perl numeric operations Arithmetic operators, ++, Arithmetic operators during C<use integer>, Other mathematical operators, Bitwise operators, Bitwise operators during C<use integer>, Operators which expect an integer, Operators which expect a string =item AUTHOR =item SEE ALSO =back =head2 perlthrtut - Tutorial on threads in Perl =over 4 =item DESCRIPTION =item What Is A Thread Anyway? =item Threaded Program Models =over 4 =item Boss/Worker =item Work Crew =item Pipeline =back =item What kind of threads are Perl threads? =item Thread-Safe Modules =item Thread Basics =over 4 =item Basic Thread Support =item A Note about the Examples =item Creating Threads =item Waiting For A Thread To Exit =item Ignoring A Thread =item Process and Thread Termination =back =item Threads And Data =over 4 =item Shared And Unshared Data =item Thread Pitfalls: Races =back =item Synchronization and control =over 4 =item Controlling access: lock() =item A Thread Pitfall: Deadlocks =item Queues: Passing Data Around =item Semaphores: Synchronizing Data Access =item Basic semaphores =item Advanced Semaphores =item Waiting for a Condition =item Giving up control =back =item General Thread Utility Routines =over 4 =item What Thread Am I In? =item Thread IDs =item Are These Threads The Same? =item What Threads Are Running? =back =item A Complete Example =item Different implementations of threads =item Performance considerations =item Process-scope Changes =item Thread-Safety of System Libraries =item Conclusion =item SEE ALSO =item Bibliography =over 4 =item Introductory Texts =item OS-Related References =item Other References =back =item Acknowledgements =item AUTHOR =item Copyrights =back =head2 perlport - Writing portable Perl =over 4 =item DESCRIPTION Not all Perl programs have to be portable, Nearly all of Perl already I<is> portable =item ISSUES =over 4 =item Newlines =item Numbers endianness and Width =item Files and Filesystems =item System Interaction =item Command names versus file pathnames =item Networking =item Interprocess Communication (IPC) =item External Subroutines (XS) =item Standard Modules =item Time and Date =item Character sets and character encoding =item Internationalisation =item System Resources =item Security =item Style =back =item CPAN Testers =item PLATFORMS =over 4 =item Unix =item DOS and Derivatives =item VMS =item VOS =item EBCDIC Platforms =item Acorn RISC OS =item Other perls =back =item FUNCTION IMPLEMENTATIONS =over 4 =item Alphabetical Listing of Perl Functions -I<X>, alarm, atan2, binmode, chmod, chown, chroot, crypt, dbmclose, dbmopen, dump, exec, exit, fcntl, flock, fork, getlogin, getpgrp, getppid, getpriority, getpwnam, getgrnam, getnetbyname, getpwuid, getgrgid, getnetbyaddr, getprotobynumber, getpwent, getgrent, gethostbyname, gethostent, getnetent, getprotoent, getservent, seekdir, sethostent, setnetent, setprotoent, setservent, endpwent, endgrent, endhostent, endnetent, endprotoent, endservent, getsockopt, glob, gmtime, ioctl, kill, link, localtime, lstat, msgctl, msgget, msgsnd, msgrcv, open, readlink, rename, rewinddir, select, semctl, semget, semop, setgrent, setpgrp, setpriority, setpwent, setsockopt, shmctl, shmget, shmread, shmwrite, sleep, socketpair, stat, symlink, syscall, sysopen, system, telldir, times, truncate, umask, utime, wait, waitpid =back =item Supported Platforms Linux (x86, ARM, IA64), HP-UX, AIX, Win32, Windows 2000, Windows XP, Windows Server 2003, Windows Vista, Windows Server 2008, Windows 7, Cygwin, Solaris (x86, SPARC), OpenVMS, Alpha (7.2 and later), I64 (8.2 and later), Symbian, NetBSD, FreeBSD, Debian GNU/kFreeBSD, Haiku, Irix (6.5. What else?), OpenBSD, Dragonfly BSD, Midnight BSD, QNX Neutrino RTOS (6.5.0), MirOS BSD, Stratus OpenVOS (17.0 or later), time_t issues that may or may not be fixed, Symbian (Series 60 v3, 3.2 and 5 - what else?), Stratus VOS / OpenVOS, AIX, Android, FreeMINT =item EOL Platforms =over 4 =item (Perl 5.20) AT&T 3b1 =item (Perl 5.14) Windows 95, Windows 98, Windows ME, Windows NT4 =item (Perl 5.12) Atari MiNT, Apollo Domain/OS, Apple Mac OS 8/9, Tenon Machten =back =item Supported Platforms (Perl 5.8) =item SEE ALSO =item AUTHORS / CONTRIBUTORS =back =head2 perllocale - Perl locale handling (internationalization and localization) =over 4 =item DESCRIPTION =item WHAT IS A LOCALE Category C<LC_NUMERIC>: Numeric formatting, Category C<LC_MONETARY>: Formatting of monetary amounts, Category C<LC_TIME>: Date/Time formatting, Category C<LC_MESSAGES>: Error and other messages, Category C<LC_COLLATE>: Collation, Category C<LC_CTYPE>: Character Types, Other categories =item PREPARING TO USE LOCALES =item USING LOCALES =over 4 =item The C<"use locale"> pragma B<Not within the scope of C<"use locale">>, B<Lingering effects of C<S<use locale>>>, B<Under C<"use locale";>> =item The setlocale function =item Multi-threaded operation =item Finding locales =item LOCALE PROBLEMS =item Testing for broken locales =item Temporarily fixing locale problems =item Permanently fixing locale problems =item Permanently fixing your system's locale configuration =item Fixing system locale configuration =item The localeconv function =item I18N::Langinfo =back =item LOCALE CATEGORIES =over 4 =item Category C<LC_COLLATE>: Collation: Text Comparisons and Sorting =item Category C<LC_CTYPE>: Character Types =item Category C<LC_NUMERIC>: Numeric Formatting =item Category C<LC_MONETARY>: Formatting of monetary amounts =item Category C<LC_TIME>: Respresentation of time =item Other categories =back =item SECURITY =item ENVIRONMENT PERL_SKIP_LOCALE_INIT, PERL_BADLANG, C<LC_ALL>, C<LANGUAGE>, C<LC_CTYPE>, C<LC_COLLATE>, C<LC_MONETARY>, C<LC_NUMERIC>, C<LC_TIME>, C<LANG> =over 4 =item Examples =back =item NOTES =over 4 =item String C<eval> and C<LC_NUMERIC> =item Backward compatibility =item I18N:Collate obsolete =item Sort speed and memory use impacts =item Freely available locale definitions =item I18n and l10n =item An imperfect standard =back =item Unicode and UTF-8 =item BUGS =over 4 =item Collation of strings containing embedded C<NUL> characters =item Multi-threaded =item Broken systems =back =item SEE ALSO =item HISTORY =back =head2 perluniintro - Perl Unicode introduction =over 4 =item DESCRIPTION =over 4 =item Unicode =item Perl's Unicode Support =item Perl's Unicode Model =item Unicode and EBCDIC =item Creating Unicode =item Handling Unicode =item Legacy Encodings =item Unicode I/O =item Displaying Unicode As Text =item Special Cases =item Advanced Topics =item Miscellaneous =item Questions With Answers =item Hexadecimal Notation =item Further Resources =back =item UNICODE IN OLDER PERLS =item SEE ALSO =item ACKNOWLEDGMENTS =item AUTHOR, COPYRIGHT, AND LICENSE =back =head2 perlunicode - Unicode support in Perl =over 4 =item DESCRIPTION =over 4 =item Important Caveats Safest if you C<use feature 'unicode_strings'>, Input and Output Layers, You must convert your non-ASCII, non-UTF-8 Perl scripts to be UTF-8, C<use utf8> still needed to enable L<UTF-8|/Unicode Encodings> in scripts, L<UTF-16|/Unicode Encodings> scripts autodetected =item Byte and Character Semantics =item ASCII Rules versus Unicode Rules When the string has been upgraded to UTF-8, There are additional methods for regular expression patterns =item Extended Grapheme Clusters (Logical characters) =item Unicode Character Properties B<C<\p{All}>>, B<C<\p{Alnum}>>, B<C<\p{Any}>>, B<C<\p{ASCII}>>, B<C<\p{Assigned}>>, B<C<\p{Blank}>>, B<C<\p{Decomposition_Type: Non_Canonical}>> (Short: C<\p{Dt=NonCanon}>), B<C<\p{Graph}>>, B<C<\p{HorizSpace}>>, B<C<\p{In=*}>>, B<C<\p{PerlSpace}>>, B<C<\p{PerlWord}>>, B<C<\p{Posix...}>>, B<C<\p{Present_In: *}>> (Short: C<\p{In=*}>), B<C<\p{Print}>>, B<C<\p{SpacePerl}>>, B<C<\p{Title}>> and B<C<\p{Titlecase}>>, B<C<\p{Unicode}>>, B<C<\p{VertSpace}>>, B<C<\p{Word}>>, B<C<\p{XPosix...}>> =item Comparison of C<\N{...}> and C<\p{name=...}> [1], [2], [3], [4], [5] =item Wildcards in Property Values =item User-Defined Character Properties =item User-Defined Case Mappings (for serious hackers only) =item Character Encodings for Input and Output =item Unicode Regular Expression Support Level [1] C<\N{U+...}> and C<\x{...}>, [2] C<\p{...}> C<\P{...}>. This requirement is for a minimal list of properties. Perl supports these. See R2.7 for other properties, [3] Perl has C<\d> C<\D> C<\s> C<\S> C<\w> C<\W> C<\X> C<[:I<prop>:]> C<[:^I<prop>:]>, plus all the properties specified by L<https://www.unicode.org/reports/tr18/#Compatibility_Properties>. These are described above in L</Other Properties>, [4], Regular expression lookahead, [5] C<\b> C<\B> meet most, but not all, the details of this requirement, but C<\b{wb}> and C<\B{wb}> do, as well as the stricter R2.3, [6], [7], [8] UTF-8/UTF-EBDDIC used in Perl allows not only C<U+10000> to C<U+10FFFF> but also beyond C<U+10FFFF>, [9] Unicode has rewritten this portion of UTS#18 to say that getting canonical equivalence (see UAX#15 L<"Unicode Normalization Forms"|https://www.unicode.org/reports/tr15>) is basically to be done at the programmer level. Use NFD to write both your regular expressions and text to match them against (you can use L<Unicode::Normalize>), [10] Perl has C<\X> and C<\b{gcb}>. Unicode has retracted their "Grapheme Cluster Mode", and recently added string properties, which Perl does not yet support, [11] see L<UAX#29 "Unicode Text Segmentation"|https://www.unicode.org/reports/tr29>,, [12] see L</Wildcards in Property Values> above, [13] Perl supports all the properties in the Unicode Character Database (UCD). It does not yet support the listed properties that come from other Unicode sources, [14] The only optional property that Perl supports is Named Sequence. None of these properties are in the UCD =item Unicode Encodings =item Noncharacter code points =item Beyond Unicode code points =item Security Implications of Unicode =item Unicode in Perl on EBCDIC =item Locales =item When Unicode Does Not Happen =item The "Unicode Bug" =item Forcing Unicode in Perl (Or Unforcing Unicode in Perl) =item Using Unicode in XS =item Hacking Perl to work on earlier Unicode versions (for very serious hackers only) =item Porting code from perl-5.6.X =back =item BUGS =over 4 =item Interaction with Extensions =item Speed =back =item SEE ALSO =back =head2 perlunicook - cookbookish examples of handling Unicode in Perl =over 4 =item DESCRIPTION =item EXAMPLES =over 4 =item ℞ 0: Standard preamble =item ℞ 1: Generic Unicode-savvy filter =item ℞ 2: Fine-tuning Unicode warnings =item ℞ 3: Declare source in utf8 for identifiers and literals =item ℞ 4: Characters and their numbers =item ℞ 5: Unicode literals by character number =item ℞ 6: Get character name by number =item ℞ 7: Get character number by name =item ℞ 8: Unicode named characters =item ℞ 9: Unicode named sequences =item ℞ 10: Custom named characters =item ℞ 11: Names of CJK codepoints =item ℞ 12: Explicit encode/decode =item ℞ 13: Decode program arguments as utf8 =item ℞ 14: Decode program arguments as locale encoding =item ℞ 15: Declare STD{IN,OUT,ERR} to be utf8 =item ℞ 16: Declare STD{IN,OUT,ERR} to be in locale encoding =item ℞ 17: Make file I/O default to utf8 =item ℞ 18: Make all I/O and args default to utf8 =item ℞ 19: Open file with specific encoding =item ℞ 20: Unicode casing =item ℞ 21: Unicode case-insensitive comparisons =item ℞ 22: Match Unicode linebreak sequence in regex =item ℞ 23: Get character category =item ℞ 24: Disabling Unicode-awareness in builtin charclasses =item ℞ 25: Match Unicode properties in regex with \p, \P =item ℞ 26: Custom character properties =item ℞ 27: Unicode normalization =item ℞ 28: Convert non-ASCII Unicode numerics =item ℞ 29: Match Unicode grapheme cluster in regex =item ℞ 30: Extract by grapheme instead of by codepoint (regex) =item ℞ 31: Extract by grapheme instead of by codepoint (substr) =item ℞ 32: Reverse string by grapheme =item ℞ 33: String length in graphemes =item ℞ 34: Unicode column-width for printing =item ℞ 35: Unicode collation =item ℞ 36: Case- I<and> accent-insensitive Unicode sort =item ℞ 37: Unicode locale collation =item ℞ 38: Making C<cmp> work on text instead of codepoints =item ℞ 39: Case- I<and> accent-insensitive comparisons =item ℞ 40: Case- I<and> accent-insensitive locale comparisons =item ℞ 41: Unicode linebreaking =item ℞ 42: Unicode text in DBM hashes, the tedious way =item ℞ 43: Unicode text in DBM hashes, the easy way =item ℞ 44: PROGRAM: Demo of Unicode collation and printing =back =item SEE ALSO §3.13 Default Case Algorithms, page 113; §4.2 Case, pages 120–122; Case Mappings, page 166–172, especially Caseless Matching starting on page 170, UAX #44: Unicode Character Database, UTS #18: Unicode Regular Expressions, UAX #15: Unicode Normalization Forms, UTS #10: Unicode Collation Algorithm, UAX #29: Unicode Text Segmentation, UAX #14: Unicode Line Breaking Algorithm, UAX #11: East Asian Width =item AUTHOR =item COPYRIGHT AND LICENCE =item REVISION HISTORY =back =head2 perlunifaq - Perl Unicode FAQ =over 4 =item Q and A =over 4 =item perlunitut isn't really a Unicode tutorial, is it? =item What character encodings does Perl support? =item Which version of perl should I use? =item What about binary data, like images? =item When should I decode or encode? =item What if I don't decode? =item What if I don't encode? =item Is there a way to automatically decode or encode? =item What if I don't know which encoding was used? =item Can I use Unicode in my Perl sources? =item Data::Dumper doesn't restore the UTF8 flag; is it broken? =item Why do regex character classes sometimes match only in the ASCII range? =item Why do some characters not uppercase or lowercase correctly? =item How can I determine if a string is a text string or a binary string? =item How do I convert from encoding FOO to encoding BAR? =item What are C<decode_utf8> and C<encode_utf8>? =item What is a "wide character"? =back =item INTERNALS =over 4 =item What is "the UTF8 flag"? =item What about the C<use bytes> pragma? =item What about the C<use encoding> pragma? =item What is the difference between C<:encoding> and C<:utf8>? =item What's the difference between C<UTF-8> and C<utf8>? =item I lost track; what encoding is the internal format really? =back =item AUTHOR =item SEE ALSO =back =head2 perluniprops - Index of Unicode Version 13.0.0 character properties in Perl =over 4 =item DESCRIPTION =item Properties accessible through C<\p{}> and C<\P{}> Single form (C<\p{name}>) tighter rules:, white space adjacent to a non-word character, underscores separating digits in numbers, Compound form (C<\p{name=value}> or C<\p{name:value}>) tighter rules:, Stabilized, Deprecated, Obsolete, Discouraged, Z<>B<*> is a wild-card, B<(\d+)> in the info column gives the number of Unicode code points matched by this property, B<D> means this is deprecated, B<O> means this is obsolete, B<S> means this is stabilized, B<T> means tighter (stricter) name matching applies, B<X> means use of this form is discouraged, and may not be stable =over 4 =item Legal C<\p{}> and C<\P{}> constructs that match no characters \p{Canonical_Combining_Class=Attached_Below_Left}, \p{Canonical_Combining_Class=CCC133}, \p{Grapheme_Cluster_Break=E_Base}, \p{Grapheme_Cluster_Break=E_Base_GAZ}, \p{Grapheme_Cluster_Break=E_Modifier}, \p{Grapheme_Cluster_Break=Glue_After_Zwj}, \p{Word_Break=E_Base}, \p{Word_Break=E_Base_GAZ}, \p{Word_Break=E_Modifier}, \p{Word_Break=Glue_After_Zwj} =back =item Properties accessible through Unicode::UCD =item Properties accessible through other means =item Unicode character properties that are NOT accepted by Perl I<Expands_On_NFC> (XO_NFC), I<Expands_On_NFD> (XO_NFD), I<Expands_On_NFKC> (XO_NFKC), I<Expands_On_NFKD> (XO_NFKD), I<Grapheme_Link> (Gr_Link), I<Jamo_Short_Name> (JSN), I<Other_Alphabetic> (OAlpha), I<Other_Default_Ignorable_Code_Point> (ODI), I<Other_Grapheme_Extend> (OGr_Ext), I<Other_ID_Continue> (OIDC), I<Other_ID_Start> (OIDS), I<Other_Lowercase> (OLower), I<Other_Math> (OMath), I<Other_Uppercase> (OUpper), I<Script=Katakana_Or_Hiragana> (sc=Hrkt), I<Script_Extensions=Katakana_Or_Hiragana> (scx=Hrkt) =item Other information in the Unicode data base F<auxiliary/GraphemeBreakTest.html>, F<auxiliary/LineBreakTest.html>, F<auxiliary/SentenceBreakTest.html>, F<auxiliary/WordBreakTest.html>, F<BidiCharacterTest.txt>, F<BidiTest.txt>, F<NormTest.txt>, F<CJKRadicals.txt>, F<emoji/ReadMe.txt>, F<ReadMe.txt>, F<EmojiSources.txt>, F<extracted/DName.txt>, F<Index.txt>, F<NamedSqProv.txt>, F<NamesList.html>, F<NamesList.txt>, F<NormalizationCorrections.txt>, F<NushuSources.txt>, F<StandardizedVariants.html>, F<StandardizedVariants.txt>, F<TangutSources.txt>, F<USourceData.txt>, F<USourceGlyphs.pdf> =item SEE ALSO =back =head2 perlunitut - Perl Unicode Tutorial =over 4 =item DESCRIPTION =over 4 =item Definitions =item Your new toolkit =item I/O flow (the actual 5 minute tutorial) =back =item SUMMARY =item Q and A (or FAQ) =item ACKNOWLEDGEMENTS =item AUTHOR =item SEE ALSO =back =head2 perlebcdic - Considerations for running Perl on EBCDIC platforms =over 4 =item DESCRIPTION =item COMMON CHARACTER CODE SETS =over 4 =item ASCII =item ISO 8859 =item Latin 1 (ISO 8859-1) =item EBCDIC B<0037>, B<1047>, B<POSIX-BC> =item Unicode code points versus EBCDIC code points =item Unicode and UTF =item Using Encode =back =item SINGLE OCTET TABLES recipe 0, recipe 1, recipe 2, recipe 3, recipe 4, recipe 5, recipe 6 =over 4 =item Table in hex, sorted in 1047 order =back =item IDENTIFYING CHARACTER CODE SETS =item CONVERSIONS =over 4 =item C<utf8::unicode_to_native()> and C<utf8::native_to_unicode()> =item tr/// =item iconv =item C RTL =back =item OPERATOR DIFFERENCES =item FUNCTION DIFFERENCES C<chr()>, C<ord()>, C<pack()>, C<print()>, C<printf()>, C<sort()>, C<sprintf()>, C<unpack()> =item REGULAR EXPRESSION DIFFERENCES =item SOCKETS =item SORTING =over 4 =item Ignore ASCII vs. EBCDIC sort differences. =item Use a sort helper function =item MONO CASE then sort data (for non-digits, non-underscore) =item Perform sorting on one type of platform only. =back =item TRANSFORMATION FORMATS =over 4 =item URL decoding and encoding =item uu encoding and decoding =item Quoted-Printable encoding and decoding =item Caesarean ciphers =back =item Hashing order and checksums =item I18N AND L10N =item MULTI-OCTET CHARACTER SETS =item OS ISSUES =over 4 =item OS/400 PASE, IFS access =item OS/390, z/OS C<sigaction>, C<chcp>, dataset access, C<iconv>, locales =item POSIX-BC? =back =item BUGS =item SEE ALSO =item REFERENCES =item HISTORY =item AUTHOR =back =head2 perlsec - Perl security =over 4 =item DESCRIPTION =item SECURITY VULNERABILITY CONTACT INFORMATION =item SECURITY MECHANISMS AND CONCERNS =over 4 =item Taint mode =item Laundering and Detecting Tainted Data =item Switches On the "#!" Line =item Taint mode and @INC =item Cleaning Up Your Path =item Shebang Race Condition =item Protecting Your Programs =item Unicode =item Algorithmic Complexity Attacks Hash Seed Randomization, Hash Traversal Randomization, Bucket Order Perturbance, New Default Hash Function, Alternative Hash Functions =item Using Sudo =back =item SEE ALSO =back =head2 perlsecpolicy - Perl security report handling policy =over 4 =item DESCRIPTION =item REPORTING SECURITY ISSUES IN PERL =item WHAT ARE SECURITY ISSUES =over 4 =item Software covered by the Perl security team =item Bugs that may qualify as security issues in Perl =item Bugs that do not qualify as security issues in Perl =item Bugs that require special categorization =back =item HOW WE DEAL WITH SECURITY ISSUES =over 4 =item Perl's vulnerability remediation workflow =item Publicly known and zero-day security issues =item Vulnerability credit and bounties =back =back =head2 perlmod - Perl modules (packages and symbol tables) =over 4 =item DESCRIPTION =over 4 =item Is this the document you were after? This doc, L<perlnewmod>, L<perlmodstyle> =item Packages X<package> X<namespace> X<variable, global> X<global variable> X<global> =item Symbol Tables X<symbol table> X<stash> X<%::> X<%main::> X<typeglob> X<glob> X<alias> =item BEGIN, UNITCHECK, CHECK, INIT and END X<BEGIN> X<UNITCHECK> X<CHECK> X<INIT> X<END> =item Perl Classes X<class> X<@ISA> =item Perl Modules X<module> =item Making your module threadsafe X<threadsafe> X<thread safe> X<module, threadsafe> X<module, thread safe> X<CLONE> X<CLONE_SKIP> X<thread> X<threads> X<ithread> =back =item SEE ALSO =back =head2 perlmodlib - constructing new Perl modules and finding existing ones =over 4 =item THE PERL MODULE LIBRARY =over 4 =item Pragmatic Modules attributes, autodie, autodie::exception, autodie::exception::system, autodie::hints, autodie::skip, autouse, base, bigint, bignum, bigrat, blib, bytes, charnames, constant, deprecate, diagnostics, encoding, encoding::warnings, experimental, feature, fields, filetest, if, integer, less, lib, locale, mro, ok, open, ops, overload, overloading, parent, re, sigtrap, sort, strict, subs, threads, threads::shared, utf8, vars, version, vmsish, warnings, warnings::register =item Standard Modules Amiga::ARexx, Amiga::Exec, AnyDBM_File, App::Cpan, App::Prove, App::Prove::State, App::Prove::State::Result, App::Prove::State::Result::Test, Archive::Tar, Archive::Tar::File, Attribute::Handlers, AutoLoader, AutoSplit, B, B::Concise, B::Deparse, B::Op_private, B::Showlex, B::Terse, B::Xref, Benchmark, C<IO::Socket::IP>, C<Socket>, CORE, CPAN, CPAN::API::HOWTO, CPAN::Debug, CPAN::Distroprefs, CPAN::FirstTime, CPAN::HandleConfig, CPAN::Kwalify, CPAN::Meta, CPAN::Meta::Converter, CPAN::Meta::Feature, CPAN::Meta::History, CPAN::Meta::History::Meta_1_0, CPAN::Meta::History::Meta_1_1, CPAN::Meta::History::Meta_1_2, CPAN::Meta::History::Meta_1_3, CPAN::Meta::History::Meta_1_4, CPAN::Meta::Merge, CPAN::Meta::Prereqs, CPAN::Meta::Requirements, CPAN::Meta::Spec, CPAN::Meta::Validator, CPAN::Meta::YAML, CPAN::Nox, CPAN::Plugin, CPAN::Plugin::Specfile, CPAN::Queue, CPAN::Tarzip, CPAN::Version, Carp, Class::Struct, Compress::Raw::Bzip2, Compress::Raw::Zlib, Compress::Zlib, Config, Config::Extensions, Config::Perl::V, Cwd, DB, DBM_Filter, DBM_Filter::compress, DBM_Filter::encode, DBM_Filter::int32, DBM_Filter::null, DBM_Filter::utf8, DB_File, Data::Dumper, Devel::PPPort, Devel::Peek, Devel::SelfStubber, Digest, Digest::MD5, Digest::SHA, Digest::base, Digest::file, DirHandle, Dumpvalue, DynaLoader, Encode, Encode::Alias, Encode::Byte, Encode::CJKConstants, Encode::CN, Encode::CN::HZ, Encode::Config, Encode::EBCDIC, Encode::Encoder, Encode::Encoding, Encode::GSM0338, Encode::Guess, Encode::JP, Encode::JP::H2Z, Encode::JP::JIS7, Encode::KR, Encode::KR::2022_KR, Encode::MIME::Header, Encode::MIME::Name, Encode::PerlIO, Encode::Supported, Encode::Symbol, Encode::TW, Encode::Unicode, Encode::Unicode::UTF7, English, Env, Errno, Exporter, Exporter::Heavy, ExtUtils::CBuilder, ExtUtils::CBuilder::Platform::Windows, ExtUtils::Command, ExtUtils::Command::MM, ExtUtils::Constant, ExtUtils::Constant::Base, ExtUtils::Constant::Utils, ExtUtils::Constant::XS, ExtUtils::Embed, ExtUtils::Install, ExtUtils::Installed, ExtUtils::Liblist, ExtUtils::MM, ExtUtils::MM::Utils, ExtUtils::MM_AIX, ExtUtils::MM_Any, ExtUtils::MM_BeOS, ExtUtils::MM_Cygwin, ExtUtils::MM_DOS, ExtUtils::MM_Darwin, ExtUtils::MM_MacOS, ExtUtils::MM_NW5, ExtUtils::MM_OS2, ExtUtils::MM_QNX, ExtUtils::MM_UWIN, ExtUtils::MM_Unix, ExtUtils::MM_VMS, ExtUtils::MM_VOS, ExtUtils::MM_Win32, ExtUtils::MM_Win95, ExtUtils::MY, ExtUtils::MakeMaker, ExtUtils::MakeMaker::Config, ExtUtils::MakeMaker::FAQ, ExtUtils::MakeMaker::Locale, ExtUtils::MakeMaker::Tutorial, ExtUtils::Manifest, ExtUtils::Miniperl, ExtUtils::Mkbootstrap, ExtUtils::Mksymlists, ExtUtils::Packlist, ExtUtils::ParseXS, ExtUtils::ParseXS::Constants, ExtUtils::ParseXS::Eval, ExtUtils::ParseXS::Utilities, ExtUtils::Typemaps, ExtUtils::Typemaps::Cmd, ExtUtils::Typemaps::InputMap, ExtUtils::Typemaps::OutputMap, ExtUtils::Typemaps::Type, ExtUtils::XSSymSet, ExtUtils::testlib, Fatal, Fcntl, File::Basename, File::Compare, File::Copy, File::DosGlob, File::Fetch, File::Find, File::Glob, File::GlobMapper, File::Path, File::Spec, File::Spec::AmigaOS, File::Spec::Cygwin, File::Spec::Epoc, File::Spec::Functions, File::Spec::Mac, File::Spec::OS2, File::Spec::Unix, File::Spec::VMS, File::Spec::Win32, File::Temp, File::stat, FileCache, FileHandle, Filter::Simple, Filter::Util::Call, FindBin, GDBM_File, Getopt::Long, Getopt::Std, HTTP::Tiny, Hash::Util, Hash::Util::FieldHash, I18N::Collate, I18N::LangTags, I18N::LangTags::Detect, I18N::LangTags::List, I18N::Langinfo, IO, IO::Compress::Base, IO::Compress::Bzip2, IO::Compress::Deflate, IO::Compress::FAQ, IO::Compress::Gzip, IO::Compress::RawDeflate, IO::Compress::Zip, IO::Dir, IO::File, IO::Handle, IO::Pipe, IO::Poll, IO::Seekable, IO::Select, IO::Socket, IO::Socket::INET, IO::Socket::UNIX, IO::Uncompress::AnyInflate, IO::Uncompress::AnyUncompress, IO::Uncompress::Base, IO::Uncompress::Bunzip2, IO::Uncompress::Gunzip, IO::Uncompress::Inflate, IO::Uncompress::RawInflate, IO::Uncompress::Unzip, IO::Zlib, IPC::Cmd, IPC::Msg, IPC::Open2, IPC::Open3, IPC::Semaphore, IPC::SharedMem, IPC::SysV, Internals, JSON::PP, JSON::PP::Boolean, List::Util, List::Util::XS, Locale::Maketext, Locale::Maketext::Cookbook, Locale::Maketext::Guts, Locale::Maketext::GutsLoader, Locale::Maketext::Simple, Locale::Maketext::TPJ13, MIME::Base64, MIME::QuotedPrint, Math::BigFloat, Math::BigInt, Math::BigInt::Calc, Math::BigInt::FastCalc, Math::BigInt::Lib, Math::BigRat, Math::Complex, Math::Trig, Memoize, Memoize::AnyDBM_File, Memoize::Expire, Memoize::ExpireFile, Memoize::ExpireTest, Memoize::NDBM_File, Memoize::SDBM_File, Memoize::Storable, Module::CoreList, Module::CoreList::Utils, Module::Load, Module::Load::Conditional, Module::Loaded, Module::Metadata, NDBM_File, NEXT, Net::Cmd, Net::Config, Net::Domain, Net::FTP, Net::FTP::dataconn, Net::NNTP, Net::Netrc, Net::POP3, Net::Ping, Net::SMTP, Net::Time, Net::hostent, Net::libnetFAQ, Net::netent, Net::protoent, Net::servent, O, ODBM_File, Opcode, POSIX, Params::Check, Parse::CPAN::Meta, Perl::OSType, PerlIO, PerlIO::encoding, PerlIO::mmap, PerlIO::scalar, PerlIO::via, PerlIO::via::QuotedPrint, Pod::Checker, Pod::Escapes, Pod::Functions, Pod::Html, Pod::Man, Pod::ParseLink, Pod::Perldoc, Pod::Perldoc::BaseTo, Pod::Perldoc::GetOptsOO, Pod::Perldoc::ToANSI, Pod::Perldoc::ToChecker, Pod::Perldoc::ToMan, Pod::Perldoc::ToNroff, Pod::Perldoc::ToPod, Pod::Perldoc::ToRtf, Pod::Perldoc::ToTerm, Pod::Perldoc::ToText, Pod::Perldoc::ToTk, Pod::Perldoc::ToXml, Pod::Simple, Pod::Simple::Checker, Pod::Simple::Debug, Pod::Simple::DumpAsText, Pod::Simple::DumpAsXML, Pod::Simple::HTML, Pod::Simple::HTMLBatch, Pod::Simple::JustPod, Pod::Simple::LinkSection, Pod::Simple::Methody, Pod::Simple::PullParser, Pod::Simple::PullParserEndToken, Pod::Simple::PullParserStartToken, Pod::Simple::PullParserTextToken, Pod::Simple::PullParserToken, Pod::Simple::RTF, Pod::Simple::Search, Pod::Simple::SimpleTree, Pod::Simple::Subclassing, Pod::Simple::Text, Pod::Simple::TextContent, Pod::Simple::XHTML, Pod::Simple::XMLOutStream, Pod::Text, Pod::Text::Color, Pod::Text::Overstrike, Pod::Text::Termcap, Pod::Usage, SDBM_File, Safe, Scalar::Util, Search::Dict, SelectSaver, SelfLoader, Storable, Sub::Util, Symbol, Sys::Hostname, Sys::Syslog, Sys::Syslog::Win32, TAP::Base, TAP::Formatter::Base, TAP::Formatter::Color, TAP::Formatter::Console, TAP::Formatter::Console::ParallelSession, TAP::Formatter::Console::Session, TAP::Formatter::File, TAP::Formatter::File::Session, TAP::Formatter::Session, TAP::Harness, TAP::Harness::Env, TAP::Object, TAP::Parser, TAP::Parser::Aggregator, TAP::Parser::Grammar, TAP::Parser::Iterator, TAP::Parser::Iterator::Array, TAP::Parser::Iterator::Process, TAP::Parser::Iterator::Stream, TAP::Parser::IteratorFactory, TAP::Parser::Multiplexer, TAP::Parser::Result, TAP::Parser::Result::Bailout, TAP::Parser::Result::Comment, TAP::Parser::Result::Plan, TAP::Parser::Result::Pragma, TAP::Parser::Result::Test, TAP::Parser::Result::Unknown, TAP::Parser::Result::Version, TAP::Parser::Result::YAML, TAP::Parser::ResultFactory, TAP::Parser::Scheduler, TAP::Parser::Scheduler::Job, TAP::Parser::Scheduler::Spinner, TAP::Parser::Source, TAP::Parser::SourceHandler, TAP::Parser::SourceHandler::Executable, TAP::Parser::SourceHandler::File, TAP::Parser::SourceHandler::Handle, TAP::Parser::SourceHandler::Perl, TAP::Parser::SourceHandler::RawTAP, TAP::Parser::YAMLish::Reader, TAP::Parser::YAMLish::Writer, Term::ANSIColor, Term::Cap, Term::Complete, Term::ReadLine, Test, Test2, Test2::API, Test2::API::Breakage, Test2::API::Context, Test2::API::Instance, Test2::API::Stack, Test2::Event, Test2::Event::Bail, Test2::Event::Diag, Test2::Event::Encoding, Test2::Event::Exception, Test2::Event::Fail, Test2::Event::Generic, Test2::Event::Note, Test2::Event::Ok, Test2::Event::Pass, Test2::Event::Plan, Test2::Event::Skip, Test2::Event::Subtest, Test2::Event::TAP::Version, Test2::Event::V2, Test2::Event::Waiting, Test2::EventFacet, Test2::EventFacet::About, Test2::EventFacet::Amnesty, Test2::EventFacet::Assert, Test2::EventFacet::Control, Test2::EventFacet::Error, Test2::EventFacet::Hub, Test2::EventFacet::Info, Test2::EventFacet::Info::Table, Test2::EventFacet::Meta, Test2::EventFacet::Parent, Test2::EventFacet::Plan, Test2::EventFacet::Render, Test2::EventFacet::Trace, Test2::Formatter, Test2::Formatter::TAP, Test2::Hub, Test2::Hub::Interceptor, Test2::Hub::Interceptor::Terminator, Test2::Hub::Subtest, Test2::IPC, Test2::IPC::Driver, Test2::IPC::Driver::Files, Test2::Tools::Tiny, Test2::Transition, Test2::Util, Test2::Util::ExternalMeta, Test2::Util::Facets2Legacy, Test2::Util::HashBase, Test2::Util::Trace, Test::Builder, Test::Builder::Formatter, Test::Builder::IO::Scalar, Test::Builder::Module, Test::Builder::Tester, Test::Builder::Tester::Color, Test::Builder::TodoDiag, Test::Harness, Test::Harness::Beyond, Test::More, Test::Simple, Test::Tester, Test::Tester::Capture, Test::Tester::CaptureRunner, Test::Tutorial, Test::use::ok, Text::Abbrev, Text::Balanced, Text::ParseWords, Text::Tabs, Text::Wrap, Thread, Thread::Queue, Thread::Semaphore, Tie::Array, Tie::File, Tie::Handle, Tie::Hash, Tie::Hash::NamedCapture, Tie::Memoize, Tie::RefHash, Tie::Scalar, Tie::StdHandle, Tie::SubstrHash, Time::HiRes, Time::Local, Time::Piece, Time::Seconds, Time::gmtime, Time::localtime, Time::tm, UNIVERSAL, Unicode::Collate, Unicode::Collate::CJK::Big5, Unicode::Collate::CJK::GB2312, Unicode::Collate::CJK::JISX0208, Unicode::Collate::CJK::Korean, Unicode::Collate::CJK::Pinyin, Unicode::Collate::CJK::Stroke, Unicode::Collate::CJK::Zhuyin, Unicode::Collate::Locale, Unicode::Normalize, Unicode::UCD, User::grent, User::pwent, VMS::DCLsym, VMS::Filespec, VMS::Stdio, Win32, Win32API::File, Win32CORE, XS::APItest, XS::Typemap, XSLoader, autodie::Scope::Guard, autodie::Scope::GuardStack, autodie::Util, version::Internals =item Extension Modules =back =item CPAN =over 4 =item Africa South Africa, Uganda, Zimbabwe =item Asia Bangladesh, China, India, Indonesia, Iran, Israel, Japan, Kazakhstan, Philippines, Qatar, Republic of Korea, Singapore, Taiwan, Turkey, Viet Nam =item Europe Austria, Belarus, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Czech Republic, Denmark, Finland, France, Germany, Greece, Hungary, Ireland, Italy, Latvia, Lithuania, Moldova, Netherlands, Norway, Poland, Portugal, Romania, Russian Federation, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Ukraine, United Kingdom =item North America Canada, Costa Rica, Mexico, United States, Alabama, Arizona, California, Idaho, Illinois, Indiana, Kansas, Massachusetts, Michigan, New Hampshire, New Jersey, New York, North Carolina, Oregon, Pennsylvania, South Carolina, Texas, Utah, Virginia, Washington, Wisconsin =item Oceania Australia, New Caledonia, New Zealand =item South America Argentina, Brazil, Chile =item RSYNC Mirrors =back =item Modules: Creation, Use, and Abuse =over 4 =item Guidelines for Module Creation =item Guidelines for Converting Perl 4 Library Scripts into Modules =item Guidelines for Reusing Application Code =back =item NOTE =back =head2 perlmodstyle - Perl module style guide =over 4 =item INTRODUCTION =item QUICK CHECKLIST =over 4 =item Before you start =item The API =item Stability =item Documentation =item Release considerations =back =item BEFORE YOU START WRITING A MODULE =over 4 =item Has it been done before? =item Do one thing and do it well =item What's in a name? =item Get feedback before publishing =back =item DESIGNING AND WRITING YOUR MODULE =over 4 =item To OO or not to OO? =item Designing your API Write simple routines to do simple things, Separate functionality from output, Provide sensible shortcuts and defaults, Naming conventions, Parameter passing =item Strictness and warnings =item Backwards compatibility =item Error handling and messages =back =item DOCUMENTING YOUR MODULE =over 4 =item POD =item README, INSTALL, release notes, changelogs perl Makefile.PL, make, make test, make install, perl Build.PL, perl Build, perl Build test, perl Build install =back =item RELEASE CONSIDERATIONS =over 4 =item Version numbering =item Pre-requisites =item Testing =item Packaging =item Licensing =back =item COMMON PITFALLS =over 4 =item Reinventing the wheel =item Trying to do too much =item Inappropriate documentation =back =item SEE ALSO L<perlstyle>, L<perlnewmod>, L<perlpod>, L<podchecker>, Packaging Tools, Testing tools, L<https://pause.perl.org/>, Any good book on software engineering =item AUTHOR =back =head2 perlmodinstall - Installing CPAN Modules =over 4 =item DESCRIPTION =over 4 =item PREAMBLE B<DECOMPRESS> the file, B<UNPACK> the file into a directory, B<BUILD> the module (sometimes unnecessary), B<INSTALL> the module =back =item PORTABILITY =item HEY =item AUTHOR =item COPYRIGHT =back =head2 perlnewmod - preparing a new module for distribution =over 4 =item DESCRIPTION =over 4 =item Warning =item What should I make into a module? =item Step-by-step: Preparing the ground Look around, Check it's new, Discuss the need, Choose a name, Check again =item Step-by-step: Making the module Start with F<module-starter> or F<h2xs>, Use L<strict|strict> and L<warnings|warnings>, Use L<Carp|Carp>, Use L<Exporter|Exporter> - wisely!, Use L<plain old documentation|perlpod>, Write tests, Write the F<README>, Write F<Changes> =item Step-by-step: Distributing your module Get a CPAN user ID, C<perl Makefile.PL; make test; make distcheck; make dist>, Upload the tarball, Fix bugs! =back =item AUTHOR =item SEE ALSO =back =head2 perlpragma - how to write a user pragma =over 4 =item DESCRIPTION =item A basic example =item Key naming =item Implementation details =back =head2 perlutil - utilities packaged with the Perl distribution =over 4 =item DESCRIPTION =item LIST OF UTILITIES =over 4 =item Documentation L<perldoc|perldoc>, L<pod2man|pod2man> and L<pod2text|pod2text>, L<pod2html|pod2html>, L<pod2usage|pod2usage>, L<podchecker|podchecker>, L<splain|splain>, C<roffitall> =item Converters =item Administration L<libnetcfg|libnetcfg>, L<perlivp> =item Development L<perlbug|perlbug>, L<perlthanks|perlbug>, L<h2ph|h2ph>, L<h2xs|h2xs>, L<enc2xs>, L<xsubpp>, L<prove>, L<corelist> =item General tools L<piconv>, L<ptar>, L<ptardiff>, L<ptargrep>, L<shasum>, L<zipdetails> =item Installation L<cpan>, L<instmodsh> =back =item SEE ALSO =back =head2 perlfilter - Source Filters =over 4 =item DESCRIPTION =item CONCEPTS =item USING FILTERS =item WRITING A SOURCE FILTER =item WRITING A SOURCE FILTER IN C B<Decryption Filters> =item CREATING A SOURCE FILTER AS A SEPARATE EXECUTABLE =item WRITING A SOURCE FILTER IN PERL =item USING CONTEXT: THE DEBUG FILTER =item CONCLUSION =item LIMITATIONS =item THINGS TO LOOK OUT FOR Some Filters Clobber the C<DATA> Handle =item REQUIREMENTS =item AUTHOR =item Copyrights =back =head2 perldtrace - Perl's support for DTrace =over 4 =item SYNOPSIS =item DESCRIPTION =item HISTORY =item PROBES sub-entry(SUBNAME, FILE, LINE, PACKAGE), sub-return(SUBNAME, FILE, LINE, PACKAGE), phase-change(NEWPHASE, OLDPHASE), op-entry(OPNAME), loading-file(FILENAME), loaded-file(FILENAME) =item EXAMPLES Most frequently called functions, Trace function calls, Function calls during interpreter cleanup, System calls at compile time, Perl functions that execute the most opcodes =item REFERENCES DTrace Dynamic Tracing Guide, DTrace: Dynamic Tracing in Oracle Solaris, Mac OS X and FreeBSD =item SEE ALSO L<Devel::DTrace::Provider> =item AUTHORS =back =head2 perlglossary - Perl Glossary =over 4 =item VERSION =item DESCRIPTION =over 4 =item A accessor methods, actual arguments, address operator, algorithm, alias, alphabetic, alternatives, anonymous, application, architecture, argument, ARGV, arithmetical operator, array, array context, Artistic License, ASCII, assertion, assignment, assignment operator, associative array, associativity, asynchronous, atom, atomic operation, attribute, autogeneration, autoincrement, autoload, autosplit, autovivification, AV, awk =item B backreference, backtracking, backward compatibility, bareword, base class, big-endian, binary, binary operator, bind, bit, bit shift, bit string, bless, block, BLOCK, block buffering, Boolean, Boolean context, breakpoint, broadcast, BSD, bucket, buffer, built-in, bundle, byte, bytecode =item C C, cache, callback, call by reference, call by value, canonical, capture variables, capturing, cargo cult, case, casefolding, casemapping, character, character class, character property, circumfix operator, class, class method, client, closure, cluster, CODE, code generator, codepoint, code subpattern, collating sequence, co-maintainer, combining character, command, command buffering, command-line arguments, command name, comment, compilation unit, compile, compile phase, compiler, compile time, composer, concatenation, conditional, connection, construct, constructor, context, continuation, core dump, CPAN, C preprocessor, cracker, currently selected output channel, current package, current working directory, CV =item D dangling statement, datagram, data structure, data type, DBM, declaration, declarator, decrement, default, defined, delimiter, dereference, derived class, descriptor, destroy, destructor, device, directive, directory, directory handle, discipline, dispatch, distribution, dual-lived, dweomer, dwimmer, dynamic scoping =item E eclectic, element, embedding, empty subclass test, encapsulation, endian, en passant, environment, environment variable, EOF, errno, error, escape sequence, exception, exception handling, exec, executable file, execute, execute bit, exit status, exploit, export, expression, extension =item F false, FAQ, fatal error, feeping creaturism, field, FIFO, file, file descriptor, fileglob, filehandle, filename, filesystem, file test operator, filter, first-come, flag, floating point, flush, FMTEYEWTK, foldcase, fork, formal arguments, format, freely available, freely redistributable, freeware, function, funny character =item G garbage collection, GID, glob, global, global destruction, glue language, granularity, grapheme, greedy, grep, group, GV =item H hacker, handler, hard reference, hash, hash table, header file, here document, hexadecimal, home directory, host, hubris, HV =item I identifier, impatience, implementation, import, increment, indexing, indirect filehandle, indirection, indirect object, indirect object slot, infix, inheritance, instance, instance data, instance method, instance variable, integer, interface, interpolation, interpreter, invocant, invocation, I/O, IO, I/O layer, IPA, IP, IPC, is-a, iteration, iterator, IV =item J JAPH =item K key, keyword =item L label, laziness, leftmost longest, left shift, lexeme, lexer, lexical analysis, lexical scoping, lexical variable, library, LIFO, line, linebreak, line buffering, line number, link, LIST, list, list context, list operator, list value, literal, little-endian, local, logical operator, lookahead, lookbehind, loop, loop control statement, loop label, lowercase, lvaluable, lvalue, lvalue modifier =item M magic, magical increment, magical variables, Makefile, man, manpage, matching, member data, memory, metacharacter, metasymbol, method, method resolution order, minicpan, minimalism, mode, modifier, module, modulus, mojibake, monger, mortal, mro, multidimensional array, multiple inheritance =item N named pipe, namespace, NaN, network address, newline, NFS, normalization, null character, null list, null string, numeric context, numification, NV, nybble =item O object, octal, offset, one-liner, open source software, operand, operating system, operator, operator overloading, options, ordinal, overloading, overriding, owner =item P package, pad, parameter, parent class, parse tree, parsing, patch, PATH, pathname, pattern, pattern matching, PAUSE, Perl mongers, permission bits, Pern, pipe, pipeline, platform, pod, pod command, pointer, polymorphism, port, portable, porter, possessive, POSIX, postfix, pp, pragma, precedence, prefix, preprocessing, primary maintainer, procedure, process, program, program generator, progressive matching, property, protocol, prototype, pseudofunction, pseudohash, pseudoliteral, public domain, pumpkin, pumpking, PV =item Q qualified, quantifier =item R race condition, readable, reaping, record, recursion, reference, referent, regex, regular expression, regular expression modifier, regular file, relational operator, reserved words, return value, RFC, right shift, role, root, RTFM, run phase, runtime, runtime pattern, RV, rvalue =item S sandbox, scalar, scalar context, scalar literal, scalar value, scalar variable, scope, scratchpad, script, script kiddie, sed, semaphore, separator, serialization, server, service, setgid, setuid, shared memory, shebang, shell, side effects, sigil, signal, signal handler, single inheritance, slice, slurp, socket, soft reference, source filter, stack, standard, standard error, standard input, standard I/O, Standard Library, standard output, statement, statement modifier, static, static method, static scoping, static variable, stat structure, status, STDERR, STDIN, STDIO, STDOUT, stream, string, string context, stringification, struct, structure, subclass, subpattern, subroutine, subscript, substitution, substring, superclass, superuser, SV, switch, switch cluster, switch statement, symbol, symbolic debugger, symbolic link, symbolic reference, symbol table, synchronous, syntactic sugar, syntax, syntax tree, syscall =item T taint checks, tainted, taint mode, TCP, term, terminator, ternary, text, thread, tie, titlecase, TMTOWTDI, token, tokener, tokenizing, toolbox approach, topic, transliterate, trigger, trinary, troff, true, truncating, type, type casting, typedef, typed lexical, typeglob, typemap =item U UDP, UID, umask, unary operator, Unicode, Unix, uppercase =item V value, variable, variable interpolation, variadic, vector, virtual, void context, v-string =item W warning, watch expression, weak reference, whitespace, word, working directory, wrapper, WYSIWYG =item X XS, XSUB =item Y yacc =item Z zero width, zombie =back =item AUTHOR AND COPYRIGHT =back =head2 perlembed - how to embed perl in your C program =over 4 =item DESCRIPTION =over 4 =item PREAMBLE B<Use C from Perl?>, B<Use a Unix program from Perl?>, B<Use Perl from Perl?>, B<Use C from C?>, B<Use Perl from C?> =item ROADMAP =item Compiling your C program =item Adding a Perl interpreter to your C program =item Calling a Perl subroutine from your C program =item Evaluating a Perl statement from your C program =item Performing Perl pattern matches and substitutions from your C program =item Fiddling with the Perl stack from your C program =item Maintaining a persistent interpreter =item Execution of END blocks =item $0 assignments =item Maintaining multiple interpreter instances =item Using Perl modules, which themselves use C libraries, from your C program =item Using embedded Perl with POSIX locales =back =item Hiding Perl_ =item MORAL =item AUTHOR =item COPYRIGHT =back =head2 perldebguts - Guts of Perl debugging =over 4 =item DESCRIPTION =item Debugger Internals =over 4 =item Writing Your Own Debugger =back =item Frame Listing Output Examples =item Debugging Regular Expressions =over 4 =item Compile-time Output C<anchored> I<STRING> C<at> I<POS>, C<floating> I<STRING> C<at> I<POS1..POS2>, C<matching floating/anchored>, C<minlen>, C<stclass> I<TYPE>, C<noscan>, C<isall>, C<GPOS>, C<plus>, C<implicit>, C<with eval>, C<anchored(TYPE)> =item Types of Nodes =item Run-time Output =back =item Debugging Perl Memory Usage =over 4 =item Using C<$ENV{PERL_DEBUG_MSTATS}> C<buckets SMALLEST(APPROX)..GREATEST(APPROX)>, Free/Used, C<Total sbrk(): SBRKed/SBRKs:CONTINUOUS>, C<pad: 0>, C<heads: 2192>, C<chain: 0>, C<tail: 6144> =back =item SEE ALSO =back =head2 perlxstut - Tutorial for writing XSUBs =over 4 =item DESCRIPTION =item SPECIAL NOTES =over 4 =item make =item Version caveat =item Dynamic Loading versus Static Loading =item Threads and PERL_NO_GET_CONTEXT =back =item TUTORIAL =over 4 =item EXAMPLE 1 =item EXAMPLE 2 =item What has gone on? =item Writing good test scripts =item EXAMPLE 3 =item What's new here? =item Input and Output Parameters =item The XSUBPP Program =item The TYPEMAP file =item Warning about Output Arguments =item EXAMPLE 4 =item What has happened here? =item Anatomy of .xs file =item Getting the fat out of XSUBs =item More about XSUB arguments =item The Argument Stack =item Extending your Extension =item Documenting your Extension =item Installing your Extension =item EXAMPLE 5 =item New Things in this Example =item EXAMPLE 6 =item New Things in this Example =item EXAMPLE 7 (Coming Soon) =item EXAMPLE 8 (Coming Soon) =item EXAMPLE 9 Passing open files to XSes =item Troubleshooting these Examples =back =item See also =item Author =over 4 =item Last Changed =back =back =head2 perlxs - XS language reference manual =over 4 =item DESCRIPTION =over 4 =item Introduction =item On The Road =item The Anatomy of an XSUB =item The Argument Stack =item The RETVAL Variable =item Returning SVs, AVs and HVs through RETVAL =item The MODULE Keyword =item The PACKAGE Keyword =item The PREFIX Keyword =item The OUTPUT: Keyword =item The NO_OUTPUT Keyword =item The CODE: Keyword =item The INIT: Keyword =item The NO_INIT Keyword =item The TYPEMAP: Keyword =item Initializing Function Parameters =item Default Parameter Values =item The PREINIT: Keyword =item The SCOPE: Keyword =item The INPUT: Keyword =item The IN/OUTLIST/IN_OUTLIST/OUT/IN_OUT Keywords =item The C<length(NAME)> Keyword =item Variable-length Parameter Lists =item The C_ARGS: Keyword =item The PPCODE: Keyword =item Returning Undef And Empty Lists =item The REQUIRE: Keyword =item The CLEANUP: Keyword =item The POSTCALL: Keyword =item The BOOT: Keyword =item The VERSIONCHECK: Keyword =item The PROTOTYPES: Keyword =item The PROTOTYPE: Keyword =item The ALIAS: Keyword =item The OVERLOAD: Keyword =item The FALLBACK: Keyword =item The INTERFACE: Keyword =item The INTERFACE_MACRO: Keyword =item The INCLUDE: Keyword =item The INCLUDE_COMMAND: Keyword =item The CASE: Keyword =item The EXPORT_XSUB_SYMBOLS: Keyword =item The & Unary Operator =item Inserting POD, Comments and C Preprocessor Directives =item Using XS With C++ =item Interface Strategy =item Perl Objects And C Structures =item Safely Storing Static Data in XS MY_CXT_KEY, typedef my_cxt_t, START_MY_CXT, MY_CXT_INIT, dMY_CXT, MY_CXT, aMY_CXT/pMY_CXT, MY_CXT_CLONE, MY_CXT_INIT_INTERP(my_perl), dMY_CXT_INTERP(my_perl) =item Thread-aware system interfaces =back =item EXAMPLES =item CAVEATS Non-locale-aware XS code, Locale-aware XS code =item XS VERSION =item AUTHOR =back =head2 perlxstypemap - Perl XS C/Perl type mapping =over 4 =item DESCRIPTION =over 4 =item Anatomy of a typemap =item The Role of the typemap File in Your Distribution =item Sharing typemaps Between CPAN Distributions =item Writing typemap Entries =item Full Listing of Core Typemaps T_SV, T_SVREF, T_SVREF_FIXED, T_AVREF, T_AVREF_REFCOUNT_FIXED, T_HVREF, T_HVREF_REFCOUNT_FIXED, T_CVREF, T_CVREF_REFCOUNT_FIXED, T_SYSRET, T_UV, T_IV, T_INT, T_ENUM, T_BOOL, T_U_INT, T_SHORT, T_U_SHORT, T_LONG, T_U_LONG, T_CHAR, T_U_CHAR, T_FLOAT, T_NV, T_DOUBLE, T_PV, T_PTR, T_PTRREF, T_PTROBJ, T_REF_IV_REF, T_REF_IV_PTR, T_PTRDESC, T_REFREF, T_REFOBJ, T_OPAQUEPTR, T_OPAQUE, Implicit array, T_PACKED, T_PACKEDARRAY, T_DATAUNIT, T_CALLBACK, T_ARRAY, T_STDIO, T_INOUT, T_IN, T_OUT =back =back =head2 perlclib - Internal replacements for standard C library functions =over 4 =item DESCRIPTION =over 4 =item Conventions C<t>, C<p>, C<n>, C<s> =item File Operations =item File Input and Output =item File Positioning =item Memory Management and String Handling =item Character Class Tests =item F<stdlib.h> functions =item Miscellaneous functions =back =item SEE ALSO =back =head2 perlguts - Introduction to the Perl API =over 4 =item DESCRIPTION =item Variables =over 4 =item Datatypes =item What is an "IV"? =item Working with SVs =item Offsets =item What's Really Stored in an SV? =item Working with AVs =item Working with HVs =item Hash API Extensions =item AVs, HVs and undefined values =item References =item Blessed References and Class Objects =item Creating New Variables GV_ADDMULTI, GV_ADDWARN =item Reference Counts and Mortality =item Stashes and Globs =item Double-Typed SVs =item Read-Only Values =item Copy on Write =item Magic Variables =item Assigning Magic =item Magic Virtual Tables =item Finding Magic =item Understanding the Magic of Tied Hashes and Arrays =item Localizing changes C<SAVEINT(int i)>, C<SAVEIV(IV i)>, C<SAVEI32(I32 i)>, C<SAVELONG(long i)>, C<SAVESPTR(s)>, C<SAVEPPTR(p)>, C<SAVEFREESV(SV *sv)>, C<SAVEMORTALIZESV(SV *sv)>, C<SAVEFREEOP(OP *op)>, C<SAVEFREEPV(p)>, C<SAVECLEARSV(SV *sv)>, C<SAVEDELETE(HV *hv, char *key, I32 length)>, C<SAVEDESTRUCTOR(DESTRUCTORFUNC_NOCONTEXT_t f, void *p)>, C<SAVEDESTRUCTOR_X(DESTRUCTORFUNC_t f, void *p)>, C<SAVESTACK_POS()>, C<SV* save_scalar(GV *gv)>, C<AV* save_ary(GV *gv)>, C<HV* save_hash(GV *gv)>, C<void save_item(SV *item)>, C<void save_list(SV **sarg, I32 maxsarg)>, C<SV* save_svref(SV **sptr)>, C<void save_aptr(AV **aptr)>, C<void save_hptr(HV **hptr)> =back =item Subroutines =over 4 =item XSUBs and the Argument Stack =item Autoloading with XSUBs =item Calling Perl Routines from within C Programs =item Putting a C value on Perl stack =item Scratchpads =item Scratchpads and recursion =back =item Memory Allocation =over 4 =item Allocation =item Reallocation =item Moving =back =item PerlIO =item Compiled code =over 4 =item Code tree =item Examining the tree =item Compile pass 1: check routines =item Compile pass 1a: constant folding =item Compile pass 2: context propagation =item Compile pass 3: peephole optimization =item Pluggable runops =item Compile-time scope hooks C<void bhk_start(pTHX_ int full)>, C<void bhk_pre_end(pTHX_ OP **o)>, C<void bhk_post_end(pTHX_ OP **o)>, C<void bhk_eval(pTHX_ OP *const o)> =back =item Examining internal data structures with the C<dump> functions =item How multiple interpreters and concurrency are supported =over 4 =item Background and PERL_IMPLICIT_CONTEXT =item So what happened to dTHR? =item How do I use all this in extensions? =item Should I do anything special if I call perl from multiple threads? =item Future Plans and PERL_IMPLICIT_SYS =back =item Internal Functions =over 4 =item Formatted Printing of IVs, UVs, and NVs =item Formatted Printing of SVs =item Formatted Printing of Strings =item Formatted Printing of C<Size_t> and C<SSize_t> =item Pointer-To-Integer and Integer-To-Pointer =item Exception Handling =item Source Documentation =item Backwards compatibility =back =item Unicode Support =over 4 =item What B<is> Unicode, anyway? =item How can I recognise a UTF-8 string? =item How does UTF-8 represent Unicode characters? =item How does Perl store UTF-8 strings? =item How do I convert a string to UTF-8? =item How do I compare strings? =item Is there anything else I need to know? =back =item Custom Operators xop_name, xop_desc, xop_class, OA_BASEOP, OA_UNOP, OA_BINOP, OA_LOGOP, OA_LISTOP, OA_PMOP, OA_SVOP, OA_PADOP, OA_PVOP_OR_SVOP, OA_LOOP, OA_COP, xop_peep =item Stacks =over 4 =item Value Stack =item Mark Stack =item Temporaries Stack =item Save Stack =item Scope Stack =back =item Dynamic Scope and the Context Stack =over 4 =item Introduction to the context stack =item Pushing contexts =item Popping contexts =item Redoing contexts =back =item Slab-based operator allocation =item AUTHORS =item SEE ALSO =back =head2 perlcall - Perl calling conventions from C =over 4 =item DESCRIPTION An Error Handler, An Event-Driven Program =item THE CALL_ FUNCTIONS call_sv, call_pv, call_method, call_argv =item FLAG VALUES =over 4 =item G_VOID =item G_SCALAR =item G_ARRAY =item G_DISCARD =item G_NOARGS =item G_EVAL =item G_KEEPERR =item Determining the Context =back =item EXAMPLES =over 4 =item No Parameters, Nothing Returned =item Passing Parameters =item Returning a Scalar =item Returning a List of Values =item Returning a List in Scalar Context =item Returning Data from Perl via the Parameter List =item Using G_EVAL =item Using G_KEEPERR =item Using call_sv =item Using call_argv =item Using call_method =item Using GIMME_V =item Using Perl to Dispose of Temporaries =item Strategies for Storing Callback Context Information 1. Ignore the problem - Allow only 1 callback, 2. Create a sequence of callbacks - hard wired limit, 3. Use a parameter to map to the Perl callback =item Alternate Stack Manipulation =item Creating and Calling an Anonymous Subroutine in C =back =item LIGHTWEIGHT CALLBACKS =item SEE ALSO =item AUTHOR =item DATE =back =head2 perlmroapi - Perl method resolution plugin interface =over 4 =item DESCRIPTION resolve, name, length, kflags, hash =item Callbacks =item Caching =item Examples =item AUTHORS =back =head2 perlreapi - Perl regular expression plugin interface =over 4 =item DESCRIPTION =item Callbacks =over 4 =item comp C</m> - RXf_PMf_MULTILINE, C</s> - RXf_PMf_SINGLELINE, C</i> - RXf_PMf_FOLD, C</x> - RXf_PMf_EXTENDED, C</p> - RXf_PMf_KEEPCOPY, Character set, RXf_SPLIT, RXf_SKIPWHITE, RXf_START_ONLY, RXf_WHITE, RXf_NULL, RXf_NO_INPLACE_SUBST =item exec rx, sv, strbeg, strend, stringarg, minend, data, flags =item intuit =item checkstr =item free =item Numbered capture callbacks =item Named capture callbacks =item qr_package =item dupe =item op_comp =back =item The REGEXP structure =over 4 =item C<engine> =item C<mother_re> =item C<extflags> =item C<minlen> C<minlenret> =item C<gofs> =item C<substrs> =item C<nparens>, C<lastparen>, and C<lastcloseparen> =item C<intflags> =item C<pprivate> =item C<offs> =item C<precomp> C<prelen> =item C<paren_names> =item C<substrs> =item C<subbeg> C<sublen> C<saved_copy> C<suboffset> C<subcoffset> =item C<wrapped> C<wraplen> =item C<seen_evals> =item C<refcnt> =back =item HISTORY =item AUTHORS =item LICENSE =back =head2 perlreguts - Description of the Perl regular expression engine. =over 4 =item DESCRIPTION =item OVERVIEW =over 4 =item A quick note on terms =item What is a regular expression engine? =item Structure of a Regexp Program C<regnode_1>, C<regnode_2>, C<regnode_string>, C<regnode_charclass>, C<regnode_charclass_posixl> =back =item Process Overview A. Compilation, 1. Parsing, 2. Peep-hole optimisation and analysis, B. Execution, 3. Start position and no-match optimisations, 4. Program execution =over 4 =item Compilation anchored fixed strings, floating fixed strings, minimum and maximum length requirements, start class, Beginning/End of line positions =item Execution =back =item MISCELLANEOUS =over 4 =item Unicode and Localisation Support =item Base Structures C<offsets>, C<regstclass>, C<data>, C<program> =back =item SEE ALSO =item AUTHOR =item LICENCE =item REFERENCES =back =head2 perlapi - autogenerated documentation for the perl public API =over 4 =item DESCRIPTION X<Perl API> X<API> X<api> =item Array Manipulation Functions av_clear X<av_clear>, av_create_and_push X<av_create_and_push>, av_create_and_unshift_one X<av_create_and_unshift_one>, av_delete X<av_delete>, av_exists X<av_exists>, av_extend X<av_extend>, av_fetch X<av_fetch>, AvFILL X<AvFILL>, av_fill X<av_fill>, av_len X<av_len>, av_make X<av_make>, av_pop X<av_pop>, av_push X<av_push>, av_shift X<av_shift>, av_store X<av_store>, av_tindex X<av_tindex>, av_top_index X<av_top_index>, av_undef X<av_undef>, av_unshift X<av_unshift>, get_av X<get_av>, newAV X<newAV>, sortsv X<sortsv> =item Callback Functions call_argv X<call_argv>, call_method X<call_method>, call_pv X<call_pv>, call_sv X<call_sv>, ENTER X<ENTER>, ENTER_with_name X<ENTER_with_name>, eval_pv X<eval_pv>, eval_sv X<eval_sv>, FREETMPS X<FREETMPS>, LEAVE X<LEAVE>, LEAVE_with_name X<LEAVE_with_name>, SAVETMPS X<SAVETMPS> =item Character case changing toFOLD X<toFOLD>, toFOLD_utf8 X<toFOLD_utf8>, toFOLD_utf8_safe X<toFOLD_utf8_safe>, toFOLD_uvchr X<toFOLD_uvchr>, toLOWER X<toLOWER>, toLOWER_L1 X<toLOWER_L1>, toLOWER_LC X<toLOWER_LC>, toLOWER_utf8 X<toLOWER_utf8>, toLOWER_utf8_safe X<toLOWER_utf8_safe>, toLOWER_uvchr X<toLOWER_uvchr>, toTITLE X<toTITLE>, toTITLE_utf8 X<toTITLE_utf8>, toTITLE_utf8_safe X<toTITLE_utf8_safe>, toTITLE_uvchr X<toTITLE_uvchr>, toUPPER X<toUPPER>, toUPPER_utf8 X<toUPPER_utf8>, toUPPER_utf8_safe X<toUPPER_utf8_safe>, toUPPER_uvchr X<toUPPER_uvchr>, WIDEST_UTYPE X<WIDEST_UTYPE> =item Character classification isALPHA X<isALPHA>, isALPHANUMERIC X<isALPHANUMERIC>, isASCII X<isASCII>, isBLANK X<isBLANK>, isCNTRL X<isCNTRL>, isDIGIT X<isDIGIT>, isGRAPH X<isGRAPH>, isIDCONT X<isIDCONT>, isIDFIRST X<isIDFIRST>, isLOWER X<isLOWER>, isOCTAL X<isOCTAL>, isPRINT X<isPRINT>, isPSXSPC X<isPSXSPC>, isPUNCT X<isPUNCT>, isSPACE X<isSPACE>, isUPPER X<isUPPER>, isWORDCHAR X<isWORDCHAR>, isXDIGIT X<isXDIGIT> =item Cloning an interpreter perl_clone X<perl_clone> =item Compile-time scope hooks BhkDISABLE X<BhkDISABLE>, BhkENABLE X<BhkENABLE>, BhkENTRY_set X<BhkENTRY_set>, blockhook_register X<blockhook_register> =item COP Hint Hashes cophh_2hv X<cophh_2hv>, cophh_copy X<cophh_copy>, cophh_delete_pv X<cophh_delete_pv>, cophh_delete_pvn X<cophh_delete_pvn>, cophh_delete_pvs X<cophh_delete_pvs>, cophh_delete_sv X<cophh_delete_sv>, cophh_fetch_pv X<cophh_fetch_pv>, cophh_fetch_pvn X<cophh_fetch_pvn>, cophh_fetch_pvs X<cophh_fetch_pvs>, cophh_fetch_sv X<cophh_fetch_sv>, cophh_free X<cophh_free>, cophh_new_empty X<cophh_new_empty>, cophh_store_pv X<cophh_store_pv>, cophh_store_pvn X<cophh_store_pvn>, cophh_store_pvs X<cophh_store_pvs>, cophh_store_sv X<cophh_store_sv> =item COP Hint Reading cop_hints_2hv X<cop_hints_2hv>, cop_hints_fetch_pv X<cop_hints_fetch_pv>, cop_hints_fetch_pvn X<cop_hints_fetch_pvn>, cop_hints_fetch_pvs X<cop_hints_fetch_pvs>, cop_hints_fetch_sv X<cop_hints_fetch_sv>, CopLABEL X<CopLABEL>, CopLABEL_len X<CopLABEL_len>, CopLABEL_len_flags X<CopLABEL_len_flags> =item Custom Operators custom_op_register X<custom_op_register>, Perl_custom_op_xop X<Perl_custom_op_xop>, XopDISABLE X<XopDISABLE>, XopENABLE X<XopENABLE>, XopENTRY X<XopENTRY>, XopENTRYCUSTOM X<XopENTRYCUSTOM>, XopENTRY_set X<XopENTRY_set>, XopFLAGS X<XopFLAGS> =item CV Manipulation Functions caller_cx X<caller_cx>, CvSTASH X<CvSTASH>, find_runcv X<find_runcv>, get_cv X<get_cv>, get_cvn_flags X<get_cvn_flags> =item C<xsubpp> variables and internal functions ax X<ax>, CLASS X<CLASS>, dAX X<dAX>, dAXMARK X<dAXMARK>, dITEMS X<dITEMS>, dUNDERBAR X<dUNDERBAR>, dXSARGS X<dXSARGS>, dXSI32 X<dXSI32>, items X<items>, ix X<ix>, RETVAL X<RETVAL>, ST X<ST>, THIS X<THIS>, UNDERBAR X<UNDERBAR>, XS X<XS>, XS_EXTERNAL X<XS_EXTERNAL>, XS_INTERNAL X<XS_INTERNAL> =item Debugging Utilities dump_all X<dump_all>, dump_packsubs X<dump_packsubs>, op_class X<op_class>, op_dump X<op_dump>, sv_dump X<sv_dump> =item Display and Dump functions pv_display X<pv_display>, pv_escape X<pv_escape>, pv_pretty X<pv_pretty> =item Embedding Functions cv_clone X<cv_clone>, cv_name X<cv_name>, cv_undef X<cv_undef>, find_rundefsv X<find_rundefsv>, find_rundefsvoffset X<find_rundefsvoffset>, intro_my X<intro_my>, load_module X<load_module>, my_exit X<my_exit>, newPADNAMELIST X<newPADNAMELIST>, newPADNAMEouter X<newPADNAMEouter>, newPADNAMEpvn X<newPADNAMEpvn>, nothreadhook X<nothreadhook>, pad_add_anon X<pad_add_anon>, pad_add_name_pv X<pad_add_name_pv>, pad_add_name_pvn X<pad_add_name_pvn>, pad_add_name_sv X<pad_add_name_sv>, pad_alloc X<pad_alloc>, pad_findmy_pv X<pad_findmy_pv>, pad_findmy_pvn X<pad_findmy_pvn>, pad_findmy_sv X<pad_findmy_sv>, padnamelist_fetch X<padnamelist_fetch>, padnamelist_store X<padnamelist_store>, pad_setsv X<pad_setsv>, pad_sv X<pad_sv>, pad_tidy X<pad_tidy>, perl_alloc X<perl_alloc>, perl_construct X<perl_construct>, perl_destruct X<perl_destruct>, perl_free X<perl_free>, perl_parse X<perl_parse>, perl_run X<perl_run>, require_pv X<require_pv> =item Exception Handling (simple) Macros dXCPT X<dXCPT>, XCPT_CATCH X<XCPT_CATCH>, XCPT_RETHROW X<XCPT_RETHROW>, XCPT_TRY_END X<XCPT_TRY_END>, XCPT_TRY_START X<XCPT_TRY_START> =item Functions in file inline.h av_count X<av_count> =item Functions in file vutil.c new_version X<new_version>, prescan_version X<prescan_version>, scan_version X<scan_version>, upg_version X<upg_version>, vcmp X<vcmp>, vnormal X<vnormal>, vnumify X<vnumify>, vstringify X<vstringify>, vverify X<vverify>, The SV is an HV or a reference to an HV, The hash contains a "version" key, The "version" key has a reference to an AV as its value =item "Gimme" Values G_ARRAY X<G_ARRAY>, G_DISCARD X<G_DISCARD>, G_EVAL X<G_EVAL>, GIMME X<GIMME>, GIMME_V X<GIMME_V>, G_NOARGS X<G_NOARGS>, G_SCALAR X<G_SCALAR>, G_VOID X<G_VOID> =item Global Variables PL_check X<PL_check>, PL_keyword_plugin X<PL_keyword_plugin>, PL_phase X<PL_phase> =item GV Functions GvAV X<GvAV>, gv_const_sv X<gv_const_sv>, GvCV X<GvCV>, gv_fetchmeth X<gv_fetchmeth>, gv_fetchmethod_autoload X<gv_fetchmethod_autoload>, gv_fetchmeth_autoload X<gv_fetchmeth_autoload>, gv_fetchmeth_pv X<gv_fetchmeth_pv>, gv_fetchmeth_pvn X<gv_fetchmeth_pvn>, gv_fetchmeth_pvn_autoload X<gv_fetchmeth_pvn_autoload>, gv_fetchmeth_pv_autoload X<gv_fetchmeth_pv_autoload>, gv_fetchmeth_sv X<gv_fetchmeth_sv>, gv_fetchmeth_sv_autoload X<gv_fetchmeth_sv_autoload>, GvHV X<GvHV>, gv_init X<gv_init>, gv_init_pv X<gv_init_pv>, gv_init_pvn X<gv_init_pvn>, gv_init_sv X<gv_init_sv>, gv_stashpv X<gv_stashpv>, gv_stashpvn X<gv_stashpvn>, gv_stashpvs X<gv_stashpvs>, gv_stashsv X<gv_stashsv>, GvSV X<GvSV>, save_gp X<save_gp>, setdefout X<setdefout> =item Handy Values C_ARRAY_END X<C_ARRAY_END>, C_ARRAY_LENGTH X<C_ARRAY_LENGTH>, cBOOL X<cBOOL>, Nullav X<Nullav>, Nullch X<Nullch>, Nullcv X<Nullcv>, Nullhv X<Nullhv>, Nullsv X<Nullsv>, STR_WITH_LEN X<STR_WITH_LEN>, __ASSERT_ X<__ASSERT_> =item Hash Manipulation Functions cop_fetch_label X<cop_fetch_label>, cop_store_label X<cop_store_label>, get_hv X<get_hv>, HEf_SVKEY X<HEf_SVKEY>, HeHASH X<HeHASH>, HeKEY X<HeKEY>, HeKLEN X<HeKLEN>, HePV X<HePV>, HeSVKEY X<HeSVKEY>, HeSVKEY_force X<HeSVKEY_force>, HeSVKEY_set X<HeSVKEY_set>, HeUTF8 X<HeUTF8>, HeVAL X<HeVAL>, hv_assert X<hv_assert>, hv_bucket_ratio X<hv_bucket_ratio>, hv_clear X<hv_clear>, hv_clear_placeholders X<hv_clear_placeholders>, hv_copy_hints_hv X<hv_copy_hints_hv>, hv_delete X<hv_delete>, hv_delete_ent X<hv_delete_ent>, HvENAME X<HvENAME>, HvENAMELEN X<HvENAMELEN>, HvENAMEUTF8 X<HvENAMEUTF8>, hv_exists X<hv_exists>, hv_exists_ent X<hv_exists_ent>, hv_fetch X<hv_fetch>, hv_fetchs X<hv_fetchs>, hv_fetch_ent X<hv_fetch_ent>, HvFILL X<HvFILL>, hv_fill X<hv_fill>, hv_iterinit X<hv_iterinit>, hv_iterkey X<hv_iterkey>, hv_iterkeysv X<hv_iterkeysv>, hv_iternext X<hv_iternext>, hv_iternextsv X<hv_iternextsv>, hv_iternext_flags X<hv_iternext_flags>, hv_iterval X<hv_iterval>, hv_magic X<hv_magic>, HvNAME X<HvNAME>, HvNAMELEN X<HvNAMELEN>, HvNAMEUTF8 X<HvNAMEUTF8>, hv_scalar X<hv_scalar>, hv_store X<hv_store>, hv_stores X<hv_stores>, hv_store_ent X<hv_store_ent>, hv_undef X<hv_undef>, newHV X<newHV> =item Hook manipulation wrap_op_checker X<wrap_op_checker> =item Lexer interface lex_bufutf8 X<lex_bufutf8>, lex_discard_to X<lex_discard_to>, lex_grow_linestr X<lex_grow_linestr>, lex_next_chunk X<lex_next_chunk>, lex_peek_unichar X<lex_peek_unichar>, lex_read_space X<lex_read_space>, lex_read_to X<lex_read_to>, lex_read_unichar X<lex_read_unichar>, lex_start X<lex_start>, lex_stuff_pv X<lex_stuff_pv>, lex_stuff_pvn X<lex_stuff_pvn>, lex_stuff_pvs X<lex_stuff_pvs>, lex_stuff_sv X<lex_stuff_sv>, lex_unstuff X<lex_unstuff>, parse_arithexpr X<parse_arithexpr>, parse_barestmt X<parse_barestmt>, parse_block X<parse_block>, parse_fullexpr X<parse_fullexpr>, parse_fullstmt X<parse_fullstmt>, parse_label X<parse_label>, parse_listexpr X<parse_listexpr>, parse_stmtseq X<parse_stmtseq>, parse_subsignature X<parse_subsignature>, parse_termexpr X<parse_termexpr>, PL_parser X<PL_parser>, PL_parser-E<gt>bufend X<PL_parser-E<gt>bufend>, PL_parser-E<gt>bufptr X<PL_parser-E<gt>bufptr>, PL_parser-E<gt>linestart X<PL_parser-E<gt>linestart>, PL_parser-E<gt>linestr X<PL_parser-E<gt>linestr>, wrap_keyword_plugin X<wrap_keyword_plugin> =item Locale-related functions and macros DECLARATION_FOR_LC_NUMERIC_MANIPULATION X<DECLARATION_FOR_LC_NUMERIC_MANIPULATION>, IN_LOCALE X<IN_LOCALE>, IN_LOCALE_COMPILETIME X<IN_LOCALE_COMPILETIME>, IN_LOCALE_RUNTIME X<IN_LOCALE_RUNTIME>, Perl_langinfo X<Perl_langinfo>, Perl_setlocale X<Perl_setlocale>, RESTORE_LC_NUMERIC X<RESTORE_LC_NUMERIC>, STORE_LC_NUMERIC_FORCE_TO_UNDERLYING X<STORE_LC_NUMERIC_FORCE_TO_UNDERLYING>, STORE_LC_NUMERIC_SET_TO_NEEDED X<STORE_LC_NUMERIC_SET_TO_NEEDED>, STORE_LC_NUMERIC_SET_TO_NEEDED_IN X<STORE_LC_NUMERIC_SET_TO_NEEDED_IN>, switch_to_global_locale X<switch_to_global_locale>, L<POSIX::localeconv|POSIX/localeconv>, L<I18N::Langinfo>, items C<CRNCYSTR> and C<THOUSEP>, L<perlapi/Perl_langinfo>, items C<CRNCYSTR> and C<THOUSEP>, sync_locale X<sync_locale>, WITH_LC_NUMERIC_SET_TO_NEEDED X<WITH_LC_NUMERIC_SET_TO_NEEDED>, WITH_LC_NUMERIC_SET_TO_NEEDED_IN X<WITH_LC_NUMERIC_SET_TO_NEEDED_IN> =item Magical Functions mg_clear X<mg_clear>, mg_copy X<mg_copy>, mg_find X<mg_find>, mg_findext X<mg_findext>, mg_free X<mg_free>, mg_freeext X<mg_freeext>, mg_free_type X<mg_free_type>, mg_get X<mg_get>, mg_length X<mg_length>, mg_magical X<mg_magical>, mg_set X<mg_set>, SvGETMAGIC X<SvGETMAGIC>, SvLOCK X<SvLOCK>, SvSETMAGIC X<SvSETMAGIC>, SvSetMagicSV X<SvSetMagicSV>, SvSetMagicSV_nosteal X<SvSetMagicSV_nosteal>, SvSetSV X<SvSetSV>, SvSetSV_nosteal X<SvSetSV_nosteal>, SvSHARE X<SvSHARE>, sv_string_from_errnum X<sv_string_from_errnum>, SvUNLOCK X<SvUNLOCK> =item Memory Management Copy X<Copy>, CopyD X<CopyD>, Move X<Move>, MoveD X<MoveD>, Newx X<Newx>, Newxc X<Newxc>, Newxz X<Newxz>, Poison X<Poison>, PoisonFree X<PoisonFree>, PoisonNew X<PoisonNew>, PoisonWith X<PoisonWith>, Renew X<Renew>, Renewc X<Renewc>, Safefree X<Safefree>, savepv X<savepv>, savepvn X<savepvn>, savepvs X<savepvs>, savesharedpv X<savesharedpv>, savesharedpvn X<savesharedpvn>, savesharedpvs X<savesharedpvs>, savesharedsvpv X<savesharedsvpv>, savesvpv X<savesvpv>, StructCopy X<StructCopy>, Zero X<Zero>, ZeroD X<ZeroD> =item Miscellaneous Functions dump_c_backtrace X<dump_c_backtrace>, fbm_compile X<fbm_compile>, fbm_instr X<fbm_instr>, foldEQ X<foldEQ>, foldEQ_locale X<foldEQ_locale>, form X<form>, getcwd_sv X<getcwd_sv>, get_c_backtrace_dump X<get_c_backtrace_dump>, ibcmp X<ibcmp>, ibcmp_locale X<ibcmp_locale>, instr X<instr>, IS_SAFE_SYSCALL X<IS_SAFE_SYSCALL>, is_safe_syscall X<is_safe_syscall>, LIKELY X<LIKELY>, memCHRs X<memCHRs>, memEQ X<memEQ>, memEQs X<memEQs>, memNE X<memNE>, memNEs X<memNEs>, mess X<mess>, mess_sv X<mess_sv>, my_snprintf X<my_snprintf>, my_sprintf X<my_sprintf>, my_strlcat X<my_strlcat>, my_strlcpy X<my_strlcpy>, my_strnlen X<my_strnlen>, my_vsnprintf X<my_vsnprintf>, ninstr X<ninstr>, PERL_SYS_INIT X<PERL_SYS_INIT>, PERL_SYS_INIT3 X<PERL_SYS_INIT3>, PERL_SYS_TERM X<PERL_SYS_TERM>, READ_XDIGIT X<READ_XDIGIT>, rninstr X<rninstr>, STMT_START X<STMT_START>, strEQ X<strEQ>, strGE X<strGE>, strGT X<strGT>, strLE X<strLE>, strLT X<strLT>, strNE X<strNE>, strnEQ X<strnEQ>, strnNE X<strnNE>, sv_destroyable X<sv_destroyable>, sv_nosharing X<sv_nosharing>, UNLIKELY X<UNLIKELY>, vmess X<vmess> =item MRO Functions mro_get_linear_isa X<mro_get_linear_isa>, mro_method_changed_in X<mro_method_changed_in>, mro_register X<mro_register> =item Multicall Functions dMULTICALL X<dMULTICALL>, MULTICALL X<MULTICALL>, POP_MULTICALL X<POP_MULTICALL>, PUSH_MULTICALL X<PUSH_MULTICALL> =item Numeric functions grok_bin X<grok_bin>, grok_hex X<grok_hex>, grok_infnan X<grok_infnan>, grok_number X<grok_number>, grok_number_flags X<grok_number_flags>, GROK_NUMERIC_RADIX X<GROK_NUMERIC_RADIX>, grok_numeric_radix X<grok_numeric_radix>, grok_oct X<grok_oct>, isinfnan X<isinfnan>, IS_NUMBER_GREATER_THAN_UV_MAX X<IS_NUMBER_GREATER_THAN_UV_MAX> bool IS_NUMBER_GREATER_THAN_UV_MAX, IS_NUMBER_INFINITY X<IS_NUMBER_INFINITY> bool IS_NUMBER_INFINITY, IS_NUMBER_IN_UV X<IS_NUMBER_IN_UV> bool IS_NUMBER_IN_UV, IS_NUMBER_NAN X<IS_NUMBER_NAN> bool IS_NUMBER_NAN, IS_NUMBER_NEG X<IS_NUMBER_NEG> bool IS_NUMBER_NEG, IS_NUMBER_NOT_INT X<IS_NUMBER_NOT_INT>, my_strtod X<my_strtod>, PERL_ABS X<PERL_ABS>, PERL_INT_MAX X<PERL_INT_MAX>, Perl_signbit X<Perl_signbit>, scan_bin X<scan_bin>, scan_hex X<scan_hex>, scan_oct X<scan_oct>, Strtod X<Strtod>, Strtol X<Strtol>, Strtoul X<Strtoul> =item Obsolete backwards compatibility functions custom_op_desc X<custom_op_desc>, custom_op_name X<custom_op_name>, gv_fetchmethod X<gv_fetchmethod>, is_utf8_char X<is_utf8_char>, is_utf8_char_buf X<is_utf8_char_buf>, pack_cat X<pack_cat>, pad_compname_type X<pad_compname_type>, sv_2pvbyte_nolen X<sv_2pvbyte_nolen>, sv_2pvutf8_nolen X<sv_2pvutf8_nolen>, sv_2pv_nolen X<sv_2pv_nolen>, sv_catpvn_mg X<sv_catpvn_mg>, sv_catsv_mg X<sv_catsv_mg>, sv_force_normal X<sv_force_normal>, sv_iv X<sv_iv>, sv_nolocking X<sv_nolocking>, sv_nounlocking X<sv_nounlocking>, sv_nv X<sv_nv>, sv_pv X<sv_pv>, sv_pvbyte X<sv_pvbyte>, sv_pvbyten X<sv_pvbyten>, sv_pvn X<sv_pvn>, sv_pvutf8 X<sv_pvutf8>, sv_pvutf8n X<sv_pvutf8n>, sv_taint X<sv_taint>, sv_unref X<sv_unref>, sv_usepvn X<sv_usepvn>, sv_usepvn_mg X<sv_usepvn_mg>, sv_uv X<sv_uv>, unpack_str X<unpack_str>, utf8_to_uvchr X<utf8_to_uvchr> =item Optree construction newASSIGNOP X<newASSIGNOP>, newBINOP X<newBINOP>, newCONDOP X<newCONDOP>, newDEFSVOP X<newDEFSVOP>, newFOROP X<newFOROP>, newGIVENOP X<newGIVENOP>, newGVOP X<newGVOP>, newLISTOP X<newLISTOP>, newLOGOP X<newLOGOP>, newLOOPEX X<newLOOPEX>, newLOOPOP X<newLOOPOP>, newMETHOP X<newMETHOP>, newMETHOP_named X<newMETHOP_named>, newNULLLIST X<newNULLLIST>, newOP X<newOP>, newPADOP X<newPADOP>, newPMOP X<newPMOP>, newPVOP X<newPVOP>, newRANGE X<newRANGE>, newSLICEOP X<newSLICEOP>, newSTATEOP X<newSTATEOP>, newSVOP X<newSVOP>, newUNOP X<newUNOP>, newUNOP_AUX X<newUNOP_AUX>, newWHENOP X<newWHENOP>, newWHILEOP X<newWHILEOP> =item Optree Manipulation Functions alloccopstash X<alloccopstash>, block_end X<block_end>, block_start X<block_start>, ck_entersub_args_list X<ck_entersub_args_list>, ck_entersub_args_proto X<ck_entersub_args_proto>, ck_entersub_args_proto_or_list X<ck_entersub_args_proto_or_list>, cv_const_sv X<cv_const_sv>, cv_get_call_checker X<cv_get_call_checker>, cv_get_call_checker_flags X<cv_get_call_checker_flags>, cv_set_call_checker X<cv_set_call_checker>, cv_set_call_checker_flags X<cv_set_call_checker_flags>, LINKLIST X<LINKLIST>, newCONSTSUB X<newCONSTSUB>, newCONSTSUB_flags X<newCONSTSUB_flags>, newXS X<newXS>, op_append_elem X<op_append_elem>, op_append_list X<op_append_list>, OP_CLASS X<OP_CLASS>, op_contextualize X<op_contextualize>, op_convert_list X<op_convert_list>, OP_DESC X<OP_DESC>, op_free X<op_free>, OpHAS_SIBLING X<OpHAS_SIBLING>, OpLASTSIB_set X<OpLASTSIB_set>, op_linklist X<op_linklist>, op_lvalue X<op_lvalue>, OpMAYBESIB_set X<OpMAYBESIB_set>, OpMORESIB_set X<OpMORESIB_set>, OP_NAME X<OP_NAME>, op_null X<op_null>, op_parent X<op_parent>, op_prepend_elem X<op_prepend_elem>, op_scope X<op_scope>, OpSIBLING X<OpSIBLING>, op_sibling_splice X<op_sibling_splice>, OP_TYPE_IS X<OP_TYPE_IS>, OP_TYPE_IS_OR_WAS X<OP_TYPE_IS_OR_WAS>, rv2cv_op_cv X<rv2cv_op_cv> =item Pack and Unpack packlist X<packlist>, unpackstring X<unpackstring> =item Pad Data Structures CvPADLIST X<CvPADLIST>, pad_add_name_pvs X<pad_add_name_pvs>, PadARRAY X<PadARRAY>, pad_findmy_pvs X<pad_findmy_pvs>, PadlistARRAY X<PadlistARRAY>, PadlistMAX X<PadlistMAX>, PadlistNAMES X<PadlistNAMES>, PadlistNAMESARRAY X<PadlistNAMESARRAY>, PadlistNAMESMAX X<PadlistNAMESMAX>, PadlistREFCNT X<PadlistREFCNT>, PadMAX X<PadMAX>, PadnameLEN X<PadnameLEN>, PadnamelistARRAY X<PadnamelistARRAY>, PadnamelistMAX X<PadnamelistMAX>, PadnamelistREFCNT X<PadnamelistREFCNT>, PadnamelistREFCNT_dec X<PadnamelistREFCNT_dec>, PadnamePV X<PadnamePV>, PadnameREFCNT X<PadnameREFCNT>, PadnameREFCNT_dec X<PadnameREFCNT_dec>, PadnameSV X<PadnameSV>, PadnameUTF8 X<PadnameUTF8>, pad_new X<pad_new>, PL_comppad X<PL_comppad>, PL_comppad_name X<PL_comppad_name>, PL_curpad X<PL_curpad> =item Per-Interpreter Variables PL_curcop X<PL_curcop>, PL_curstash X<PL_curstash>, PL_defgv X<PL_defgv>, PL_exit_flags X<PL_exit_flags>, C<PERL_EXIT_DESTRUCT_END>, C<PERL_EXIT_ABORT>, C<PERL_EXIT_WARN>, C<PERL_EXIT_EXPECTED>, PL_modglobal X<PL_modglobal>, PL_na X<PL_na>, PL_opfreehook X<PL_opfreehook>, PL_peepp X<PL_peepp>, PL_perl_destruct_level X<PL_perl_destruct_level>, 0 - none, 1 - full, 2 or greater - full with checks, PL_rpeepp X<PL_rpeepp>, PL_runops X<PL_runops>, PL_sv_no X<PL_sv_no>, PL_sv_undef X<PL_sv_undef>, PL_sv_yes X<PL_sv_yes>, PL_sv_zero X<PL_sv_zero> =item REGEXP Functions SvRX X<SvRX>, SvRXOK X<SvRXOK> =item Stack Manipulation Macros dMARK X<dMARK>, dORIGMARK X<dORIGMARK>, dSP X<dSP>, EXTEND X<EXTEND>, MARK X<MARK>, mPUSHi X<mPUSHi>, mPUSHn X<mPUSHn>, mPUSHp X<mPUSHp>, mPUSHs X<mPUSHs>, mPUSHu X<mPUSHu>, mXPUSHi X<mXPUSHi>, mXPUSHn X<mXPUSHn>, mXPUSHp X<mXPUSHp>, mXPUSHs X<mXPUSHs>, mXPUSHu X<mXPUSHu>, ORIGMARK X<ORIGMARK>, POPi X<POPi>, POPl X<POPl>, POPn X<POPn>, POPp X<POPp>, POPpbytex X<POPpbytex>, POPpx X<POPpx>, POPs X<POPs>, POPu X<POPu>, POPul X<POPul>, PUSHi X<PUSHi>, PUSHMARK X<PUSHMARK>, PUSHmortal X<PUSHmortal>, PUSHn X<PUSHn>, PUSHp X<PUSHp>, PUSHs X<PUSHs>, PUSHu X<PUSHu>, PUTBACK X<PUTBACK>, SP X<SP>, SPAGAIN X<SPAGAIN>, XPUSHi X<XPUSHi>, XPUSHmortal X<XPUSHmortal>, XPUSHn X<XPUSHn>, XPUSHp X<XPUSHp>, XPUSHs X<XPUSHs>, XPUSHu X<XPUSHu>, XSRETURN X<XSRETURN>, XSRETURN_EMPTY X<XSRETURN_EMPTY>, XSRETURN_IV X<XSRETURN_IV>, XSRETURN_NO X<XSRETURN_NO>, XSRETURN_NV X<XSRETURN_NV>, XSRETURN_PV X<XSRETURN_PV>, XSRETURN_UNDEF X<XSRETURN_UNDEF>, XSRETURN_UV X<XSRETURN_UV>, XSRETURN_YES X<XSRETURN_YES>, XST_mIV X<XST_mIV>, XST_mNO X<XST_mNO>, XST_mNV X<XST_mNV>, XST_mPV X<XST_mPV>, XST_mUNDEF X<XST_mUNDEF>, XST_mUV X<XST_mUV>, XST_mYES X<XST_mYES> =item SV Flags SVt_IV X<SVt_IV>, SVt_NULL X<SVt_NULL>, SVt_NV X<SVt_NV>, SVt_PV X<SVt_PV>, SVt_PVAV X<SVt_PVAV>, SVt_PVCV X<SVt_PVCV>, SVt_PVFM X<SVt_PVFM>, SVt_PVGV X<SVt_PVGV>, SVt_PVHV X<SVt_PVHV>, SVt_PVIO X<SVt_PVIO>, SVt_PVIV X<SVt_PVIV>, SVt_PVLV X<SVt_PVLV>, SVt_PVMG X<SVt_PVMG>, SVt_PVNV X<SVt_PVNV>, SVt_REGEXP X<SVt_REGEXP>, svtype X<svtype> =item SV Manipulation Functions boolSV X<boolSV>, croak_xs_usage X<croak_xs_usage>, get_sv X<get_sv>, looks_like_number X<looks_like_number>, newRV_inc X<newRV_inc>, newRV_noinc X<newRV_noinc>, newSV X<newSV>, newSVhek X<newSVhek>, newSViv X<newSViv>, newSVnv X<newSVnv>, newSVpadname X<newSVpadname>, newSVpv X<newSVpv>, newSVpvf X<newSVpvf>, newSVpvn X<newSVpvn>, newSVpvn_flags X<newSVpvn_flags>, newSVpvn_share X<newSVpvn_share>, newSVpvn_utf8 X<newSVpvn_utf8>, newSVpvs X<newSVpvs>, newSVpvs_flags X<newSVpvs_flags>, newSVpv_share X<newSVpv_share>, newSVpvs_share X<newSVpvs_share>, newSVrv X<newSVrv>, newSVsv X<newSVsv>, newSVsv_nomg X<newSVsv_nomg>, newSV_type X<newSV_type>, newSVuv X<newSVuv>, sortsv_flags X<sortsv_flags>, sv_2bool X<sv_2bool>, sv_2bool_flags X<sv_2bool_flags>, sv_2cv X<sv_2cv>, sv_2io X<sv_2io>, sv_2iv_flags X<sv_2iv_flags>, sv_2mortal X<sv_2mortal>, sv_2nv_flags X<sv_2nv_flags>, sv_2pvbyte X<sv_2pvbyte>, sv_2pvutf8 X<sv_2pvutf8>, sv_2pv_flags X<sv_2pv_flags>, sv_2uv_flags X<sv_2uv_flags>, sv_backoff X<sv_backoff>, sv_bless X<sv_bless>, sv_catpv X<sv_catpv>, sv_catpvf X<sv_catpvf>, sv_catpvf_mg X<sv_catpvf_mg>, sv_catpvn X<sv_catpvn>, sv_catpvn_flags X<sv_catpvn_flags>, sv_catpvn_nomg X<sv_catpvn_nomg>, sv_catpvs X<sv_catpvs>, sv_catpvs_flags X<sv_catpvs_flags>, sv_catpvs_mg X<sv_catpvs_mg>, sv_catpvs_nomg X<sv_catpvs_nomg>, sv_catpv_flags X<sv_catpv_flags>, sv_catpv_mg X<sv_catpv_mg>, sv_catpv_nomg X<sv_catpv_nomg>, sv_catsv X<sv_catsv>, sv_catsv_flags X<sv_catsv_flags>, sv_catsv_nomg X<sv_catsv_nomg>, sv_chop X<sv_chop>, sv_clear X<sv_clear>, sv_cmp X<sv_cmp>, sv_cmp_flags X<sv_cmp_flags>, sv_cmp_locale X<sv_cmp_locale>, sv_cmp_locale_flags X<sv_cmp_locale_flags>, sv_collxfrm X<sv_collxfrm>, sv_collxfrm_flags X<sv_collxfrm_flags>, sv_copypv X<sv_copypv>, sv_copypv_flags X<sv_copypv_flags>, sv_copypv_nomg X<sv_copypv_nomg>, SvCUR X<SvCUR>, SvCUR_set X<SvCUR_set>, sv_dec X<sv_dec>, sv_dec_nomg X<sv_dec_nomg>, sv_derived_from X<sv_derived_from>, sv_derived_from_pv X<sv_derived_from_pv>, sv_derived_from_pvn X<sv_derived_from_pvn>, sv_derived_from_sv X<sv_derived_from_sv>, sv_does X<sv_does>, sv_does_pv X<sv_does_pv>, sv_does_pvn X<sv_does_pvn>, sv_does_sv X<sv_does_sv>, SvEND X<SvEND>, sv_eq X<sv_eq>, sv_eq_flags X<sv_eq_flags>, sv_force_normal_flags X<sv_force_normal_flags>, sv_free X<sv_free>, SvGAMAGIC X<SvGAMAGIC>, sv_gets X<sv_gets>, sv_get_backrefs X<sv_get_backrefs>, SvGROW X<SvGROW>, sv_grow X<sv_grow>, sv_inc X<sv_inc>, sv_inc_nomg X<sv_inc_nomg>, sv_insert X<sv_insert>, sv_insert_flags X<sv_insert_flags>, SvIOK X<SvIOK>, SvIOK_notUV X<SvIOK_notUV>, SvIOK_off X<SvIOK_off>, SvIOK_on X<SvIOK_on>, SvIOK_only X<SvIOK_only>, SvIOK_only_UV X<SvIOK_only_UV>, SvIOKp X<SvIOKp>, SvIOK_UV X<SvIOK_UV>, sv_isa X<sv_isa>, sv_isa_sv X<sv_isa_sv>, SvIsCOW X<SvIsCOW>, SvIsCOW_shared_hash X<SvIsCOW_shared_hash>, sv_isobject X<sv_isobject>, SvIV X<SvIV>, SvIV_nomg X<SvIV_nomg>, SvIV_set X<SvIV_set>, SvIVX X<SvIVX>, SvIVx X<SvIVx>, SvLEN X<SvLEN>, sv_len X<sv_len>, SvLEN_set X<SvLEN_set>, sv_len_utf8 X<sv_len_utf8>, sv_magic X<sv_magic>, sv_magicext X<sv_magicext>, SvMAGIC_set X<SvMAGIC_set>, sv_mortalcopy X<sv_mortalcopy>, sv_mortalcopy_flags X<sv_mortalcopy_flags>, sv_newmortal X<sv_newmortal>, sv_newref X<sv_newref>, SvNIOK X<SvNIOK>, SvNIOK_off X<SvNIOK_off>, SvNIOKp X<SvNIOKp>, SvNOK X<SvNOK>, SvNOK_off X<SvNOK_off>, SvNOK_on X<SvNOK_on>, SvNOK_only X<SvNOK_only>, SvNOKp X<SvNOKp>, SvNV X<SvNV>, SvNV_nomg X<SvNV_nomg>, SvNV_set X<SvNV_set>, SvNVX X<SvNVX>, SvNVx X<SvNVx>, SvOK X<SvOK>, SvOOK X<SvOOK>, SvOOK_offset X<SvOOK_offset>, SvPOK X<SvPOK>, SvPOK_off X<SvPOK_off>, SvPOK_on X<SvPOK_on>, SvPOK_only X<SvPOK_only>, SvPOK_only_UTF8 X<SvPOK_only_UTF8>, SvPOKp X<SvPOKp>, sv_pos_b2u X<sv_pos_b2u>, sv_pos_b2u_flags X<sv_pos_b2u_flags>, sv_pos_u2b X<sv_pos_u2b>, sv_pos_u2b_flags X<sv_pos_u2b_flags>, SvPV X<SvPV>, SvPVbyte X<SvPVbyte>, SvPVbyte_force X<SvPVbyte_force>, SvPVbyte_nolen X<SvPVbyte_nolen>, SvPVbyte_nomg X<SvPVbyte_nomg>, sv_pvbyten_force X<sv_pvbyten_force>, SvPVbyte_or_null X<SvPVbyte_or_null>, SvPVbyte_or_null_nomg X<SvPVbyte_or_null_nomg>, SvPVbytex X<SvPVbytex>, SvPVbytex_force X<SvPVbytex_force>, SvPVCLEAR X<SvPVCLEAR>, SvPV_force X<SvPV_force>, SvPV_force_nomg X<SvPV_force_nomg>, SvPV_nolen X<SvPV_nolen>, SvPV_nomg X<SvPV_nomg>, SvPV_nomg_nolen X<SvPV_nomg_nolen>, sv_pvn_force X<sv_pvn_force>, sv_pvn_force_flags X<sv_pvn_force_flags>, SvPV_set X<SvPV_set>, SvPVutf8 X<SvPVutf8>, sv_pvutf8n_force X<sv_pvutf8n_force>, SvPVutf8x X<SvPVutf8x>, SvPVutf8x_force X<SvPVutf8x_force>, SvPVutf8_force X<SvPVutf8_force>, SvPVutf8_nolen X<SvPVutf8_nolen>, SvPVutf8_nomg X<SvPVutf8_nomg>, SvPVutf8_or_null X<SvPVutf8_or_null>, SvPVutf8_or_null_nomg X<SvPVutf8_or_null_nomg>, SvPVX X<SvPVX>, SvPVx X<SvPVx>, SvREADONLY X<SvREADONLY>, SvREADONLY_off X<SvREADONLY_off>, SvREADONLY_on X<SvREADONLY_on>, sv_ref X<sv_ref>, SvREFCNT X<SvREFCNT>, SvREFCNT_dec X<SvREFCNT_dec>, SvREFCNT_dec_NN X<SvREFCNT_dec_NN>, SvREFCNT_inc X<SvREFCNT_inc>, SvREFCNT_inc_NN X<SvREFCNT_inc_NN>, SvREFCNT_inc_simple X<SvREFCNT_inc_simple>, SvREFCNT_inc_simple_NN X<SvREFCNT_inc_simple_NN>, SvREFCNT_inc_simple_void X<SvREFCNT_inc_simple_void>, SvREFCNT_inc_simple_void_NN X<SvREFCNT_inc_simple_void_NN>, SvREFCNT_inc_void X<SvREFCNT_inc_void>, SvREFCNT_inc_void_NN X<SvREFCNT_inc_void_NN>, sv_reftype X<sv_reftype>, sv_replace X<sv_replace>, sv_report_used X<sv_report_used>, sv_reset X<sv_reset>, SvROK X<SvROK>, SvROK_off X<SvROK_off>, SvROK_on X<SvROK_on>, SvRV X<SvRV>, SvRV_set X<SvRV_set>, sv_rvunweaken X<sv_rvunweaken>, sv_rvweaken X<sv_rvweaken>, sv_setiv X<sv_setiv>, sv_setiv_mg X<sv_setiv_mg>, sv_setnv X<sv_setnv>, sv_setnv_mg X<sv_setnv_mg>, sv_setpv X<sv_setpv>, sv_setpvf X<sv_setpvf>, sv_setpvf_mg X<sv_setpvf_mg>, sv_setpviv X<sv_setpviv>, sv_setpviv_mg X<sv_setpviv_mg>, sv_setpvn X<sv_setpvn>, sv_setpvn_mg X<sv_setpvn_mg>, sv_setpvs X<sv_setpvs>, sv_setpvs_mg X<sv_setpvs_mg>, sv_setpv_bufsize X<sv_setpv_bufsize>, sv_setpv_mg X<sv_setpv_mg>, sv_setref_iv X<sv_setref_iv>, sv_setref_nv X<sv_setref_nv>, sv_setref_pv X<sv_setref_pv>, sv_setref_pvn X<sv_setref_pvn>, sv_setref_pvs X<sv_setref_pvs>, sv_setref_uv X<sv_setref_uv>, sv_setsv X<sv_setsv>, sv_setsv_flags X<sv_setsv_flags>, sv_setsv_mg X<sv_setsv_mg>, sv_setsv_nomg X<sv_setsv_nomg>, sv_setuv X<sv_setuv>, sv_setuv_mg X<sv_setuv_mg>, sv_set_undef X<sv_set_undef>, SvSTASH X<SvSTASH>, SvSTASH_set X<SvSTASH_set>, SvTAINT X<SvTAINT>, SvTAINTED X<SvTAINTED>, sv_tainted X<sv_tainted>, SvTAINTED_off X<SvTAINTED_off>, SvTAINTED_on X<SvTAINTED_on>, SvTRUE X<SvTRUE>, sv_true X<sv_true>, SvTRUE_nomg X<SvTRUE_nomg>, SvTRUEx X<SvTRUEx>, SvTYPE X<SvTYPE>, sv_unmagic X<sv_unmagic>, sv_unmagicext X<sv_unmagicext>, sv_unref_flags X<sv_unref_flags>, sv_untaint X<sv_untaint>, SvUOK X<SvUOK>, SvUPGRADE X<SvUPGRADE>, sv_upgrade X<sv_upgrade>, sv_usepvn_flags X<sv_usepvn_flags>, SvUTF8 X<SvUTF8>, sv_utf8_decode X<sv_utf8_decode>, sv_utf8_downgrade X<sv_utf8_downgrade>, sv_utf8_downgrade_flags X<sv_utf8_downgrade_flags>, sv_utf8_downgrade_nomg X<sv_utf8_downgrade_nomg>, sv_utf8_encode X<sv_utf8_encode>, sv_utf8_upgrade X<sv_utf8_upgrade>, sv_utf8_upgrade_flags X<sv_utf8_upgrade_flags>, sv_utf8_upgrade_flags_grow X<sv_utf8_upgrade_flags_grow>, sv_utf8_upgrade_nomg X<sv_utf8_upgrade_nomg>, SvUTF8_off X<SvUTF8_off>, SvUTF8_on X<SvUTF8_on>, SvUV X<SvUV>, SvUV_nomg X<SvUV_nomg>, SvUV_set X<SvUV_set>, SvUVX X<SvUVX>, SvUVx X<SvUVx>, SvUVXx X<SvUVXx>, sv_vcatpvf X<sv_vcatpvf>, sv_vcatpvfn X<sv_vcatpvfn>, sv_vcatpvfn_flags X<sv_vcatpvfn_flags>, sv_vcatpvf_mg X<sv_vcatpvf_mg>, SvVOK X<SvVOK>, sv_vsetpvf X<sv_vsetpvf>, sv_vsetpvfn X<sv_vsetpvfn>, sv_vsetpvf_mg X<sv_vsetpvf_mg> =item Unicode Support BOM_UTF8 X<BOM_UTF8>, bytes_cmp_utf8 X<bytes_cmp_utf8>, bytes_from_utf8 X<bytes_from_utf8>, bytes_to_utf8 X<bytes_to_utf8>, DO_UTF8 X<DO_UTF8>, foldEQ_utf8 X<foldEQ_utf8>, is_ascii_string X<is_ascii_string>, is_c9strict_utf8_string X<is_c9strict_utf8_string>, is_c9strict_utf8_string_loc X<is_c9strict_utf8_string_loc>, is_c9strict_utf8_string_loclen X<is_c9strict_utf8_string_loclen>, isC9_STRICT_UTF8_CHAR X<isC9_STRICT_UTF8_CHAR>, is_invariant_string X<is_invariant_string>, isSTRICT_UTF8_CHAR X<isSTRICT_UTF8_CHAR>, is_strict_utf8_string X<is_strict_utf8_string>, is_strict_utf8_string_loc X<is_strict_utf8_string_loc>, is_strict_utf8_string_loclen X<is_strict_utf8_string_loclen>, is_utf8_fixed_width_buf_flags X<is_utf8_fixed_width_buf_flags>, is_utf8_fixed_width_buf_loclen_flags X<is_utf8_fixed_width_buf_loclen_flags>, is_utf8_fixed_width_buf_loc_flags X<is_utf8_fixed_width_buf_loc_flags>, is_utf8_invariant_string X<is_utf8_invariant_string>, is_utf8_invariant_string_loc X<is_utf8_invariant_string_loc>, is_utf8_string X<is_utf8_string>, is_utf8_string_flags X<is_utf8_string_flags>, is_utf8_string_loc X<is_utf8_string_loc>, is_utf8_string_loclen X<is_utf8_string_loclen>, is_utf8_string_loclen_flags X<is_utf8_string_loclen_flags>, is_utf8_string_loc_flags X<is_utf8_string_loc_flags>, is_utf8_valid_partial_char X<is_utf8_valid_partial_char>, is_utf8_valid_partial_char_flags X<is_utf8_valid_partial_char_flags>, isUTF8_CHAR X<isUTF8_CHAR>, isUTF8_CHAR_flags X<isUTF8_CHAR_flags>, LATIN1_TO_NATIVE X<LATIN1_TO_NATIVE>, NATIVE_TO_LATIN1 X<NATIVE_TO_LATIN1>, NATIVE_TO_UNI X<NATIVE_TO_UNI>, pv_uni_display X<pv_uni_display>, REPLACEMENT_CHARACTER_UTF8 X<REPLACEMENT_CHARACTER_UTF8>, sv_cat_decode X<sv_cat_decode>, sv_recode_to_utf8 X<sv_recode_to_utf8>, sv_uni_display X<sv_uni_display>, UNICODE_REPLACEMENT X<UNICODE_REPLACEMENT>, UNI_TO_NATIVE X<UNI_TO_NATIVE>, utf8n_to_uvchr X<utf8n_to_uvchr>, utf8n_to_uvchr_error X<utf8n_to_uvchr_error>, C<UTF8_GOT_PERL_EXTENDED>, C<UTF8_GOT_CONTINUATION>, C<UTF8_GOT_EMPTY>, C<UTF8_GOT_LONG>, C<UTF8_GOT_NONCHAR>, C<UTF8_GOT_NON_CONTINUATION>, C<UTF8_GOT_OVERFLOW>, C<UTF8_GOT_SHORT>, C<UTF8_GOT_SUPER>, C<UTF8_GOT_SURROGATE>, utf8n_to_uvchr_msgs X<utf8n_to_uvchr_msgs>, C<text>, C<warn_categories>, C<flag>, UTF8SKIP X<UTF8SKIP>, L</C<UTF8_SAFE_SKIP>> if you know the maximum ending pointer in the buffer pointed to by C<s>; or, L</C<UTF8_CHK_SKIP>> if you don't know it, UTF8_CHK_SKIP X<UTF8_CHK_SKIP>, utf8_distance X<utf8_distance>, utf8_hop X<utf8_hop>, utf8_hop_back X<utf8_hop_back>, utf8_hop_forward X<utf8_hop_forward>, utf8_hop_safe X<utf8_hop_safe>, UTF8_IS_INVARIANT X<UTF8_IS_INVARIANT>, UTF8_IS_NONCHAR X<UTF8_IS_NONCHAR>, UTF8_IS_SUPER X<UTF8_IS_SUPER>, UTF8_IS_SURROGATE X<UTF8_IS_SURROGATE>, utf8_length X<utf8_length>, UTF8_MAXBYTES X<UTF8_MAXBYTES>, UTF8_MAXBYTES_CASE X<UTF8_MAXBYTES_CASE>, UTF8_SAFE_SKIP X<UTF8_SAFE_SKIP>, UTF8_SKIP X<UTF8_SKIP>, utf8_to_bytes X<utf8_to_bytes>, utf8_to_uvchr_buf X<utf8_to_uvchr_buf>, UVCHR_IS_INVARIANT X<UVCHR_IS_INVARIANT>, UVCHR_SKIP X<UVCHR_SKIP>, uvchr_to_utf8 X<uvchr_to_utf8>, uvchr_to_utf8_flags X<uvchr_to_utf8_flags>, uvchr_to_utf8_flags_msgs X<uvchr_to_utf8_flags_msgs>, C<text>, C<warn_categories>, C<flag> =item Variables created by C<xsubpp> and C<xsubpp> internal functions newXSproto X<newXSproto>, XS_APIVERSION_BOOTCHECK X<XS_APIVERSION_BOOTCHECK>, XS_VERSION X<XS_VERSION>, XS_VERSION_BOOTCHECK X<XS_VERSION_BOOTCHECK> =item Warning and Dieing ckWARN X<ckWARN>, ckWARN2 X<ckWARN2>, ckWARN3 X<ckWARN3>, ckWARN4 X<ckWARN4>, ckWARN_d X<ckWARN_d>, ckWARN2_d X<ckWARN2_d>, ckWARN3_d X<ckWARN3_d>, ckWARN4_d X<ckWARN4_d>, CLEAR_ERRSV X<CLEAR_ERRSV>, croak X<croak>, croak_no_modify X<croak_no_modify>, croak_sv X<croak_sv>, die X<die>, die_sv X<die_sv>, ERRSV X<ERRSV>, my_setenv X<my_setenv>, rsignal X<rsignal>, SANE_ERRSV X<SANE_ERRSV>, vcroak X<vcroak>, vwarn X<vwarn>, warn X<warn>, warn_sv X<warn_sv> =item Undocumented functions CvDEPTH X<CvDEPTH>, CvGV X<CvGV>, GetVars X<GetVars>, Gv_AMupdate X<Gv_AMupdate>, PerlIO_close X<PerlIO_close>, PerlIO_context_layers X<PerlIO_context_layers>, PerlIO_error X<PerlIO_error>, PerlIO_fill X<PerlIO_fill>, PerlIO_flush X<PerlIO_flush>, PerlIO_get_bufsiz X<PerlIO_get_bufsiz>, PerlIO_get_ptr X<PerlIO_get_ptr>, PerlIO_read X<PerlIO_read>, PerlIO_seek X<PerlIO_seek>, PerlIO_set_cnt X<PerlIO_set_cnt>, PerlIO_setlinebuf X<PerlIO_setlinebuf>, PerlIO_stdout X<PerlIO_stdout>, PerlIO_unread X<PerlIO_unread>, SvAMAGIC_off X<SvAMAGIC_off>, SvAMAGIC_on X<SvAMAGIC_on>, amagic_call X<amagic_call>, amagic_deref_call X<amagic_deref_call>, any_dup X<any_dup>, atfork_lock X<atfork_lock>, atfork_unlock X<atfork_unlock>, av_arylen_p X<av_arylen_p>, av_iter_p X<av_iter_p>, block_gimme X<block_gimme>, call_atexit X<call_atexit>, call_list X<call_list>, calloc X<calloc>, cast_i32 X<cast_i32>, cast_iv X<cast_iv>, cast_ulong X<cast_ulong>, cast_uv X<cast_uv>, ck_warner X<ck_warner>, ck_warner_d X<ck_warner_d>, ckwarn X<ckwarn>, ckwarn_d X<ckwarn_d>, clear_defarray X<clear_defarray>, clone_params_del X<clone_params_del>, clone_params_new X<clone_params_new>, croak_nocontext X<croak_nocontext>, csighandler X<csighandler>, csighandler1 X<csighandler1>, csighandler3 X<csighandler3>, cx_dump X<cx_dump>, cx_dup X<cx_dup>, cxinc X<cxinc>, deb X<deb>, deb_nocontext X<deb_nocontext>, debop X<debop>, debprofdump X<debprofdump>, debstack X<debstack>, debstackptrs X<debstackptrs>, delimcpy X<delimcpy>, despatch_signals X<despatch_signals>, die_nocontext X<die_nocontext>, dirp_dup X<dirp_dup>, do_aspawn X<do_aspawn>, do_close X<do_close>, do_gv_dump X<do_gv_dump>, do_gvgv_dump X<do_gvgv_dump>, do_hv_dump X<do_hv_dump>, do_join X<do_join>, do_magic_dump X<do_magic_dump>, do_op_dump X<do_op_dump>, do_open X<do_open>, do_openn X<do_openn>, do_pmop_dump X<do_pmop_dump>, do_spawn X<do_spawn>, do_spawn_nowait X<do_spawn_nowait>, do_sprintf X<do_sprintf>, do_sv_dump X<do_sv_dump>, doing_taint X<doing_taint>, doref X<doref>, dounwind X<dounwind>, dowantarray X<dowantarray>, dump_eval X<dump_eval>, dump_form X<dump_form>, dump_indent X<dump_indent>, dump_mstats X<dump_mstats>, dump_sub X<dump_sub>, dump_vindent X<dump_vindent>, filter_del X<filter_del>, filter_read X<filter_read>, foldEQ_latin1 X<foldEQ_latin1>, form_nocontext X<form_nocontext>, fp_dup X<fp_dup>, free_global_struct X<free_global_struct>, free_tmps X<free_tmps>, get_context X<get_context>, get_mstats X<get_mstats>, get_op_descs X<get_op_descs>, get_op_names X<get_op_names>, get_ppaddr X<get_ppaddr>, get_vtbl X<get_vtbl>, gp_dup X<gp_dup>, gp_free X<gp_free>, gp_ref X<gp_ref>, gv_AVadd X<gv_AVadd>, gv_HVadd X<gv_HVadd>, gv_IOadd X<gv_IOadd>, gv_SVadd X<gv_SVadd>, gv_add_by_type X<gv_add_by_type>, gv_autoload4 X<gv_autoload4>, gv_autoload_pv X<gv_autoload_pv>, gv_autoload_pvn X<gv_autoload_pvn>, gv_autoload_sv X<gv_autoload_sv>, gv_check X<gv_check>, gv_dump X<gv_dump>, gv_efullname3 X<gv_efullname3>, gv_efullname4 X<gv_efullname4>, gv_fetchfile X<gv_fetchfile>, gv_fetchfile_flags X<gv_fetchfile_flags>, gv_fetchpv X<gv_fetchpv>, gv_fetchpvn_flags X<gv_fetchpvn_flags>, gv_fetchsv X<gv_fetchsv>, gv_fullname3 X<gv_fullname3>, gv_fullname4 X<gv_fullname4>, gv_handler X<gv_handler>, gv_name_set X<gv_name_set>, he_dup X<he_dup>, hek_dup X<hek_dup>, hv_common X<hv_common>, hv_common_key_len X<hv_common_key_len>, hv_delayfree_ent X<hv_delayfree_ent>, hv_eiter_p X<hv_eiter_p>, hv_eiter_set X<hv_eiter_set>, hv_free_ent X<hv_free_ent>, hv_ksplit X<hv_ksplit>, hv_name_set X<hv_name_set>, hv_placeholders_get X<hv_placeholders_get>, hv_placeholders_set X<hv_placeholders_set>, hv_rand_set X<hv_rand_set>, hv_riter_p X<hv_riter_p>, hv_riter_set X<hv_riter_set>, ibcmp_utf8 X<ibcmp_utf8>, init_global_struct X<init_global_struct>, init_stacks X<init_stacks>, init_tm X<init_tm>, is_lvalue_sub X<is_lvalue_sub>, leave_scope X<leave_scope>, load_module_nocontext X<load_module_nocontext>, magic_dump X<magic_dump>, markstack_grow X<markstack_grow>, mess_nocontext X<mess_nocontext>, mfree X<mfree>, mg_dup X<mg_dup>, mg_size X<mg_size>, mini_mktime X<mini_mktime>, moreswitches X<moreswitches>, mro_get_from_name X<mro_get_from_name>, mro_set_mro X<mro_set_mro>, mro_set_private_data X<mro_set_private_data>, my_atof X<my_atof>, my_chsize X<my_chsize>, my_cxt_index X<my_cxt_index>, my_cxt_init X<my_cxt_init>, my_dirfd X<my_dirfd>, my_failure_exit X<my_failure_exit>, my_fflush_all X<my_fflush_all>, my_fork X<my_fork>, my_lstat X<my_lstat>, my_pclose X<my_pclose>, my_popen X<my_popen>, my_popen_list X<my_popen_list>, my_socketpair X<my_socketpair>, my_stat X<my_stat>, my_strftime X<my_strftime>, newANONATTRSUB X<newANONATTRSUB>, newANONHASH X<newANONHASH>, newANONLIST X<newANONLIST>, newANONSUB X<newANONSUB>, newATTRSUB X<newATTRSUB>, newAVREF X<newAVREF>, newCVREF X<newCVREF>, newFORM X<newFORM>, newGVREF X<newGVREF>, newGVgen X<newGVgen>, newGVgen_flags X<newGVgen_flags>, newHVREF X<newHVREF>, newHVhv X<newHVhv>, newIO X<newIO>, newMYSUB X<newMYSUB>, newPROG X<newPROG>, newRV X<newRV>, newSUB X<newSUB>, newSVREF X<newSVREF>, newSVpvf_nocontext X<newSVpvf_nocontext>, newSVsv_flags X<newSVsv_flags>, new_stackinfo X<new_stackinfo>, op_refcnt_lock X<op_refcnt_lock>, op_refcnt_unlock X<op_refcnt_unlock>, parser_dup X<parser_dup>, perl_alloc_using X<perl_alloc_using>, perl_clone_using X<perl_clone_using>, perly_sighandler X<perly_sighandler>, pmop_dump X<pmop_dump>, pop_scope X<pop_scope>, pregcomp X<pregcomp>, pregexec X<pregexec>, pregfree X<pregfree>, pregfree2 X<pregfree2>, ptr_table_fetch X<ptr_table_fetch>, ptr_table_free X<ptr_table_free>, ptr_table_new X<ptr_table_new>, ptr_table_split X<ptr_table_split>, ptr_table_store X<ptr_table_store>, push_scope X<push_scope>, re_compile X<re_compile>, re_dup_guts X<re_dup_guts>, reentrant_free X<reentrant_free>, reentrant_init X<reentrant_init>, reentrant_retry X<reentrant_retry>, reentrant_size X<reentrant_size>, ref X<ref>, reg_named_buff_all X<reg_named_buff_all>, reg_named_buff_exists X<reg_named_buff_exists>, reg_named_buff_fetch X<reg_named_buff_fetch>, reg_named_buff_firstkey X<reg_named_buff_firstkey>, reg_named_buff_nextkey X<reg_named_buff_nextkey>, reg_named_buff_scalar X<reg_named_buff_scalar>, regdump X<regdump>, regdupe_internal X<regdupe_internal>, regexec_flags X<regexec_flags>, regfree_internal X<regfree_internal>, reginitcolors X<reginitcolors>, regnext X<regnext>, repeatcpy X<repeatcpy>, rsignal_state X<rsignal_state>, runops_debug X<runops_debug>, runops_standard X<runops_standard>, rvpv_dup X<rvpv_dup>, safesyscalloc X<safesyscalloc>, safesysfree X<safesysfree>, safesysmalloc X<safesysmalloc>, safesysrealloc X<safesysrealloc>, save_I16 X<save_I16>, save_I32 X<save_I32>, save_I8 X<save_I8>, save_adelete X<save_adelete>, save_aelem X<save_aelem>, save_aelem_flags X<save_aelem_flags>, save_alloc X<save_alloc>, save_ary X<save_ary>, save_bool X<save_bool>, save_clearsv X<save_clearsv>, save_delete X<save_delete>, save_destructor X<save_destructor>, save_destructor_x X<save_destructor_x>, save_freeop X<save_freeop>, save_freepv X<save_freepv>, save_freesv X<save_freesv>, save_generic_pvref X<save_generic_pvref>, save_generic_svref X<save_generic_svref>, save_hdelete X<save_hdelete>, save_helem X<save_helem>, save_helem_flags X<save_helem_flags>, save_hints X<save_hints>, save_hptr X<save_hptr>, save_int X<save_int>, save_item X<save_item>, save_iv X<save_iv>, save_mortalizesv X<save_mortalizesv>, save_op X<save_op>, save_padsv_and_mortalize X<save_padsv_and_mortalize>, save_pptr X<save_pptr>, save_pushi32ptr X<save_pushi32ptr>, save_pushptr X<save_pushptr>, save_pushptrptr X<save_pushptrptr>, save_re_context X<save_re_context>, save_set_svflags X<save_set_svflags>, save_shared_pvref X<save_shared_pvref>, save_sptr X<save_sptr>, save_svref X<save_svref>, save_vptr X<save_vptr>, savestack_grow X<savestack_grow>, savestack_grow_cnt X<savestack_grow_cnt>, scan_num X<scan_num>, scan_vstring X<scan_vstring>, seed X<seed>, set_context X<set_context>, share_hek X<share_hek>, si_dup X<si_dup>, ss_dup X<ss_dup>, stack_grow X<stack_grow>, start_subparse X<start_subparse>, str_to_version X<str_to_version>, sv_2iv X<sv_2iv>, sv_2pv X<sv_2pv>, sv_2pvbyte_flags X<sv_2pvbyte_flags>, sv_2pvutf8_flags X<sv_2pvutf8_flags>, sv_2uv X<sv_2uv>, sv_catpvf_mg_nocontext X<sv_catpvf_mg_nocontext>, sv_catpvf_nocontext X<sv_catpvf_nocontext>, sv_dup X<sv_dup>, sv_dup_inc X<sv_dup_inc>, sv_peek X<sv_peek>, sv_setpvf_mg_nocontext X<sv_setpvf_mg_nocontext>, sv_setpvf_nocontext X<sv_setpvf_nocontext>, sys_init X<sys_init>, sys_init3 X<sys_init3>, sys_intern_clear X<sys_intern_clear>, sys_intern_dup X<sys_intern_dup>, sys_intern_init X<sys_intern_init>, sys_term X<sys_term>, taint_env X<taint_env>, taint_proper X<taint_proper>, unlnk X<unlnk>, unsharepvn X<unsharepvn>, vdeb X<vdeb>, vform X<vform>, vload_module X<vload_module>, vnewSVpvf X<vnewSVpvf>, vwarner X<vwarner>, warn_nocontext X<warn_nocontext>, warner X<warner>, warner_nocontext X<warner_nocontext>, whichsig X<whichsig>, whichsig_pv X<whichsig_pv>, whichsig_pvn X<whichsig_pvn>, whichsig_sv X<whichsig_sv> =item AUTHORS =item SEE ALSO =back =head2 perlintern - autogenerated documentation of purely B<internal> Perl functions =over 4 =item DESCRIPTION X<internal Perl functions> X<interpreter functions> =item Array Manipulation Functions AvFILLp X<AvFILLp> =item Compile-time scope hooks BhkENTRY X<BhkENTRY>, BhkFLAGS X<BhkFLAGS>, CALL_BLOCK_HOOKS X<CALL_BLOCK_HOOKS> =item Custom Operators core_prototype X<core_prototype> =item CV Manipulation Functions docatch X<docatch> =item CV reference counts and CvOUTSIDE CvWEAKOUTSIDE X<CvWEAKOUTSIDE> =item Embedding Functions cv_dump X<cv_dump>, cv_forget_slab X<cv_forget_slab>, do_dump_pad X<do_dump_pad>, pad_alloc_name X<pad_alloc_name>, pad_block_start X<pad_block_start>, pad_check_dup X<pad_check_dup>, pad_findlex X<pad_findlex>, pad_fixup_inner_anons X<pad_fixup_inner_anons>, pad_free X<pad_free>, pad_leavemy X<pad_leavemy>, padlist_dup X<padlist_dup>, padname_dup X<padname_dup>, padnamelist_dup X<padnamelist_dup>, pad_push X<pad_push>, pad_reset X<pad_reset>, pad_swipe X<pad_swipe> =item Errno dSAVEDERRNO X<dSAVEDERRNO>, dSAVE_ERRNO X<dSAVE_ERRNO>, RESTORE_ERRNO X<RESTORE_ERRNO>, SAVE_ERRNO X<SAVE_ERRNO>, SETERRNO X<SETERRNO> =item GV Functions gv_try_downgrade X<gv_try_downgrade> =item Hash Manipulation Functions hv_ename_add X<hv_ename_add>, hv_ename_delete X<hv_ename_delete>, refcounted_he_chain_2hv X<refcounted_he_chain_2hv>, refcounted_he_fetch_pv X<refcounted_he_fetch_pv>, refcounted_he_fetch_pvn X<refcounted_he_fetch_pvn>, refcounted_he_fetch_pvs X<refcounted_he_fetch_pvs>, refcounted_he_fetch_sv X<refcounted_he_fetch_sv>, refcounted_he_free X<refcounted_he_free>, refcounted_he_inc X<refcounted_he_inc>, refcounted_he_new_pv X<refcounted_he_new_pv>, refcounted_he_new_pvn X<refcounted_he_new_pvn>, refcounted_he_new_pvs X<refcounted_he_new_pvs>, refcounted_he_new_sv X<refcounted_he_new_sv> =item IO Functions start_glob X<start_glob> =item Lexer interface validate_proto X<validate_proto> =item Magical Functions magic_clearhint X<magic_clearhint>, magic_clearhints X<magic_clearhints>, magic_methcall X<magic_methcall>, magic_sethint X<magic_sethint>, mg_localize X<mg_localize> =item Miscellaneous Functions free_c_backtrace X<free_c_backtrace>, get_c_backtrace X<get_c_backtrace>, quadmath_format_needed X<quadmath_format_needed>, quadmath_format_valid X<quadmath_format_valid> =item MRO Functions mro_get_linear_isa_dfs X<mro_get_linear_isa_dfs>, mro_isa_changed_in X<mro_isa_changed_in>, mro_package_moved X<mro_package_moved> =item Numeric functions grok_atoUV X<grok_atoUV>, isinfnansv X<isinfnansv> =item Obsolete backwards compatibility functions utf8n_to_uvuni X<utf8n_to_uvuni>, utf8_to_uvuni X<utf8_to_uvuni>, uvuni_to_utf8_flags X<uvuni_to_utf8_flags> =item Optree Manipulation Functions finalize_optree X<finalize_optree>, newATTRSUB_x X<newATTRSUB_x>, newXS_len_flags X<newXS_len_flags>, optimize_optree X<optimize_optree>, traverse_op_tree X<traverse_op_tree> =item Pad Data Structures CX_CURPAD_SAVE X<CX_CURPAD_SAVE>, CX_CURPAD_SV X<CX_CURPAD_SV>, PAD_BASE_SV X<PAD_BASE_SV>, PAD_CLONE_VARS X<PAD_CLONE_VARS>, PAD_COMPNAME_FLAGS X<PAD_COMPNAME_FLAGS>, PAD_COMPNAME_GEN X<PAD_COMPNAME_GEN>, PAD_COMPNAME_GEN_set X<PAD_COMPNAME_GEN_set>, PAD_COMPNAME_OURSTASH X<PAD_COMPNAME_OURSTASH>, PAD_COMPNAME_PV X<PAD_COMPNAME_PV>, PAD_COMPNAME_TYPE X<PAD_COMPNAME_TYPE>, PadnameIsOUR X<PadnameIsOUR>, PadnameIsSTATE X<PadnameIsSTATE>, PadnameOURSTASH X<PadnameOURSTASH>, PadnameOUTER X<PadnameOUTER>, PadnameTYPE X<PadnameTYPE>, PAD_RESTORE_LOCAL X<PAD_RESTORE_LOCAL>, PAD_SAVE_LOCAL X<PAD_SAVE_LOCAL>, PAD_SAVE_SETNULLPAD X<PAD_SAVE_SETNULLPAD>, PAD_SETSV X<PAD_SETSV>, PAD_SET_CUR X<PAD_SET_CUR>, PAD_SET_CUR_NOSAVE X<PAD_SET_CUR_NOSAVE>, PAD_SV X<PAD_SV>, PAD_SVl X<PAD_SVl>, SAVECLEARSV X<SAVECLEARSV>, SAVECOMPPAD X<SAVECOMPPAD>, SAVEPADSV X<SAVEPADSV> =item Per-Interpreter Variables PL_DBsingle X<PL_DBsingle>, PL_DBsub X<PL_DBsub>, PL_DBtrace X<PL_DBtrace>, PL_dowarn X<PL_dowarn>, PL_last_in_gv X<PL_last_in_gv>, PL_ofsgv X<PL_ofsgv>, PL_rs X<PL_rs> =item Stack Manipulation Macros djSP X<djSP>, LVRET X<LVRET> =item SV Flags SVt_INVLIST X<SVt_INVLIST> =item SV Manipulation Functions sv_2num X<sv_2num>, sv_add_arena X<sv_add_arena>, sv_clean_all X<sv_clean_all>, sv_clean_objs X<sv_clean_objs>, sv_free_arenas X<sv_free_arenas>, SvTHINKFIRST X<SvTHINKFIRST> =item Unicode Support find_uninit_var X<find_uninit_var>, isSCRIPT_RUN X<isSCRIPT_RUN>, is_utf8_non_invariant_string X<is_utf8_non_invariant_string>, report_uninit X<report_uninit>, utf8_to_uvuni_buf X<utf8_to_uvuni_buf>, uvoffuni_to_utf8_flags X<uvoffuni_to_utf8_flags>, valid_utf8_to_uvchr X<valid_utf8_to_uvchr>, variant_under_utf8_count X<variant_under_utf8_count> =item Undocumented functions ASCII_TO_NEED X<ASCII_TO_NEED>, NATIVE_TO_NEED X<NATIVE_TO_NEED>, POPMARK X<POPMARK>, PadnameIN_SCOPE X<PadnameIN_SCOPE>, PerlIO_restore_errno X<PerlIO_restore_errno>, PerlIO_save_errno X<PerlIO_save_errno>, PerlLIO_dup2_cloexec X<PerlLIO_dup2_cloexec>, PerlLIO_dup_cloexec X<PerlLIO_dup_cloexec>, PerlLIO_open3_cloexec X<PerlLIO_open3_cloexec>, PerlLIO_open_cloexec X<PerlLIO_open_cloexec>, PerlProc_pipe_cloexec X<PerlProc_pipe_cloexec>, PerlSock_accept_cloexec X<PerlSock_accept_cloexec>, PerlSock_socket_cloexec X<PerlSock_socket_cloexec>, PerlSock_socketpair_cloexec X<PerlSock_socketpair_cloexec>, ReANY X<ReANY>, Slab_Alloc X<Slab_Alloc>, Slab_Free X<Slab_Free>, Slab_to_ro X<Slab_to_ro>, Slab_to_rw X<Slab_to_rw>, TOPMARK X<TOPMARK>, _add_range_to_invlist X<_add_range_to_invlist>, _byte_dump_string X<_byte_dump_string>, _force_out_malformed_utf8_message X<_force_out_malformed_utf8_message>, _inverse_folds X<_inverse_folds>, _invlistEQ X<_invlistEQ>, _invlist_array_init X<_invlist_array_init>, _invlist_contains_cp X<_invlist_contains_cp>, _invlist_dump X<_invlist_dump>, _invlist_intersection X<_invlist_intersection>, _invlist_intersection_maybe_complement_2nd X<_invlist_intersection_maybe_complement_2nd>, _invlist_invert X<_invlist_invert>, _invlist_len X<_invlist_len>, _invlist_search X<_invlist_search>, _invlist_subtract X<_invlist_subtract>, _invlist_union X<_invlist_union>, _invlist_union_maybe_complement_2nd X<_invlist_union_maybe_complement_2nd>, _is_cur_LC_category_utf8 X<_is_cur_LC_category_utf8>, _is_in_locale_category X<_is_in_locale_category>, _is_uni_FOO X<_is_uni_FOO>, _is_uni_perl_idcont X<_is_uni_perl_idcont>, _is_uni_perl_idstart X<_is_uni_perl_idstart>, _is_utf8_FOO X<_is_utf8_FOO>, _is_utf8_perl_idcont X<_is_utf8_perl_idcont>, _is_utf8_perl_idstart X<_is_utf8_perl_idstart>, _mem_collxfrm X<_mem_collxfrm>, _new_invlist X<_new_invlist>, _new_invlist_C_array X<_new_invlist_C_array>, _setup_canned_invlist X<_setup_canned_invlist>, _to_fold_latin1 X<_to_fold_latin1>, _to_uni_fold_flags X<_to_uni_fold_flags>, _to_upper_title_latin1 X<_to_upper_title_latin1>, _to_utf8_fold_flags X<_to_utf8_fold_flags>, _to_utf8_lower_flags X<_to_utf8_lower_flags>, _to_utf8_title_flags X<_to_utf8_title_flags>, _to_utf8_upper_flags X<_to_utf8_upper_flags>, _utf8n_to_uvchr_msgs_helper X<_utf8n_to_uvchr_msgs_helper>, _warn_problematic_locale X<_warn_problematic_locale>, abort_execution X<abort_execution>, add_cp_to_invlist X<add_cp_to_invlist>, alloc_LOGOP X<alloc_LOGOP>, allocmy X<allocmy>, amagic_cmp X<amagic_cmp>, amagic_cmp_desc X<amagic_cmp_desc>, amagic_cmp_locale X<amagic_cmp_locale>, amagic_cmp_locale_desc X<amagic_cmp_locale_desc>, amagic_i_ncmp X<amagic_i_ncmp>, amagic_i_ncmp_desc X<amagic_i_ncmp_desc>, amagic_is_enabled X<amagic_is_enabled>, amagic_ncmp X<amagic_ncmp>, amagic_ncmp_desc X<amagic_ncmp_desc>, append_utf8_from_native_byte X<append_utf8_from_native_byte>, apply X<apply>, av_extend_guts X<av_extend_guts>, av_nonelem X<av_nonelem>, av_reify X<av_reify>, bind_match X<bind_match>, boot_core_PerlIO X<boot_core_PerlIO>, boot_core_UNIVERSAL X<boot_core_UNIVERSAL>, boot_core_mro X<boot_core_mro>, cando X<cando>, check_utf8_print X<check_utf8_print>, ck_anoncode X<ck_anoncode>, ck_backtick X<ck_backtick>, ck_bitop X<ck_bitop>, ck_cmp X<ck_cmp>, ck_concat X<ck_concat>, ck_defined X<ck_defined>, ck_delete X<ck_delete>, ck_each X<ck_each>, ck_entersub_args_core X<ck_entersub_args_core>, ck_eof X<ck_eof>, ck_eval X<ck_eval>, ck_exec X<ck_exec>, ck_exists X<ck_exists>, ck_ftst X<ck_ftst>, ck_fun X<ck_fun>, ck_glob X<ck_glob>, ck_grep X<ck_grep>, ck_index X<ck_index>, ck_isa X<ck_isa>, ck_join X<ck_join>, ck_length X<ck_length>, ck_lfun X<ck_lfun>, ck_listiob X<ck_listiob>, ck_match X<ck_match>, ck_method X<ck_method>, ck_null X<ck_null>, ck_open X<ck_open>, ck_prototype X<ck_prototype>, ck_readline X<ck_readline>, ck_refassign X<ck_refassign>, ck_repeat X<ck_repeat>, ck_require X<ck_require>, ck_return X<ck_return>, ck_rfun X<ck_rfun>, ck_rvconst X<ck_rvconst>, ck_sassign X<ck_sassign>, ck_select X<ck_select>, ck_shift X<ck_shift>, ck_smartmatch X<ck_smartmatch>, ck_sort X<ck_sort>, ck_spair X<ck_spair>, ck_split X<ck_split>, ck_stringify X<ck_stringify>, ck_subr X<ck_subr>, ck_substr X<ck_substr>, ck_svconst X<ck_svconst>, ck_tell X<ck_tell>, ck_trunc X<ck_trunc>, closest_cop X<closest_cop>, cmp_desc X<cmp_desc>, cmp_locale_desc X<cmp_locale_desc>, cmpchain_extend X<cmpchain_extend>, cmpchain_finish X<cmpchain_finish>, cmpchain_start X<cmpchain_start>, cntrl_to_mnemonic X<cntrl_to_mnemonic>, coresub_op X<coresub_op>, create_eval_scope X<create_eval_scope>, croak_caller X<croak_caller>, croak_memory_wrap X<croak_memory_wrap>, croak_no_mem X<croak_no_mem>, croak_popstack X<croak_popstack>, current_re_engine X<current_re_engine>, custom_op_get_field X<custom_op_get_field>, cv_ckproto_len_flags X<cv_ckproto_len_flags>, cv_clone_into X<cv_clone_into>, cv_const_sv_or_av X<cv_const_sv_or_av>, cv_undef_flags X<cv_undef_flags>, cvgv_from_hek X<cvgv_from_hek>, cvgv_set X<cvgv_set>, cvstash_set X<cvstash_set>, deb_stack_all X<deb_stack_all>, defelem_target X<defelem_target>, delete_eval_scope X<delete_eval_scope>, delimcpy_no_escape X<delimcpy_no_escape>, die_unwind X<die_unwind>, do_aexec X<do_aexec>, do_aexec5 X<do_aexec5>, do_eof X<do_eof>, do_exec X<do_exec>, do_exec3 X<do_exec3>, do_ipcctl X<do_ipcctl>, do_ipcget X<do_ipcget>, do_msgrcv X<do_msgrcv>, do_msgsnd X<do_msgsnd>, do_ncmp X<do_ncmp>, do_open6 X<do_open6>, do_open_raw X<do_open_raw>, do_print X<do_print>, do_readline X<do_readline>, do_seek X<do_seek>, do_semop X<do_semop>, do_shmio X<do_shmio>, do_sysseek X<do_sysseek>, do_tell X<do_tell>, do_trans X<do_trans>, do_uniprop_match X<do_uniprop_match>, do_vecget X<do_vecget>, do_vecset X<do_vecset>, do_vop X<do_vop>, does_utf8_overflow X<does_utf8_overflow>, dofile X<dofile>, drand48_init_r X<drand48_init_r>, drand48_r X<drand48_r>, dtrace_probe_call X<dtrace_probe_call>, dtrace_probe_load X<dtrace_probe_load>, dtrace_probe_op X<dtrace_probe_op>, dtrace_probe_phase X<dtrace_probe_phase>, dump_all_perl X<dump_all_perl>, dump_packsubs_perl X<dump_packsubs_perl>, dump_sub_perl X<dump_sub_perl>, dump_sv_child X<dump_sv_child>, dup_warnings X<dup_warnings>, emulate_cop_io X<emulate_cop_io>, find_first_differing_byte_pos X<find_first_differing_byte_pos>, find_lexical_cv X<find_lexical_cv>, find_runcv_where X<find_runcv_where>, find_script X<find_script>, foldEQ_latin1_s2_folded X<foldEQ_latin1_s2_folded>, foldEQ_utf8_flags X<foldEQ_utf8_flags>, form_alien_digit_msg X<form_alien_digit_msg>, form_cp_too_large_msg X<form_cp_too_large_msg>, free_tied_hv_pool X<free_tied_hv_pool>, get_and_check_backslash_N_name X<get_and_check_backslash_N_name>, get_db_sub X<get_db_sub>, get_debug_opts X<get_debug_opts>, get_deprecated_property_msg X<get_deprecated_property_msg>, get_hash_seed X<get_hash_seed>, get_invlist_iter_addr X<get_invlist_iter_addr>, get_invlist_offset_addr X<get_invlist_offset_addr>, get_invlist_previous_index_addr X<get_invlist_previous_index_addr>, get_no_modify X<get_no_modify>, get_opargs X<get_opargs>, get_prop_definition X<get_prop_definition>, get_prop_values X<get_prop_values>, get_re_arg X<get_re_arg>, get_re_gclass_nonbitmap_data X<get_re_gclass_nonbitmap_data>, get_regclass_nonbitmap_data X<get_regclass_nonbitmap_data>, get_regex_charset_name X<get_regex_charset_name>, getenv_len X<getenv_len>, grok_bin_oct_hex X<grok_bin_oct_hex>, grok_bslash_c X<grok_bslash_c>, grok_bslash_o X<grok_bslash_o>, grok_bslash_x X<grok_bslash_x>, gv_fetchmeth_internal X<gv_fetchmeth_internal>, gv_override X<gv_override>, gv_setref X<gv_setref>, gv_stashpvn_internal X<gv_stashpvn_internal>, gv_stashsvpvn_cached X<gv_stashsvpvn_cached>, hfree_next_entry X<hfree_next_entry>, hv_backreferences_p X<hv_backreferences_p>, hv_kill_backrefs X<hv_kill_backrefs>, hv_placeholders_p X<hv_placeholders_p>, hv_pushkv X<hv_pushkv>, hv_undef_flags X<hv_undef_flags>, init_argv_symbols X<init_argv_symbols>, init_constants X<init_constants>, init_dbargs X<init_dbargs>, init_debugger X<init_debugger>, init_i18nl10n X<init_i18nl10n>, init_i18nl14n X<init_i18nl14n>, init_named_cv X<init_named_cv>, init_uniprops X<init_uniprops>, invert X<invert>, invlist_array X<invlist_array>, invlist_clear X<invlist_clear>, invlist_clone X<invlist_clone>, invlist_contents X<invlist_contents>, invlist_extend X<invlist_extend>, invlist_highest X<invlist_highest>, invlist_is_iterating X<invlist_is_iterating>, invlist_iterfinish X<invlist_iterfinish>, invlist_iterinit X<invlist_iterinit>, invlist_iternext X<invlist_iternext>, invlist_lowest X<invlist_lowest>, invlist_max X<invlist_max>, invlist_previous_index X<invlist_previous_index>, invlist_set_len X<invlist_set_len>, invlist_set_previous_index X<invlist_set_previous_index>, invlist_trim X<invlist_trim>, invmap_dump X<invmap_dump>, io_close X<io_close>, isFF_OVERLONG X<isFF_OVERLONG>, isFOO_lc X<isFOO_lc>, is_grapheme X<is_grapheme>, is_invlist X<is_invlist>, is_utf8_char_helper X<is_utf8_char_helper>, is_utf8_common X<is_utf8_common>, is_utf8_overlong_given_start_byte_ok X<is_utf8_overlong_given_start_byte_ok>, jmaybe X<jmaybe>, keyword X<keyword>, keyword_plugin_standard X<keyword_plugin_standard>, list X<list>, load_charnames X<load_charnames>, localize X<localize>, lossless_NV_to_IV X<lossless_NV_to_IV>, magic_clear_all_env X<magic_clear_all_env>, magic_cleararylen_p X<magic_cleararylen_p>, magic_clearenv X<magic_clearenv>, magic_clearisa X<magic_clearisa>, magic_clearpack X<magic_clearpack>, magic_clearsig X<magic_clearsig>, magic_copycallchecker X<magic_copycallchecker>, magic_existspack X<magic_existspack>, magic_freearylen_p X<magic_freearylen_p>, magic_freeovrld X<magic_freeovrld>, magic_get X<magic_get>, magic_getarylen X<magic_getarylen>, magic_getdebugvar X<magic_getdebugvar>, magic_getdefelem X<magic_getdefelem>, magic_getnkeys X<magic_getnkeys>, magic_getpack X<magic_getpack>, magic_getpos X<magic_getpos>, magic_getsig X<magic_getsig>, magic_getsubstr X<magic_getsubstr>, magic_gettaint X<magic_gettaint>, magic_getuvar X<magic_getuvar>, magic_getvec X<magic_getvec>, magic_killbackrefs X<magic_killbackrefs>, magic_nextpack X<magic_nextpack>, magic_regdata_cnt X<magic_regdata_cnt>, magic_regdatum_get X<magic_regdatum_get>, magic_regdatum_set X<magic_regdatum_set>, magic_scalarpack X<magic_scalarpack>, magic_set X<magic_set>, magic_set_all_env X<magic_set_all_env>, magic_setarylen X<magic_setarylen>, magic_setcollxfrm X<magic_setcollxfrm>, magic_setdbline X<magic_setdbline>, magic_setdebugvar X<magic_setdebugvar>, magic_setdefelem X<magic_setdefelem>, magic_setenv X<magic_setenv>, magic_setisa X<magic_setisa>, magic_setlvref X<magic_setlvref>, magic_setmglob X<magic_setmglob>, magic_setnkeys X<magic_setnkeys>, magic_setnonelem X<magic_setnonelem>, magic_setpack X<magic_setpack>, magic_setpos X<magic_setpos>, magic_setregexp X<magic_setregexp>, magic_setsig X<magic_setsig>, magic_setsubstr X<magic_setsubstr>, magic_settaint X<magic_settaint>, magic_setutf8 X<magic_setutf8>, magic_setuvar X<magic_setuvar>, magic_setvec X<magic_setvec>, magic_sizepack X<magic_sizepack>, magic_wipepack X<magic_wipepack>, malloc_good_size X<malloc_good_size>, malloced_size X<malloced_size>, mem_collxfrm X<mem_collxfrm>, mem_log_alloc X<mem_log_alloc>, mem_log_free X<mem_log_free>, mem_log_realloc X<mem_log_realloc>, mg_find_mglob X<mg_find_mglob>, mode_from_discipline X<mode_from_discipline>, more_bodies X<more_bodies>, mortal_getenv X<mortal_getenv>, mro_meta_dup X<mro_meta_dup>, mro_meta_init X<mro_meta_init>, multiconcat_stringify X<multiconcat_stringify>, multideref_stringify X<multideref_stringify>, my_atof2 X<my_atof2>, my_atof3 X<my_atof3>, my_attrs X<my_attrs>, my_clearenv X<my_clearenv>, my_lstat_flags X<my_lstat_flags>, my_memrchr X<my_memrchr>, my_mkostemp X<my_mkostemp>, my_mkostemp_cloexec X<my_mkostemp_cloexec>, my_mkstemp X<my_mkstemp>, my_mkstemp_cloexec X<my_mkstemp_cloexec>, my_stat_flags X<my_stat_flags>, my_strerror X<my_strerror>, my_unexec X<my_unexec>, newGP X<newGP>, newMETHOP_internal X<newMETHOP_internal>, newSTUB X<newSTUB>, newSVavdefelem X<newSVavdefelem>, newXS_deffile X<newXS_deffile>, new_warnings_bitfield X<new_warnings_bitfield>, nextargv X<nextargv>, noperl_die X<noperl_die>, notify_parser_that_changed_to_utf8 X<notify_parser_that_changed_to_utf8>, oopsAV X<oopsAV>, oopsHV X<oopsHV>, op_clear X<op_clear>, op_integerize X<op_integerize>, op_lvalue_flags X<op_lvalue_flags>, op_refcnt_dec X<op_refcnt_dec>, op_refcnt_inc X<op_refcnt_inc>, op_relocate_sv X<op_relocate_sv>, op_std_init X<op_std_init>, op_unscope X<op_unscope>, opmethod_stash X<opmethod_stash>, opslab_force_free X<opslab_force_free>, opslab_free X<opslab_free>, opslab_free_nopad X<opslab_free_nopad>, package X<package>, package_version X<package_version>, pad_add_weakref X<pad_add_weakref>, padlist_store X<padlist_store>, padname_free X<padname_free>, padnamelist_free X<padnamelist_free>, parse_unicode_opts X<parse_unicode_opts>, parser_free X<parser_free>, parser_free_nexttoke_ops X<parser_free_nexttoke_ops>, path_is_searchable X<path_is_searchable>, peep X<peep>, pmruntime X<pmruntime>, populate_isa X<populate_isa>, ptr_hash X<ptr_hash>, qerror X<qerror>, re_exec_indentf X<re_exec_indentf>, re_indentf X<re_indentf>, re_intuit_start X<re_intuit_start>, re_intuit_string X<re_intuit_string>, re_op_compile X<re_op_compile>, re_printf X<re_printf>, reg_named_buff X<reg_named_buff>, reg_named_buff_iter X<reg_named_buff_iter>, reg_numbered_buff_fetch X<reg_numbered_buff_fetch>, reg_numbered_buff_length X<reg_numbered_buff_length>, reg_numbered_buff_store X<reg_numbered_buff_store>, reg_qr_package X<reg_qr_package>, reg_skipcomment X<reg_skipcomment>, reg_temp_copy X<reg_temp_copy>, regcurly X<regcurly>, regprop X<regprop>, report_evil_fh X<report_evil_fh>, report_redefined_cv X<report_redefined_cv>, report_wrongway_fh X<report_wrongway_fh>, rpeep X<rpeep>, rsignal_restore X<rsignal_restore>, rsignal_save X<rsignal_save>, rxres_save X<rxres_save>, same_dirent X<same_dirent>, save_strlen X<save_strlen>, save_to_buffer X<save_to_buffer>, sawparens X<sawparens>, scalar X<scalar>, scalarvoid X<scalarvoid>, scan_str X<scan_str>, scan_word X<scan_word>, set_caret_X X<set_caret_X>, set_numeric_standard X<set_numeric_standard>, set_numeric_underlying X<set_numeric_underlying>, set_padlist X<set_padlist>, setfd_cloexec X<setfd_cloexec>, setfd_cloexec_for_nonsysfd X<setfd_cloexec_for_nonsysfd>, setfd_cloexec_or_inhexec_by_sysfdness X<setfd_cloexec_or_inhexec_by_sysfdness>, setfd_inhexec X<setfd_inhexec>, setfd_inhexec_for_sysfd X<setfd_inhexec_for_sysfd>, should_warn_nl X<should_warn_nl>, should_we_output_Debug_r X<should_we_output_Debug_r>, sighandler X<sighandler>, sighandler1 X<sighandler1>, sighandler3 X<sighandler3>, skipspace_flags X<skipspace_flags>, softref2xv X<softref2xv>, sortsv_flags_impl X<sortsv_flags_impl>, sub_crush_depth X<sub_crush_depth>, sv_add_backref X<sv_add_backref>, sv_buf_to_ro X<sv_buf_to_ro>, sv_del_backref X<sv_del_backref>, sv_free2 X<sv_free2>, sv_i_ncmp X<sv_i_ncmp>, sv_i_ncmp_desc X<sv_i_ncmp_desc>, sv_kill_backrefs X<sv_kill_backrefs>, sv_len_utf8_nomg X<sv_len_utf8_nomg>, sv_magicext_mglob X<sv_magicext_mglob>, sv_ncmp X<sv_ncmp>, sv_ncmp_desc X<sv_ncmp_desc>, sv_only_taint_gmagic X<sv_only_taint_gmagic>, sv_or_pv_pos_u2b X<sv_or_pv_pos_u2b>, sv_resetpvn X<sv_resetpvn>, sv_sethek X<sv_sethek>, sv_setsv_cow X<sv_setsv_cow>, sv_unglob X<sv_unglob>, tied_method X<tied_method>, tmps_grow_p X<tmps_grow_p>, to_uni_fold X<to_uni_fold>, to_uni_lower X<to_uni_lower>, to_uni_title X<to_uni_title>, to_uni_upper X<to_uni_upper>, translate_substr_offsets X<translate_substr_offsets>, try_amagic_bin X<try_amagic_bin>, try_amagic_un X<try_amagic_un>, uiv_2buf X<uiv_2buf>, unshare_hek X<unshare_hek>, utf16_to_utf8 X<utf16_to_utf8>, utf16_to_utf8_reversed X<utf16_to_utf8_reversed>, utf8_to_uvchr_buf_helper X<utf8_to_uvchr_buf_helper>, utilize X<utilize>, uvoffuni_to_utf8_flags_msgs X<uvoffuni_to_utf8_flags_msgs>, uvuni_to_utf8 X<uvuni_to_utf8>, valid_utf8_to_uvuni X<valid_utf8_to_uvuni>, variant_byte_number X<variant_byte_number>, varname X<varname>, vivify_defelem X<vivify_defelem>, vivify_ref X<vivify_ref>, wait4pid X<wait4pid>, was_lvalue_sub X<was_lvalue_sub>, watch X<watch>, win32_croak_not_implemented X<win32_croak_not_implemented>, write_to_stderr X<write_to_stderr>, xs_boot_epilog X<xs_boot_epilog>, xs_handshake X<xs_handshake>, yyerror X<yyerror>, yyerror_pv X<yyerror_pv>, yyerror_pvn X<yyerror_pvn>, yylex X<yylex>, yyparse X<yyparse>, yyquit X<yyquit>, yyunlex X<yyunlex> =item AUTHORS =item SEE ALSO =back =head2 perliol - C API for Perl's implementation of IO in Layers. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item History and Background =item Basic Structure =item Layers vs Disciplines =item Data Structures =item Functions and Attributes =item Per-instance Data =item Layers in action. =item Per-instance flag bits PERLIO_F_EOF, PERLIO_F_CANWRITE, PERLIO_F_CANREAD, PERLIO_F_ERROR, PERLIO_F_TRUNCATE, PERLIO_F_APPEND, PERLIO_F_CRLF, PERLIO_F_UTF8, PERLIO_F_UNBUF, PERLIO_F_WRBUF, PERLIO_F_RDBUF, PERLIO_F_LINEBUF, PERLIO_F_TEMP, PERLIO_F_OPEN, PERLIO_F_FASTGETS =item Methods in Detail fsize, name, size, kind, PERLIO_K_BUFFERED, PERLIO_K_RAW, PERLIO_K_CANCRLF, PERLIO_K_FASTGETS, PERLIO_K_MULTIARG, Pushed, Popped, Open, Binmode, Getarg, Fileno, Dup, Read, Write, Seek, Tell, Close, Flush, Fill, Eof, Error, Clearerr, Setlinebuf, Get_base, Get_bufsiz, Get_ptr, Get_cnt, Set_ptrcnt =item Utilities =item Implementing PerlIO Layers C implementations, Perl implementations =item Core Layers "unix", "perlio", "stdio", "crlf", "mmap", "pending", "raw", "utf8" =item Extension Layers ":encoding", ":scalar", ":via" =back =item TODO =back =head2 perlapio - perl's IO abstraction interface. =over 4 =item SYNOPSIS =item DESCRIPTION 1. USE_STDIO, 2. USE_PERLIO, B<PerlIO_stdin()>, B<PerlIO_stdout()>, B<PerlIO_stderr()>, B<PerlIO_open(path, mode)>, B<PerlIO_fdopen(fd,mode)>, B<PerlIO_reopen(path,mode,f)>, B<PerlIO_printf(f,fmt,...)>, B<PerlIO_vprintf(f,fmt,a)>, B<PerlIO_stdoutf(fmt,...)>, B<PerlIO_read(f,buf,count)>, B<PerlIO_write(f,buf,count)>, B<PerlIO_close(f)>, B<PerlIO_puts(f,s)>, B<PerlIO_putc(f,c)>, B<PerlIO_ungetc(f,c)>, B<PerlIO_getc(f)>, B<PerlIO_eof(f)>, B<PerlIO_error(f)>, B<PerlIO_fileno(f)>, B<PerlIO_clearerr(f)>, B<PerlIO_flush(f)>, B<PerlIO_seek(f,offset,whence)>, B<PerlIO_tell(f)>, B<PerlIO_getpos(f,p)>, B<PerlIO_setpos(f,p)>, B<PerlIO_rewind(f)>, B<PerlIO_tmpfile()>, B<PerlIO_setlinebuf(f)> =over 4 =item Co-existence with stdio B<PerlIO_importFILE(f,mode)>, B<PerlIO_exportFILE(f,mode)>, B<PerlIO_releaseFILE(p,f)>, B<PerlIO_findFILE(f)> =item "Fast gets" Functions B<PerlIO_fast_gets(f)>, B<PerlIO_has_cntptr(f)>, B<PerlIO_get_cnt(f)>, B<PerlIO_get_ptr(f)>, B<PerlIO_set_ptrcnt(f,p,c)>, B<PerlIO_canset_cnt(f)>, B<PerlIO_set_cnt(f,c)>, B<PerlIO_has_base(f)>, B<PerlIO_get_base(f)>, B<PerlIO_get_bufsiz(f)> =item Other Functions PerlIO_apply_layers(f,mode,layers), PerlIO_binmode(f,ptype,imode,layers), 'E<lt>' read, 'E<gt>' write, '+' read/write, PerlIO_debug(fmt,...) =back =back =head2 perlhack - How to hack on Perl =over 4 =item DESCRIPTION =item SUPER QUICK PATCH GUIDE Check out the source repository, Ensure you're following the latest advice, Create a branch for your change, Make your change, Test your change, Commit your change, Send your change to the Perl issue tracker, Thank you, Acknowledgement, Next time =item BUG REPORTING =item PERL 5 PORTERS =over 4 =item perl-changes mailing list =item #p5p on IRC =back =item GETTING THE PERL SOURCE =over 4 =item Read access via Git =item Read access via the web =item Read access via rsync =item Write access via git =back =item PATCHING PERL =over 4 =item Submitting patches =item Getting your patch accepted Why, What, How =item Patching a core module =item Updating perldelta =item What makes for a good patch? =back =item TESTING F<t/base>, F<t/comp> and F<t/opbasic>, All other subdirectories of F<t/>, Test files not found under F<t/> =over 4 =item Special C<make test> targets test_porting, minitest, test.valgrind check.valgrind, test_harness, test-notty test_notty =item Parallel tests =item Running tests by hand =item Using F<t/harness> for testing -v, -torture, -re=PATTERN, -re LIST OF PATTERNS, PERL_CORE=1, PERL_DESTRUCT_LEVEL=2, PERL, PERL_SKIP_TTY_TEST, PERL_TEST_Net_Ping, PERL_TEST_NOVREXX, PERL_TEST_NUMCONVERTS, PERL_TEST_MEMORY =item Performance testing =item Building perl at older commits =back =item MORE READING FOR GUTS HACKERS L<perlsource>, L<perlinterp>, L<perlhacktut>, L<perlhacktips>, L<perlguts>, L<perlxstut> and L<perlxs>, L<perlapi>, F<Porting/pumpkin.pod> =item CPAN TESTERS AND PERL SMOKERS =item WHAT NEXT? =over 4 =item "The Road goes ever on and on, down from the door where it began." =item Metaphoric Quotations =back =item AUTHOR =back =head2 perlsource - A guide to the Perl source tree =over 4 =item DESCRIPTION =item FINDING YOUR WAY AROUND =over 4 =item C code =item Core modules F<lib/>, F<ext/>, F<dist/>, F<cpan/> =item Tests Module tests, F<t/base/>, F<t/cmd/>, F<t/comp/>, F<t/io/>, F<t/mro/>, F<t/op/>, F<t/opbasic/>, F<t/re/>, F<t/run/>, F<t/uni/>, F<t/win32/>, F<t/porting/>, F<t/lib/> =item Documentation =item Hacking tools and documentation F<check*>, F<Maintainers>, F<Maintainers.pl>, and F<Maintainers.pm>, F<podtidy> =item Build system =item F<AUTHORS> =item F<MANIFEST> =back =back =head2 perlinterp - An overview of the Perl interpreter =over 4 =item DESCRIPTION =item ELEMENTS OF THE INTERPRETER =over 4 =item Startup =item Parsing =item Optimization =item Running =item Exception handing =item INTERNAL VARIABLE TYPES =back =item OP TREES =item STACKS =over 4 =item Argument stack =item Mark stack =item Save stack =back =item MILLIONS OF MACROS =item FURTHER READING =back =head2 perlhacktut - Walk through the creation of a simple C code patch =over 4 =item DESCRIPTION =item EXAMPLE OF A SIMPLE PATCH =over 4 =item Writing the patch =item Testing the patch =item Documenting the patch =item Submit =back =item AUTHOR =back =head2 perlhacktips - Tips for Perl core C code hacking =over 4 =item DESCRIPTION =item COMMON PROBLEMS =over 4 =item Perl environment problems =item Portability problems =item Problematic System Interfaces =item Security problems =back =item DEBUGGING =over 4 =item Poking at Perl =item Using a source-level debugger run [args], break function_name, break source.c:xxx, step, next, continue, finish, 'enter', ptype, print =item gdb macro support =item Dumping Perl Data Structures =item Using gdb to look at specific parts of a program =item Using gdb to look at what the parser/lexer are doing =back =item SOURCE CODE STATIC ANALYSIS =over 4 =item lint =item Coverity =item HP-UX cadvise (Code Advisor) =item cpd (cut-and-paste detector) =item gcc warnings =item Warnings of other C compilers =back =item MEMORY DEBUGGERS =over 4 =item valgrind =item AddressSanitizer -Dcc=clang, -Accflags=-fsanitize=address, -Aldflags=-fsanitize=address, -Alddlflags=-shared\ -fsanitize=address, -fsanitize-blacklist=`pwd`/asan_ignore =back =item PROFILING =over 4 =item Gprof Profiling -a, -b, -e routine, -f routine, -s, -z =item GCC gcov Profiling =back =item MISCELLANEOUS TRICKS =over 4 =item PERL_DESTRUCT_LEVEL =item PERL_MEM_LOG =item DDD over gdb =item C backtrace Linux, OS X, get_c_backtrace, free_c_backtrace, get_c_backtrace_dump, dump_c_backtrace =item Poison =item Read-only optrees =item When is a bool not a bool? =item The .i Targets =back =item AUTHOR =back =head2 perlpolicy - Various and sundry policies and commitments related to the Perl core =over 4 =item DESCRIPTION =item GOVERNANCE =over 4 =item Perl 5 Porters =back =item MAINTENANCE AND SUPPORT =item BACKWARD COMPATIBILITY AND DEPRECATION =over 4 =item Terminology experimental, deprecated, discouraged, removed =back =item MAINTENANCE BRANCHES =over 4 =item Getting changes into a maint branch =back =item CONTRIBUTED MODULES =over 4 =item A Social Contract about Artistic Control =back =item DOCUMENTATION =item STANDARDS OF CONDUCT =item CREDITS =back =head2 perlgov - Perl Rules of Governance =over 4 =item PREAMBLE =item Mandate =item Definitions "Core Team", "Steering Council", "Vote Administrator" =over 4 =item The Core Team =item The Steering Council =item The Vote Administrator =back =item Core Team Members Abhijit Menon-Sen (inactive), Andy Dougherty, Chad Granum, Chris 'BinGOs' Williams, Craig Berry, Dagfinn Ilmari Mannsåker, Dave Mitchell, David Golden, H. Merijn Brand, Hugo van der Sanden, James E Keenan, Jan Dubois (inactive), Jesse Vincent (inactive), Karen Etheridge, Karl Williamson, Leon Timmermans, Matthew Horsfall, Max Maischein, Nicholas Clark, Nicolas R, Paul "LeoNerd" Evans, Philippe "BooK" Bruhat, Ricardo Signes, Sawyer X, Steve Hay, Stuart Mackintosh, Todd Rinaldo, Tony Cook =back =head2 perlgit - Detailed information about git and the Perl repository =over 4 =item DESCRIPTION =item CLONING THE REPOSITORY =item WORKING WITH THE REPOSITORY =over 4 =item Finding out your status =item Patch workflow =item A note on derived files =item Cleaning a working directory =item Bisecting =item Topic branches and rewriting history =item Grafts =back =item WRITE ACCESS TO THE GIT REPOSITORY =over 4 =item Accepting a patch =item Committing to blead =item On merging and rebasing =item Committing to maintenance versions =item Using a smoke-me branch to test changes =back =back =head2 perlbook - Books about and related to Perl =over 4 =item DESCRIPTION =over 4 =item The most popular books I<Programming Perl> (the "Camel Book"):, I<The Perl Cookbook> (the "Ram Book"):, I<Learning Perl> (the "Llama Book"), I<Intermediate Perl> (the "Alpaca Book") =item References I<Perl 5 Pocket Reference>, I<Perl Debugger Pocket Reference>, I<Regular Expression Pocket Reference> =item Tutorials I<Beginning Perl>, I<Learning Perl> (the "Llama Book"), I<Intermediate Perl> (the "Alpaca Book"), I<Mastering Perl>, I<Effective Perl Programming> =item Task-Oriented I<Writing Perl Modules for CPAN>, I<The Perl Cookbook>, I<Automating System Administration with Perl>, I<Real World SQL Server Administration with Perl> =item Special Topics I<Regular Expressions Cookbook>, I<Programming the Perl DBI>, I<Perl Best Practices>, I<Higher-Order Perl>, I<Mastering Regular Expressions>, I<Network Programming with Perl>, I<Perl Template Toolkit>, I<Object Oriented Perl>, I<Data Munging with Perl>, I<Mastering Perl/Tk>, I<Extending and Embedding Perl>, I<Pro Perl Debugging> =item Free (as in beer) books =item Other interesting, non-Perl books I<Programming Pearls>, I<More Programming Pearls> =item A note on freshness =item Get your book listed =back =back =head2 perlcommunity - a brief overview of the Perl community =over 4 =item DESCRIPTION =over 4 =item Where to Find the Community =item Mailing Lists and Newsgroups =item IRC =item Websites L<https://perl.com/>, L<http://blogs.perl.org/>, L<http://perlsphere.net/>, L<http://perlweekly.com/>, L<https://www.perlmonks.org/>, L<https://stackoverflow.com/>, L<http://prepan.org/> =item User Groups =item Workshops =item Hackathons =item Conventions The Perl Conference, OSCON =item Calendar of Perl Events =back =item AUTHOR =back =head2 perldoc - Look up Perl documentation in Pod format. =over 4 =item SYNOPSIS =item DESCRIPTION =item OPTIONS B<-h>, B<-D>, B<-t>, B<-u>, B<-m> I<module>, B<-l>, B<-U>, B<-F>, B<-f> I<perlfunc>, B<-q> I<perlfaq-search-regexp>, B<-a> I<perlapifunc>, B<-v> I<perlvar>, B<-T>, B<-d> I<destination-filename>, B<-o> I<output-formatname>, B<-M> I<module-name>, B<-w> I<option:value> or B<-w> I<option>, B<-X>, B<-L> I<language_code>, B<PageName|ModuleName|ProgramName|URL>, B<-n> I<some-formatter>, B<-r>, B<-i>, B<-V> =item SECURITY =item ENVIRONMENT =item CHANGES =item SEE ALSO =item AUTHOR =back =head2 perlhist - the Perl history records =over 4 =item DESCRIPTION =item INTRODUCTION =item THE KEEPERS OF THE PUMPKIN =over 4 =item PUMPKIN? =back =item THE RECORDS =over 4 =item SELECTED RELEASE SIZES =item SELECTED PATCH SIZES =back =item THE KEEPERS OF THE RECORDS =back =head2 perldelta - what is new for perl v5.32.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes MacOS (Darwin), Minix =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5321delta, perldelta - what is new for perl v5.32.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes MacOS (Darwin), Minix =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5320delta - what is new for perl v5.32.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item The isa Operator =item Unicode 13.0 is supported =item Chained comparisons capability =item New Unicode properties C<Identifier_Status> and C<Identifier_Type> supported =item It is now possible to write C<qr/\p{Name=...}/>, or C<qr!\p{na=/(SMILING|GRINNING) FACE/}!> =item Improvement of C<POSIX::mblen()>, C<mbtowc>, and C<wctomb> =item Alpha assertions are no longer experimental =item Script runs are no longer experimental =item Feature checks are now faster =item Perl is now developed on GitHub =item Compiled patterns can now be dumped before optimization =back =item Security =over 4 =item [CVE-2020-10543] Buffer overflow caused by a crafted regular expression =item [CVE-2020-10878] Integer overflow via malformed bytecode produced by a crafted regular expression =item [CVE-2020-12723] Buffer overflow caused by a crafted regular expression =item Additional Note =back =item Incompatible Changes =over 4 =item Certain pattern matching features are now prohibited in compiling Unicode property value wildcard subpatterns =item Unused functions C<POSIX::mbstowcs> and C<POSIX::wcstombs> are removed =item A bug fix for C<(?[...])> may have caused some patterns to no longer compile =item C<\p{I<user-defined>}> properties now always override official Unicode ones =item Modifiable variables are no longer permitted in constants =item Use of L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on strings with code points above 0xFF is forbidden =item Use of code points over 0xFF in string bitwise operators =item C<Sys::Hostname::hostname()> does not accept arguments =item Plain "0" string now treated as a number for range operator =item C<\K> now disallowed in look-ahead and look-behind assertions =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation C<caller>, C<__FILE__>, C<__LINE__>, C<return>, C<open> =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =over 4 =item L<perlbug> The bug tracker homepage URL now points to GitHub =item L<streamzip> =back =item Configuration and Compilation =over 4 =item F<Configure> =back =item Testing =item Platform Support =over 4 =item Discontinued Platforms Windows CE =item Platform-Specific Notes Linux, NetBSD 8.0, Windows, Solaris, VMS, z/OS =back =item Internal Changes =item Selected Bug Fixes =item Obituary =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5303delta - what is new for perl v5.30.3 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2020-10543] Buffer overflow caused by a crafted regular expression =item [CVE-2020-10878] Integer overflow via malformed bytecode produced by a crafted regular expression =item [CVE-2020-12723] Buffer overflow caused by a crafted regular expression =item Additional Note =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Testing =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5302delta - what is new for perl v5.30.2 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes Windows =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5301delta - what is new for perl v5.30.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes Win32 =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5300delta - what is new for perl v5.30.0 =over 4 =item DESCRIPTION =item Notice =item Core Enhancements =over 4 =item Limited variable length lookbehind in regular expression pattern matching is now experimentally supported =item The upper limit C<"n"> specifiable in a regular expression quantifier of the form C<"{m,n}"> has been doubled to 65534 =item Unicode 12.1 is supported =item Wildcards in Unicode property value specifications are now partially supported =item qr'\N{name}' is now supported =item Turkic UTF-8 locales are now seamlessly supported =item It is now possible to compile perl to always use thread-safe locale operations. =item Eliminate opASSIGN macro usage from core =item C<-Drv> now means something on C<-DDEBUGGING> builds =back =item Incompatible Changes =over 4 =item Assigning non-zero to C<$[> is fatal =item Delimiters must now be graphemes =item Some formerly deprecated uses of an unescaped left brace C<"{"> in regular expression patterns are now illegal =item Previously deprecated sysread()/syswrite() on :utf8 handles is now fatal =item my() in false conditional prohibited =item Fatalize $* and $# =item Fatalize unqualified use of dump() =item Remove File::Glob::glob() =item C<pack()> no longer can return malformed UTF-8 =item Any set of digits in the Common script are legal in a script run of another script =item JSON::PP enables allow_nonref by default =back =item Deprecations =over 4 =item In XS code, use of various macros dealing with UTF-8. =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Utility Changes =over 4 =item L<xsubpp> =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes HP-UX 11.11, Mac OS X, Minix3, Cygwin, Win32 Mingw, Windows =back =item Internal Changes =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5283delta - what is new for perl v5.28.3 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2020-10543] Buffer overflow caused by a crafted regular expression =item [CVE-2020-10878] Integer overflow via malformed bytecode produced by a crafted regular expression =item [CVE-2020-12723] Buffer overflow caused by a crafted regular expression =item Additional Note =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Testing =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5282delta - what is new for perl v5.28.2 =over 4 =item DESCRIPTION =item Incompatible Changes =over 4 =item Any set of digits in the Common script are legal in a script run of another script =back =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Platform Support =over 4 =item Platform-Specific Notes Windows, Mac OS X =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5281delta - what is new for perl v5.28.1 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2018-18311] Integer overflow leading to buffer overflow and segmentation fault =item [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c) =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5280delta - what is new for perl v5.28.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item Unicode 10.0 is supported =item L<C<delete>|perlfunc/delete EXPR> on key/value hash slices =item Experimentally, there are now alphabetic synonyms for some regular expression assertions =item Mixed Unicode scripts are now detectable =item In-place editing with C<perl -i> is now safer =item Initialisation of aggregate state variables =item Full-size inode numbers =item The C<sprintf> C<%j> format size modifier is now available with pre-C99 compilers =item Close-on-exec flag set atomically =item String- and number-specific bitwise ops are no longer experimental =item Locales are now thread-safe on systems that support them =item New read-only predefined variable C<${^SAFE_LOCALES}> =back =item Security =over 4 =item [CVE-2017-12837] Heap buffer overflow in regular expression compiler =item [CVE-2017-12883] Buffer over-read in regular expression parser =item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows =item Default Hash Function Change =back =item Incompatible Changes =over 4 =item Subroutine attribute and signature order =item Comma-less variable lists in formats are no longer allowed =item The C<:locked> and C<:unique> attributes have been removed =item C<\N{}> with nothing between the braces is now illegal =item Opening the same symbol as both a file and directory handle is no longer allowed =item Use of bare C<< << >> to mean C<< <<"" >> is no longer allowed =item Setting $/ to a reference to a non-positive integer no longer allowed =item Unicode code points with values exceeding C<IV_MAX> are now fatal =item The C<B::OP::terse> method has been removed =item Use of inherited AUTOLOAD for non-methods is no longer allowed =item Use of strings with code points over 0xFF is not allowed for bitwise string operators =item Setting C<${^ENCODING}> to a defined value is now illegal =item Backslash no longer escapes colon in PATH for the C<-S> switch =item the -DH (DEBUG_H) misfeature has been removed =item Yada-yada is now strictly a statement =item Sort algorithm can no longer be specified =item Over-radix digits in floating point literals =item Return type of C<unpackstring()> =back =item Deprecations =over 4 =item Use of L<C<vec>|perlfunc/vec EXPR,OFFSET,BITS> on strings with code points above 0xFF is deprecated =item Some uses of unescaped C<"{"> in regexes are no longer fatal =item Use of unescaped C<"{"> immediately after a C<"("> in regular expression patterns is deprecated =item Assignment to C<$[> will be fatal in Perl 5.30 =item hostname() won't accept arguments in Perl 5.32 =item Module removals B::Debug, L<Locale::Codes> and its associated Country, Currency and Language modules =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Removal of use vars =item Use of DynaLoader changed to XSLoader in many modules =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation L<perldiag/Variable length lookbehind not implemented in regex mE<sol>%sE<sol>>, "Use of state $_ is experimental" in L<perldiag> =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =over 4 =item L<perlbug> =back =item Configuration and Compilation C89 requirement, New probes, HAS_BUILTIN_ADD_OVERFLOW, HAS_BUILTIN_MUL_OVERFLOW, HAS_BUILTIN_SUB_OVERFLOW, HAS_THREAD_SAFE_NL_LANGINFO_L, HAS_LOCALECONV_L, HAS_MBRLEN, HAS_MBRTOWC, HAS_MEMRCHR, HAS_NANOSLEEP, HAS_STRNLEN, HAS_STRTOLD_L, I_WCHAR =item Testing =item Packaging =item Platform Support =over 4 =item Discontinued Platforms PowerUX / Power MAX OS =item Platform-Specific Notes CentOS, Cygwin, Darwin, FreeBSD, VMS, Windows =back =item Internal Changes =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5263delta - what is new for perl v5.26.3 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2018-12015] Directory traversal in module Archive::Tar =item [CVE-2018-18311] Integer overflow leading to buffer overflow and segmentation fault =item [CVE-2018-18312] Heap-buffer-overflow write in S_regatom (regcomp.c) =item [CVE-2018-18313] Heap-buffer-overflow read in S_grok_bslash_N (regcomp.c) =item [CVE-2018-18314] Heap-buffer-overflow write in S_regatom (regcomp.c) =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5262delta - what is new for perl v5.26.2 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom (regcomp.c) =item [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string (utf8.c) =item [CVE-2018-6913] heap-buffer-overflow in S_pack_rec =item Assertion failure in Perl__core_swash_init (utf8.c) =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Platform Support =over 4 =item Platform-Specific Notes Windows =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5261delta - what is new for perl v5.26.1 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2017-12837] Heap buffer overflow in regular expression compiler =item [CVE-2017-12883] Buffer over-read in regular expression parser =item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Platform Support =over 4 =item Platform-Specific Notes FreeBSD, Windows =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5260delta - what is new for perl v5.26.0 =over 4 =item DESCRIPTION =item Notice C<"."> no longer in C<@INC>, C<do> may now warn, In regular expression patterns, a literal left brace C<"{"> should be escaped =item Core Enhancements =over 4 =item Lexical subroutines are no longer experimental =item Indented Here-documents =item New regular expression modifier C</xx> =item C<@{^CAPTURE}>, C<%{^CAPTURE}>, and C<%{^CAPTURE_ALL}> =item Declaring a reference to a variable =item Unicode 9.0 is now supported =item Use of C<\p{I<script>}> uses the improved Script_Extensions property =item Perl can now do default collation in UTF-8 locales on platforms that support it =item Better locale collation of strings containing embedded C<NUL> characters =item C<CORE> subroutines for hash and array functions callable via reference =item New Hash Function For 64-bit Builds =back =item Security =over 4 =item Removal of the current directory (C<".">) from C<@INC> F<Configure -Udefault_inc_excludes_dot>, C<PERL_USE_UNSAFE_INC>, A new deprecation warning issued by C<do>, Script authors, Installing and using CPAN modules, Module Authors =item Escaped colons and relative paths in PATH =item New C<-Di> switch is now required for PerlIO debugging output =back =item Incompatible Changes =over 4 =item Unescaped literal C<"{"> characters in regular expression patterns are no longer permissible =item C<scalar(%hash)> return signature changed =item C<keys> returned from an lvalue subroutine =item The C<${^ENCODING}> facility has been removed =item C<POSIX::tmpnam()> has been removed =item require ::Foo::Bar is now illegal. =item Literal control character variable names are no longer permissible =item C<NBSP> is no longer permissible in C<\N{...}> =back =item Deprecations =over 4 =item String delimiters that aren't stand-alone graphemes are now deprecated =item C<\cI<X>> that maps to a printable is no longer deprecated =back =item Performance Enhancements New Faster Hash Function on 64 bit builds, readline is faster =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =over 4 =item F<c2ph> and F<pstruct> =item F<Porting/pod_lib.pl> =item F<Porting/sync-with-cpan> =item F<perf/benchmarks> =item F<Porting/checkAUTHORS.pl> =item F<t/porting/regen.t> =item F<utils/h2xs.PL> =item L<perlbug> =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item New Platforms NetBSD/VAX =item Platform-Specific Notes Darwin, EBCDIC, HP-UX, Hurd, VAX, VMS, Windows, Linux, OpenBSD 6, FreeBSD, DragonFly BSD =back =item Internal Changes =item Selected Bug Fixes =item Known Problems =item Errata From Previous Releases =item Obituary =item Acknowledgements =item Reporting Bugs =item Give Thanks =item SEE ALSO =back =head2 perl5244delta - what is new for perl v5.24.4 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2018-6797] heap-buffer-overflow (WRITE of size 1) in S_regatom (regcomp.c) =item [CVE-2018-6798] Heap-buffer-overflow in Perl__byte_dump_string (utf8.c) =item [CVE-2018-6913] heap-buffer-overflow in S_pack_rec =item Assertion failure in Perl__core_swash_init (utf8.c) =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5243delta - what is new for perl v5.24.3 =over 4 =item DESCRIPTION =item Security =over 4 =item [CVE-2017-12837] Heap buffer overflow in regular expression compiler =item [CVE-2017-12883] Buffer over-read in regular expression parser =item [CVE-2017-12814] C<$ENV{$key}> stack buffer overflow on Windows =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes VMS, Windows =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5242delta - what is new for perl v5.24.2 =over 4 =item DESCRIPTION =item Security =over 4 =item Improved handling of '.' in @INC in base.pm =item "Escaped" colons and relative paths in PATH =back =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5241delta - what is new for perl v5.24.1 =over 4 =item DESCRIPTION =item Security =over 4 =item B<-Di> switch is now required for PerlIO debugging output =item Core modules and tools no longer search F<"."> for optional modules =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Testing =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5240delta - what is new for perl v5.24.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item Postfix dereferencing is no longer experimental =item Unicode 8.0 is now supported =item perl will now croak when closing an in-place output file fails =item New C<\b{lb}> boundary in regular expressions =item C<qr/(?[ ])/> now works in UTF-8 locales =item Integer shift (C<< << >> and C<< >> >>) now more explicitly defined =item printf and sprintf now allow reordered precision arguments =item More fields provided to C<sigaction> callback with C<SA_SIGINFO> =item Hashbang redirection to Perl 6 =back =item Security =over 4 =item Set proper umask before calling C<mkstemp(3)> =item Fix out of boundary access in Win32 path handling =item Fix loss of taint in canonpath =item Avoid accessing uninitialized memory in win32 C<crypt()> =item Remove duplicate environment variables from C<environ> =back =item Incompatible Changes =over 4 =item The C<autoderef> feature has been removed =item Lexical $_ has been removed =item C<qr/\b{wb}/> is now tailored to Perl expectations =item Regular expression compilation errors =item C<qr/\N{}/> now disallowed under C<use re "strict"> =item Nested declarations are now disallowed =item The C</\C/> character class has been removed. =item C<chdir('')> no longer chdirs home =item ASCII characters in variable names must now be all visible =item An off by one issue in C<$Carp::MaxArgNums> has been fixed =item Only blanks and tabs are now allowed within C<[...]> within C<(?[...])>. =back =item Deprecations =over 4 =item Using code points above the platform's C<IV_MAX> is now deprecated =item Doing bitwise operations on strings containing code points above 0xFF is deprecated =item C<sysread()>, C<syswrite()>, C<recv()> and C<send()> are deprecated on :utf8 handles =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Platform-Specific Notes AmigaOS, Cygwin, EBCDIC, UTF-EBCDIC extended, EBCDIC C<cmp()> and C<sort()> fixed for UTF-EBCDIC strings, EBCDIC C<tr///> and C<y///> fixed for C<\N{}>, and C<S<use utf8>> ranges, FreeBSD, IRIX, MacOS X, Solaris, Tru64, VMS, Win32, ppc64el, floating point =back =item Internal Changes =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5224delta - what is new for perl v5.22.4 =over 4 =item DESCRIPTION =item Security =over 4 =item Improved handling of '.' in @INC in base.pm =item "Escaped" colons and relative paths in PATH =back =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5223delta - what is new for perl v5.22.3 =over 4 =item DESCRIPTION =item Security =over 4 =item B<-Di> switch is now required for PerlIO debugging output =item Core modules and tools no longer search F<"."> for optional modules =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Testing =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5222delta - what is new for perl v5.22.2 =over 4 =item DESCRIPTION =item Security =over 4 =item Fix out of boundary access in Win32 path handling =item Fix loss of taint in C<canonpath()> =item Set proper umask before calling C<mkstemp(3)> =item Avoid accessing uninitialized memory in Win32 C<crypt()> =item Remove duplicate environment variables from C<environ> =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes Darwin, OS X/Darwin, ppc64el, Tru64 =back =item Internal Changes =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5221delta - what is new for perl v5.22.1 =over 4 =item DESCRIPTION =item Incompatible Changes =over 4 =item Bounds Checking Constructs =back =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes IRIX =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5220delta - what is new for perl v5.22.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item New bitwise operators =item New double-diamond operator =item New C<\b> boundaries in regular expressions =item Non-Capturing Regular Expression Flag =item C<use re 'strict'> =item Unicode 7.0 (with correction) is now supported =item S<C<use locale>> can restrict which locale categories are affected =item Perl now supports POSIX 2008 locale currency additions =item Better heuristics on older platforms for determining locale UTF-8ness =item Aliasing via reference =item C<prototype> with no arguments =item New C<:const> subroutine attribute =item C<fileno> now works on directory handles =item List form of pipe open implemented for Win32 =item Assignment to list repetition =item Infinity and NaN (not-a-number) handling improved =item Floating point parsing has been improved =item Packing infinity or not-a-number into a character is now fatal =item Experimental C Backtrace API =back =item Security =over 4 =item Perl is now compiled with C<-fstack-protector-strong> if available =item The L<Safe> module could allow outside packages to be replaced =item Perl is now always compiled with C<-D_FORTIFY_SOURCE=2> if available =back =item Incompatible Changes =over 4 =item Subroutine signatures moved before attributes =item C<&> and C<\&> prototypes accepts only subs =item C<use encoding> is now lexical =item List slices returning empty lists =item C<\N{}> with a sequence of multiple spaces is now a fatal error =item S<C<use UNIVERSAL '...'>> is now a fatal error =item In double-quotish C<\cI<X>>, I<X> must now be a printable ASCII character =item Splitting the tokens C<(?> and C<(*> in regular expressions is now a fatal compilation error. =item C<qr/foo/x> now ignores all Unicode pattern white space =item Comment lines within S<C<(?[ ])>> are now ended only by a C<\n> =item C<(?[...])> operators now follow standard Perl precedence =item Omitting C<%> and C<@> on hash and array names is no longer permitted =item C<"$!"> text is now in English outside the scope of C<use locale> =item C<"$!"> text will be returned in UTF-8 when appropriate =item Support for C<?PATTERN?> without explicit operator has been removed =item C<defined(@array)> and C<defined(%hash)> are now fatal errors =item Using a hash or an array as a reference are now fatal errors =item Changes to the C<*> prototype =back =item Deprecations =over 4 =item Setting C<${^ENCODING}> to anything but C<undef> =item Use of non-graphic characters in single-character variable names =item Inlining of C<sub () { $var }> with observable side-effects =item Use of multiple C</x> regexp modifiers =item Using a NO-BREAK space in a character alias for C<\N{...}> is now deprecated =item A literal C<"{"> should now be escaped in a pattern =item Making all warnings fatal is discouraged =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =item Diagnostic Removals =back =item Utility Changes =over 4 =item F<find2perl>, F<s2p> and F<a2p> removal =item L<h2ph> =item L<encguess> =back =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Regained Platforms IRIX and Tru64 platforms are working again, z/OS running EBCDIC Code Page 1047 =item Discontinued Platforms NeXTSTEP/OPENSTEP =item Platform-Specific Notes EBCDIC, HP-UX, Android, VMS, Win32, OpenBSD, Solaris =back =item Internal Changes =item Selected Bug Fixes =item Known Problems =item Obituary =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5203delta - what is new for perl v5.20.3 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Utility Changes =over 4 =item L<h2ph> =back =item Testing =item Platform Support =over 4 =item Platform-Specific Notes Win32 =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5202delta - what is new for perl v5.20.2 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Testing =item Platform Support =over 4 =item Regained Platforms =back =item Selected Bug Fixes =item Known Problems =item Errata From Previous Releases =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5201delta - what is new for perl v5.20.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Performance Enhancements =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item Changes to Existing Diagnostics =back =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes Android, OpenBSD, Solaris, VMS, Windows =back =item Internal Changes =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5200delta - what is new for perl v5.20.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item Experimental Subroutine signatures =item C<sub>s now take a C<prototype> attribute =item More consistent prototype parsing =item C<rand> now uses a consistent random number generator =item New slice syntax =item Experimental Postfix Dereferencing =item Unicode 6.3 now supported =item New C<\p{Unicode}> regular expression pattern property =item Better 64-bit support =item C<S<use locale>> now works on UTF-8 locales =item C<S<use locale>> now compiles on systems without locale ability =item More locale initialization fallback options =item C<-DL> runtime option now added for tracing locale setting =item B<-F> now implies B<-a> and B<-a> implies B<-n> =item $a and $b warnings exemption =back =item Security =over 4 =item Avoid possible read of free()d memory during parsing =back =item Incompatible Changes =over 4 =item C<do> can no longer be used to call subroutines =item Quote-like escape changes =item Tainting happens under more circumstances; now conforms to documentation =item C<\p{}>, C<\P{}> matching has changed for non-Unicode code points. =item C<\p{All}> has been expanded to match all possible code points =item Data::Dumper's output may change =item Locale decimal point character no longer leaks outside of S<C<use locale>> scope =item Assignments of Windows sockets error codes to $! now prefer F<errno.h> values over WSAGetLastError() values =item Functions C<PerlIO_vsprintf> and C<PerlIO_sprintf> have been removed =back =item Deprecations =over 4 =item The C</\C/> character class =item Literal control characters in variable names =item References to non-integers and non-positive integers in C<$/> =item Character matching routines in POSIX =item Interpreter-based threads are now I<discouraged> =item Module removals L<CGI> and its associated CGI:: packages, L<inc::latest>, L<Package::Constants>, L<Module::Build> and its associated Module::Build:: packages =item Utility removals L<find2perl>, L<s2p>, L<a2p> =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item New Platforms Android, Bitrig, FreeMiNT, Synology =item Discontinued Platforms C<sfio>, AT&T 3b1, DG/UX, EBCDIC =item Platform-Specific Notes Cygwin, GNU/Hurd, Linux, Mac OS, MidnightBSD, Mixed-endian platforms, VMS, Win32, WinCE =back =item Internal Changes =item Selected Bug Fixes =over 4 =item Regular Expressions =item Perl 5 Debugger and -d =item Lexical Subroutines =item Everything Else =back =item Known Problems =item Obituary =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5184delta - what is new for perl v5.18.4 =over 4 =item DESCRIPTION =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Platform Support =over 4 =item Platform-Specific Notes Win32 =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5182delta - what is new for perl v5.18.2 =over 4 =item DESCRIPTION =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5181delta - what is new for perl v5.18.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Platform Support =over 4 =item Platform-Specific Notes AIX, MidnightBSD =back =item Selected Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5180delta - what is new for perl v5.18.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item New mechanism for experimental features =item Hash overhaul =item Upgrade to Unicode 6.2 =item Character name aliases may now include non-Latin1-range characters =item New DTrace probes =item C<${^LAST_FH}> =item Regular Expression Set Operations =item Lexical subroutines =item Computed Labels =item More CORE:: subs =item C<kill> with negative signal names =back =item Security =over 4 =item See also: hash overhaul =item C<Storable> security warning in documentation =item C<Locale::Maketext> allowed code injection via a malicious template =item Avoid calling memset with a negative count =back =item Incompatible Changes =over 4 =item See also: hash overhaul =item An unknown character name in C<\N{...}> is now a syntax error =item Formerly deprecated characters in C<\N{}> character name aliases are now errors. =item C<\N{BELL}> now refers to U+1F514 instead of U+0007 =item New Restrictions in Multi-Character Case-Insensitive Matching in Regular Expression Bracketed Character Classes =item Explicit rules for variable names and identifiers =item Vertical tabs are now whitespace =item C</(?{})/> and C</(??{})/> have been heavily reworked =item Stricter parsing of substitution replacement =item C<given> now aliases the global C<$_> =item The smartmatch family of features are now experimental =item Lexical C<$_> is now experimental =item readline() with C<$/ = \N> now reads N characters, not N bytes =item Overridden C<glob> is now passed one argument =item Here doc parsing =item Alphanumeric operators must now be separated from the closing delimiter of regular expressions =item qw(...) can no longer be used as parentheses =item Interaction of lexical and default warnings =item C<state sub> and C<our sub> =item Defined values stored in environment are forced to byte strings =item C<require> dies for unreadable files =item C<gv_fetchmeth_*> and SUPER =item C<split>'s first argument is more consistently interpreted =back =item Deprecations =over 4 =item Module removals L<encoding>, L<Archive::Extract>, L<B::Lint>, L<B::Lint::Debug>, L<CPANPLUS> and all included C<CPANPLUS::*> modules, L<Devel::InnerPackage>, L<Log::Message>, L<Log::Message::Config>, L<Log::Message::Handlers>, L<Log::Message::Item>, L<Log::Message::Simple>, L<Module::Pluggable>, L<Module::Pluggable::Object>, L<Object::Accessor>, L<Pod::LaTeX>, L<Term::UI>, L<Term::UI::History> =item Deprecated Utilities L<cpanp>, C<cpanp-run-perl>, L<cpan2dist>, L<pod2latex> =item PL_sv_objcount =item Five additional characters should be escaped in patterns with C</x> =item User-defined charnames with surprising whitespace =item Various XS-callable functions are now deprecated =item Certain rare uses of backslashes within regexes are now deprecated =item Splitting the tokens C<(?> and C<(*> in regular expressions =item Pre-PerlIO IO implementations =back =item Future Deprecations DG/UX, NeXT =item Performance Enhancements =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item Changes to Existing Documentation =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item Discontinued Platforms BeOS, UTS Global, VM/ESA, MPE/IX, EPOC, Rhapsody =item Platform-Specific Notes =back =item Internal Changes =item Selected Bug Fixes =item Known Problems =item Obituary =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5163delta - what is new for perl v5.16.3 =over 4 =item DESCRIPTION =item Core Enhancements =item Security =over 4 =item CVE-2013-1667: memory exhaustion with arbitrary hash keys =item wrap-around with IO on long strings =item memory leak in Encode =back =item Incompatible Changes =item Deprecations =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5162delta - what is new for perl v5.16.2 =over 4 =item DESCRIPTION =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Configuration and Compilation configuration should no longer be confused by ls colorization =item Platform Support =over 4 =item Platform-Specific Notes AIX =back =item Selected Bug Fixes fix /\h/ equivalence with /[\h]/ =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5161delta - what is new for perl v5.16.1 =over 4 =item DESCRIPTION =item Security =over 4 =item an off-by-two error in Scalar-List-Util has been fixed =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules and Pragmata =back =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes VMS =back =item Selected Bug Fixes =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5160delta - what is new for perl v5.16.0 =over 4 =item DESCRIPTION =item Notice =item Core Enhancements =over 4 =item C<use I<VERSION>> =item C<__SUB__> =item New and Improved Built-ins =item Unicode Support =item XS Changes =item Changes to Special Variables =item Debugger Changes =item The C<CORE> Namespace =item Other Changes =back =item Security =over 4 =item Use C<is_utf8_char_buf()> and not C<is_utf8_char()> =item Malformed UTF-8 input could cause attempts to read beyond the end of the buffer =item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728). =item Privileges are now set correctly when assigning to C<$(> =back =item Deprecations =over 4 =item Don't read the Unicode data base files in F<lib/unicore> =item XS functions C<is_utf8_char()>, C<utf8_to_uvchr()> and C<utf8_to_uvuni()> =back =item Future Deprecations =over 4 =item Core Modules =item Platforms with no supporting programmers =item Other Future Deprecations =back =item Incompatible Changes =over 4 =item Special blocks called in void context =item The C<overloading> pragma and regexp objects =item Two XS typemap Entries removed =item Unicode 6.1 has incompatibilities with Unicode 6.0 =item Borland compiler =item Certain deprecated Unicode properties are no longer supported by default =item Dereferencing IO thingies as typeglobs =item User-defined case-changing operations =item XSUBs are now 'static' =item Weakening read-only references =item Tying scalars that hold typeglobs =item IPC::Open3 no longer provides C<xfork()>, C<xclose_on_exec()> and C<xpipe_anon()> =item C<$$> no longer caches PID =item C<$$> and C<getppid()> no longer emulate POSIX semantics under LinuxThreads =item C<< $< >>, C<< $> >>, C<$(> and C<$)> are no longer cached =item Which Non-ASCII characters get quoted by C<quotemeta> and C<\Q> has changed =back =item Performance Enhancements =item Modules and Pragmata =over 4 =item Deprecated Modules L<Version::Requirements> =item New Modules and Pragmata =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =item Removed Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Removed Errors =item Changes to Existing Diagnostics =back =item Utility Changes =item Configuration and Compilation =item Platform Support =over 4 =item Platform-Specific Notes =back =item Internal Changes =item Selected Bug Fixes =over 4 =item Array and hash =item C API fixes =item Compile-time hints =item Copy-on-write scalars =item The debugger =item Dereferencing operators =item Filehandle, last-accessed =item Filetests and C<stat> =item Formats =item C<given> and C<when> =item The C<glob> operator =item Lvalue subroutines =item Overloading =item Prototypes of built-in keywords =item Regular expressions =item Smartmatching =item The C<sort> operator =item The C<substr> operator =item Support for embedded nulls =item Threading bugs =item Tied variables =item Version objects and vstrings =item Warnings, redefinition =item Warnings, "Uninitialized" =item Weak references =item Other notable fixes =back =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5144delta - what is new for perl v5.14.4 =over 4 =item DESCRIPTION =item Core Enhancements =item Security =over 4 =item CVE-2013-1667: memory exhaustion with arbitrary hash keys =item memory leak in Encode =item [perl #111594] Socket::unpack_sockaddr_un heap-buffer-overflow =item [perl #111586] SDBM_File: fix off-by-one access to global ".dir" =item off-by-two error in List::Util =item [perl #115994] fix segv in regcomp.c:S_join_exact() =item [perl #115992] PL_eval_start use-after-free =item wrap-around with IO on long strings =back =item Incompatible Changes =item Deprecations =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata Socket, SDBM_File, List::Util =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =item Utility Changes =item Configuration and Compilation =item Platform Support =over 4 =item New Platforms =item Discontinued Platforms =item Platform-Specific Notes VMS =back =item Selected Bug Fixes =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5143delta - what is new for perl v5.14.3 =over 4 =item DESCRIPTION =item Core Enhancements =item Security =over 4 =item C<Digest> unsafe use of eval (CVE-2011-3597) =item Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195) =back =item Incompatible Changes =item Deprecations =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Configuration and Compilation =item Platform Support =over 4 =item New Platforms =item Discontinued Platforms =item Platform-Specific Notes FreeBSD, Solaris and NetBSD, HP-UX, Linux, Mac OS X, GNU/Hurd, NetBSD =back =item Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5142delta - what is new for perl v5.14.2 =over 4 =item DESCRIPTION =item Core Enhancements =item Security =over 4 =item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728). =item C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939) =back =item Incompatible Changes =item Deprecations =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Platform Support =over 4 =item New Platforms =item Discontinued Platforms =item Platform-Specific Notes HP-UX PA-RISC/64 now supports gcc-4.x, Building on OS X 10.7 Lion and Xcode 4 works again =back =item Bug Fixes =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5141delta - what is new for perl v5.14.1 =over 4 =item DESCRIPTION =item Core Enhancements =item Security =item Incompatible Changes =item Deprecations =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragmata =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics =item Changes to Existing Diagnostics =back =item Utility Changes =item Configuration and Compilation =item Testing =item Platform Support =over 4 =item New Platforms =item Discontinued Platforms =item Platform-Specific Notes =back =item Internal Changes =item Bug Fixes =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5140delta - what is new for perl v5.14.0 =over 4 =item DESCRIPTION =item Notice =item Core Enhancements =over 4 =item Unicode =item Regular Expressions =item Syntactical Enhancements =item Exception Handling =item Other Enhancements C<-d:-foo>, C<-d:-foo=bar> =item New C APIs =back =item Security =over 4 =item User-defined regular expression properties =back =item Incompatible Changes =over 4 =item Regular Expressions and String Escapes =item Stashes and Package Variables =item Changes to Syntax or to Perl Operators =item Threads and Processes =item Configuration =back =item Deprecations =over 4 =item Omitting a space between a regular expression and subsequent word =item C<\cI<X>> =item C<"\b{"> and C<"\B{"> =item Perl 4-era .pl libraries =item List assignment to C<$[> =item Use of qw(...) as parentheses =item C<\N{BELL}> =item C<?PATTERN?> =item Tie functions on scalars holding typeglobs =item User-defined case-mapping =item Deprecated modules L<Devel::DProf> =back =item Performance Enhancements =over 4 =item "Safe signals" optimisation =item Optimisation of shift() and pop() calls without arguments =item Optimisation of regexp engine string comparison work =item Regular expression compilation speed-up =item String appending is 100 times faster =item Eliminate C<PL_*> accessor functions under ithreads =item Freeing weak references =item Lexical array and hash assignments =item C<@_> uses less memory =item Size optimisations to SV and HV structures =item Memory consumption improvements to Exporter =item Memory savings for weak references =item C<%+> and C<%-> use less memory =item Multiple small improvements to threads =item Adjacent pairs of nextstate opcodes are now optimized away =back =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated Modules and Pragma much less configuration dialog hassle, support for F<META/MYMETA.json>, support for L<local::lib>, support for L<HTTP::Tiny> to reduce the dependency on FTP sites, automatic mirror selection, iron out all known bugs in configure_requires, support for distributions compressed with L<bzip2(1)>, allow F<Foo/Bar.pm> on the command line to mean C<Foo::Bar>, charinfo(), charscript(), charblock() =item Removed Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Diagnostics =over 4 =item New Diagnostics Closure prototype called, Insecure user-defined property %s, panic: gp_free failed to free glob pointer - something is repeatedly re-creating entries, Parsing code internal error (%s), refcnt: fd %d%s, Regexp modifier "/%c" may not appear twice, Regexp modifiers "/%c" and "/%c" are mutually exclusive, Using !~ with %s doesn't make sense, "\b{" is deprecated; use "\b\{" instead, "\B{" is deprecated; use "\B\{" instead, Operation "%s" returns its argument for .., Use of qw(...) as parentheses is deprecated =item Changes to Existing Diagnostics =back =item Utility Changes =item Configuration and Compilation =item Platform Support =over 4 =item New Platforms AIX =item Discontinued Platforms Apollo DomainOS, MacOS Classic =item Platform-Specific Notes =back =item Internal Changes =over 4 =item New APIs =item C API Changes =item Deprecated C APIs C<Perl_ptr_table_clear>, C<sv_compile_2op>, C<find_rundefsvoffset>, C<CALL_FPTR> and C<CPERLscope> =item Other Internal Changes =back =item Selected Bug Fixes =over 4 =item I/O =item Regular Expression Bug Fixes =item Syntax/Parsing Bugs =item Stashes, Globs and Method Lookup Aliasing packages by assigning to globs [perl #77358], Deleting packages by deleting their containing stash elements, Undefining the glob containing a package (C<undef *Foo::>), Undefining an ISA glob (C<undef *Foo::ISA>), Deleting an ISA stash element (C<delete $Foo::{ISA}>), Sharing @ISA arrays between classes (via C<*Foo::ISA = \@Bar::ISA> or C<*Foo::ISA = *Bar::ISA>) [perl #77238] =item Unicode =item Ties, Overloading and Other Magic =item The Debugger =item Threads =item Scoping and Subroutines =item Signals =item Miscellaneous Memory Leaks =item Memory Corruption and Crashes =item Fixes to Various Perl Operators =item Bugs Relating to the C API =back =item Known Problems =item Errata =over 4 =item keys(), values(), and each() work on arrays =item split() and C<@_> =back =item Obituary =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5125delta - what is new for perl v5.12.5 =over 4 =item DESCRIPTION =item Security =over 4 =item C<Encode> decode_xs n-byte heap-overflow (CVE-2011-2939) =item C<File::Glob::bsd_glob()> memory error with GLOB_ALTDIRFUNC (CVE-2011-2728). =item Heap buffer overrun in 'x' string repeat operator (CVE-2012-5195) =back =item Incompatible Changes =item Modules and Pragmata =over 4 =item Updated Modules =back =item Changes to Existing Documentation =over 4 =item L<perlebcdic> =item L<perlunicode> =item L<perluniprops> =back =item Installation and Configuration Improvements =over 4 =item Platform Specific Changes Mac OS X, NetBSD =back =item Selected Bug Fixes =item Errata =over 4 =item split() and C<@_> =back =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5124delta - what is new for perl v5.12.4 =over 4 =item DESCRIPTION =item Incompatible Changes =item Selected Bug Fixes =item Modules and Pragmata =item Testing =item Documentation =item Platform Specific Notes Linux =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5123delta - what is new for perl v5.12.3 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =over 4 =item C<keys>, C<values> work on arrays =back =item Bug Fixes =item Platform Specific Notes Solaris, VMS, VOS =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5122delta - what is new for perl v5.12.2 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Pragmata Changes =item Updated Modules C<Carp>, C<CPANPLUS>, C<File::Glob>, C<File::Copy>, C<File::Spec> =back =item Utility Changes =item Changes to Existing Documentation =item Installation and Configuration Improvements =over 4 =item Configuration improvements =item Compilation improvements =back =item Selected Bug Fixes =item Platform Specific Notes =over 4 =item AIX =item Windows =item VMS =back =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5121delta - what is new for perl v5.12.1 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata =over 4 =item Pragmata Changes =item Updated Modules =back =item Changes to Existing Documentation =item Testing =over 4 =item Testing Improvements =back =item Installation and Configuration Improvements =over 4 =item Configuration improvements =back =item Bug Fixes =item Platform Specific Notes =over 4 =item HP-UX =item AIX =item FreeBSD 7 =item VMS =back =item Known Problems =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5120delta - what is new for perl v5.12.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item New C<package NAME VERSION> syntax =item The C<...> operator =item Implicit strictures =item Unicode improvements =item Y2038 compliance =item qr overloading =item Pluggable keywords =item APIs for more internals =item Overridable function lookup =item A proper interface for pluggable Method Resolution Orders =item C<\N> experimental regex escape =item DTrace support =item Support for C<configure_requires> in CPAN module metadata =item C<each>, C<keys>, C<values> are now more flexible =item C<when> as a statement modifier =item C<$,> flexibility =item // in when clauses =item Enabling warnings from your shell environment =item C<delete local> =item New support for Abstract namespace sockets =item 32-bit limit on substr arguments removed =back =item Potentially Incompatible Changes =over 4 =item Deprecations warn by default =item Version number formats =item @INC reorganization =item REGEXPs are now first class =item Switch statement changes flip-flop operators, defined-or operator =item Smart match changes =item Other potentially incompatible changes =back =item Deprecations suidperl, Use of C<:=> to mean an empty attribute list, C<< UNIVERSAL->import() >>, Use of "goto" to jump into a construct, Custom character names in \N{name} that don't look like names, Deprecated Modules, L<Class::ISA>, L<Pod::Plainer>, L<Shell>, L<Switch>, Assignment to $[, Use of the attribute :locked on subroutines, Use of "locked" with the attributes pragma, Use of "unique" with the attributes pragma, Perl_pmflag, Numerous Perl 4-era libraries =item Unicode overhaul =item Modules and Pragmata =over 4 =item New Modules and Pragmata C<autodie>, C<Compress::Raw::Bzip2>, C<overloading>, C<parent>, C<Parse::CPAN::Meta>, C<VMS::DCLsym>, C<VMS::Stdio>, C<XS::APItest::KeywordRPN> =item Updated Pragmata C<base>, C<bignum>, C<charnames>, C<constant>, C<diagnostics>, C<feature>, C<less>, C<lib>, C<mro>, C<overload>, C<threads>, C<threads::shared>, C<version>, C<warnings> =item Updated Modules C<Archive::Extract>, C<Archive::Tar>, C<Attribute::Handlers>, C<AutoLoader>, C<B::Concise>, C<B::Debug>, C<B::Deparse>, C<B::Lint>, C<CGI>, C<Class::ISA>, C<Compress::Raw::Zlib>, C<CPAN>, C<CPANPLUS>, C<CPANPLUS::Dist::Build>, C<Data::Dumper>, C<DB_File>, C<Devel::PPPort>, C<Digest>, C<Digest::MD5>, C<Digest::SHA>, C<Encode>, C<Exporter>, C<ExtUtils::CBuilder>, C<ExtUtils::Command>, C<ExtUtils::Constant>, C<ExtUtils::Install>, C<ExtUtils::MakeMaker>, C<ExtUtils::Manifest>, C<ExtUtils::ParseXS>, C<File::Fetch>, C<File::Path>, C<File::Temp>, C<Filter::Simple>, C<Filter::Util::Call>, C<Getopt::Long>, C<IO>, C<IO::Zlib>, C<IPC::Cmd>, C<IPC::SysV>, C<Locale::Maketext>, C<Locale::Maketext::Simple>, C<Log::Message>, C<Log::Message::Simple>, C<Math::BigInt>, C<Math::BigInt::FastCalc>, C<Math::BigRat>, C<Math::Complex>, C<Memoize>, C<MIME::Base64>, C<Module::Build>, C<Module::CoreList>, C<Module::Load>, C<Module::Load::Conditional>, C<Module::Loaded>, C<Module::Pluggable>, C<Net::Ping>, C<NEXT>, C<Object::Accessor>, C<Package::Constants>, C<PerlIO>, C<Pod::Parser>, C<Pod::Perldoc>, C<Pod::Plainer>, C<Pod::Simple>, C<Safe>, C<SelfLoader>, C<Storable>, C<Switch>, C<Sys::Syslog>, C<Term::ANSIColor>, C<Term::UI>, C<Test>, C<Test::Harness>, C<Test::Simple>, C<Text::Balanced>, C<Text::ParseWords>, C<Text::Soundex>, C<Thread::Queue>, C<Thread::Semaphore>, C<Tie::RefHash>, C<Time::HiRes>, C<Time::Local>, C<Time::Piece>, C<Unicode::Collate>, C<Unicode::Normalize>, C<Win32>, C<Win32API::File>, C<XSLoader> =item Removed Modules and Pragmata C<attrs>, C<CPAN::API::HOWTO>, C<CPAN::DeferedCode>, C<CPANPLUS::inc>, C<DCLsym>, C<ExtUtils::MakeMaker::bytes>, C<ExtUtils::MakeMaker::vmsish>, C<Stdio>, C<Test::Harness::Assert>, C<Test::Harness::Iterator>, C<Test::Harness::Point>, C<Test::Harness::Results>, C<Test::Harness::Straps>, C<Test::Harness::Util>, C<XSSymSet> =item Deprecated Modules and Pragmata =back =item Documentation =over 4 =item New Documentation =item Changes to Existing Documentation =back =item Selected Performance Enhancements =item Installation and Configuration Improvements =item Internal Changes =item Testing =over 4 =item Testing improvements Parallel tests, Test harness flexibility, Test watchdog =item New Tests =back =item New or Changed Diagnostics =over 4 =item New Diagnostics =item Changed Diagnostics C<Illegal character in prototype for %s : %s>, C<Prototype after '%c' for %s : %s> =back =item Utility Changes =item Selected Bug Fixes =item Platform Specific Changes =over 4 =item New Platforms Haiku, MirOS BSD =item Discontinued Platforms Domain/OS, MiNT, Tenon MachTen =item Updated Platforms AIX, Cygwin, Darwin (Mac OS X), DragonFly BSD, FreeBSD, Irix, NetBSD, OpenVMS, Stratus VOS, Symbian, Windows =back =item Known Problems =item Errata =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5101delta - what is new for perl v5.10.1 =over 4 =item DESCRIPTION =item Incompatible Changes =over 4 =item Switch statement changes flip-flop operators, defined-or operator =item Smart match changes =item Other incompatible changes =back =item Core Enhancements =over 4 =item Unicode Character Database 5.1.0 =item A proper interface for pluggable Method Resolution Orders =item The C<overloading> pragma =item Parallel tests =item DTrace support =item Support for C<configure_requires> in CPAN module metadata =back =item Modules and Pragmata =over 4 =item New Modules and Pragmata C<autodie>, C<Compress::Raw::Bzip2>, C<parent>, C<Parse::CPAN::Meta> =item Pragmata Changes C<attributes>, C<attrs>, C<base>, C<bigint>, C<bignum>, C<bigrat>, C<charnames>, C<constant>, C<feature>, C<fields>, C<lib>, C<open>, C<overload>, C<overloading>, C<version> =item Updated Modules C<Archive::Extract>, C<Archive::Tar>, C<Attribute::Handlers>, C<AutoLoader>, C<AutoSplit>, C<B>, C<B::Debug>, C<B::Deparse>, C<B::Lint>, C<B::Xref>, C<Benchmark>, C<Carp>, C<CGI>, C<Compress::Zlib>, C<CPAN>, C<CPANPLUS>, C<CPANPLUS::Dist::Build>, C<Cwd>, C<Data::Dumper>, C<DB>, C<DB_File>, C<Devel::PPPort>, C<Digest::MD5>, C<Digest::SHA>, C<DirHandle>, C<Dumpvalue>, C<DynaLoader>, C<Encode>, C<Errno>, C<Exporter>, C<ExtUtils::CBuilder>, C<ExtUtils::Command>, C<ExtUtils::Constant>, C<ExtUtils::Embed>, C<ExtUtils::Install>, C<ExtUtils::MakeMaker>, C<ExtUtils::Manifest>, C<ExtUtils::ParseXS>, C<Fatal>, C<File::Basename>, C<File::Compare>, C<File::Copy>, C<File::Fetch>, C<File::Find>, C<File::Path>, C<File::Spec>, C<File::stat>, C<File::Temp>, C<FileCache>, C<FileHandle>, C<Filter::Simple>, C<Filter::Util::Call>, C<FindBin>, C<GDBM_File>, C<Getopt::Long>, C<Hash::Util::FieldHash>, C<I18N::Collate>, C<IO>, C<IO::Compress::*>, C<IO::Dir>, C<IO::Handle>, C<IO::Socket>, C<IO::Zlib>, C<IPC::Cmd>, C<IPC::Open3>, C<IPC::SysV>, C<lib>, C<List::Util>, C<Locale::MakeText>, C<Log::Message>, C<Math::BigFloat>, C<Math::BigInt>, C<Math::BigInt::FastCalc>, C<Math::BigRat>, C<Math::Complex>, C<Math::Trig>, C<Memoize>, C<Module::Build>, C<Module::CoreList>, C<Module::Load>, C<Module::Load::Conditional>, C<Module::Loaded>, C<Module::Pluggable>, C<NDBM_File>, C<Net::Ping>, C<NEXT>, C<Object::Accessor>, C<OS2::REXX>, C<Package::Constants>, C<PerlIO>, C<PerlIO::via>, C<Pod::Man>, C<Pod::Parser>, C<Pod::Simple>, C<Pod::Text>, C<POSIX>, C<Safe>, C<Scalar::Util>, C<SelectSaver>, C<SelfLoader>, C<Socket>, C<Storable>, C<Switch>, C<Symbol>, C<Sys::Syslog>, C<Term::ANSIColor>, C<Term::ReadLine>, C<Term::UI>, C<Test::Harness>, C<Test::Simple>, C<Text::ParseWords>, C<Text::Tabs>, C<Text::Wrap>, C<Thread::Queue>, C<Thread::Semaphore>, C<threads>, C<threads::shared>, C<Tie::RefHash>, C<Tie::StdHandle>, C<Time::HiRes>, C<Time::Local>, C<Time::Piece>, C<Unicode::Normalize>, C<Unicode::UCD>, C<UNIVERSAL>, C<Win32>, C<Win32API::File>, C<XSLoader> =back =item Utility Changes F<h2ph>, F<h2xs>, F<perl5db.pl>, F<perlthanks> =item New Documentation L<perlhaiku>, L<perlmroapi>, L<perlperf>, L<perlrepository>, L<perlthanks> =item Changes to Existing Documentation =item Performance Enhancements =item Installation and Configuration Improvements =over 4 =item F<ext/> reorganisation =item Configuration improvements =item Compilation improvements =item Platform Specific Changes AIX, Cygwin, FreeBSD, Irix, Haiku, MirOS BSD, NetBSD, Stratus VOS, Symbian, Win32, VMS =back =item Selected Bug Fixes =item New or Changed Diagnostics C<panic: sv_chop %s>, C<Can't locate package %s for the parents of %s>, C<v-string in use/require is non-portable>, C<Deep recursion on subroutine "%s"> =item Changed Internals C<SVf_UTF8>, C<SVs_TEMP> =item New Tests t/comp/retainedlines.t, t/io/perlio_fail.t, t/io/perlio_leaks.t, t/io/perlio_open.t, t/io/perlio.t, t/io/pvbm.t, t/mro/package_aliases.t, t/op/dbm.t, t/op/index_thr.t, t/op/pat_thr.t, t/op/qr_gc.t, t/op/reg_email_thr.t, t/op/regexp_qr_embed_thr.t, t/op/regexp_unicode_prop.t, t/op/regexp_unicode_prop_thr.t, t/op/reg_nc_tie.t, t/op/reg_posixcc.t, t/op/re.t, t/op/setpgrpstack.t, t/op/substr_thr.t, t/op/upgrade.t, t/uni/lex_utf8.t, t/uni/tie.t =item Known Problems =item Deprecations =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl5100delta - what is new for perl 5.10.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item The C<feature> pragma =item New B<-E> command-line switch =item Defined-or operator =item Switch and Smart Match operator =item Regular expressions Recursive Patterns, Named Capture Buffers, Possessive Quantifiers, Backtracking control verbs, Relative backreferences, C<\K> escape, Vertical and horizontal whitespace, and linebreak, Optional pre-match and post-match captures with the /p flag =item C<say()> =item Lexical C<$_> =item The C<_> prototype =item UNITCHECK blocks =item New Pragma, C<mro> =item readdir() may return a "short filename" on Windows =item readpipe() is now overridable =item Default argument for readline() =item state() variables =item Stacked filetest operators =item UNIVERSAL::DOES() =item Formats =item Byte-order modifiers for pack() and unpack() =item C<no VERSION> =item C<chdir>, C<chmod> and C<chown> on filehandles =item OS groups =item Recursive sort subs =item Exceptions in constant folding =item Source filters in @INC =item New internal variables C<${^RE_DEBUG_FLAGS}>, C<${^CHILD_ERROR_NATIVE}>, C<${^RE_TRIE_MAXBUF}>, C<${^WIN32_SLOPPY_STAT}> =item Miscellaneous =item UCD 5.0.0 =item MAD =item kill() on Windows =back =item Incompatible Changes =over 4 =item Packing and UTF-8 strings =item Byte/character count feature in unpack() =item The C<$*> and C<$#> variables have been removed =item substr() lvalues are no longer fixed-length =item Parsing of C<-f _> =item C<:unique> =item Effect of pragmas in eval =item chdir FOO =item Handling of .pmc files =item $^V is now a C<version> object instead of a v-string =item @- and @+ in patterns =item $AUTOLOAD can now be tainted =item Tainting and printf =item undef and signal handlers =item strictures and dereferencing in defined() =item C<(?p{})> has been removed =item Pseudo-hashes have been removed =item Removal of the bytecode compiler and of perlcc =item Removal of the JPL =item Recursive inheritance detected earlier =item warnings::enabled and warnings::warnif changed to favor users of modules =back =item Modules and Pragmata =over 4 =item Upgrading individual core modules =item Pragmata Changes C<feature>, C<mro>, Scoping of the C<sort> pragma, Scoping of C<bignum>, C<bigint>, C<bigrat>, C<base>, C<strict> and C<warnings>, C<version>, C<warnings>, C<less> =item New modules =item Selected Changes to Core Modules C<Attribute::Handlers>, C<B::Lint>, C<B>, C<Thread> =back =item Utility Changes perl -d, ptar, ptardiff, shasum, corelist, h2ph and h2xs, perlivp, find2perl, config_data, cpanp, cpan2dist, pod2html =item New Documentation =item Performance Enhancements =over 4 =item In-place sorting =item Lexical array access =item XS-assisted SWASHGET =item Constant subroutines =item C<PERL_DONT_CREATE_GVSV> =item Weak references are cheaper =item sort() enhancements =item Memory optimisations =item UTF-8 cache optimisation =item Sloppy stat on Windows =item Regular expressions optimisations Engine de-recursivised, Single char char-classes treated as literals, Trie optimisation of literal string alternations, Aho-Corasick start-point optimisation =back =item Installation and Configuration Improvements =over 4 =item Configuration improvements C<-Dusesitecustomize>, Relocatable installations, strlcat() and strlcpy(), C<d_pseudofork> and C<d_printf_format_null>, Configure help =item Compilation improvements Parallel build, Borland's compilers support, Static build on Windows, ppport.h files, C++ compatibility, Support for Microsoft 64-bit compiler, Visual C++, Win32 builds =item Installation improvements Module auxiliary files =item New Or Improved Platforms =back =item Selected Bug Fixes strictures in regexp-eval blocks, Calling CORE::require(), Subscripts of slices, C<no warnings 'category'> works correctly with -w, threads improvements, chr() and negative values, PERL5SHELL and tainting, Using *FILE{IO}, Overloading and reblessing, Overloading and UTF-8, eval memory leaks fixed, Random device on Windows, PERLIO_DEBUG, PerlIO::scalar and read-only scalars, study() and UTF-8, Critical signals, @INC-hook fix, C<-t> switch fix, Duping UTF-8 filehandles, Localisation of hash elements =item New or Changed Diagnostics Use of uninitialized value, Deprecated use of my() in false conditional, !=~ should be !~, Newline in left-justified string, Too late for "-T" option, "%s" variable %s masks earlier declaration, readdir()/closedir()/etc. attempted on invalid dirhandle, Opening dirhandle/filehandle %s also as a file/directory, Use of -P is deprecated, v-string in use/require is non-portable, perl -V =item Changed Internals =over 4 =item Reordering of SVt_* constants =item Elimination of SVt_PVBM =item New type SVt_BIND =item Removal of CPP symbols =item Less space is used by ops =item New parser =item Use of C<const> =item Mathoms =item C<AvFLAGS> has been removed =item C<av_*> changes =item $^H and %^H =item B:: modules inheritance changed =item Anonymous hash and array constructors =back =item Known Problems =over 4 =item UTF-8 problems =back =item Platform Specific Problems =item Reporting Bugs =item SEE ALSO =back =head2 perl589delta - what is new for perl v5.8.9 =over 4 =item DESCRIPTION =item Notice =item Incompatible Changes =item Core Enhancements =over 4 =item Unicode Character Database 5.1.0. =item stat and -X on directory handles =item Source filters in @INC =item Exceptions in constant folding =item C<no VERSION> =item Improved internal UTF-8 caching code =item Runtime relocatable installations =item New internal variables C<${^CHILD_ERROR_NATIVE}>, C<${^UTF8CACHE}> =item C<readpipe> is now overridable =item simple exception handling macros =item -D option enhancements =item XS-assisted SWASHGET =item Constant subroutines =back =item New Platforms =item Modules and Pragmata =over 4 =item New Modules =item Updated Modules =back =item Utility Changes =over 4 =item debugger upgraded to version 1.31 =item F<perlthanks> =item F<perlbug> =item F<h2xs> =item F<h2ph> =back =item New Documentation =item Changes to Existing Documentation =item Performance Enhancements =item Installation and Configuration Improvements =over 4 =item Relocatable installations =item Configuration improvements =item Compilation improvements =item Installation improvements. =item Platform Specific Changes =back =item Selected Bug Fixes =over 4 =item Unicode =item PerlIO =item Magic =item Reblessing overloaded objects now works =item C<strict> now propagates correctly into string evals =item Other fixes =item Platform Specific Fixes =item Smaller fixes =back =item New or Changed Diagnostics =over 4 =item panic: sv_chop %s =item Maximal count of pending signals (%s) exceeded =item panic: attempt to call %s in %s =item FETCHSIZE returned a negative value =item Can't upgrade %s (%d) to %d =item %s argument is not a HASH or ARRAY element or a subroutine =item Cannot make the non-overridable builtin %s fatal =item Unrecognized character '%s' in column %d =item Offset outside string =item Invalid escape in the specified encoding in regexp; marked by <-- HERE in m/%s/ =item Your machine doesn't support dump/undump. =back =item Changed Internals =over 4 =item Macro cleanups =back =item New Tests ext/DynaLoader/t/DynaLoader.t, t/comp/fold.t, t/io/pvbm.t, t/lib/proxy_constant_subs.t, t/op/attrhand.t, t/op/dbm.t, t/op/inccode-tie.t, t/op/incfilter.t, t/op/kill0.t, t/op/qrstack.t, t/op/qr.t, t/op/regexp_qr_embed.t, t/op/regexp_qr.t, t/op/rxcode.t, t/op/studytied.t, t/op/substT.t, t/op/symbolcache.t, t/op/upgrade.t, t/mro/package_aliases.t, t/pod/twice.t, t/run/cloexec.t, t/uni/cache.t, t/uni/chr.t, t/uni/greek.t, t/uni/latin2.t, t/uni/overload.t, t/uni/tie.t =item Known Problems =item Platform Specific Notes =over 4 =item Win32 =item OS/2 =item VMS =back =item Obituary =item Acknowledgements =item Reporting Bugs =item SEE ALSO =back =head2 perl588delta - what is new for perl v5.8.8 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata =item Utility Changes =over 4 =item C<h2xs> enhancements =item C<perlivp> enhancements =back =item New Documentation =item Performance Enhancements =item Installation and Configuration Improvements =item Selected Bug Fixes =over 4 =item no warnings 'category' works correctly with -w =item Remove over-optimisation =item sprintf() fixes =item Debugger and Unicode slowdown =item Smaller fixes =back =item New or Changed Diagnostics =over 4 =item Attempt to set length of freed array =item Non-string passed as bitmask =item Search pattern not terminated or ternary operator parsed as search pattern =back =item Changed Internals =item Platform Specific Problems =item Reporting Bugs =item SEE ALSO =back =head2 perl587delta - what is new for perl v5.8.7 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =over 4 =item Unicode Character Database 4.1.0 =item suidperl less insecure =item Optional site customization script =item C<Config.pm> is now much smaller. =back =item Modules and Pragmata =item Utility Changes =over 4 =item find2perl enhancements =back =item Performance Enhancements =item Installation and Configuration Improvements =item Selected Bug Fixes =item New or Changed Diagnostics =item Changed Internals =item Known Problems =item Platform Specific Problems =item Reporting Bugs =item SEE ALSO =back =head2 perl586delta - what is new for perl v5.8.6 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata =item Utility Changes =item Performance Enhancements =item Selected Bug Fixes =item New or Changed Diagnostics =item Changed Internals =item New Tests =item Reporting Bugs =item SEE ALSO =back =head2 perl585delta - what is new for perl v5.8.5 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata =item Utility Changes =over 4 =item Perl's debugger =item h2ph =back =item Installation and Configuration Improvements =item Selected Bug Fixes =item New or Changed Diagnostics =item Changed Internals =item Known Problems =item Platform Specific Problems =item Reporting Bugs =item SEE ALSO =back =head2 perl584delta - what is new for perl v5.8.4 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =over 4 =item Malloc wrapping =item Unicode Character Database 4.0.1 =item suidperl less insecure =item format =back =item Modules and Pragmata =over 4 =item Updated modules Attribute::Handlers, B, Benchmark, CGI, Carp, Cwd, Exporter, File::Find, IO, IPC::Open3, Local::Maketext, Math::BigFloat, Math::BigInt, Math::BigRat, MIME::Base64, ODBM_File, POSIX, Shell, Socket, Storable, Switch, Sys::Syslog, Term::ANSIColor, Time::HiRes, Unicode::UCD, Win32, base, open, threads, utf8 =back =item Performance Enhancements =item Utility Changes =item Installation and Configuration Improvements =item Selected Bug Fixes =item New or Changed Diagnostics =item Changed Internals =item Future Directions =item Platform Specific Problems =item Reporting Bugs =item SEE ALSO =back =head2 perl583delta - what is new for perl v5.8.3 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =item Modules and Pragmata CGI, Cwd, Digest, Digest::MD5, Encode, File::Spec, FindBin, List::Util, Math::BigInt, PodParser, Pod::Perldoc, POSIX, Unicode::Collate, Unicode::Normalize, Test::Harness, threads::shared =item Utility Changes =item New Documentation =item Installation and Configuration Improvements =item Selected Bug Fixes =item New or Changed Diagnostics =item Changed Internals =item Configuration and Building =item Platform Specific Problems =item Known Problems =item Future Directions =item Obituary =item Reporting Bugs =item SEE ALSO =back =head2 perl582delta - what is new for perl v5.8.2 =over 4 =item DESCRIPTION =item Incompatible Changes =item Core Enhancements =over 4 =item Hash Randomisation =item Threading =back =item Modules and Pragmata =over 4 =item Updated Modules And Pragmata Devel::PPPort, Digest::MD5, I18N::LangTags, libnet, MIME::Base64, Pod::Perldoc, strict, Tie::Hash, Time::HiRes, Unicode::Collate, Unicode::Normalize, UNIVERSAL =back =item Selected Bug Fixes =item Changed Internals =item Platform Specific Problems =item Future Directions =item Reporting Bugs =item SEE ALSO =back =head2 perl581delta - what is new for perl v5.8.1 =over 4 =item DESCRIPTION =item Incompatible Changes =over 4 =item Hash Randomisation =item UTF-8 On Filehandles No Longer Activated By Locale =item Single-number v-strings are no longer v-strings before "=>" =item (Win32) The -C Switch Has Been Repurposed =item (Win32) The /d Switch Of cmd.exe =back =item Core Enhancements =over 4 =item UTF-8 no longer default under UTF-8 locales =item Unsafe signals again available =item Tied Arrays with Negative Array Indices =item local ${$x} =item Unicode Character Database 4.0.0 =item Deprecation Warnings =item Miscellaneous Enhancements =back =item Modules and Pragmata =over 4 =item Updated Modules And Pragmata base, B::Bytecode, B::Concise, B::Deparse, Benchmark, ByteLoader, bytes, CGI, charnames, CPAN, Data::Dumper, DB_File, Devel::PPPort, Digest::MD5, Encode, fields, libnet, Math::BigInt, MIME::Base64, NEXT, Net::Ping, PerlIO::scalar, podlators, Pod::LaTeX, PodParsers, Pod::Perldoc, Scalar::Util, Storable, strict, Term::ANSIcolor, Test::Harness, Test::More, Test::Simple, Text::Balanced, Time::HiRes, threads, threads::shared, Unicode::Collate, Unicode::Normalize, Win32::GetFolderPath, Win32::GetOSVersion =back =item Utility Changes =item New Documentation =item Installation and Configuration Improvements =over 4 =item Platform-specific enhancements =back =item Selected Bug Fixes =over 4 =item Closures, eval and lexicals =item Generic fixes =item Platform-specific fixes =back =item New or Changed Diagnostics =over 4 =item Changed "A thread exited while %d threads were running" =item Removed "Attempt to clear a restricted hash" =item New "Illegal declaration of anonymous subroutine" =item Changed "Invalid range "%s" in transliteration operator" =item New "Missing control char name in \c" =item New "Newline in left-justified string for %s" =item New "Possible precedence problem on bitwise %c operator" =item New "Pseudo-hashes are deprecated" =item New "read() on %s filehandle %s" =item New "5.005 threads are deprecated" =item New "Tied variable freed while still in use" =item New "To%s: illegal mapping '%s'" =item New "Use of freed value in iteration" =back =item Changed Internals =item New Tests =item Known Problems =over 4 =item Tied hashes in scalar context =item Net::Ping 450_service and 510_ping_udp failures =item B::C =back =item Platform Specific Problems =over 4 =item EBCDIC Platforms =item Cygwin 1.5 problems =item HP-UX: HP cc warnings about sendfile and sendpath =item IRIX: t/uni/tr_7jis.t falsely failing =item Mac OS X: no usemymalloc =item Tru64: No threaded builds with GNU cc (gcc) =item Win32: sysopen, sysread, syswrite =back =item Future Directions =item Reporting Bugs =item SEE ALSO =back =head2 perl58delta - what is new for perl v5.8.0 =over 4 =item DESCRIPTION =item Highlights In 5.8.0 =item Incompatible Changes =over 4 =item Binary Incompatibility =item 64-bit platforms and malloc =item AIX Dynaloading =item Attributes for C<my> variables now handled at run-time =item Socket Extension Dynamic in VMS =item IEEE-format Floating Point Default on OpenVMS Alpha =item New Unicode Semantics (no more C<use utf8>, almost) =item New Unicode Properties =item REF(...) Instead Of SCALAR(...) =item pack/unpack D/F recycled =item glob() now returns filenames in alphabetical order =item Deprecations =back =item Core Enhancements =over 4 =item Unicode Overhaul =item PerlIO is Now The Default =item ithreads =item Restricted Hashes =item Safe Signals =item Understanding of Numbers =item Arrays now always interpolate into double-quoted strings [561] =item Miscellaneous Changes =back =item Modules and Pragmata =over 4 =item New Modules and Pragmata =item Updated And Improved Modules and Pragmata =back =item Utility Changes =item New Documentation =item Performance Enhancements =item Installation and Configuration Improvements =over 4 =item Generic Improvements =item New Or Improved Platforms =back =item Selected Bug Fixes =over 4 =item Platform Specific Changes and Fixes =back =item New or Changed Diagnostics =item Changed Internals =item Security Vulnerability Closed [561] =item New Tests =item Known Problems =over 4 =item The Compiler Suite Is Still Very Experimental =item Localising Tied Arrays and Hashes Is Broken =item Building Extensions Can Fail Because Of Largefiles =item Modifying $_ Inside for(..) =item mod_perl 1.26 Doesn't Build With Threaded Perl =item lib/ftmp-security tests warn 'system possibly insecure' =item libwww-perl (LWP) fails base/date #51 =item PDL failing some tests =item Perl_get_sv =item Self-tying Problems =item ext/threads/t/libc =item Failure of Thread (5.005-style) tests =item Timing problems =item Tied/Magical Array/Hash Elements Do Not Autovivify =item Unicode in package/class and subroutine names does not work =back =item Platform Specific Problems =over 4 =item AIX =item Alpha systems with old gccs fail several tests =item AmigaOS =item BeOS =item Cygwin "unable to remap" =item Cygwin ndbm tests fail on FAT =item DJGPP Failures =item FreeBSD built with ithreads coredumps reading large directories =item FreeBSD Failing locale Test 117 For ISO 8859-15 Locales =item IRIX fails ext/List/Util/t/shuffle.t or Digest::MD5 =item HP-UX lib/posix Subtest 9 Fails When LP64-Configured =item Linux with glibc 2.2.5 fails t/op/int subtest #6 with -Duse64bitint =item Linux With Sfio Fails op/misc Test 48 =item Mac OS X =item Mac OS X dyld undefined symbols =item OS/2 Test Failures =item op/sprintf tests 91, 129, and 130 =item SCO =item Solaris 2.5 =item Solaris x86 Fails Tests With -Duse64bitint =item SUPER-UX (NEC SX) =item Term::ReadKey not working on Win32 =item UNICOS/mk =item UTS =item VOS (Stratus) =item VMS =item Win32 =item XML::Parser not working =item z/OS (OS/390) =item Unicode Support on EBCDIC Still Spotty =item Seen In Perl 5.7 But Gone Now =back =item Reporting Bugs =item SEE ALSO =item HISTORY =back =head2 perl561delta - what's new for perl v5.6.1 =over 4 =item DESCRIPTION =item Summary of changes between 5.6.0 and 5.6.1 =over 4 =item Security Issues =item Core bug fixes C<UNIVERSAL::isa()>, Memory leaks, Numeric conversions, qw(a\\b), caller(), Bugs in regular expressions, "slurp" mode, Autovivification of symbolic references to special variables, Lexical warnings, Spurious warnings and errors, glob(), Tainting, sort(), #line directives, Subroutine prototypes, map(), Debugger, PERL5OPT, chop(), Unicode support, 64-bit support, Compiler, Lvalue subroutines, IO::Socket, File::Find, xsubpp, C<no Module;>, Tests =item Core features =item Configuration issues =item Documentation =item Bundled modules B::Concise, File::Temp, Pod::LaTeX, Pod::Text::Overstrike, CGI, CPAN, Class::Struct, DB_File, Devel::Peek, File::Find, Getopt::Long, IO::Poll, IPC::Open3, Math::BigFloat, Math::Complex, Net::Ping, Opcode, Pod::Parser, Pod::Text, SDBM_File, Sys::Syslog, Tie::RefHash, Tie::SubstrHash =item Platform-specific improvements NCR MP-RAS, NonStop-UX =back =item Core Enhancements =over 4 =item Interpreter cloning, threads, and concurrency =item Lexically scoped warning categories =item Unicode and UTF-8 support =item Support for interpolating named characters =item "our" declarations =item Support for strings represented as a vector of ordinals =item Improved Perl version numbering system =item New syntax for declaring subroutine attributes =item File and directory handles can be autovivified =item open() with more than two arguments =item 64-bit support =item Large file support =item Long doubles =item "more bits" =item Enhanced support for sort() subroutines =item C<sort $coderef @foo> allowed =item File globbing implemented internally =item Support for CHECK blocks =item POSIX character class syntax [: :] supported =item Better pseudo-random number generator =item Improved C<qw//> operator =item Better worst-case behavior of hashes =item pack() format 'Z' supported =item pack() format modifier '!' supported =item pack() and unpack() support counted strings =item Comments in pack() templates =item Weak references =item Binary numbers supported =item Lvalue subroutines =item Some arrows may be omitted in calls through references =item Boolean assignment operators are legal lvalues =item exists() is supported on subroutine names =item exists() and delete() are supported on array elements =item Pseudo-hashes work better =item Automatic flushing of output buffers =item Better diagnostics on meaningless filehandle operations =item Where possible, buffered data discarded from duped input filehandle =item eof() has the same old magic as <> =item binmode() can be used to set :crlf and :raw modes =item C<-T> filetest recognizes UTF-8 encoded files as "text" =item system(), backticks and pipe open now reflect exec() failure =item Improved diagnostics =item Diagnostics follow STDERR =item More consistent close-on-exec behavior =item syswrite() ease-of-use =item Better syntax checks on parenthesized unary operators =item Bit operators support full native integer width =item Improved security features =item More functional bareword prototype (*) =item C<require> and C<do> may be overridden =item $^X variables may now have names longer than one character =item New variable $^C reflects C<-c> switch =item New variable $^V contains Perl version as a string =item Optional Y2K warnings =item Arrays now always interpolate into double-quoted strings =item @- and @+ provide starting/ending offsets of regex submatches =back =item Modules and Pragmata =over 4 =item Modules attributes, B, Benchmark, ByteLoader, constant, charnames, Data::Dumper, DB, DB_File, Devel::DProf, Devel::Peek, Dumpvalue, DynaLoader, English, Env, Fcntl, File::Compare, File::Find, File::Glob, File::Spec, File::Spec::Functions, Getopt::Long, IO, JPL, lib, Math::BigInt, Math::Complex, Math::Trig, Pod::Parser, Pod::InputObjects, Pod::Checker, podchecker, Pod::ParseUtils, Pod::Find, Pod::Select, podselect, Pod::Usage, pod2usage, Pod::Text and Pod::Man, SDBM_File, Sys::Syslog, Sys::Hostname, Term::ANSIColor, Time::Local, Win32, XSLoader, DBM Filters =item Pragmata =back =item Utility Changes =over 4 =item dprofpp =item find2perl =item h2xs =item perlcc =item perldoc =item The Perl Debugger =back =item Improved Documentation perlapi.pod, perlboot.pod, perlcompile.pod, perldbmfilter.pod, perldebug.pod, perldebguts.pod, perlfork.pod, perlfilter.pod, perlhack.pod, perlintern.pod, perllexwarn.pod, perlnumber.pod, perlopentut.pod, perlreftut.pod, perltootc.pod, perltodo.pod, perlunicode.pod =item Performance enhancements =over 4 =item Simple sort() using { $a <=> $b } and the like are optimized =item Optimized assignments to lexical variables =item Faster subroutine calls =item delete(), each(), values() and hash iteration are faster =back =item Installation and Configuration Improvements =over 4 =item -Dusethreads means something different =item New Configure flags =item Threadedness and 64-bitness now more daring =item Long Doubles =item -Dusemorebits =item -Duselargefiles =item installusrbinperl =item SOCKS support =item C<-A> flag =item Enhanced Installation Directories =item gcc automatically tried if 'cc' does not seem to be working =back =item Platform specific changes =over 4 =item Supported platforms =item DOS =item OS390 (OpenEdition MVS) =item VMS =item Win32 =back =item Significant bug fixes =over 4 =item <HANDLE> on empty files =item C<eval '...'> improvements =item All compilation errors are true errors =item Implicitly closed filehandles are safer =item Behavior of list slices is more consistent =item C<(\$)> prototype and C<$foo{a}> =item C<goto &sub> and AUTOLOAD =item C<-bareword> allowed under C<use integer> =item Failures in DESTROY() =item Locale bugs fixed =item Memory leaks =item Spurious subroutine stubs after failed subroutine calls =item Taint failures under C<-U> =item END blocks and the C<-c> switch =item Potential to leak DATA filehandles =back =item New or Changed Diagnostics "%s" variable %s masks earlier declaration in same %s, "my sub" not yet implemented, "our" variable %s redeclared, '!' allowed only after types %s, / cannot take a count, / must be followed by a, A or Z, / must be followed by a*, A* or Z*, / must follow a numeric type, /%s/: Unrecognized escape \\%c passed through, /%s/: Unrecognized escape \\%c in character class passed through, /%s/ should probably be written as "%s", %s() called too early to check prototype, %s argument is not a HASH or ARRAY element, %s argument is not a HASH or ARRAY element or slice, %s argument is not a subroutine name, %s package attribute may clash with future reserved word: %s, (in cleanup) %s, <> should be quotes, Attempt to join self, Bad evalled substitution pattern, Bad realloc() ignored, Bareword found in conditional, Binary number > 0b11111111111111111111111111111111 non-portable, Bit vector size > 32 non-portable, Buffer overflow in prime_env_iter: %s, Can't check filesystem of script "%s", Can't declare class for non-scalar %s in "%s", Can't declare %s in "%s", Can't ignore signal CHLD, forcing to default, Can't modify non-lvalue subroutine call, Can't read CRTL environ, Can't remove %s: %s, skipping file, Can't return %s from lvalue subroutine, Can't weaken a nonreference, Character class [:%s:] unknown, Character class syntax [%s] belongs inside character classes, Constant is not %s reference, constant(%s): %s, CORE::%s is not a keyword, defined(@array) is deprecated, defined(%hash) is deprecated, Did not produce a valid header, (Did you mean "local" instead of "our"?), Document contains no data, entering effective %s failed, false [] range "%s" in regexp, Filehandle %s opened only for output, flock() on closed filehandle %s, Global symbol "%s" requires explicit package name, Hexadecimal number > 0xffffffff non-portable, Ill-formed CRTL environ value "%s", Ill-formed message in prime_env_iter: |%s|, Illegal binary digit %s, Illegal binary digit %s ignored, Illegal number of bits in vec, Integer overflow in %s number, Invalid %s attribute: %s, Invalid %s attributes: %s, invalid [] range "%s" in regexp, Invalid separator character %s in attribute list, Invalid separator character %s in subroutine attribute list, leaving effective %s failed, Lvalue subs returning %s not implemented yet, Method %s not permitted, Missing %sbrace%s on \N{}, Missing command in piped open, Missing name in "my sub", No %s specified for -%c, No package name allowed for variable %s in "our", No space allowed after -%c, no UTC offset information; assuming local time is UTC, Octal number > 037777777777 non-portable, panic: del_backref, panic: kid popen errno read, panic: magic_killbackrefs, Parentheses missing around "%s" list, Possible unintended interpolation of %s in string, Possible Y2K bug: %s, pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead, Premature end of script headers, Repeat count in pack overflows, Repeat count in unpack overflows, realloc() of freed memory ignored, Reference is already weak, setpgrp can't take arguments, Strange *+?{} on zero-length expression, switching effective %s is not implemented, This Perl can't reset CRTL environ elements (%s), This Perl can't set CRTL environ elements (%s=%s), Too late to run %s block, Unknown open() mode '%s', Unknown process %x sent message to prime_env_iter: %s, Unrecognized escape \\%c passed through, Unterminated attribute parameter in attribute list, Unterminated attribute list, Unterminated attribute parameter in subroutine attribute list, Unterminated subroutine attribute list, Value of CLI symbol "%s" too long, Version number must be a constant number =item New tests =item Incompatible Changes =over 4 =item Perl Source Incompatibilities CHECK is a new keyword, Treatment of list slices of undef has changed, Format of $English::PERL_VERSION is different, Literals of the form C<1.2.3> parse differently, Possibly changed pseudo-random number generator, Hashing function for hash keys has changed, C<undef> fails on read only values, Close-on-exec bit may be set on pipe and socket handles, Writing C<"$$1"> to mean C<"${$}1"> is unsupported, delete(), each(), values() and C<\(%h)>, vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS, Text of some diagnostic output has changed, C<%@> has been removed, Parenthesized not() behaves like a list operator, Semantics of bareword prototype C<(*)> have changed, Semantics of bit operators may have changed on 64-bit platforms, More builtins taint their results =item C Source Incompatibilities C<PERL_POLLUTE>, C<PERL_IMPLICIT_CONTEXT>, C<PERL_POLLUTE_MALLOC> =item Compatible C Source API Changes C<PATCHLEVEL> is now C<PERL_VERSION> =item Binary Incompatibilities =back =item Known Problems =over 4 =item Localizing a tied hash element may leak memory =item Known test failures =item EBCDIC platforms not fully supported =item UNICOS/mk CC failures during Configure run =item Arrow operator and arrays =item Experimental features Threads, Unicode, 64-bit support, Lvalue subroutines, Weak references, The pseudo-hash data type, The Compiler suite, Internal implementation of file globbing, The DB module, The regular expression code constructs: =back =item Obsolete Diagnostics Character class syntax [: :] is reserved for future extensions, Ill-formed logical name |%s| in prime_env_iter, In string, @%s now must be written as \@%s, Probable precedence problem on %s, regexp too big, Use of "$$<digit>" to mean "${$}<digit>" is deprecated =item Reporting Bugs =item SEE ALSO =item HISTORY =back =head2 perl56delta - what's new for perl v5.6.0 =over 4 =item DESCRIPTION =item Core Enhancements =over 4 =item Interpreter cloning, threads, and concurrency =item Lexically scoped warning categories =item Unicode and UTF-8 support =item Support for interpolating named characters =item "our" declarations =item Support for strings represented as a vector of ordinals =item Improved Perl version numbering system =item New syntax for declaring subroutine attributes =item File and directory handles can be autovivified =item open() with more than two arguments =item 64-bit support =item Large file support =item Long doubles =item "more bits" =item Enhanced support for sort() subroutines =item C<sort $coderef @foo> allowed =item File globbing implemented internally =item Support for CHECK blocks =item POSIX character class syntax [: :] supported =item Better pseudo-random number generator =item Improved C<qw//> operator =item Better worst-case behavior of hashes =item pack() format 'Z' supported =item pack() format modifier '!' supported =item pack() and unpack() support counted strings =item Comments in pack() templates =item Weak references =item Binary numbers supported =item Lvalue subroutines =item Some arrows may be omitted in calls through references =item Boolean assignment operators are legal lvalues =item exists() is supported on subroutine names =item exists() and delete() are supported on array elements =item Pseudo-hashes work better =item Automatic flushing of output buffers =item Better diagnostics on meaningless filehandle operations =item Where possible, buffered data discarded from duped input filehandle =item eof() has the same old magic as <> =item binmode() can be used to set :crlf and :raw modes =item C<-T> filetest recognizes UTF-8 encoded files as "text" =item system(), backticks and pipe open now reflect exec() failure =item Improved diagnostics =item Diagnostics follow STDERR =item More consistent close-on-exec behavior =item syswrite() ease-of-use =item Better syntax checks on parenthesized unary operators =item Bit operators support full native integer width =item Improved security features =item More functional bareword prototype (*) =item C<require> and C<do> may be overridden =item $^X variables may now have names longer than one character =item New variable $^C reflects C<-c> switch =item New variable $^V contains Perl version as a string =item Optional Y2K warnings =item Arrays now always interpolate into double-quoted strings =item @- and @+ provide starting/ending offsets of regex matches =back =item Modules and Pragmata =over 4 =item Modules attributes, B, Benchmark, ByteLoader, constant, charnames, Data::Dumper, DB, DB_File, Devel::DProf, Devel::Peek, Dumpvalue, DynaLoader, English, Env, Fcntl, File::Compare, File::Find, File::Glob, File::Spec, File::Spec::Functions, Getopt::Long, IO, JPL, lib, Math::BigInt, Math::Complex, Math::Trig, Pod::Parser, Pod::InputObjects, Pod::Checker, podchecker, Pod::ParseUtils, Pod::Find, Pod::Select, podselect, Pod::Usage, pod2usage, Pod::Text and Pod::Man, SDBM_File, Sys::Syslog, Sys::Hostname, Term::ANSIColor, Time::Local, Win32, XSLoader, DBM Filters =item Pragmata =back =item Utility Changes =over 4 =item dprofpp =item find2perl =item h2xs =item perlcc =item perldoc =item The Perl Debugger =back =item Improved Documentation perlapi.pod, perlboot.pod, perlcompile.pod, perldbmfilter.pod, perldebug.pod, perldebguts.pod, perlfork.pod, perlfilter.pod, perlhack.pod, perlintern.pod, perllexwarn.pod, perlnumber.pod, perlopentut.pod, perlreftut.pod, perltootc.pod, perltodo.pod, perlunicode.pod =item Performance enhancements =over 4 =item Simple sort() using { $a <=> $b } and the like are optimized =item Optimized assignments to lexical variables =item Faster subroutine calls =item delete(), each(), values() and hash iteration are faster =back =item Installation and Configuration Improvements =over 4 =item -Dusethreads means something different =item New Configure flags =item Threadedness and 64-bitness now more daring =item Long Doubles =item -Dusemorebits =item -Duselargefiles =item installusrbinperl =item SOCKS support =item C<-A> flag =item Enhanced Installation Directories =back =item Platform specific changes =over 4 =item Supported platforms =item DOS =item OS390 (OpenEdition MVS) =item VMS =item Win32 =back =item Significant bug fixes =over 4 =item <HANDLE> on empty files =item C<eval '...'> improvements =item All compilation errors are true errors =item Implicitly closed filehandles are safer =item Behavior of list slices is more consistent =item C<(\$)> prototype and C<$foo{a}> =item C<goto &sub> and AUTOLOAD =item C<-bareword> allowed under C<use integer> =item Failures in DESTROY() =item Locale bugs fixed =item Memory leaks =item Spurious subroutine stubs after failed subroutine calls =item Taint failures under C<-U> =item END blocks and the C<-c> switch =item Potential to leak DATA filehandles =back =item New or Changed Diagnostics "%s" variable %s masks earlier declaration in same %s, "my sub" not yet implemented, "our" variable %s redeclared, '!' allowed only after types %s, / cannot take a count, / must be followed by a, A or Z, / must be followed by a*, A* or Z*, / must follow a numeric type, /%s/: Unrecognized escape \\%c passed through, /%s/: Unrecognized escape \\%c in character class passed through, /%s/ should probably be written as "%s", %s() called too early to check prototype, %s argument is not a HASH or ARRAY element, %s argument is not a HASH or ARRAY element or slice, %s argument is not a subroutine name, %s package attribute may clash with future reserved word: %s, (in cleanup) %s, <> should be quotes, Attempt to join self, Bad evalled substitution pattern, Bad realloc() ignored, Bareword found in conditional, Binary number > 0b11111111111111111111111111111111 non-portable, Bit vector size > 32 non-portable, Buffer overflow in prime_env_iter: %s, Can't check filesystem of script "%s", Can't declare class for non-scalar %s in "%s", Can't declare %s in "%s", Can't ignore signal CHLD, forcing to default, Can't modify non-lvalue subroutine call, Can't read CRTL environ, Can't remove %s: %s, skipping file, Can't return %s from lvalue subroutine, Can't weaken a nonreference, Character class [:%s:] unknown, Character class syntax [%s] belongs inside character classes, Constant is not %s reference, constant(%s): %s, CORE::%s is not a keyword, defined(@array) is deprecated, defined(%hash) is deprecated, Did not produce a valid header, (Did you mean "local" instead of "our"?), Document contains no data, entering effective %s failed, false [] range "%s" in regexp, Filehandle %s opened only for output, flock() on closed filehandle %s, Global symbol "%s" requires explicit package name, Hexadecimal number > 0xffffffff non-portable, Ill-formed CRTL environ value "%s", Ill-formed message in prime_env_iter: |%s|, Illegal binary digit %s, Illegal binary digit %s ignored, Illegal number of bits in vec, Integer overflow in %s number, Invalid %s attribute: %s, Invalid %s attributes: %s, invalid [] range "%s" in regexp, Invalid separator character %s in attribute list, Invalid separator character %s in subroutine attribute list, leaving effective %s failed, Lvalue subs returning %s not implemented yet, Method %s not permitted, Missing %sbrace%s on \N{}, Missing command in piped open, Missing name in "my sub", No %s specified for -%c, No package name allowed for variable %s in "our", No space allowed after -%c, no UTC offset information; assuming local time is UTC, Octal number > 037777777777 non-portable, panic: del_backref, panic: kid popen errno read, panic: magic_killbackrefs, Parentheses missing around "%s" list, Possible unintended interpolation of %s in string, Possible Y2K bug: %s, pragma "attrs" is deprecated, use "sub NAME : ATTRS" instead, Premature end of script headers, Repeat count in pack overflows, Repeat count in unpack overflows, realloc() of freed memory ignored, Reference is already weak, setpgrp can't take arguments, Strange *+?{} on zero-length expression, switching effective %s is not implemented, This Perl can't reset CRTL environ elements (%s), This Perl can't set CRTL environ elements (%s=%s), Too late to run %s block, Unknown open() mode '%s', Unknown process %x sent message to prime_env_iter: %s, Unrecognized escape \\%c passed through, Unterminated attribute parameter in attribute list, Unterminated attribute list, Unterminated attribute parameter in subroutine attribute list, Unterminated subroutine attribute list, Value of CLI symbol "%s" too long, Version number must be a constant number =item New tests =item Incompatible Changes =over 4 =item Perl Source Incompatibilities CHECK is a new keyword, Treatment of list slices of undef has changed, Format of $English::PERL_VERSION is different, Literals of the form C<1.2.3> parse differently, Possibly changed pseudo-random number generator, Hashing function for hash keys has changed, C<undef> fails on read only values, Close-on-exec bit may be set on pipe and socket handles, Writing C<"$$1"> to mean C<"${$}1"> is unsupported, delete(), each(), values() and C<\(%h)>, vec(EXPR,OFFSET,BITS) enforces powers-of-two BITS, Text of some diagnostic output has changed, C<%@> has been removed, Parenthesized not() behaves like a list operator, Semantics of bareword prototype C<(*)> have changed, Semantics of bit operators may have changed on 64-bit platforms, More builtins taint their results =item C Source Incompatibilities C<PERL_POLLUTE>, C<PERL_IMPLICIT_CONTEXT>, C<PERL_POLLUTE_MALLOC> =item Compatible C Source API Changes C<PATCHLEVEL> is now C<PERL_VERSION> =item Binary Incompatibilities =back =item Known Problems =over 4 =item Thread test failures =item EBCDIC platforms not supported =item In 64-bit HP-UX the lib/io_multihomed test may hang =item NEXTSTEP 3.3 POSIX test failure =item Tru64 (aka Digital UNIX, aka DEC OSF/1) lib/sdbm test failure with gcc =item UNICOS/mk CC failures during Configure run =item Arrow operator and arrays =item Experimental features Threads, Unicode, 64-bit support, Lvalue subroutines, Weak references, The pseudo-hash data type, The Compiler suite, Internal implementation of file globbing, The DB module, The regular expression code constructs: =back =item Obsolete Diagnostics Character class syntax [: :] is reserved for future extensions, Ill-formed logical name |%s| in prime_env_iter, In string, @%s now must be written as \@%s, Probable precedence problem on %s, regexp too big, Use of "$$<digit>" to mean "${$}<digit>" is deprecated =item Reporting Bugs =item SEE ALSO =item HISTORY =back =head2 perl5005delta - what's new for perl5.005 =over 4 =item DESCRIPTION =item About the new versioning system =item Incompatible Changes =over 4 =item WARNING: This version is not binary compatible with Perl 5.004. =item Default installation structure has changed =item Perl Source Compatibility =item C Source Compatibility =item Binary Compatibility =item Security fixes may affect compatibility =item Relaxed new mandatory warnings introduced in 5.004 =item Licensing =back =item Core Changes =over 4 =item Threads =item Compiler =item Regular Expressions Many new and improved optimizations, Many bug fixes, New regular expression constructs, New operator for precompiled regular expressions, Other improvements, Incompatible changes =item Improved malloc() =item Quicksort is internally implemented =item Reliable signals =item Reliable stack pointers =item More generous treatment of carriage returns =item Memory leaks =item Better support for multiple interpreters =item Behavior of local() on array and hash elements is now well-defined =item C<%!> is transparently tied to the L<Errno> module =item Pseudo-hashes are supported =item C<EXPR foreach EXPR> is supported =item Keywords can be globally overridden =item C<$^E> is meaningful on Win32 =item C<foreach (1..1000000)> optimized =item C<Foo::> can be used as implicitly quoted package name =item C<exists $Foo::{Bar::}> tests existence of a package =item Better locale support =item Experimental support for 64-bit platforms =item prototype() returns useful results on builtins =item Extended support for exception handling =item Re-blessing in DESTROY() supported for chaining DESTROY() methods =item All C<printf> format conversions are handled internally =item New C<INIT> keyword =item New C<lock> keyword =item New C<qr//> operator =item C<our> is now a reserved word =item Tied arrays are now fully supported =item Tied handles support is better =item 4th argument to substr =item Negative LENGTH argument to splice =item Magic lvalues are now more magical =item <> now reads in records =back =item Supported Platforms =over 4 =item New Platforms =item Changes in existing support =back =item Modules and Pragmata =over 4 =item New Modules B, Data::Dumper, Dumpvalue, Errno, File::Spec, ExtUtils::Installed, ExtUtils::Packlist, Fatal, IPC::SysV, Test, Tie::Array, Tie::Handle, Thread, attrs, fields, re =item Changes in existing modules Benchmark, Carp, CGI, Fcntl, Math::Complex, Math::Trig, POSIX, DB_File, MakeMaker, CPAN, Cwd =back =item Utility Changes =item Documentation Changes =item New Diagnostics Ambiguous call resolved as CORE::%s(), qualify as such or use &, Bad index while coercing array into hash, Bareword "%s" refers to nonexistent package, Can't call method "%s" on an undefined value, Can't check filesystem of script "%s" for nosuid, Can't coerce array into hash, Can't goto subroutine from an eval-string, Can't localize pseudo-hash element, Can't use %%! because Errno.pm is not available, Cannot find an opnumber for "%s", Character class syntax [. .] is reserved for future extensions, Character class syntax [: :] is reserved for future extensions, Character class syntax [= =] is reserved for future extensions, %s: Eval-group in insecure regular expression, %s: Eval-group not allowed, use re 'eval', %s: Eval-group not allowed at run time, Explicit blessing to '' (assuming package main), Illegal hex digit ignored, No such array field, No such field "%s" in variable %s of type %s, Out of memory during ridiculously large request, Range iterator outside integer range, Recursive inheritance detected while looking for method '%s' %s, Reference found where even-sized list expected, Undefined value assigned to typeglob, Use of reserved word "%s" is deprecated, perl: warning: Setting locale failed =item Obsolete Diagnostics Can't mktemp(), Can't write to temp file for B<-e>: %s, Cannot open temporary file, regexp too big =item Configuration Changes =item BUGS =item SEE ALSO =item HISTORY =back =head2 perl5004delta - what's new for perl5.004 =over 4 =item DESCRIPTION =item Supported Environments =item Core Changes =over 4 =item List assignment to %ENV works =item Change to "Can't locate Foo.pm in @INC" error =item Compilation option: Binary compatibility with 5.003 =item $PERL5OPT environment variable =item Limitations on B<-M>, B<-m>, and B<-T> options =item More precise warnings =item Deprecated: Inherited C<AUTOLOAD> for non-methods =item Previously deprecated %OVERLOAD is no longer usable =item Subroutine arguments created only when they're modified =item Group vector changeable with C<$)> =item Fixed parsing of $$<digit>, &$<digit>, etc. =item Fixed localization of $<digit>, $&, etc. =item No resetting of $. on implicit close =item C<wantarray> may return undef =item C<eval EXPR> determines value of EXPR in scalar context =item Changes to tainting checks No glob() or <*>, No spawning if tainted $CDPATH, $ENV, $BASH_ENV, No spawning if tainted $TERM doesn't look like a terminal name =item New Opcode module and revised Safe module =item Embedding improvements =item Internal change: FileHandle class based on IO::* classes =item Internal change: PerlIO abstraction interface =item New and changed syntax $coderef->(PARAMS) =item New and changed builtin constants __PACKAGE__ =item New and changed builtin variables $^E, $^H, $^M =item New and changed builtin functions delete on slices, flock, printf and sprintf, keys as an lvalue, my() in Control Structures, pack() and unpack(), sysseek(), use VERSION, use Module VERSION LIST, prototype(FUNCTION), srand, $_ as Default, C<m//gc> does not reset search position on failure, C<m//x> ignores whitespace before ?*+{}, nested C<sub{}> closures work now, formats work right on changing lexicals =item New builtin methods isa(CLASS), can(METHOD), VERSION( [NEED] ) =item TIEHANDLE now supported TIEHANDLE classname, LIST, PRINT this, LIST, PRINTF this, LIST, READ this LIST, READLINE this, GETC this, DESTROY this =item Malloc enhancements -DPERL_EMERGENCY_SBRK, -DPACK_MALLOC, -DTWO_POT_OPTIMIZE =item Miscellaneous efficiency enhancements =back =item Support for More Operating Systems =over 4 =item Win32 =item Plan 9 =item QNX =item AmigaOS =back =item Pragmata use autouse MODULE => qw(sub1 sub2 sub3), use blib, use blib 'dir', use constant NAME => VALUE, use locale, use ops, use vmsish =item Modules =over 4 =item Required Updates =item Installation directories =item Module information summary =item Fcntl =item IO =item Math::Complex =item Math::Trig =item DB_File =item Net::Ping =item Object-oriented overrides for builtin operators =back =item Utility Changes =over 4 =item pod2html Sends converted HTML to standard output =item xsubpp C<void> XSUBs now default to returning nothing =back =item C Language API Changes C<gv_fetchmethod> and C<perl_call_sv>, C<perl_eval_pv>, Extended API for manipulating hashes =item Documentation Changes L<perldelta>, L<perlfaq>, L<perllocale>, L<perltoot>, L<perlapio>, L<perlmodlib>, L<perldebug>, L<perlsec> =item New Diagnostics "my" variable %s masks earlier declaration in same scope, %s argument is not a HASH element or slice, Allocation too large: %lx, Allocation too large, Applying %s to %s will act on scalar(%s), Attempt to free nonexistent shared string, Attempt to use reference as lvalue in substr, Bareword "%s" refers to nonexistent package, Can't redefine active sort subroutine %s, Can't use bareword ("%s") as %s ref while "strict refs" in use, Cannot resolve method `%s' overloading `%s' in package `%s', Constant subroutine %s redefined, Constant subroutine %s undefined, Copy method did not return a reference, Died, Exiting pseudo-block via %s, Identifier too long, Illegal character %s (carriage return), Illegal switch in PERL5OPT: %s, Integer overflow in hex number, Integer overflow in octal number, internal error: glob failed, Invalid conversion in %s: "%s", Invalid type in pack: '%s', Invalid type in unpack: '%s', Name "%s::%s" used only once: possible typo, Null picture in formline, Offset outside string, Out of memory!, Out of memory during request for %s, panic: frexp, Possible attempt to put comments in qw() list, Possible attempt to separate words with commas, Scalar value @%s{%s} better written as $%s{%s}, Stub found while resolving method `%s' overloading `%s' in %s, Too late for "B<-T>" option, untie attempted while %d inner references still exist, Unrecognized character %s, Unsupported function fork, Use of "$$<digit>" to mean "${$}<digit>" is deprecated, Value of %s can be "0"; test with defined(), Variable "%s" may be unavailable, Variable "%s" will not stay shared, Warning: something's wrong, Ill-formed logical name |%s| in prime_env_iter, Got an error from DosAllocMem, Malformed PERLLIB_PREFIX, PERL_SH_DIR too long, Process terminated by SIG%s =item BUGS =item SEE ALSO =item HISTORY =back =head2 perlexperiment - A listing of experimental features in Perl =over 4 =item DESCRIPTION =over 4 =item Current experiments Smart match (C<~~>), Pluggable keywords, Regular Expression Set Operations, Subroutine signatures, Aliasing via reference, The "const" attribute, use re 'strict';, The <:win32> IO pseudolayer, Declaring a reference to a variable, There is an C<installhtml> target in the Makefile, (Limited) Variable-length look-behind =item Accepted features 64-bit support, die accepts a reference, DB module, Weak references, Internal file glob, fork() emulation, -Dusemultiplicity -Duseithreads, Support for long doubles, The C<\N> regex character class, C<(?{code})> and C<(??{ code })>, Linux abstract Unix domain sockets, Lvalue subroutines, Backtracking control verbs, The <:pop> IO pseudolayer, C<\s> in regexp matches vertical tab, Postfix dereference syntax, Lexical subroutines, String- and number-specific bitwise operators, Alphabetic assertions, Script runs =item Removed features 5.005-style threading, perlcc, The pseudo-hash data type, GetOpt::Long Options can now take multiple values at once (experimental), Assertions, Test::Harness::Straps, C<legacy>, Lexical C<$_>, Array and hash container functions accept references, C<our> can have an experimental optional attribute C<unique> =back =item SEE ALSO =item AUTHORS =item COPYRIGHT =item LICENSE =back =head2 perlartistic - the Perl Artistic License =over 4 =item SYNOPSIS =item DESCRIPTION =item The "Artistic License" =over 4 =item Preamble =item Definitions "Package", "Standard Version", "Copyright Holder", "You", "Reasonable copying fee", "Freely Available" =item Conditions a), b), c), d), a), b), c), d) =back =back =head2 perlgpl - the GNU General Public License, version 1 =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item GNU GENERAL PUBLIC LICENSE =back =head2 perlaix - Perl version 5 on IBM AIX (UNIX) systems =over 4 =item DESCRIPTION =over 4 =item Compiling Perl 5 on AIX =item Supported Compilers =item Incompatibility with AIX Toolbox lib gdbm =item Perl 5 was successfully compiled and tested on: =item Building Dynamic Extensions on AIX =item Using Large Files with Perl =item Threaded Perl =item 64-bit Perl =item Long doubles =item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/32-bit) =item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (32-bit) =item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (threaded/64-bit) =item Recommended Options AIX 5.1/5.2/5.3/6.1 and 7.1 (64-bit) =item Compiling Perl 5 on AIX 7.1.0 =item Compiling Perl 5 on older AIX versions up to 4.3.3 =item OS level =item Building Dynamic Extensions on AIX E<lt> 5L =item The IBM ANSI C Compiler =item The usenm option =item Using GNU's gcc for building Perl =item Using Large Files with Perl E<lt> 5L =item Threaded Perl E<lt> 5L =item 64-bit Perl E<lt> 5L =item AIX 4.2 and extensions using C++ with statics =back =item AUTHORS =back =head2 perlamiga - Perl under AmigaOS 4.1 =over 4 =item NOTE =item SYNOPSIS =back =over 4 =item DESCRIPTION =over 4 =item Prerequisites for running Perl 5.22.1 under AmigaOS 4.1 B<AmigaOS 4.1 update 6 with all updates applied as of 9th October 2013>, B<newlib.library version 53.28 or greater>, B<AmigaOS SDK>, B<abc-shell> =item Starting Perl programs under AmigaOS 4.1 =item Limitations of Perl under AmigaOS 4.1 B<Nested Piped programs can crash when run from older abc-shells>, B<Incorrect or unexpected command line unescaping>, B<Starting subprocesses via open has limitations>, If you find any other limitations or bugs then let me know =back =item INSTALLATION =item Amiga Specific Modules =over 4 =item Amiga::ARexx =item Amiga::Exec =back =item BUILDING =item CHANGES B<August 2015>, Port to Perl 5.22, Add handling of NIL: to afstat(), Fix inheritance of environment variables by subprocesses, Fix exec, and exit in "forked" subprocesses, Fix issue with newlib's unlink, which could cause infinite loops, Add flock() emulation using IDOS->LockRecord thanks to Tony Cook for the suggestion, Fix issue where kill was using the wrong kind of process ID, B<27th November 2013>, Create new installation system based on installperl links and Amiga protection bits now set correctly, Pod now defaults to text, File::Spec should now recognise an Amiga style absolute path as well as an Unix style one. Relative paths must always be Unix style, B<20th November 2013>, Configured to use SDK:Local/C/perl to start standard scripts, Added Amiga::Exec module with support for Wait() and AmigaOS signal numbers, B<10th October 13> =item SEE ALSO =back =head2 perlandroid - Perl under Android =over 4 =item SYNOPSIS =item DESCRIPTION =item Cross-compilation =over 4 =item Get the Android Native Development Kit (NDK) =item Determine the architecture you'll be cross-compiling for =item Set up a standalone toolchain =item adb or ssh? =item Configure and beyond =back =item Native Builds =over 4 =item CCTools =item Termux =back =item AUTHOR =back =head2 perlbs2000 - building and installing Perl for BS2000. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item gzip on BS2000 =item bison on BS2000 =item Unpacking Perl Distribution on BS2000 =item Compiling Perl on BS2000 =item Testing Perl on BS2000 =item Installing Perl on BS2000 =item Using Perl in the Posix-Shell of BS2000 =item Using Perl in "native" BS2000 =item Floating point anomalies on BS2000 =item Using PerlIO and different encodings on ASCII and EBCDIC partitions =back =item AUTHORS =item SEE ALSO =over 4 =item Mailing list =back =item HISTORY =back =head2 perlcygwin - Perl for Cygwin =over 4 =item SYNOPSIS =item PREREQUISITES FOR COMPILING PERL ON CYGWIN =over 4 =item Cygwin = GNU+Cygnus+Windows (Don't leave UNIX without it) =item Cygwin Configuration C<PATH>, I<nroff> =back =item CONFIGURE PERL ON CYGWIN =over 4 =item Stripping Perl Binaries on Cygwin =item Optional Libraries for Perl on Cygwin C<-lcrypt>, C<-lgdbm_compat> (C<use GDBM_File>), C<-ldb> (C<use DB_File>), C<cygserver> (C<use IPC::SysV>), C<-lutil> =item Configure-time Options for Perl on Cygwin C<-Uusedl>, C<-Dusemymalloc>, C<-Uuseperlio>, C<-Dusemultiplicity>, C<-Uuse64bitint>, C<-Duselongdouble>, C<-Uuseithreads>, C<-Duselargefiles>, C<-Dmksymlinks> =item Suspicious Warnings on Cygwin Win9x and C<d_eofnblk>, Compiler/Preprocessor defines =back =item MAKE ON CYGWIN =item TEST ON CYGWIN =over 4 =item File Permissions on Cygwin =item NDBM_File and ODBM_File do not work on FAT filesystems =item C<fork()> failures in io_* tests =back =item Specific features of the Cygwin port =over 4 =item Script Portability on Cygwin Pathnames, Text/Binary, PerlIO, F<.exe>, Cygwin vs. Windows process ids, Cygwin vs. Windows errors, rebase errors on fork or system, C<chown()>, Miscellaneous =item Prebuilt methods: C<Cwd::cwd>, C<Cygwin::pid_to_winpid>, C<Cygwin::winpid_to_pid>, C<Cygwin::win_to_posix_path>, C<Cygwin::posix_to_win_path>, C<Cygwin::mount_table()>, C<Cygwin::mount_flags>, C<Cygwin::is_binmount>, C<Cygwin::sync_winenv> =back =item INSTALL PERL ON CYGWIN =item MANIFEST ON CYGWIN Documentation, Build, Configure, Make, Install, Tests, Compiled Perl Source, Compiled Module Source, Perl Modules/Scripts, Perl Module Tests =item BUGS ON CYGWIN =item AUTHORS =item HISTORY =back =head2 perldos - Perl under DOS, W31, W95. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Prerequisites for Compiling Perl on DOS DJGPP, Pthreads =item Shortcomings of Perl under DOS =item Building Perl on DOS =item Testing Perl on DOS =item Installation of Perl on DOS =back =item BUILDING AND INSTALLING MODULES ON DOS =over 4 =item Building Prerequisites for Perl on DOS =item Unpacking CPAN Modules on DOS =item Building Non-XS Modules on DOS =item Building XS Modules on DOS =back =item AUTHOR =item SEE ALSO =back =head2 perlfreebsd - Perl version 5 on FreeBSD systems =over 4 =item DESCRIPTION =over 4 =item FreeBSD core dumps from readdir_r with ithreads =item C<$^X> doesn't always contain a full path in FreeBSD =back =item AUTHOR =back =head2 perlhaiku - Perl version 5.10+ on Haiku =over 4 =item DESCRIPTION =item BUILD AND INSTALL =item KNOWN PROBLEMS =item CONTACT =back =head2 perlhpux - Perl version 5 on Hewlett-Packard Unix (HP-UX) systems =over 4 =item DESCRIPTION =over 4 =item Using perl as shipped with HP-UX =item Using perl from HP's porting centre =item Other prebuilt perl binaries =item Compiling Perl 5 on HP-UX =item PA-RISC =item PA-RISC 1.0 =item PA-RISC 1.1 =item PA-RISC 2.0 =item Portability Between PA-RISC Versions =item Itanium Processor Family (IPF) and HP-UX =item Itanium, Itanium 2 & Madison 6 =item HP-UX versions =item Building Dynamic Extensions on HP-UX =item The HP ANSI C Compiler =item The GNU C Compiler =item Using Large Files with Perl on HP-UX =item Threaded Perl on HP-UX =item 64-bit Perl on HP-UX =item Oracle on HP-UX =item GDBM and Threads on HP-UX =item NFS filesystems and utime(2) on HP-UX =item HP-UX Kernel Parameters (maxdsiz) for Compiling Perl =back =item nss_delete core dump from op/pwent or op/grent =item error: pasting ")" and "l" does not give a valid preprocessing token =item Redeclaration of "sendpath" with a different storage class specifier =item Miscellaneous =item AUTHOR =back =head2 perlhurd - Perl version 5 on Hurd =over 4 =item DESCRIPTION =over 4 =item Known Problems with Perl on Hurd =back =item AUTHOR =back =head2 perlirix - Perl version 5 on Irix systems =over 4 =item DESCRIPTION =over 4 =item Building 32-bit Perl in Irix =item Building 64-bit Perl in Irix =item About Compiler Versions of Irix =item Linker Problems in Irix =item Malloc in Irix =item Building with threads in Irix =item Irix 5.3 =back =item AUTHOR =back =head2 perllinux - Perl version 5 on Linux systems =over 4 =item DESCRIPTION =over 4 =item Deploying Perl on Linux =item Experimental Support for Sun Studio Compilers for Linux OS =back =item AUTHOR =back =head2 perlmacos - Perl under Mac OS (Classic) =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =back =head2 perlmacosx - Perl under Mac OS X =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Installation Prefix =item SDK support =item Universal Binary support =item 64-bit PPC support =item libperl and Prebinding =item Updating Apple's Perl =item Known problems =item Cocoa =back =item Starting From Scratch =item AUTHOR =item DATE =back =head2 perlnetware - Perl for NetWare =over 4 =item DESCRIPTION =item BUILD =over 4 =item Tools & SDK =item Setup SetNWBld.bat, Buildtype.bat =item Make =item Interpreter =item Extensions =back =item INSTALL =item BUILD NEW EXTENSIONS =item ACKNOWLEDGEMENTS =item AUTHORS =item DATE =back =head2 perlopenbsd - Perl version 5 on OpenBSD systems =over 4 =item DESCRIPTION =over 4 =item OpenBSD core dumps from getprotobyname_r and getservbyname_r with ithreads =back =item AUTHOR =back =head2 perlos2 - Perl under OS/2, DOS, Win0.3*, Win0.95 and WinNT. =over 4 =item SYNOPSIS =back =over 4 =item DESCRIPTION =over 4 =item Target =item Other OSes =item Prerequisites EMX, RSX, HPFS, pdksh =item Starting Perl programs under OS/2 (and DOS and...) =item Starting OS/2 (and DOS) programs under Perl =back =item Frequently asked questions =over 4 =item "It does not work" =item I cannot run external programs =item I cannot embed perl into my program, or use F<perl.dll> from my program. Is your program EMX-compiled with C<-Zmt -Zcrtdll>?, Did you use L<ExtUtils::Embed>? =item C<``> and pipe-C<open> do not work under DOS. =item Cannot start C<find.exe "pattern" file> =back =item INSTALLATION =over 4 =item Automatic binary installation C<PERL_BADLANG>, C<PERL_BADFREE>, F<Config.pm> =item Manual binary installation Perl VIO and PM executables (dynamically linked), Perl_ VIO executable (statically linked), Executables for Perl utilities, Main Perl library, Additional Perl modules, Tools to compile Perl modules, Manpages for Perl and utilities, Manpages for Perl modules, Source for Perl documentation, Perl manual in F<.INF> format, Pdksh =item B<Warning> =back =item Accessing documentation =over 4 =item OS/2 F<.INF> file =item Plain text =item Manpages =item HTML =item GNU C<info> files =item F<PDF> files =item C<LaTeX> docs =back =item BUILD =over 4 =item The short story =item Prerequisites =item Getting perl source =item Application of the patches =item Hand-editing =item Making =item Testing A lot of C<bad free>, Process terminated by SIGTERM/SIGINT, F<op/fs.t>, Z<>18, Z<>25, F<op/stat.t> =item Installing the built perl =item C<a.out>-style build =back =item Building a binary distribution =item Building custom F<.EXE> files =over 4 =item Making executables with a custom collection of statically loaded extensions =item Making executables with a custom search-paths =back =item Build FAQ =over 4 =item Some C</> became C<\> in pdksh. =item C<'errno'> - unresolved external =item Problems with tr or sed =item Some problem (forget which ;-) =item Library ... not found =item Segfault in make =item op/sprintf test failure =back =item Specific (mis)features of OS/2 port =over 4 =item C<setpriority>, C<getpriority> =item C<system()> =item C<extproc> on the first line =item Additional modules: =item Prebuilt methods: C<File::Copy::syscopy>, C<DynaLoader::mod2fname>, C<Cwd::current_drive()>, C<Cwd::sys_chdir(name)>, C<Cwd::change_drive(name)>, C<Cwd::sys_is_absolute(name)>, C<Cwd::sys_is_rooted(name)>, C<Cwd::sys_is_relative(name)>, C<Cwd::sys_cwd(name)>, C<Cwd::sys_abspath(name, dir)>, C<Cwd::extLibpath([type])>, C<Cwd::extLibpath_set( path [, type ] )>, C<OS2::Error(do_harderror,do_exception)>, C<OS2::Errors2Drive(drive)>, OS2::SysInfo(), OS2::BootDrive(), C<OS2::MorphPM(serve)>, C<OS2::UnMorphPM(serve)>, C<OS2::Serve_Messages(force)>, C<OS2::Process_Messages(force [, cnt])>, C<OS2::_control87(new,mask)>, OS2::get_control87(), C<OS2::set_control87_em(new=MCW_EM,mask=MCW_EM)>, C<OS2::DLLname([how [, \&xsub]])> =item Prebuilt variables: $OS2::emx_rev, $OS2::emx_env, $OS2::os_ver, $OS2::is_aout, $OS2::can_fork, $OS2::nsyserror =item Misfeatures =item Modifications C<popen>, C<tmpnam>, C<tmpfile>, C<ctermid>, C<stat>, C<mkdir>, C<rmdir>, C<flock> =item Identifying DLLs =item Centralized management of resources C<HAB>, C<HMQ>, Treating errors reported by OS/2 API, C<CheckOSError(expr)>, C<CheckWinError(expr)>, C<SaveWinError(expr)>, C<SaveCroakWinError(expr,die,name1,name2)>, C<WinError_2_Perl_rc>, C<FillWinError>, C<FillOSError(rc)>, Loading DLLs and ordinals in DLLs =back =item Perl flavors =over 4 =item F<perl.exe> =item F<perl_.exe> =item F<perl__.exe> =item F<perl___.exe> =item Why strange names? =item Why dynamic linking? =item Why chimera build? =back =item ENVIRONMENT =over 4 =item C<PERLLIB_PREFIX> =item C<PERL_BADLANG> =item C<PERL_BADFREE> =item C<PERL_SH_DIR> =item C<USE_PERL_FLOCK> =item C<TMP> or C<TEMP> =back =item Evolution =over 4 =item Text-mode filehandles =item Priorities =item DLL name mangling: pre 5.6.2 =item DLL name mangling: 5.6.2 and beyond Global DLLs, specific DLLs, C<BEGINLIBPATH> and C<ENDLIBPATH>, F<.> from C<LIBPATH> =item DLL forwarder generation =item Threading =item Calls to external programs =item Memory allocation =item Threads C<COND_WAIT>, F<os2.c> =back =item BUGS =back =over 4 =item AUTHOR =item SEE ALSO =back =head2 perlos390 - building and installing Perl for OS/390 and z/OS =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Tools =item Unpacking Perl distribution on OS/390 =item Setup and utilities for Perl on OS/390 =item Configure Perl on OS/390 =item Build, Test, Install Perl on OS/390 =item Build Anomalies with Perl on OS/390 =item Testing Anomalies with Perl on OS/390 =item Installation Anomalies with Perl on OS/390 =item Usage Hints for Perl on OS/390 =item Floating Point Anomalies with Perl on OS/390 =item Modules and Extensions for Perl on OS/390 =back =item AUTHORS =item SEE ALSO =over 4 =item Mailing list for Perl on OS/390 =back =item HISTORY =back =head2 perlos400 - Perl version 5 on OS/400 =over 4 =item DESCRIPTION =over 4 =item Compiling Perl for OS/400 PASE =item Installing Perl in OS/400 PASE =item Using Perl in OS/400 PASE =item Known Problems =item Perl on ILE =back =item AUTHORS =back =head2 perlplan9 - Plan 9-specific documentation for Perl =over 4 =item DESCRIPTION =over 4 =item Invoking Perl =item What's in Plan 9 Perl =item What's not in Plan 9 Perl =item Perl5 Functions not currently supported in Plan 9 Perl =item Signals in Plan 9 Perl =back =item COMPILING AND INSTALLING PERL ON PLAN 9 =over 4 =item Installing Perl Documentation on Plan 9 =back =item BUGS =item Revision date =item AUTHOR =back =head2 perlqnx - Perl version 5 on QNX =over 4 =item DESCRIPTION =over 4 =item Required Software for Compiling Perl on QNX4 /bin/sh, ar, nm, cpp, make =item Outstanding Issues with Perl on QNX4 =item QNX auxiliary files qnx/ar, qnx/cpp =item Outstanding issues with perl under QNX6 =item Cross-compilation =back =item AUTHOR =back =head2 perlriscos - Perl version 5 for RISC OS =over 4 =item DESCRIPTION =item BUILD =item AUTHOR =back =head2 perlsolaris - Perl version 5 on Solaris systems =over 4 =item DESCRIPTION =over 4 =item Solaris Version Numbers. =back =item RESOURCES Solaris FAQ, Precompiled Binaries, Solaris Documentation =item SETTING UP =over 4 =item File Extraction Problems on Solaris. =item Compiler and Related Tools on Solaris. =item Environment for Compiling perl on Solaris =back =item RUN CONFIGURE. =over 4 =item 64-bit perl on Solaris. =item Threads in perl on Solaris. =item Malloc Issues with perl on Solaris. =back =item MAKE PROBLEMS. Dynamic Loading Problems With GNU as and GNU ld, ld.so.1: ./perl: fatal: relocation error:, dlopen: stub interception failed, #error "No DATAMODEL_NATIVE specified", sh: ar: not found =item MAKE TEST =over 4 =item op/stat.t test 4 in Solaris =item nss_delete core dump from op/pwent or op/grent =back =item CROSS-COMPILATION =item PREBUILT BINARIES OF PERL FOR SOLARIS. =item RUNTIME ISSUES FOR PERL ON SOLARIS. =over 4 =item Limits on Numbers of Open Files on Solaris. =back =item SOLARIS-SPECIFIC MODULES. =item SOLARIS-SPECIFIC PROBLEMS WITH MODULES. =over 4 =item Proc::ProcessTable on Solaris =item BSD::Resource on Solaris =item Net::SSLeay on Solaris =back =item SunOS 4.x =item AUTHOR =back =head2 perlsymbian - Perl version 5 on Symbian OS =over 4 =item DESCRIPTION =over 4 =item Compiling Perl on Symbian =item Compilation problems =item PerlApp =item sisify.pl =item Using Perl in Symbian =back =item TO DO =item WARNING =item NOTE =item AUTHOR =item COPYRIGHT =item LICENSE =item HISTORY =back =head2 perlsynology - Perl 5 on Synology DSM systems =over 4 =item DESCRIPTION =over 4 =item Setting up the build environment =item Compiling Perl 5 =item Known problems Error message "No error definitions found", F<ext/DynaLoader/t/DynaLoader.t> =item Smoke testing Perl 5 =item Adding libraries =back =item REVISION =item AUTHOR =back =head2 perltru64 - Perl version 5 on Tru64 (formerly known as Digital UNIX formerly known as DEC OSF/1) systems =over 4 =item DESCRIPTION =over 4 =item Compiling Perl 5 on Tru64 =item Using Large Files with Perl on Tru64 =item Threaded Perl on Tru64 =item Long Doubles on Tru64 =item DB_File tests failing on Tru64 =item 64-bit Perl on Tru64 =item Warnings about floating-point overflow when compiling Perl on Tru64 =back =item Testing Perl on Tru64 =item ext/ODBM_File/odbm Test Failing With Static Builds =item Perl Fails Because Of Unresolved Symbol sockatmark =item read_cur_obj_info: bad file magic number =item AUTHOR =back =head2 perlvms - VMS-specific documentation for Perl =over 4 =item DESCRIPTION =item Installation =item Organization of Perl Images =over 4 =item Core Images =item Perl Extensions =item Installing static extensions =item Installing dynamic extensions =back =item File specifications =over 4 =item Syntax =item Filename Case =item Symbolic Links =item Wildcard expansion =item Pipes =back =item PERL5LIB and PERLLIB =item The Perl Forked Debugger =item PERL_VMS_EXCEPTION_DEBUG =item Command line =over 4 =item I/O redirection and backgrounding =item Command line switches -i, -S, -u =back =item Perl functions File tests, backticks, binmode FILEHANDLE, crypt PLAINTEXT, USER, die, dump, exec LIST, fork, getpwent, getpwnam, getpwuid, gmtime, kill, qx//, select (system call), stat EXPR, system LIST, time, times, unlink LIST, utime LIST, waitpid PID,FLAGS =item Perl variables %ENV, CRTL_ENV, CLISYM_[LOCAL], Any other string, $!, $^E, $?, $| =item Standard modules with VMS-specific differences =over 4 =item SDBM_File =back =item Revision date =item AUTHOR =back =head2 perlvos - Perl for Stratus OpenVOS =over 4 =item SYNOPSIS =item BUILDING PERL FOR OPENVOS =item INSTALLING PERL IN OPENVOS =item USING PERL IN OPENVOS =over 4 =item Restrictions of Perl on OpenVOS =back =item TEST STATUS =item SUPPORT STATUS =item AUTHOR =item LAST UPDATE =back =head2 perlwin32 - Perl under Windows =over 4 =item SYNOPSIS =item DESCRIPTION L<http://mingw.org>, L<http://mingw-w64.org> =over 4 =item Setting Up Perl on Windows Make, Command Shell, Microsoft Visual C++, Microsoft Visual C++ 2008-2019 Express/Community Edition, Microsoft Visual C++ 2005 Express Edition, Microsoft Visual C++ Toolkit 2003, Microsoft Platform SDK 64-bit Compiler, GCC, Intel C++ Compiler =item Building =item Testing Perl on Windows =item Installation of Perl on Windows =item Usage Hints for Perl on Windows Environment Variables, File Globbing, Using perl from the command line, Building Extensions, Command-line Wildcard Expansion, Notes on 64-bit Windows =item Running Perl Scripts =item Miscellaneous Things =back =item BUGS AND CAVEATS =item ACKNOWLEDGEMENTS =item AUTHORS Gary Ng E<lt>71564.1743@CompuServe.COME<gt>, Gurusamy Sarathy E<lt>gsar@activestate.comE<gt>, Nick Ing-Simmons E<lt>nick@ing-simmons.netE<gt>, Jan Dubois E<lt>jand@activestate.comE<gt>, Steve Hay E<lt>steve.m.hay@googlemail.comE<gt> =item SEE ALSO =item HISTORY =back =head2 perlboot - Links to information on object-oriented programming in Perl =over 4 =item DESCRIPTION =back =head2 perlbot - Links to information on object-oriented programming in Perl =over 4 =item DESCRIPTION =back =head2 perlrepository - Links to current information on the Perl source repository =over 4 =item DESCRIPTION =back =head2 perltodo - Link to the Perl to-do list =over 4 =item DESCRIPTION =back =head2 perltooc - Links to information on object-oriented programming in Perl =over 4 =item DESCRIPTION =back =head2 perltoot - Links to information on object-oriented programming in Perl =over 4 =item DESCRIPTION =back =head1 PRAGMA DOCUMENTATION =head2 attributes - get/set subroutine or variable attributes =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item What C<import> does =item Built-in Attributes lvalue, method, prototype(..), const, shared =item Available Subroutines get, reftype =item Package-specific Attribute Handling FETCH_I<type>_ATTRIBUTES, MODIFY_I<type>_ATTRIBUTES =item Syntax of Attribute Lists =back =item EXPORTS =over 4 =item Default exports =item Available exports =item Export tags defined =back =item EXAMPLES =item MORE EXAMPLES =item SEE ALSO =back =head2 autodie - Replace functions with ones that succeed or die with lexical scope =over 4 =item SYNOPSIS =item DESCRIPTION =item EXCEPTIONS =item CATEGORIES =item FUNCTION SPECIFIC NOTES =over 4 =item print =item flock =item system/exec =back =item GOTCHAS =item DIAGNOSTICS :void cannot be used with lexical scope, No user hints defined for %s =item Tips and Tricks =over 4 =item Importing autodie into another namespace than "caller" =back =item BUGS =over 4 =item autodie and string eval =item REPORTING BUGS =back =item FEEDBACK =item AUTHOR =item LICENSE =item SEE ALSO =item ACKNOWLEDGEMENTS =back =head2 autodie::Scope::Guard - Wrapper class for calling subs at end of scope =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods =back =item AUTHOR =item LICENSE =back =head2 autodie::Scope::GuardStack - Hook stack for managing scopes via %^H =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods =back =item AUTHOR =item LICENSE =back =head2 autodie::Util - Internal Utility subroutines for autodie and Fatal =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods =back =item AUTHOR =item LICENSE =back =head2 autodie::exception - Exceptions from autodying functions. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Common Methods =back =back =over 4 =item Advanced methods =back =over 4 =item SEE ALSO =item LICENSE =item AUTHOR =back =head2 autodie::exception::system - Exceptions from autodying system(). =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item stringify =back =over 4 =item LICENSE =item AUTHOR =back =head2 autodie::hints - Provide hints about user subroutines to autodie =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Introduction =item What are hints? =item Example hints =back =item Manually setting hints from within your program =item Adding hints to your module =item Insisting on hints =back =over 4 =item Diagnostics Attempts to set_hints_for unidentifiable subroutine, fail hints cannot be provided with either scalar or list hints for %s, %s hint missing for %s =item ACKNOWLEDGEMENTS =item AUTHOR =item LICENSE =item SEE ALSO =back =head2 autodie::skip - Skip a package when throwing autodie exceptions =over 4 =item SYNPOSIS =item DESCRIPTION =item AUTHOR =item LICENSE =item SEE ALSO =back =head2 autouse - postpone load of modules until a function is used =over 4 =item SYNOPSIS =item DESCRIPTION =item WARNING =item AUTHOR =item SEE ALSO =back =head2 base - Establish an ISA relationship with base classes at compile time =over 4 =item SYNOPSIS =item DESCRIPTION =item DIAGNOSTICS Base class package "%s" is empty, Class 'Foo' tried to inherit from itself =item HISTORY =item CAVEATS =item SEE ALSO =back =head2 bigint - Transparent BigInteger support for Perl =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item use integer vs. use bigint =item Options a or accuracy, p or precision, t or trace, hex, oct, l, lib, try or only, v or version =item Math Library =item Internal Format =item Sign =item Method calls =item Methods inf(), NaN(), e, PI, bexp(), bpi(), upgrade(), in_effect() =back =item CAVEATS Operator vs literal overloading, ranges, in_effect(), hex()/oct() =item MODULES USED =item EXAMPLES =item BUGS =item SUPPORT =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 bignum - Transparent BigNumber support for Perl =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Options a or accuracy, p or precision, t or trace, l or lib, hex, oct, v or version =item Methods =item Caveats inf(), NaN(), e, PI(), bexp(), bpi(), upgrade(), in_effect() =item Math Library =item INTERNAL FORMAT =item SIGN =back =item CAVEATS Operator vs literal overloading, in_effect(), hex()/oct() =item MODULES USED =item EXAMPLES =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, Search CPAN, CPAN Testers Matrix =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 bigrat - Transparent BigNumber/BigRational support for Perl =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Modules Used =item Math Library =item Sign =item Methods inf(), NaN(), e, PI, bexp(), bpi(), upgrade(), in_effect() =item MATH LIBRARY =item Caveat =item Options a or accuracy, p or precision, t or trace, l or lib, hex, oct, v or version =back =item CAVEATS Operator vs literal overloading, in_effect(), hex()/oct() =item EXAMPLES =item BUGS =item SUPPORT =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 blib - Use MakeMaker's uninstalled version of a package =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item AUTHOR =back =head2 bytes - Perl pragma to expose the individual bytes of characters =over 4 =item NOTICE =item SYNOPSIS =item DESCRIPTION =item LIMITATIONS =item SEE ALSO =back =head2 charnames - access to Unicode character names and named character sequences; also define character names =over 4 =item SYNOPSIS =item DESCRIPTION =item LOOSE MATCHES =item ALIASES =item CUSTOM ALIASES =item charnames::string_vianame(I<name>) =item charnames::vianame(I<name>) =item charnames::viacode(I<code>) =item CUSTOM TRANSLATORS =item BUGS =back =head2 constant - Perl pragma to declare constants =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTES =over 4 =item List constants =item Defining multiple constants at once =item Magic constants =back =item TECHNICAL NOTES =item CAVEATS =item SEE ALSO =item BUGS =item AUTHORS =item COPYRIGHT & LICENSE =back =head2 deprecate - Perl pragma for deprecating the inclusion of a module in core =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Important Caveat =back =item EXPORT =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 diagnostics, splain - produce verbose warning diagnostics =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item The C<diagnostics> Pragma =item The I<splain> Program =back =item EXAMPLES =item INTERNALS =item BUGS =item AUTHOR =back =head2 encoding - allows you to write your script in non-ASCII and non-UTF-8 =over 4 =item WARNING =item SYNOPSIS =item DESCRIPTION C<use encoding ['I<ENCNAME>'] ;>, C<use encoding I<ENCNAME>, Filter=E<gt>1;>, C<no encoding;> =item OPTIONS =over 4 =item Setting C<STDIN> and/or C<STDOUT> individually =item The C<:locale> sub-pragma =back =item CAVEATS =over 4 =item SIDE EFFECTS =item DO NOT MIX MULTIPLE ENCODINGS =item Prior to Perl v5.22 =item Prior to Encode version 1.87 =item Prior to Perl v5.8.1 "NON-EUC" doublebyte encodings, C<tr///>, Legend of characters above =back =item EXAMPLE - Greekperl =item BUGS Thread safety, Can't be used by more than one module in a single program, Other modules using C<STDIN> and C<STDOUT> get the encoded stream, literals in regex that are longer than 127 bytes, EBCDIC, C<format>, See also L</CAVEATS> =item HISTORY =item SEE ALSO =back =head2 encoding::warnings - Warn on implicit encoding conversions =over 4 =item VERSION =item NOTICE =item SYNOPSIS =item DESCRIPTION =over 4 =item Overview of the problem =item Detecting the problem =item Solving the problem Upgrade both sides to unicode-strings, Downgrade both sides to byte-strings, Specify the encoding for implicit byte-string upgrading, PerlIO layers for B<STDIN> and B<STDOUT>, Literal conversions, Implicit upgrading for byte-strings =back =item CAVEATS =back =over 4 =item SEE ALSO =item AUTHORS =item COPYRIGHT =back =head2 experimental - Experimental features made easy =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION C<array_base> - allow the use of C<$[> to change the starting index of C<@array>, C<autoderef> - allow push, each, keys, and other built-ins on references, C<bitwise> - allow the new stringwise bit operators, C<const_attr> - allow the :const attribute on subs, C<lexical_topic> - allow the use of lexical C<$_> via C<my $_>, C<lexical_subs> - allow the use of lexical subroutines, C<postderef> - allow the use of postfix dereferencing expressions, including in interpolating strings, C<re_strict> - enables strict mode in regular expressions, C<refaliasing> - allow aliasing via C<\$x = \$y>, C<regex_sets> - allow extended bracketed character classes in regexps, C<signatures> - allow subroutine signatures (for named arguments), C<smartmatch> - allow the use of C<~~>, C<switch> - allow the use of C<~~>, given, and when, C<win32_perlio> - allows the use of the :win32 IO layer =over 4 =item Ordering matters =item Disclaimer =back =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 feature - Perl pragma to enable new features =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Lexical effect =item C<no feature> =back =item AVAILABLE FEATURES =over 4 =item The 'say' feature =item The 'state' feature =item The 'switch' feature =item The 'unicode_strings' feature =item The 'unicode_eval' and 'evalbytes' features =item The 'current_sub' feature =item The 'array_base' feature =item The 'fc' feature =item The 'lexical_subs' feature =item The 'postderef' and 'postderef_qq' features =item The 'signatures' feature =item The 'refaliasing' feature =item The 'bitwise' feature =item The 'declared_refs' feature =item The 'isa' feature =item The 'indirect' feature =back =item FEATURE BUNDLES =item IMPLICIT LOADING =back =head2 fields - compile-time class fields =over 4 =item SYNOPSIS =item DESCRIPTION new, phash =item SEE ALSO =back =head2 filetest - Perl pragma to control the filetest permission operators =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Consider this carefully =item The "access" sub-pragma =item Limitation with regard to C<_> =back =back =head2 if - C<use> a Perl module if a condition holds =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item C<use if> =item C<no if> =back =item BUGS =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENCE =back =head2 integer - Perl pragma to use integer arithmetic instead of floating point =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 less - perl pragma to request less of something =over 4 =item SYNOPSIS =item DESCRIPTION =item FOR MODULE AUTHORS =over 4 =item C<< BOOLEAN = less->of( FEATURE ) >> =item C<< FEATURES = less->of() >> =back =item CAVEATS This probably does nothing, This works only on 5.10+ =back =head2 lib - manipulate @INC at compile time =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Adding directories to @INC =item Deleting directories from @INC =item Restoring original @INC =back =item CAVEATS =item NOTES =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 locale - Perl pragma to use or avoid POSIX locales for built-in operations =over 4 =item WARNING =item SYNOPSIS =item DESCRIPTION =back =head2 mro - Method Resolution Order =over 4 =item SYNOPSIS =item DESCRIPTION =item OVERVIEW =item The C3 MRO =over 4 =item What is C3? =item How does C3 work =back =item Functions =over 4 =item mro::get_linear_isa($classname[, $type]) =item mro::set_mro ($classname, $type) =item mro::get_mro($classname) =item mro::get_isarev($classname) =item mro::is_universal($classname) =item mro::invalidate_all_method_caches() =item mro::method_changed_in($classname) =item mro::get_pkg_gen($classname) =item next::method =item next::can =item maybe::next::method =back =item SEE ALSO =over 4 =item The original Dylan paper L<http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.19.3910&rep=rep1 &type=pdf> =item Python 2.3 MRO L<https://www.python.org/download/releases/2.3/mro/> =item Class::C3 L<Class::C3> =back =item AUTHOR =back =head2 ok - Alternative to Test::More::use_ok =over 4 =item SYNOPSIS =item DESCRIPTION =item CC0 1.0 Universal =back =head2 open - perl pragma to set default PerlIO layers for input and output =over 4 =item SYNOPSIS =item DESCRIPTION =item IMPLEMENTATION DETAILS =item SEE ALSO =back =head2 ops - Perl pragma to restrict unsafe operations when compiling =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 overload - Package for overloading Perl operations =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Fundamentals =item Overloadable Operations C<not>, C<neg>, C<++>, C<-->, I<Assignments>, I<Non-mutators with a mutator variant>, C<int>, I<String, numeric, boolean, and regexp conversions>, I<Iteration>, I<File tests>, I<Matching>, I<Dereferencing>, I<Special> =item Magic Autogeneration =item Special Keys for C<use overload> defined, but FALSE, C<undef>, TRUE =item How Perl Chooses an Operator Implementation =item Losing Overloading =item Inheritance and Overloading Method names in the C<use overload> directive, Overloading of an operation is inherited by derived classes =item Run-time Overloading =item Public Functions overload::StrVal(arg), overload::Overloaded(arg), overload::Method(obj,op) =item Overloading Constants integer, float, binary, q, qr =back =item IMPLEMENTATION =item COOKBOOK =over 4 =item Two-face Scalars =item Two-face References =item Symbolic Calculator =item I<Really> Symbolic Calculator =back =item AUTHOR =item SEE ALSO =item DIAGNOSTICS Odd number of arguments for overload::constant, '%s' is not an overloadable type, '%s' is not a code reference, overload arg '%s' is invalid =item BUGS AND PITFALLS =back =head2 overloading - perl pragma to lexically control overloading =over 4 =item SYNOPSIS =item DESCRIPTION C<no overloading>, C<no overloading @ops>, C<use overloading>, C<use overloading @ops> =back =head2 parent - Establish an ISA relationship with base classes at compile time =over 4 =item SYNOPSIS =item DESCRIPTION =item HISTORY =item CAVEATS =item SEE ALSO L<base>, L<parent::versioned> =item AUTHORS AND CONTRIBUTORS =item MAINTAINER =item LICENSE =back =head2 re - Perl pragma to alter regular expression behaviour =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item 'taint' mode =item 'eval' mode =item 'strict' mode =item '/flags' mode =item 'debug' mode =item 'Debug' mode Compile related options, COMPILE, PARSE, OPTIMISE, TRIEC, DUMP, FLAGS, TEST, Execute related options, EXECUTE, MATCH, TRIEE, INTUIT, Extra debugging options, EXTRA, BUFFERS, TRIEM, STATE, STACK, GPOS, OPTIMISEM, OFFSETS, OFFSETSDBG, DUMP_PRE_OPTIMIZE, WILDCARD, Other useful flags, ALL, All, MORE, More =item Exportable Functions is_regexp($ref), regexp_pattern($ref), regmust($ref), regname($name,$all), regnames($all), regnames_count() =back =item SEE ALSO =back =head2 sigtrap - Perl pragma to enable simple signal handling =over 4 =item SYNOPSIS =item DESCRIPTION =item OPTIONS =over 4 =item SIGNAL HANDLERS B<stack-trace>, B<die>, B<handler> I<your-handler> =item SIGNAL LISTS B<normal-signals>, B<error-signals>, B<old-interface-signals> =item OTHER B<untrapped>, B<any>, I<signal>, I<number> =back =item EXAMPLES =back =head2 sort - perl pragma to control sort() behaviour =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEATS =back =head2 strict - Perl pragma to restrict unsafe constructs =over 4 =item SYNOPSIS =item DESCRIPTION C<strict refs>, C<strict vars>, C<strict subs> =item HISTORY =back =head2 subs - Perl pragma to predeclare subroutine names =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 threads - Perl interpreter-based threads =over 4 =item VERSION =item WARNING =item SYNOPSIS =item DESCRIPTION $thr = threads->create(FUNCTION, ARGS), $thr->join(), $thr->detach(), threads->detach(), threads->self(), $thr->tid(), threads->tid(), "$thr", threads->object($tid), threads->yield(), threads->list(), threads->list(threads::all), threads->list(threads::running), threads->list(threads::joinable), $thr1->equal($thr2), async BLOCK;, $thr->error(), $thr->_handle(), threads->_handle() =item EXITING A THREAD threads->exit(), threads->exit(status), die(), exit(status), use threads 'exit' => 'threads_only', threads->create({'exit' => 'thread_only'}, ...), $thr->set_thread_exit_only(boolean), threads->set_thread_exit_only(boolean) =item THREAD STATE $thr->is_running(), $thr->is_joinable(), $thr->is_detached(), threads->is_detached() =item THREAD CONTEXT =over 4 =item Explicit context =item Implicit context =item $thr->wantarray() =item threads->wantarray() =back =item THREAD STACK SIZE threads->get_stack_size();, $size = $thr->get_stack_size();, $old_size = threads->set_stack_size($new_size);, use threads ('stack_size' => VALUE);, $ENV{'PERL5_ITHREADS_STACK_SIZE'}, threads->create({'stack_size' => VALUE}, FUNCTION, ARGS), $thr2 = $thr1->create(FUNCTION, ARGS) =item THREAD SIGNALLING $thr->kill('SIG...'); =item WARNINGS Perl exited with active threads:, Thread creation failed: pthread_create returned #, Thread # terminated abnormally: .., Using minimum thread stack size of #, Thread creation failed: pthread_attr_setstacksize(I<SIZE>) returned 22 =item ERRORS This Perl not built to support threads, Cannot change stack size of an existing thread, Cannot signal threads without safe signals, Unrecognized signal name: .. =item BUGS AND LIMITATIONS Thread-safe modules, Using non-thread-safe modules, Memory consumption, Current working directory, Locales, Environment variables, Catching signals, Parent-child threads, Unsafe signals, Perl has been built with C<PERL_OLD_SIGNALS> (see C<perl -V>), The environment variable C<PERL_SIGNALS> is set to C<unsafe> (see L<perlrun/"PERL_SIGNALS">), The module L<Perl::Unsafe::Signals> is used, Identity of objects returned from threads, Returning blessed objects from threads, END blocks in threads, Open directory handles, Detached threads and global destruction, Perl Bugs and the CPAN Version of L<threads> =item REQUIREMENTS =item SEE ALSO =item AUTHOR =item LICENSE =item ACKNOWLEDGEMENTS =back =head2 threads::shared - Perl extension for sharing data structures between threads =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item EXPORT =item FUNCTIONS share VARIABLE, shared_clone REF, is_shared VARIABLE, lock VARIABLE, cond_wait VARIABLE, cond_wait CONDVAR, LOCKVAR, cond_timedwait VARIABLE, ABS_TIMEOUT, cond_timedwait CONDVAR, ABS_TIMEOUT, LOCKVAR, cond_signal VARIABLE, cond_broadcast VARIABLE =item OBJECTS =item NOTES =item WARNINGS cond_broadcast() called on unlocked variable, cond_signal() called on unlocked variable =item BUGS AND LIMITATIONS =item SEE ALSO =item AUTHOR =item LICENSE =back =head2 utf8 - Perl pragma to enable/disable UTF-8 (or UTF-EBCDIC) in source code =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Utility functions C<$num_octets = utf8::upgrade($string)>, C<$success = utf8::downgrade($string[, $fail_ok])>, C<utf8::encode($string)>, C<$success = utf8::decode($string)>, C<$unicode = utf8::native_to_unicode($code_point)>, C<$native = utf8::unicode_to_native($code_point)>, C<$flag = utf8::is_utf8($string)>, C<$flag = utf8::valid($string)> =back =item BUGS =item SEE ALSO =back =head2 vars - Perl pragma to predeclare global variable names =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 version - Perl extension for Version Objects =over 4 =item SYNOPSIS =item DESCRIPTION =item TYPES OF VERSION OBJECTS Decimal Versions, Dotted Decimal Versions =item DECLARING VERSIONS =over 4 =item How to convert a module from decimal to dotted-decimal =item How to C<declare()> a dotted-decimal version =back =item PARSING AND COMPARING VERSIONS =over 4 =item How to C<parse()> a version =item How to check for a legal version string C<is_lax()>, C<is_strict()> =item How to compare version objects =back =item OBJECT METHODS =over 4 =item is_alpha() =item is_qv() =item normal() =item numify() =item stringify() =back =item EXPORTED FUNCTIONS =over 4 =item qv() =item is_lax() =item is_strict() =back =item AUTHOR =item SEE ALSO =back =head2 version::Internals - Perl extension for Version Objects =over 4 =item DESCRIPTION =item WHAT IS A VERSION? Decimal versions, Dotted-Decimal versions =over 4 =item Decimal Versions =item Dotted-Decimal Versions =item Alpha Versions =item Regular Expressions for Version Parsing C<$version::LAX>, C<$version::STRICT>, v1.234.5 =back =item IMPLEMENTATION DETAILS =over 4 =item Equivalence between Decimal and Dotted-Decimal Versions =item Quoting Rules =item What about v-strings? =item Version Object Internals original, qv, alpha, version =item Replacement UNIVERSAL::VERSION =back =item USAGE DETAILS =over 4 =item Using modules that use version.pm Decimal versions always work, Dotted-Decimal version work sometimes =item Object Methods new(), qv(), Normal Form, Numification, Stringification, Comparison operators, Logical Operators =back =item AUTHOR =item SEE ALSO =back =head2 vmsish - Perl pragma to control VMS-specific language features =over 4 =item SYNOPSIS =item DESCRIPTION C<vmsish status>, C<vmsish exit>, C<vmsish time>, C<vmsish hushed> =back =head2 warnings - Perl pragma to control optional warnings =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Default Warnings and Optional Warnings =item What's wrong with B<-w> and C<$^W> =item Controlling Warnings from the Command Line B<-w> X<-w>, B<-W> X<-W>, B<-X> X<-X> =item Backward Compatibility =item Category Hierarchy X<warning, categories> =item Fatal Warnings X<warning, fatal> =item Reporting Warnings from a Module X<warning, reporting> X<warning, registering> =back =item FUNCTIONS use warnings::register, warnings::enabled(), warnings::enabled($category), warnings::enabled($object), warnings::enabled_at_level($category, $level), warnings::fatal_enabled(), warnings::fatal_enabled($category), warnings::fatal_enabled($object), warnings::fatal_enabled_at_level($category, $level), warnings::warn($message), warnings::warn($category, $message), warnings::warn($object, $message), warnings::warn_at_level($category, $level, $message), warnings::warnif($message), warnings::warnif($category, $message), warnings::warnif($object, $message), warnings::warnif_at_level($category, $level, $message), warnings::register_categories(@names) =back =head2 warnings::register - warnings import function =over 4 =item SYNOPSIS =item DESCRIPTION =back =head1 MODULE DOCUMENTATION =head2 AnyDBM_File - provide framework for multiple DBMs =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item DBM Comparisons [0], [1], [2], [3] =back =item SEE ALSO =back =head2 App::Cpan - easily interact with CPAN from the command line =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Options -a, -A module [ module ... ], -c module, -C module [ module ... ], -D module [ module ... ], -f, -F, -g module [ module ... ], -G module [ module ... ], -h, -i module [ module ... ], -I, -j Config.pm, -J, -l, -L author [ author ... ], -m, -M mirror1,mirror2,.., -n, -O, -p, -P, -r, -s, -t module [ module ... ], -T, -u, -v, -V, -w, -x module [ module ... ], -I<X> =item Examples =item Environment variables NONINTERACTIVE_TESTING, PERL_MM_USE_DEFAULT, CPAN_OPTS, CPANSCRIPT_LOGLEVEL, GIT_COMMAND =item Methods =back =back run() =over 4 =item EXIT VALUES =item TO DO =item BUGS =item SEE ALSO =item SOURCE AVAILABILITY =item CREDITS =item AUTHOR =item COPYRIGHT =back =head2 App::Prove - Implements the C<prove> command. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Attributes C<archive>, C<argv>, C<backwards>, C<blib>, C<color>, C<directives>, C<dry>, C<exec>, C<extensions>, C<failures>, C<comments>, C<formatter>, C<harness>, C<ignore_exit>, C<includes>, C<jobs>, C<lib>, C<merge>, C<modules>, C<parse>, C<plugins>, C<quiet>, C<really_quiet>, C<recurse>, C<rules>, C<show_count>, C<show_help>, C<show_man>, C<show_version>, C<shuffle>, C<state>, C<state_class>, C<taint_fail>, C<taint_warn>, C<test_args>, C<timer>, C<verbose>, C<warnings_fail>, C<warnings_warn>, C<tapversion>, C<trap> =back =over 4 =item PLUGINS =over 4 =item Sample Plugin =back =item SEE ALSO =back =head2 App::Prove::State - State storage for the C<prove> command. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods C<store>, C<extensions> (optional), C<result_class> (optional) =back =back =over 4 =item C<result_class> =back =over 4 =item C<extensions> =back =over 4 =item C<results> =back =over 4 =item C<commit> =back =over 4 =item Instance Methods C<last>, C<failed>, C<passed>, C<all>, C<hot>, C<todo>, C<slow>, C<fast>, C<new>, C<old>, C<save> =back =head2 App::Prove::State::Result - Individual test suite results. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods =back =back =over 4 =item C<state_version> =back =over 4 =item C<test_class> =back =head2 App::Prove::State::Result::Test - Individual test results. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =head2 Archive::Tar - module for manipulations of tar archives =over 4 =item SYNOPSIS =item DESCRIPTION =item Object Methods =over 4 =item Archive::Tar->new( [$file, $compressed] ) =back =back =over 4 =item $tar->read ( $filename|$handle, [$compressed, {opt => 'val'}] ) limit, filter, md5, extract =back =over 4 =item $tar->contains_file( $filename ) =back =over 4 =item $tar->extract( [@filenames] ) =back =over 4 =item $tar->extract_file( $file, [$extract_path] ) =back =over 4 =item $tar->list_files( [\@properties] ) =back =over 4 =item $tar->get_files( [@filenames] ) =back =over 4 =item $tar->get_content( $file ) =back =over 4 =item $tar->replace_content( $file, $content ) =back =over 4 =item $tar->rename( $file, $new_name ) =back =over 4 =item $tar->chmod( $file, $mode ) =back =over 4 =item $tar->chown( $file, $uname [, $gname] ) =back =over 4 =item $tar->remove (@filenamelist) =back =over 4 =item $tar->clear =back =over 4 =item $tar->write ( [$file, $compressed, $prefix] ) =back =over 4 =item $tar->add_files( @filenamelist ) =back =over 4 =item $tar->add_data ( $filename, $data, [$opthashref] ) FILE, HARDLINK, SYMLINK, CHARDEV, BLOCKDEV, DIR, FIFO, SOCKET =back =over 4 =item $tar->error( [$BOOL] ) =back =over 4 =item $tar->setcwd( $cwd ); =back =over 4 =item Class Methods =over 4 =item Archive::Tar->create_archive($file, $compressed, @filelist) =back =back =over 4 =item Archive::Tar->iter( $filename, [ $compressed, {opt => $val} ] ) =back =over 4 =item Archive::Tar->list_archive($file, $compressed, [\@properties]) =back =over 4 =item Archive::Tar->extract_archive($file, $compressed) =back =over 4 =item $bool = Archive::Tar->has_io_string =back =over 4 =item $bool = Archive::Tar->has_perlio =back =over 4 =item $bool = Archive::Tar->has_zlib_support =back =over 4 =item $bool = Archive::Tar->has_bzip2_support =back =over 4 =item $bool = Archive::Tar->has_xz_support =back =over 4 =item Archive::Tar->can_handle_compressed_files =back =over 4 =item GLOBAL VARIABLES =over 4 =item $Archive::Tar::FOLLOW_SYMLINK =item $Archive::Tar::CHOWN =item $Archive::Tar::CHMOD =item $Archive::Tar::SAME_PERMISSIONS =item $Archive::Tar::DO_NOT_USE_PREFIX =item $Archive::Tar::DEBUG =item $Archive::Tar::WARN =item $Archive::Tar::error =item $Archive::Tar::INSECURE_EXTRACT_MODE =item $Archive::Tar::HAS_PERLIO =item $Archive::Tar::HAS_IO_STRING =item $Archive::Tar::ZERO_PAD_NUMBERS =item Tuning the way RESOLVE_SYMLINK will works =back =back =over 4 =item FAQ What's the minimum perl version required to run Archive::Tar?, Isn't Archive::Tar slow?, Isn't Archive::Tar heavier on memory than /bin/tar?, Can you lazy-load data instead?, How much memory will an X kb tar file need?, What do you do with unsupported filetypes in an archive?, I'm using WinZip, or some other non-POSIX client, and files are not being extracted properly!, How do I extract only files that have property X from an archive?, How do I access .tar.Z files?, How do I handle Unicode strings? =item CAVEATS =item TODO Check if passed in handles are open for read/write, Allow archives to be passed in as string, Facilitate processing an opened filehandle of a compressed archive =item SEE ALSO The GNU tar specification, The PAX format specification, A comparison of GNU and POSIX tar standards; C<http://www.delorie.com/gnu/docs/tar/tar_114.html>, GNU tar intends to switch to POSIX compatibility, A Comparison between various tar implementations =item AUTHOR =item ACKNOWLEDGEMENTS =item COPYRIGHT =back =head2 Archive::Tar::File - a subclass for in-memory extracted file from Archive::Tar =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Accessors name, mode, uid, gid, size, mtime, chksum, type, linkname, magic, version, uname, gname, devmajor, devminor, prefix, raw =back =item Methods =over 4 =item Archive::Tar::File->new( file => $path ) =item Archive::Tar::File->new( data => $path, $data, $opt ) =item Archive::Tar::File->new( chunk => $chunk ) =back =back =over 4 =item $bool = $file->extract( [ $alternative_name ] ) =back =over 4 =item $path = $file->full_path =back =over 4 =item $bool = $file->validate =back =over 4 =item $bool = $file->has_content =back =over 4 =item $content = $file->get_content =back =over 4 =item $cref = $file->get_content_by_ref =back =over 4 =item $bool = $file->replace_content( $content ) =back =over 4 =item $bool = $file->rename( $new_name ) =back =over 4 =item $bool = $file->chmod $mode) =back =over 4 =item $bool = $file->chown( $user [, $group]) =back =over 4 =item Convenience methods $file->is_file, $file->is_dir, $file->is_hardlink, $file->is_symlink, $file->is_chardev, $file->is_blockdev, $file->is_fifo, $file->is_socket, $file->is_longlink, $file->is_label, $file->is_unknown =back =head2 Attribute::Handlers - Simpler definition of attribute handlers =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION [0], [1], [2], [3], [4], [5], [6], [7] =over 4 =item Typed lexicals =item Type-specific attribute handlers =item Non-interpretive attribute handlers =item Phase-specific attribute handlers =item Attributes as C<tie> interfaces =back =item EXAMPLES =item UTILITY FUNCTIONS findsym =item DIAGNOSTICS C<Bad attribute type: ATTR(%s)>, C<Attribute handler %s doesn't handle %s attributes>, C<Declaration of %s attribute in package %s may clash with future reserved word>, C<Can't have two ATTR specifiers on one subroutine>, C<Can't autotie a %s>, C<Internal error: %s symbol went missing>, C<Won't be able to apply END handler> =item AUTHOR =item BUGS =item COPYRIGHT AND LICENSE =back =head2 AutoLoader - load subroutines only on demand =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Subroutine Stubs =item Using B<AutoLoader>'s AUTOLOAD Subroutine =item Overriding B<AutoLoader>'s AUTOLOAD Subroutine =item Package Lexicals =item Not Using AutoLoader =item B<AutoLoader> vs. B<SelfLoader> =item Forcing AutoLoader to Load a Function =back =item CAVEATS =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 AutoSplit - split a package for autoloading =over 4 =item SYNOPSIS =item DESCRIPTION $keep, $check, $modtime =over 4 =item Multiple packages =back =item DIAGNOSTICS =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 B - The Perl Compiler Backend =over 4 =item SYNOPSIS =item DESCRIPTION =item OVERVIEW =item Utility Functions =over 4 =item Functions Returning C<B::SV>, C<B::AV>, C<B::HV>, and C<B::CV> objects sv_undef, sv_yes, sv_no, svref_2object(SVREF), amagic_generation, init_av, check_av, unitcheck_av, begin_av, end_av, comppadlist, regex_padav, main_cv =item Functions for Examining the Symbol Table walksymtable(SYMREF, METHOD, RECURSE, PREFIX) =item Functions Returning C<B::OP> objects or for walking op trees main_root, main_start, walkoptree(OP, METHOD), walkoptree_debug(DEBUG) =item Miscellaneous Utility Functions ppname(OPNUM), hash(STR), cast_I32(I), minus_c, cstring(STR), perlstring(STR), safename(STR), class(OBJ), threadsv_names =item Exported utility variables @optype, @specialsv_name =back =item OVERVIEW OF CLASSES =over 4 =item SV-RELATED CLASSES =item B::SV Methods REFCNT, FLAGS, object_2svref =item B::IV Methods IV, IVX, UVX, int_value, needs64bits, packiv =item B::NV Methods NV, NVX, COP_SEQ_RANGE_LOW, COP_SEQ_RANGE_HIGH =item B::RV Methods RV =item B::PV Methods PV, RV, PVX, CUR, LEN =item B::PVMG Methods MAGIC, SvSTASH =item B::MAGIC Methods MOREMAGIC, precomp, PRIVATE, TYPE, FLAGS, OBJ, PTR, REGEX =item B::PVLV Methods TARGOFF, TARGLEN, TYPE, TARG =item B::BM Methods USEFUL, PREVIOUS, RARE, TABLE =item B::REGEXP Methods REGEX, precomp, qr_anoncv, compflags =item B::GV Methods is_empty, NAME, SAFENAME, STASH, SV, IO, FORM, AV, HV, EGV, CV, CVGEN, LINE, FILE, FILEGV, GvREFCNT, FLAGS, GPFLAGS =item B::IO Methods LINES, PAGE, PAGE_LEN, LINES_LEFT, TOP_NAME, TOP_GV, FMT_NAME, FMT_GV, BOTTOM_NAME, BOTTOM_GV, SUBPROCESS, IoTYPE, IoFLAGS, IsSTD =item B::AV Methods FILL, MAX, ARRAY, ARRAYelt =item B::CV Methods STASH, START, ROOT, GV, FILE, DEPTH, PADLIST, OUTSIDE, OUTSIDE_SEQ, XSUB, XSUBANY, CvFLAGS, const_sv, NAME_HEK =item B::HV Methods FILL, MAX, KEYS, RITER, NAME, ARRAY =item OP-RELATED CLASSES =item B::OP Methods next, sibling, parent, name, ppaddr, desc, targ, type, opt, flags, private, spare =item B::UNOP Method first =item B::UNOP_AUX Methods (since 5.22) aux_list(cv), string(cv) =item B::BINOP Method last =item B::LOGOP Method other =item B::LISTOP Method children =item B::PMOP Methods pmreplroot, pmreplstart, pmflags, precomp, pmoffset, code_list, pmregexp =item B::SVOP Methods sv, gv =item B::PADOP Method padix =item B::PVOP Method pv =item B::LOOP Methods redoop, nextop, lastop =item B::COP Methods label, stash, stashpv, stashoff (threaded only), file, cop_seq, line, warnings, io, hints, hints_hash =item B::METHOP Methods (Since Perl 5.22) first, meth_sv =item PAD-RELATED CLASSES =item B::PADLIST Methods MAX, ARRAY, ARRAYelt, NAMES, REFCNT, id, outid =item B::PADNAMELIST Methods MAX, ARRAY, ARRAYelt, REFCNT =item B::PADNAME Methods PV, PVX, LEN, REFCNT, FLAGS, TYPE, SvSTASH, OURSTASH, PROTOCV, COP_SEQ_RANGE_LOW, COP_SEQ_RANGE_HIGH, PARENT_PAD_INDEX, PARENT_FAKELEX_FLAGS =item $B::overlay =back =item AUTHOR =back =head2 B::Concise - Walk Perl syntax tree, printing concise info about ops =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLE =item OPTIONS =over 4 =item Options for Opcode Ordering B<-basic>, B<-exec>, B<-tree> =item Options for Line-Style B<-concise>, B<-terse>, B<-linenoise>, B<-debug>, B<-env> =item Options for tree-specific formatting B<-compact>, B<-loose>, B<-vt>, B<-ascii> =item Options controlling sequence numbering B<-base>I<n>, B<-bigendian>, B<-littleendian> =item Other options B<-src>, B<-stash="somepackage">, B<-main>, B<-nomain>, B<-nobanner>, B<-banner>, B<-banneris> => subref =item Option Stickiness =back =item ABBREVIATIONS =over 4 =item OP class abbreviations =item OP flags abbreviations =back =item FORMATTING SPECIFICATIONS =over 4 =item Special Patterns B<(x(>I<exec_text>B<;>I<basic_text>B<)x)>, B<(*(>I<text>B<)*)>, B<(*(>I<text1>B<;>I<text2>B<)*)>, B<(?(>I<text1>B<#>I<var>I<Text2>B<)?)>, B<~> =item # Variables B<#>I<var>, B<#>I<var>I<N>, B<#>I<Var>, B<#addr>, B<#arg>, B<#class>, B<#classsym>, B<#coplabel>, B<#exname>, B<#extarg>, B<#firstaddr>, B<#flags>, B<#flagval>, B<#hints>, B<#hintsval>, B<#hyphseq>, B<#label>, B<#lastaddr>, B<#name>, B<#NAME>, B<#next>, B<#nextaddr>, B<#noise>, B<#private>, B<#privval>, B<#seq>, B<#opt>, B<#sibaddr>, B<#svaddr>, B<#svclass>, B<#svval>, B<#targ>, B<#targarg>, B<#targarglife>, B<#typenum> =back =item One-Liner Command tips perl -MO=Concise,bar foo.pl, perl -MDigest::MD5=md5 -MO=Concise,md5 -e1, perl -MPOSIX -MO=Concise,_POSIX_ARG_MAX -e1, perl -MPOSIX -MO=Concise,a -e 'print _POSIX_SAVED_IDS', perl -MPOSIX -MO=Concise,a -e 'sub a{_POSIX_SAVED_IDS}', perl -MB::Concise -e 'B::Concise::compile("-exec","-src", \%B::Concise::)->()' =item Using B::Concise outside of the O framework =over 4 =item Example: Altering Concise Renderings =item set_style() =item set_style_standard($name) =item add_style () =item add_callback () =item Running B::Concise::compile() =item B::Concise::reset_sequence() =item Errors =back =item AUTHOR =back =head2 B::Deparse - Perl compiler backend to produce perl code =over 4 =item SYNOPSIS =item DESCRIPTION =item OPTIONS B<-d>, B<-f>I<FILE>, B<-l>, B<-p>, B<-P>, B<-q>, B<-s>I<LETTERS>, B<C>, B<i>I<NUMBER>, B<T>, B<v>I<STRING>B<.>, B<-x>I<LEVEL> =item USING B::Deparse AS A MODULE =over 4 =item Synopsis =item Description =item new =item ambient_pragmas strict, $[, bytes, utf8, integer, re, warnings, hint_bits, warning_bits, %^H =item coderef2text =back =item BUGS =item AUTHOR =back =head2 B::Op_private - OP op_private flag definitions =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item C<%bits> =item C<%defines> =item C<%labels> =item C<%ops_using> =back =back =head2 B::Showlex - Show lexical variables used in functions or files =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES =over 4 =item OPTIONS =back =item SEE ALSO =item TODO =item AUTHOR =back =head2 B::Terse - Walk Perl syntax tree, printing terse info about ops =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =back =head2 B::Xref - Generates cross reference reports for Perl programs =over 4 =item SYNOPSIS =item DESCRIPTION i, &, s, r =item OPTIONS C<-oFILENAME>, C<-r>, C<-d>, C<-D[tO]> =item BUGS =item AUTHOR =back =head2 Benchmark - benchmark running times of Perl code =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods new, debug, iters =item Standard Exports timeit(COUNT, CODE), timethis ( COUNT, CODE, [ TITLE, [ STYLE ]] ), timethese ( COUNT, CODEHASHREF, [ STYLE ] ), timediff ( T1, T2 ), timestr ( TIMEDIFF, [ STYLE, [ FORMAT ] ] ) =item Optional Exports clearcache ( COUNT ), clearallcache ( ), cmpthese ( COUNT, CODEHASHREF, [ STYLE ] ), cmpthese ( RESULTSHASHREF, [ STYLE ] ), countit(TIME, CODE), disablecache ( ), enablecache ( ), timesum ( T1, T2 ) =item :hireswallclock =back =item Benchmark Object cpu_p, cpu_c, cpu_a, real, iters =item NOTES =item EXAMPLES =item INHERITANCE =item CAVEATS =item SEE ALSO =item AUTHORS =item MODIFICATION HISTORY =back =head2 CORE - Namespace for Perl's core routines =over 4 =item SYNOPSIS =item DESCRIPTION =item OVERRIDING CORE FUNCTIONS =item AUTHOR =item SEE ALSO =back =head2 CPAN - query, download and build perl modules from CPAN sites =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item CPAN::shell([$prompt, $command]) Starting Interactive Mode Searching for authors, bundles, distribution files and modules, C<get>, C<make>, C<test>, C<install>, C<clean> modules or distributions, C<readme>, C<perldoc>, C<look> module or distribution, C<ls> author, C<ls> globbing_expression, C<failed>, Persistence between sessions, The C<force> and the C<fforce> pragma, Lockfile, Signals =item CPAN::Shell =item autobundle =item hosts install_tested, is_tested =item mkmyconfig =item r [Module|/Regexp/]... =item recent ***EXPERIMENTAL COMMAND*** =item recompile =item report Bundle|Distribution|Module =item smoke ***EXPERIMENTAL COMMAND*** =item upgrade [Module|/Regexp/]... =item The four C<CPAN::*> Classes: Author, Bundle, Module, Distribution =item Integrating local directories =item Redirection =item Plugin support ***EXPERIMENTAL*** =back =item CONFIGURATION completion support, displaying some help: o conf help, displaying current values: o conf [KEY], changing of scalar values: o conf KEY VALUE, changing of list values: o conf KEY SHIFT|UNSHIFT|PUSH|POP|SPLICE|LIST, reverting to saved: o conf defaults, saving the config: o conf commit =over 4 =item Config Variables C<o conf E<lt>scalar optionE<gt>>, C<o conf E<lt>scalar optionE<gt> E<lt>valueE<gt>>, C<o conf E<lt>list optionE<gt>>, C<o conf E<lt>list optionE<gt> [shift|pop]>, C<o conf E<lt>list optionE<gt> [unshift|push|splice] E<lt>listE<gt>>, interactive editing: o conf init [MATCH|LIST] =item CPAN::anycwd($path): Note on config variable getcwd cwd, getcwd, fastcwd, getdcwd, backtickcwd =item Note on the format of the urllist parameter =item The urllist parameter has CD-ROM support =item Maintaining the urllist parameter =item The C<requires> and C<build_requires> dependency declarations =item Configuration of the allow_installing_* parameters =item Configuration for individual distributions (I<Distroprefs>) =item Filenames =item Fallback Data::Dumper and Storable =item Blueprint =item Language Specs comment [scalar], cpanconfig [hash], depends [hash] *** EXPERIMENTAL FEATURE ***, disabled [boolean], features [array] *** EXPERIMENTAL FEATURE ***, goto [string], install [hash], make [hash], match [hash], patches [array], pl [hash], test [hash] =item Processing Instructions args [array], commandline, eexpect [hash], env [hash], expect [array] =item Schema verification with C<Kwalify> =item Example Distroprefs Files =back =item PROGRAMMER'S INTERFACE expand($type,@things), expandany(@things), Programming Examples =over 4 =item Methods in the other Classes CPAN::Author::as_glimpse(), CPAN::Author::as_string(), CPAN::Author::email(), CPAN::Author::fullname(), CPAN::Author::name(), CPAN::Bundle::as_glimpse(), CPAN::Bundle::as_string(), CPAN::Bundle::clean(), CPAN::Bundle::contains(), CPAN::Bundle::force($method,@args), CPAN::Bundle::get(), CPAN::Bundle::inst_file(), CPAN::Bundle::inst_version(), CPAN::Bundle::uptodate(), CPAN::Bundle::install(), CPAN::Bundle::make(), CPAN::Bundle::readme(), CPAN::Bundle::test(), CPAN::Distribution::as_glimpse(), CPAN::Distribution::as_string(), CPAN::Distribution::author, CPAN::Distribution::pretty_id(), CPAN::Distribution::base_id(), CPAN::Distribution::clean(), CPAN::Distribution::containsmods(), CPAN::Distribution::cvs_import(), CPAN::Distribution::dir(), CPAN::Distribution::force($method,@args), CPAN::Distribution::get(), CPAN::Distribution::install(), CPAN::Distribution::isa_perl(), CPAN::Distribution::look(), CPAN::Distribution::make(), CPAN::Distribution::perldoc(), CPAN::Distribution::prefs(), CPAN::Distribution::prereq_pm(), CPAN::Distribution::readme(), CPAN::Distribution::reports(), CPAN::Distribution::read_yaml(), CPAN::Distribution::test(), CPAN::Distribution::uptodate(), CPAN::Index::force_reload(), CPAN::Index::reload(), CPAN::InfoObj::dump(), CPAN::Module::as_glimpse(), CPAN::Module::as_string(), CPAN::Module::clean(), CPAN::Module::cpan_file(), CPAN::Module::cpan_version(), CPAN::Module::cvs_import(), CPAN::Module::description(), CPAN::Module::distribution(), CPAN::Module::dslip_status(), CPAN::Module::force($method,@args), CPAN::Module::get(), CPAN::Module::inst_file(), CPAN::Module::available_file(), CPAN::Module::inst_version(), CPAN::Module::available_version(), CPAN::Module::install(), CPAN::Module::look(), CPAN::Module::make(), CPAN::Module::manpage_headline(), CPAN::Module::perldoc(), CPAN::Module::readme(), CPAN::Module::reports(), CPAN::Module::test(), CPAN::Module::uptodate(), CPAN::Module::userid() =item Cache Manager =item Bundles =back =item PREREQUISITES =item UTILITIES =over 4 =item Finding packages and VERSION =item Debugging o debug package.., o debug -package.., o debug all, o debug number =item Floppy, Zip, Offline Mode =item Basic Utilities for Programmers has_inst($module), use_inst($module), has_usable($module), instance($module), frontend(), frontend($new_frontend) =back =item SECURITY =over 4 =item Cryptographically signed modules =back =item EXPORT =item ENVIRONMENT =item POPULATE AN INSTALLATION WITH LOTS OF MODULES =item WORKING WITH CPAN.pm BEHIND FIREWALLS =over 4 =item Three basic types of firewalls http firewall, ftp firewall, One-way visibility, SOCKS, IP Masquerade =item Configuring lynx or ncftp for going through a firewall =back =item FAQ 1), 2), 3), 4), 5), 6), 7), 8), 9), 10), 11), 12), 13), 14), 15), 16), 17), 18), 19) =item COMPATIBILITY =over 4 =item OLD PERL VERSIONS =item CPANPLUS =item CPANMINUS =back =item SECURITY ADVICE =item BUGS =item AUTHOR =item LICENSE =item TRANSLATIONS =item SEE ALSO =back =head2 CPAN::API::HOWTO - a recipe book for programming with CPAN.pm =over 4 =item RECIPES =over 4 =item What distribution contains a particular module? =item What modules does a particular distribution contain? =back =item SEE ALSO =item LICENSE =item AUTHOR =back =head2 CPAN::Debug - internal debugging for CPAN.pm =over 4 =item LICENSE =back =head2 CPAN::Distroprefs -- read and match distroprefs =over 4 =item SYNOPSIS =item DESCRIPTION =item INTERFACE a CPAN::Distroprefs::Result object, C<undef>, indicating that no prefs files remain to be found =item RESULTS =over 4 =item Common =item Errors =item Successes =back =item PREFS =item LICENSE =back =head2 CPAN::FirstTime - Utility for CPAN::Config file Initialization =over 4 =item SYNOPSIS =item DESCRIPTION =back allow_installing_module_downgrades, allow_installing_outdated_dists, auto_commit, build_cache, build_dir, build_dir_reuse, build_requires_install_policy, cache_metadata, check_sigs, cleanup_after_install, colorize_output, colorize_print, colorize_warn, colorize_debug, commandnumber_in_prompt, connect_to_internet_ok, ftp_passive, ftpstats_period, ftpstats_size, getcwd, halt_on_failure, histfile, histsize, inactivity_timeout, index_expire, inhibit_startup_message, keep_source_where, load_module_verbosity, makepl_arg, make_arg, make_install_arg, make_install_make_command, mbuildpl_arg, mbuild_arg, mbuild_install_arg, mbuild_install_build_command, pager, prefer_installer, prefs_dir, prerequisites_policy, randomize_urllist, recommends_policy, scan_cache, shell, show_unparsable_versions, show_upload_date, show_zero_versions, suggests_policy, tar_verbosity, term_is_latin, term_ornaments, test_report, perl5lib_verbosity, prefer_external_tar, trust_test_report_history, urllist_ping_external, urllist_ping_verbose, use_prompt_default, use_sqlite, version_timeout, yaml_load_code, yaml_module =over 4 =item LICENSE =back =head2 CPAN::HandleConfig - internal configuration handling for CPAN.pm =over 4 =item C<< CLASS->safe_quote ITEM >> =back =over 4 =item LICENSE =back =head2 CPAN::Kwalify - Interface between CPAN.pm and Kwalify.pm =over 4 =item SYNOPSIS =item DESCRIPTION _validate($schema_name, $data, $file, $doc), yaml($schema_name) =item AUTHOR =item LICENSE =back =head2 CPAN::Meta - the distribution metadata for a CPAN dist =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item create =item load_file =item load_yaml_string =item load_json_string =item load_string =item save =item meta_spec_version =item effective_prereqs =item should_index_file =item should_index_package =item features =item feature =item as_struct =item as_string =back =item STRING DATA =item LIST DATA =item MAP DATA =item CUSTOM DATA =item BUGS =item SEE ALSO =item SUPPORT =over 4 =item Bugs / Feature Requests =item Source Code =back =item AUTHORS =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Converter - Convert CPAN distribution metadata structures =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item convert =item upgrade_fragment =back =item BUGS =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Feature - an optional feature provided by a CPAN distribution =over 4 =item VERSION =item DESCRIPTION =item METHODS =over 4 =item new =item identifier =item description =item prereqs =back =item BUGS =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::History - history of CPAN Meta Spec changes =over 4 =item VERSION =item DESCRIPTION =item HISTORY =over 4 =item Version 2 =item Version 1.4 =item Version 1.3 =item Version 1.2 =item Version 1.1 =item Version 1.0 =back =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::History::Meta_1_0 - Version 1.0 metadata specification for META.yml =over 4 =item PREFACE =item DESCRIPTION =item Format =item Fields name, version, license, perl, gpl, lgpl, artistic, bsd, open_source, unrestricted, restrictive, distribution_type, requires, recommends, build_requires, conflicts, dynamic_config, generated_by =item Related Projects DOAP =item History =back =head2 CPAN::Meta::History::Meta_1_1 - Version 1.1 metadata specification for META.yml =over 4 =item PREFACE =item DESCRIPTION =item Format =item Fields name, version, license, perl, gpl, lgpl, artistic, bsd, open_source, unrestricted, restrictive, license_uri, distribution_type, private, requires, recommends, build_requires, conflicts, dynamic_config, generated_by =over 4 =item Ingy's suggestions short_description, description, maturity, author_id, owner_id, categorization, keyword, chapter_id, URL for further information, namespaces =back =item History =back =head2 CPAN::Meta::History::Meta_1_2 - Version 1.2 metadata specification for META.yml =over 4 =item PREFACE =item SYNOPSIS =item DESCRIPTION =item FORMAT =item TERMINOLOGY distribution, module =item VERSION SPECIFICATIONS =item HEADER =item FIELDS =over 4 =item meta-spec =item name =item version =item abstract =item author =item license perl, gpl, lgpl, artistic, bsd, open_source, unrestricted, restrictive =item distribution_type =item requires =item recommends =item build_requires =item conflicts =item dynamic_config =item private =item provides =item no_index =item keywords =item resources homepage, license, bugtracker =item generated_by =back =item SEE ALSO =item HISTORY March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003, December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005 =back =head2 CPAN::Meta::History::Meta_1_3 - Version 1.3 metadata specification for META.yml =over 4 =item PREFACE =item SYNOPSIS =item DESCRIPTION =item FORMAT =item TERMINOLOGY distribution, module =item HEADER =item FIELDS =over 4 =item meta-spec =item name =item version =item abstract =item author =item license apache, artistic, bsd, gpl, lgpl, mit, mozilla, open_source, perl, restrictive, unrestricted =item distribution_type =item requires =item recommends =item build_requires =item conflicts =item dynamic_config =item private =item provides =item no_index =item keywords =item resources homepage, license, bugtracker =item generated_by =back =item VERSION SPECIFICATIONS =item SEE ALSO =item HISTORY March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003, December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005 =back =head2 CPAN::Meta::History::Meta_1_4 - Version 1.4 metadata specification for META.yml =over 4 =item PREFACE =item SYNOPSIS =item DESCRIPTION =item FORMAT =item TERMINOLOGY distribution, module =item HEADER =item FIELDS =over 4 =item meta-spec =item name =item version =item abstract =item author =item license apache, artistic, bsd, gpl, lgpl, mit, mozilla, open_source, perl, restrictive, unrestricted =item distribution_type =item requires =item recommends =item build_requires =item configure_requires =item conflicts =item dynamic_config =item private =item provides =item no_index =item keywords =item resources homepage, license, bugtracker =item generated_by =back =item VERSION SPECIFICATIONS =item SEE ALSO =item HISTORY March 14, 2003 (Pi day), May 8, 2003, November 13, 2003, November 16, 2003, December 9, 2003, December 15, 2003, July 26, 2005, August 23, 2005, June 12, 2007 =back =head2 CPAN::Meta::Merge - Merging CPAN Meta fragments =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item merge(@fragments) =back =item MERGE STRATEGIES identical, set_addition, uniq_map, improvise =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Prereqs - a set of distribution prerequisites by phase and type =over 4 =item VERSION =item DESCRIPTION =item METHODS =over 4 =item new =item requirements_for =item phases =item types_in =item with_merged_prereqs =item merged_requirements =item as_string_hash =item is_finalized =item finalize =item clone =back =item BUGS =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Requirements - a set of version requirements for a CPAN dist =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item add_minimum =item add_maximum =item add_exclusion =item exact_version =item add_requirements =item accepts_module =item clear_requirement =item requirements_for_module =item structured_requirements_for_module =item required_modules =item clone =item is_simple =item is_finalized =item finalize =item as_string_hash =item add_string_requirement >= 1.3, <= 1.3, != 1.3, > 1.3, < 1.3, >= 1.3, != 1.5, <= 2.0 =item from_string_hash =back =item SUPPORT =over 4 =item Bugs / Feature Requests =item Source Code =back =item AUTHORS =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Spec - specification for CPAN distribution metadata =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item TERMINOLOGY distribution, module, package, consumer, producer, must, should, may, etc =item DATA TYPES =over 4 =item Boolean =item String =item List =item Map =item License String =item URL =item Version =item Version Range =back =item STRUCTURE =over 4 =item REQUIRED FIELDS version, url, stable, testing, unstable =item OPTIONAL FIELDS file, directory, package, namespace, description, prereqs, file, version, homepage, license, bugtracker, repository =item DEPRECATED FIELDS =back =item VERSION NUMBERS =over 4 =item Version Formats Decimal versions, Dotted-integer versions =item Version Ranges =back =item PREREQUISITES =over 4 =item Prereq Spec configure, build, test, runtime, develop, requires, recommends, suggests, conflicts =item Merging and Resolving Prerequisites =back =item SERIALIZATION =item NOTES FOR IMPLEMENTORS =over 4 =item Extracting Version Numbers from Perl Modules =item Comparing Version Numbers =item Prerequisites for dynamically configured distributions =item Indexing distributions a la PAUSE =back =item SEE ALSO =item HISTORY =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::Validator - validate CPAN distribution metadata structures =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item is_valid =item errors =item Check Methods =item Validator Methods =back =item BUGS =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 CPAN::Meta::YAML - Read and write a subset of YAML for CPAN Meta files =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item SUPPORT =item SEE ALSO =item AUTHORS =item COPYRIGHT AND LICENSE =back =over 4 =item SYNOPSIS =item DESCRIPTION =back new( LOCAL_FILE_NAME ) continents() countries( [CONTINENTS] ) mirrors( [COUNTRIES] ) get_mirrors_by_countries( [COUNTRIES] ) get_mirrors_by_continents( [CONTINENTS] ) get_countries_by_continents( [CONTINENTS] ) default_mirror best_mirrors get_n_random_mirrors_by_continents( N, [CONTINENTS] ) get_mirrors_timings( MIRROR_LIST, SEEN, CALLBACK, %ARGS ); find_best_continents( HASH_REF ); =over 4 =item AUTHOR =item LICENSE =back =head2 CPAN::Nox - Wrapper around CPAN.pm without using any XS module =over 4 =item SYNOPSIS =item DESCRIPTION =item LICENSE =item SEE ALSO =back =head2 CPAN::Plugin - Base class for CPAN shell extensions =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Alpha Status =item How Plugins work? =back =item METHODS =over 4 =item plugin_requires =item distribution_object =item distribution =item distribution_info =item build_dir =item is_xs =back =item AUTHOR =back =head2 CPAN::Plugin::Specfile - Proof of concept implementation of a trivial CPAN::Plugin =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item OPTIONS =back =item AUTHOR =back =head2 CPAN::Queue - internal queue support for CPAN.pm =over 4 =item LICENSE =back =head2 CPAN::Tarzip - internal handling of tar archives for CPAN.pm =over 4 =item LICENSE =back =head2 CPAN::Version - utility functions to compare CPAN versions =over 4 =item SYNOPSIS =item DESCRIPTION =item LICENSE =back =head2 Carp - alternative warn and die for modules =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Forcing a Stack Trace =item Stack Trace formatting =back =item GLOBAL VARIABLES =over 4 =item $Carp::MaxEvalLen =item $Carp::MaxArgLen =item $Carp::MaxArgNums =item $Carp::Verbose =item $Carp::RefArgFormatter =item @CARP_NOT =item %Carp::Internal =item %Carp::CarpInternal =item $Carp::CarpLevel =back =item BUGS =item SEE ALSO =item CONTRIBUTING =item AUTHOR =item COPYRIGHT =item LICENSE =back =head2 Class::Struct - declare struct-like datatypes as Perl classes =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item The C<struct()> function =item Class Creation at Compile Time =item Element Types and Accessor Methods Scalar (C<'$'> or C<'*$'>), Array (C<'@'> or C<'*@'>), Hash (C<'%'> or C<'*%'>), Class (C<'Class_Name'> or C<'*Class_Name'>) =item Initializing with C<new> =back =item EXAMPLES Example 1, Example 2, Example 3 =item Author and Modification History =back =head2 Compress::Raw::Bzip2 - Low-Level Interface to bzip2 compression library =over 4 =item SYNOPSIS =item DESCRIPTION =item Compression =over 4 =item ($z, $status) = new Compress::Raw::Bzip2 $appendOutput, $blockSize100k, $workfactor; B<$appendOutput>, B<$blockSize100k>, B<$workfactor> =item $status = $bz->bzdeflate($input, $output); =item $status = $bz->bzflush($output); =item $status = $bz->bzclose($output); =item Example =back =item Uncompression =over 4 =item ($z, $status) = new Compress::Raw::Bunzip2 $appendOutput, $consumeInput, $small, $verbosity, $limitOutput; B<$appendOutput>, B<$consumeInput>, B<$small>, B<$limitOutput>, B<$verbosity> =item $status = $z->bzinflate($input, $output); =back =item Misc =over 4 =item my $version = Compress::Raw::Bzip2::bzlibversion(); =back =item Constants =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 Compress::Raw::Zlib - Low-Level Interface to zlib compression library =over 4 =item SYNOPSIS =item DESCRIPTION =item Compress::Raw::Zlib::Deflate =over 4 =item B<($d, $status) = new Compress::Raw::Zlib::Deflate( [OPT] ) > B<-Level>, B<-Method>, B<-WindowBits>, B<-MemLevel>, B<-Strategy>, B<-Dictionary>, B<-Bufsize>, B<-AppendOutput>, B<-CRC32>, B<-ADLER32> =item B<$status = $d-E<gt>deflate($input, $output)> =item B<$status = $d-E<gt>flush($output [, $flush_type]) > =item B<$status = $d-E<gt>deflateReset() > =item B<$status = $d-E<gt>deflateParams([OPT])> B<-Level>, B<-Strategy>, B<-BufSize> =item B<$status = $d-E<gt>deflateTune($good_length, $max_lazy, $nice_length, $max_chain)> =item B<$d-E<gt>dict_adler()> =item B<$d-E<gt>crc32()> =item B<$d-E<gt>adler32()> =item B<$d-E<gt>msg()> =item B<$d-E<gt>total_in()> =item B<$d-E<gt>total_out()> =item B<$d-E<gt>get_Strategy()> =item B<$d-E<gt>get_Level()> =item B<$d-E<gt>get_BufSize()> =item Example =back =item Compress::Raw::Zlib::Inflate =over 4 =item B< ($i, $status) = new Compress::Raw::Zlib::Inflate( [OPT] ) > B<-WindowBits>, B<-Bufsize>, B<-Dictionary>, B<-AppendOutput>, B<-CRC32>, B<-ADLER32>, B<-ConsumeInput>, B<-LimitOutput> =item B< $status = $i-E<gt>inflate($input, $output [,$eof]) > =item B<$status = $i-E<gt>inflateSync($input)> =item B<$status = $i-E<gt>inflateReset() > =item B<$i-E<gt>dict_adler()> =item B<$i-E<gt>crc32()> =item B<$i-E<gt>adler32()> =item B<$i-E<gt>msg()> =item B<$i-E<gt>total_in()> =item B<$i-E<gt>total_out()> =item B<$d-E<gt>get_BufSize()> =item Examples =back =item CHECKSUM FUNCTIONS =item Misc =over 4 =item my $version = Compress::Raw::Zlib::zlib_version(); =item my $flags = Compress::Raw::Zlib::zlibCompileFlags(); =back =item The LimitOutput option. =item ACCESSING ZIP FILES =item FAQ =over 4 =item Compatibility with Unix compress/uncompress. =item Accessing .tar.Z files =item Zlib Library Version Support =back =item CONSTANTS =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 Compress::Zlib - Interface to zlib compression library =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Notes for users of Compress::Zlib version 1 =back =item GZIP INTERFACE B<$gz = gzopen($filename, $mode)>, B<$gz = gzopen($filehandle, $mode)>, B<$bytesread = $gz-E<gt>gzread($buffer [, $size]) ;>, B<$bytesread = $gz-E<gt>gzreadline($line) ;>, B<$byteswritten = $gz-E<gt>gzwrite($buffer) ;>, B<$status = $gz-E<gt>gzflush($flush_type) ;>, B<$offset = $gz-E<gt>gztell() ;>, B<$status = $gz-E<gt>gzseek($offset, $whence) ;>, B<$gz-E<gt>gzclose>, B<$gz-E<gt>gzsetparams($level, $strategy>, B<$level>, B<$strategy>, B<$gz-E<gt>gzerror>, B<$gzerrno> =over 4 =item Examples =item Compress::Zlib::memGzip =item Compress::Zlib::memGunzip =back =item COMPRESS/UNCOMPRESS B<$dest = compress($source [, $level] ) ;>, B<$dest = uncompress($source) ;> =item Deflate Interface =over 4 =item B<($d, $status) = deflateInit( [OPT] )> B<-Level>, B<-Method>, B<-WindowBits>, B<-MemLevel>, B<-Strategy>, B<-Dictionary>, B<-Bufsize> =item B<($out, $status) = $d-E<gt>deflate($buffer)> =item B<($out, $status) = $d-E<gt>flush()> =head2 B<($out, $status) = $d-E<gt>flush($flush_type)> =item B<$status = $d-E<gt>deflateParams([OPT])> B<-Level>, B<-Strategy> =item B<$d-E<gt>dict_adler()> =item B<$d-E<gt>msg()> =item B<$d-E<gt>total_in()> =item B<$d-E<gt>total_out()> =item Example =back =item Inflate Interface =over 4 =item B<($i, $status) = inflateInit()> B<-WindowBits>, B<-Bufsize>, B<-Dictionary> =item B<($out, $status) = $i-E<gt>inflate($buffer)> =item B<$status = $i-E<gt>inflateSync($buffer)> =item B<$i-E<gt>dict_adler()> =item B<$i-E<gt>msg()> =item B<$i-E<gt>total_in()> =item B<$i-E<gt>total_out()> =item Example =back =item CHECKSUM FUNCTIONS =item Misc =over 4 =item my $version = Compress::Zlib::zlib_version(); =back =item CONSTANTS =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 Config, =for comment Generated by configpm. Any changes made here will be lost! =over 4 =item SYNOPSIS =item DESCRIPTION myconfig(), config_sh(), config_re($regex), config_vars(@names), bincompat_options(), non_bincompat_options(), compile_date(), local_patches(), header_files() =item EXAMPLE =item WARNING =item GLOSSARY =back =over 4 =item _ C<_a>, C<_exe>, C<_o> =item a C<afs>, C<afsroot>, C<alignbytes>, C<aphostname>, C<api_revision>, C<api_subversion>, C<api_version>, C<api_versionstring>, C<ar>, C<archlib>, C<archlibexp>, C<archname>, C<archname64>, C<archobjs>, C<asctime_r_proto>, C<awk> =item b C<baserev>, C<bash>, C<bin>, C<bin_ELF>, C<binexp>, C<bison>, C<byacc>, C<byteorder> =item c C<c>, C<castflags>, C<cat>, C<cc>, C<cccdlflags>, C<ccdlflags>, C<ccflags>, C<ccflags_uselargefiles>, C<ccname>, C<ccsymbols>, C<ccversion>, C<cf_by>, C<cf_email>, C<cf_time>, C<charbits>, C<charsize>, C<chgrp>, C<chmod>, C<chown>, C<clocktype>, C<comm>, C<compress>, C<config_arg0>, C<config_argc>, C<config_args>, C<contains>, C<cp>, C<cpio>, C<cpp>, C<cpp_stuff>, C<cppccsymbols>, C<cppflags>, C<cpplast>, C<cppminus>, C<cpprun>, C<cppstdin>, C<cppsymbols>, C<crypt_r_proto>, C<cryptlib>, C<csh>, C<ctermid_r_proto>, C<ctime_r_proto> =item d C<d__fwalk>, C<d_accept4>, C<d_access>, C<d_accessx>, C<d_acosh>, C<d_aintl>, C<d_alarm>, C<d_archlib>, C<d_asctime64>, C<d_asctime_r>, C<d_asinh>, C<d_atanh>, C<d_atolf>, C<d_atoll>, C<d_attribute_deprecated>, C<d_attribute_format>, C<d_attribute_malloc>, C<d_attribute_nonnull>, C<d_attribute_noreturn>, C<d_attribute_pure>, C<d_attribute_unused>, C<d_attribute_warn_unused_result>, C<d_backtrace>, C<d_bsd>, C<d_bsdgetpgrp>, C<d_bsdsetpgrp>, C<d_builtin_add_overflow>, C<d_builtin_choose_expr>, C<d_builtin_expect>, C<d_builtin_mul_overflow>, C<d_builtin_sub_overflow>, C<d_c99_variadic_macros>, C<d_casti32>, C<d_castneg>, C<d_cbrt>, C<d_chown>, C<d_chroot>, C<d_chsize>, C<d_class>, C<d_clearenv>, C<d_closedir>, C<d_cmsghdr_s>, C<d_copysign>, C<d_copysignl>, C<d_cplusplus>, C<d_crypt>, C<d_crypt_r>, C<d_csh>, C<d_ctermid>, C<d_ctermid_r>, C<d_ctime64>, C<d_ctime_r>, C<d_cuserid>, C<d_dbminitproto>, C<d_difftime>, C<d_difftime64>, C<d_dir_dd_fd>, C<d_dirfd>, C<d_dirnamlen>, C<d_dladdr>, C<d_dlerror>, C<d_dlopen>, C<d_dlsymun>, C<d_dosuid>, C<d_double_has_inf>, C<d_double_has_nan>, C<d_double_has_negative_zero>, C<d_double_has_subnormals>, C<d_double_style_cray>, C<d_double_style_ibm>, C<d_double_style_ieee>, C<d_double_style_vax>, C<d_drand48_r>, C<d_drand48proto>, C<d_dup2>, C<d_dup3>, C<d_duplocale>, C<d_eaccess>, C<d_endgrent>, C<d_endgrent_r>, C<d_endhent>, C<d_endhostent_r>, C<d_endnent>, C<d_endnetent_r>, C<d_endpent>, C<d_endprotoent_r>, C<d_endpwent>, C<d_endpwent_r>, C<d_endsent>, C<d_endservent_r>, C<d_eofnblk>, C<d_erf>, C<d_erfc>, C<d_eunice>, C<d_exp2>, C<d_expm1>, C<d_faststdio>, C<d_fchdir>, C<d_fchmod>, C<d_fchmodat>, C<d_fchown>, C<d_fcntl>, C<d_fcntl_can_lock>, C<d_fd_macros>, C<d_fd_set>, C<d_fdclose>, C<d_fdim>, C<d_fds_bits>, C<d_fegetround>, C<d_fgetpos>, C<d_finite>, C<d_finitel>, C<d_flexfnam>, C<d_flock>, C<d_flockproto>, C<d_fma>, C<d_fmax>, C<d_fmin>, C<d_fdopendir>, C<d_fork>, C<d_fp_class>, C<d_fp_classify>, C<d_fp_classl>, C<d_fpathconf>, C<d_fpclass>, C<d_fpclassify>, C<d_fpclassl>, C<d_fpgetround>, C<d_fpos64_t>, C<d_freelocale>, C<d_frexpl>, C<d_fs_data_s>, C<d_fseeko>, C<d_fsetpos>, C<d_fstatfs>, C<d_fstatvfs>, C<d_fsync>, C<d_ftello>, C<d_ftime>, C<d_futimes>, C<d_gai_strerror>, C<d_Gconvert>, C<d_gdbm_ndbm_h_uses_prototypes>, C<d_gdbmndbm_h_uses_prototypes>, C<d_getaddrinfo>, C<d_getcwd>, C<d_getespwnam>, C<d_getfsstat>, C<d_getgrent>, C<d_getgrent_r>, C<d_getgrgid_r>, C<d_getgrnam_r>, C<d_getgrps>, C<d_gethbyaddr>, C<d_gethbyname>, C<d_gethent>, C<d_gethname>, C<d_gethostbyaddr_r>, C<d_gethostbyname_r>, C<d_gethostent_r>, C<d_gethostprotos>, C<d_getitimer>, C<d_getlogin>, C<d_getlogin_r>, C<d_getmnt>, C<d_getmntent>, C<d_getnameinfo>, C<d_getnbyaddr>, C<d_getnbyname>, C<d_getnent>, C<d_getnetbyaddr_r>, C<d_getnetbyname_r>, C<d_getnetent_r>, C<d_getnetprotos>, C<d_getpagsz>, C<d_getpbyname>, C<d_getpbynumber>, C<d_getpent>, C<d_getpgid>, C<d_getpgrp>, C<d_getpgrp2>, C<d_getppid>, C<d_getprior>, C<d_getprotobyname_r>, C<d_getprotobynumber_r>, C<d_getprotoent_r>, C<d_getprotoprotos>, C<d_getprpwnam>, C<d_getpwent>, C<d_getpwent_r>, C<d_getpwnam_r>, C<d_getpwuid_r>, C<d_getsbyname>, C<d_getsbyport>, C<d_getsent>, C<d_getservbyname_r>, C<d_getservbyport_r>, C<d_getservent_r>, C<d_getservprotos>, C<d_getspnam>, C<d_getspnam_r>, C<d_gettimeod>, C<d_gmtime64>, C<d_gmtime_r>, C<d_gnulibc>, C<d_grpasswd>, C<d_has_C_UTF8>, C<d_hasmntopt>, C<d_htonl>, C<d_hypot>, C<d_ilogb>, C<d_ilogbl>, C<d_inc_version_list>, C<d_inetaton>, C<d_inetntop>, C<d_inetpton>, C<d_int64_t>, C<d_ip_mreq>, C<d_ip_mreq_source>, C<d_ipv6_mreq>, C<d_ipv6_mreq_source>, C<d_isascii>, C<d_isblank>, C<d_isfinite>, C<d_isfinitel>, C<d_isinf>, C<d_isinfl>, C<d_isless>, C<d_isnan>, C<d_isnanl>, C<d_isnormal>, C<d_j0>, C<d_j0l>, C<d_killpg>, C<d_lc_monetary_2008>, C<d_lchown>, C<d_ldbl_dig>, C<d_ldexpl>, C<d_lgamma>, C<d_lgamma_r>, C<d_libm_lib_version>, C<d_libname_unique>, C<d_link>, C<d_linkat>, C<d_llrint>, C<d_llrintl>, C<d_llround>, C<d_llroundl>, C<d_localeconv_l>, C<d_localtime64>, C<d_localtime_r>, C<d_localtime_r_needs_tzset>, C<d_locconv>, C<d_lockf>, C<d_log1p>, C<d_log2>, C<d_logb>, C<d_long_double_style_ieee>, C<d_long_double_style_ieee_doubledouble>, C<d_long_double_style_ieee_extended>, C<d_long_double_style_ieee_std>, C<d_long_double_style_vax>, C<d_longdbl>, C<d_longlong>, C<d_lrint>, C<d_lrintl>, C<d_lround>, C<d_lroundl>, C<d_lseekproto>, C<d_lstat>, C<d_madvise>, C<d_malloc_good_size>, C<d_malloc_size>, C<d_mblen>, C<d_mbrlen>, C<d_mbrtowc>, C<d_mbstowcs>, C<d_mbtowc>, C<d_memmem>, C<d_memrchr>, C<d_mkdir>, C<d_mkdtemp>, C<d_mkfifo>, C<d_mkostemp>, C<d_mkstemp>, C<d_mkstemps>, C<d_mktime>, C<d_mktime64>, C<d_mmap>, C<d_modfl>, C<d_modflproto>, C<d_mprotect>, C<d_msg>, C<d_msg_ctrunc>, C<d_msg_dontroute>, C<d_msg_oob>, C<d_msg_peek>, C<d_msg_proxy>, C<d_msgctl>, C<d_msgget>, C<d_msghdr_s>, C<d_msgrcv>, C<d_msgsnd>, C<d_msync>, C<d_munmap>, C<d_mymalloc>, C<d_nan>, C<d_nanosleep>, C<d_ndbm>, C<d_ndbm_h_uses_prototypes>, C<d_nearbyint>, C<d_newlocale>, C<d_nextafter>, C<d_nexttoward>, C<d_nice>, C<d_nl_langinfo>, C<d_nv_preserves_uv>, C<d_nv_zero_is_allbits_zero>, C<d_off64_t>, C<d_old_pthread_create_joinable>, C<d_oldpthreads>, C<d_oldsock>, C<d_open3>, C<d_openat>, C<d_pathconf>, C<d_pause>, C<d_perl_otherlibdirs>, C<d_phostname>, C<d_pipe>, C<d_pipe2>, C<d_poll>, C<d_portable>, C<d_prctl>, C<d_prctl_set_name>, C<d_PRId64>, C<d_PRIeldbl>, C<d_PRIEUldbl>, C<d_PRIfldbl>, C<d_PRIFUldbl>, C<d_PRIgldbl>, C<d_PRIGUldbl>, C<d_PRIi64>, C<d_printf_format_null>, C<d_PRIo64>, C<d_PRIu64>, C<d_PRIx64>, C<d_PRIXU64>, C<d_procselfexe>, C<d_pseudofork>, C<d_pthread_atfork>, C<d_pthread_attr_setscope>, C<d_pthread_yield>, C<d_ptrdiff_t>, C<d_pwage>, C<d_pwchange>, C<d_pwclass>, C<d_pwcomment>, C<d_pwexpire>, C<d_pwgecos>, C<d_pwpasswd>, C<d_pwquota>, C<d_qgcvt>, C<d_quad>, C<d_querylocale>, C<d_random_r>, C<d_re_comp>, C<d_readdir>, C<d_readdir64_r>, C<d_readdir_r>, C<d_readlink>, C<d_readv>, C<d_recvmsg>, C<d_regcmp>, C<d_regcomp>, C<d_remainder>, C<d_remquo>, C<d_rename>, C<d_renameat>, C<d_rewinddir>, C<d_rint>, C<d_rmdir>, C<d_round>, C<d_sbrkproto>, C<d_scalbn>, C<d_scalbnl>, C<d_sched_yield>, C<d_scm_rights>, C<d_SCNfldbl>, C<d_seekdir>, C<d_select>, C<d_sem>, C<d_semctl>, C<d_semctl_semid_ds>, C<d_semctl_semun>, C<d_semget>, C<d_semop>, C<d_sendmsg>, C<d_setegid>, C<d_seteuid>, C<d_setgrent>, C<d_setgrent_r>, C<d_setgrps>, C<d_sethent>, C<d_sethostent_r>, C<d_setitimer>, C<d_setlinebuf>, C<d_setlocale>, C<d_setlocale_accepts_any_locale_name>, C<d_setlocale_r>, C<d_setnent>, C<d_setnetent_r>, C<d_setpent>, C<d_setpgid>, C<d_setpgrp>, C<d_setpgrp2>, C<d_setprior>, C<d_setproctitle>, C<d_setprotoent_r>, C<d_setpwent>, C<d_setpwent_r>, C<d_setregid>, C<d_setresgid>, C<d_setresuid>, C<d_setreuid>, C<d_setrgid>, C<d_setruid>, C<d_setsent>, C<d_setservent_r>, C<d_setsid>, C<d_setvbuf>, C<d_shm>, C<d_shmat>, C<d_shmatprototype>, C<d_shmctl>, C<d_shmdt>, C<d_shmget>, C<d_sigaction>, C<d_siginfo_si_addr>, C<d_siginfo_si_band>, C<d_siginfo_si_errno>, C<d_siginfo_si_fd>, C<d_siginfo_si_pid>, C<d_siginfo_si_status>, C<d_siginfo_si_uid>, C<d_siginfo_si_value>, C<d_signbit>, C<d_sigprocmask>, C<d_sigsetjmp>, C<d_sin6_scope_id>, C<d_sitearch>, C<d_snprintf>, C<d_sockaddr_in6>, C<d_sockaddr_sa_len>, C<d_sockatmark>, C<d_sockatmarkproto>, C<d_socket>, C<d_socklen_t>, C<d_sockpair>, C<d_socks5_init>, C<d_sqrtl>, C<d_srand48_r>, C<d_srandom_r>, C<d_sresgproto>, C<d_sresuproto>, C<d_stat>, C<d_statblks>, C<d_statfs_f_flags>, C<d_statfs_s>, C<d_static_inline>, C<d_statvfs>, C<d_stdio_cnt_lval>, C<d_stdio_ptr_lval>, C<d_stdio_ptr_lval_nochange_cnt>, C<d_stdio_ptr_lval_sets_cnt>, C<d_stdio_stream_array>, C<d_stdiobase>, C<d_stdstdio>, C<d_strcoll>, C<d_strerror_l>, C<d_strerror_r>, C<d_strftime>, C<d_strlcat>, C<d_strlcpy>, C<d_strnlen>, C<d_strtod>, C<d_strtod_l>, C<d_strtol>, C<d_strtold>, C<d_strtold_l>, C<d_strtoll>, C<d_strtoq>, C<d_strtoul>, C<d_strtoull>, C<d_strtouq>, C<d_strxfrm>, C<d_suidsafe>, C<d_symlink>, C<d_syscall>, C<d_syscallproto>, C<d_sysconf>, C<d_sysernlst>, C<d_syserrlst>, C<d_system>, C<d_tcgetpgrp>, C<d_tcsetpgrp>, C<d_telldir>, C<d_telldirproto>, C<d_tgamma>, C<d_thread_safe_nl_langinfo_l>, C<d_time>, C<d_timegm>, C<d_times>, C<d_tm_tm_gmtoff>, C<d_tm_tm_zone>, C<d_tmpnam_r>, C<d_towlower>, C<d_towupper>, C<d_trunc>, C<d_truncate>, C<d_truncl>, C<d_ttyname_r>, C<d_tzname>, C<d_u32align>, C<d_ualarm>, C<d_umask>, C<d_uname>, C<d_union_semun>, C<d_unlinkat>, C<d_unordered>, C<d_unsetenv>, C<d_uselocale>, C<d_usleep>, C<d_usleepproto>, C<d_ustat>, C<d_vendorarch>, C<d_vendorbin>, C<d_vendorlib>, C<d_vendorscript>, C<d_vfork>, C<d_void_closedir>, C<d_voidsig>, C<d_voidtty>, C<d_vsnprintf>, C<d_wait4>, C<d_waitpid>, C<d_wcscmp>, C<d_wcstombs>, C<d_wcsxfrm>, C<d_wctomb>, C<d_writev>, C<d_xenix>, C<date>, C<db_hashtype>, C<db_prefixtype>, C<db_version_major>, C<db_version_minor>, C<db_version_patch>, C<default_inc_excludes_dot>, C<direntrytype>, C<dlext>, C<dlsrc>, C<doubleinfbytes>, C<doublekind>, C<doublemantbits>, C<doublenanbytes>, C<doublesize>, C<drand01>, C<drand48_r_proto>, C<dtrace>, C<dtraceobject>, C<dtracexnolibs>, C<dynamic_ext> =item e C<eagain>, C<ebcdic>, C<echo>, C<egrep>, C<emacs>, C<endgrent_r_proto>, C<endhostent_r_proto>, C<endnetent_r_proto>, C<endprotoent_r_proto>, C<endpwent_r_proto>, C<endservent_r_proto>, C<eunicefix>, C<exe_ext>, C<expr>, C<extensions>, C<extern_C>, C<extras> =item f C<fflushall>, C<fflushNULL>, C<find>, C<firstmakefile>, C<flex>, C<fpossize>, C<fpostype>, C<freetype>, C<from>, C<full_ar>, C<full_csh>, C<full_sed> =item g C<gccansipedantic>, C<gccosandvers>, C<gccversion>, C<getgrent_r_proto>, C<getgrgid_r_proto>, C<getgrnam_r_proto>, C<gethostbyaddr_r_proto>, C<gethostbyname_r_proto>, C<gethostent_r_proto>, C<getlogin_r_proto>, C<getnetbyaddr_r_proto>, C<getnetbyname_r_proto>, C<getnetent_r_proto>, C<getprotobyname_r_proto>, C<getprotobynumber_r_proto>, C<getprotoent_r_proto>, C<getpwent_r_proto>, C<getpwnam_r_proto>, C<getpwuid_r_proto>, C<getservbyname_r_proto>, C<getservbyport_r_proto>, C<getservent_r_proto>, C<getspnam_r_proto>, C<gidformat>, C<gidsign>, C<gidsize>, C<gidtype>, C<glibpth>, C<gmake>, C<gmtime_r_proto>, C<gnulibc_version>, C<grep>, C<groupcat>, C<groupstype>, C<gzip> =item h C<h_fcntl>, C<h_sysfile>, C<hint>, C<hostcat>, C<hostgenerate>, C<hostosname>, C<hostperl>, C<html1dir>, C<html1direxp>, C<html3dir>, C<html3direxp> =item i C<i16size>, C<i16type>, C<i32size>, C<i32type>, C<i64size>, C<i64type>, C<i8size>, C<i8type>, C<i_arpainet>, C<i_bfd>, C<i_bsdioctl>, C<i_crypt>, C<i_db>, C<i_dbm>, C<i_dirent>, C<i_dlfcn>, C<i_execinfo>, C<i_fcntl>, C<i_fenv>, C<i_fp>, C<i_fp_class>, C<i_gdbm>, C<i_gdbm_ndbm>, C<i_gdbmndbm>, C<i_grp>, C<i_ieeefp>, C<i_inttypes>, C<i_langinfo>, C<i_libutil>, C<i_locale>, C<i_machcthr>, C<i_malloc>, C<i_mallocmalloc>, C<i_mntent>, C<i_ndbm>, C<i_netdb>, C<i_neterrno>, C<i_netinettcp>, C<i_niin>, C<i_poll>, C<i_prot>, C<i_pthread>, C<i_pwd>, C<i_quadmath>, C<i_rpcsvcdbm>, C<i_sgtty>, C<i_shadow>, C<i_socks>, C<i_stdbool>, C<i_stdint>, C<i_stdlib>, C<i_sunmath>, C<i_sysaccess>, C<i_sysdir>, C<i_sysfile>, C<i_sysfilio>, C<i_sysin>, C<i_sysioctl>, C<i_syslog>, C<i_sysmman>, C<i_sysmode>, C<i_sysmount>, C<i_sysndir>, C<i_sysparam>, C<i_syspoll>, C<i_sysresrc>, C<i_syssecrt>, C<i_sysselct>, C<i_syssockio>, C<i_sysstat>, C<i_sysstatfs>, C<i_sysstatvfs>, C<i_systime>, C<i_systimek>, C<i_systimes>, C<i_systypes>, C<i_sysuio>, C<i_sysun>, C<i_sysutsname>, C<i_sysvfs>, C<i_syswait>, C<i_termio>, C<i_termios>, C<i_time>, C<i_unistd>, C<i_ustat>, C<i_utime>, C<i_vfork>, C<i_wchar>, C<i_wctype>, C<i_xlocale>, C<ignore_versioned_solibs>, C<inc_version_list>, C<inc_version_list_init>, C<incpath>, C<incpth>, C<inews>, C<initialinstalllocation>, C<installarchlib>, C<installbin>, C<installhtml1dir>, C<installhtml3dir>, C<installman1dir>, C<installman3dir>, C<installprefix>, C<installprefixexp>, C<installprivlib>, C<installscript>, C<installsitearch>, C<installsitebin>, C<installsitehtml1dir>, C<installsitehtml3dir>, C<installsitelib>, C<installsiteman1dir>, C<installsiteman3dir>, C<installsitescript>, C<installstyle>, C<installusrbinperl>, C<installvendorarch>, C<installvendorbin>, C<installvendorhtml1dir>, C<installvendorhtml3dir>, C<installvendorlib>, C<installvendorman1dir>, C<installvendorman3dir>, C<installvendorscript>, C<intsize>, C<issymlink>, C<ivdformat>, C<ivsize>, C<ivtype> =item k C<known_extensions>, C<ksh> =item l C<ld>, C<ld_can_script>, C<lddlflags>, C<ldflags>, C<ldflags_uselargefiles>, C<ldlibpthname>, C<less>, C<lib_ext>, C<libc>, C<libperl>, C<libpth>, C<libs>, C<libsdirs>, C<libsfiles>, C<libsfound>, C<libspath>, C<libswanted>, C<libswanted_uselargefiles>, C<line>, C<lint>, C<lkflags>, C<ln>, C<lns>, C<localtime_r_proto>, C<locincpth>, C<loclibpth>, C<longdblinfbytes>, C<longdblkind>, C<longdblmantbits>, C<longdblnanbytes>, C<longdblsize>, C<longlongsize>, C<longsize>, C<lp>, C<lpr>, C<ls>, C<lseeksize>, C<lseektype> =item m C<mail>, C<mailx>, C<make>, C<make_set_make>, C<mallocobj>, C<mallocsrc>, C<malloctype>, C<man1dir>, C<man1direxp>, C<man1ext>, C<man3dir>, C<man3direxp>, C<man3ext>, C<mips_type>, C<mistrustnm>, C<mkdir>, C<mmaptype>, C<modetype>, C<more>, C<multiarch>, C<mv>, C<myarchname>, C<mydomain>, C<myhostname>, C<myuname> =item n C<n>, C<need_va_copy>, C<netdb_hlen_type>, C<netdb_host_type>, C<netdb_name_type>, C<netdb_net_type>, C<nm>, C<nm_opt>, C<nm_so_opt>, C<nonxs_ext>, C<nroff>, C<nv_overflows_integers_at>, C<nv_preserves_uv_bits>, C<nveformat>, C<nvEUformat>, C<nvfformat>, C<nvFUformat>, C<nvgformat>, C<nvGUformat>, C<nvmantbits>, C<nvsize>, C<nvtype> =item o C<o_nonblock>, C<obj_ext>, C<old_pthread_create_joinable>, C<optimize>, C<orderlib>, C<osname>, C<osvers>, C<otherlibdirs> =item p C<package>, C<pager>, C<passcat>, C<patchlevel>, C<path_sep>, C<perl>, C<perl5> =item P C<PERL_API_REVISION>, C<PERL_API_SUBVERSION>, C<PERL_API_VERSION>, C<PERL_CONFIG_SH>, C<PERL_PATCHLEVEL>, C<perl_patchlevel>, C<PERL_REVISION>, C<perl_static_inline>, C<PERL_SUBVERSION>, C<PERL_VERSION>, C<perladmin>, C<perllibs>, C<perlpath>, C<pg>, C<phostname>, C<pidtype>, C<plibpth>, C<pmake>, C<pr>, C<prefix>, C<prefixexp>, C<privlib>, C<privlibexp>, C<procselfexe>, C<ptrsize> =item q C<quadkind>, C<quadtype> =item r C<randbits>, C<randfunc>, C<random_r_proto>, C<randseedtype>, C<ranlib>, C<rd_nodata>, C<readdir64_r_proto>, C<readdir_r_proto>, C<revision>, C<rm>, C<rm_try>, C<rmail>, C<run>, C<runnm> =item s C<sched_yield>, C<scriptdir>, C<scriptdirexp>, C<sed>, C<seedfunc>, C<selectminbits>, C<selecttype>, C<sendmail>, C<setgrent_r_proto>, C<sethostent_r_proto>, C<setlocale_r_proto>, C<setnetent_r_proto>, C<setprotoent_r_proto>, C<setpwent_r_proto>, C<setservent_r_proto>, C<sGMTIME_max>, C<sGMTIME_min>, C<sh>, C<shar>, C<sharpbang>, C<shmattype>, C<shortsize>, C<shrpenv>, C<shsharp>, C<sig_count>, C<sig_name>, C<sig_name_init>, C<sig_num>, C<sig_num_init>, C<sig_size>, C<signal_t>, C<sitearch>, C<sitearchexp>, C<sitebin>, C<sitebinexp>, C<sitehtml1dir>, C<sitehtml1direxp>, C<sitehtml3dir>, C<sitehtml3direxp>, C<sitelib>, C<sitelib_stem>, C<sitelibexp>, C<siteman1dir>, C<siteman1direxp>, C<siteman3dir>, C<siteman3direxp>, C<siteprefix>, C<siteprefixexp>, C<sitescript>, C<sitescriptexp>, C<sizesize>, C<sizetype>, C<sleep>, C<sLOCALTIME_max>, C<sLOCALTIME_min>, C<smail>, C<so>, C<sockethdr>, C<socketlib>, C<socksizetype>, C<sort>, C<spackage>, C<spitshell>, C<sPRId64>, C<sPRIeldbl>, C<sPRIEUldbl>, C<sPRIfldbl>, C<sPRIFUldbl>, C<sPRIgldbl>, C<sPRIGUldbl>, C<sPRIi64>, C<sPRIo64>, C<sPRIu64>, C<sPRIx64>, C<sPRIXU64>, C<srand48_r_proto>, C<srandom_r_proto>, C<src>, C<sSCNfldbl>, C<ssizetype>, C<st_ino_sign>, C<st_ino_size>, C<startperl>, C<startsh>, C<static_ext>, C<stdchar>, C<stdio_base>, C<stdio_bufsiz>, C<stdio_cnt>, C<stdio_filbuf>, C<stdio_ptr>, C<stdio_stream_array>, C<strerror_r_proto>, C<submit>, C<subversion>, C<sysman>, C<sysroot> =item t C<tail>, C<tar>, C<targetarch>, C<targetdir>, C<targetenv>, C<targethost>, C<targetmkdir>, C<targetport>, C<targetsh>, C<tbl>, C<tee>, C<test>, C<timeincl>, C<timetype>, C<tmpnam_r_proto>, C<to>, C<touch>, C<tr>, C<trnl>, C<troff>, C<ttyname_r_proto> =item u C<u16size>, C<u16type>, C<u32size>, C<u32type>, C<u64size>, C<u64type>, C<u8size>, C<u8type>, C<uidformat>, C<uidsign>, C<uidsize>, C<uidtype>, C<uname>, C<uniq>, C<uquadtype>, C<use5005threads>, C<use64bitall>, C<use64bitint>, C<usecbacktrace>, C<usecrosscompile>, C<usedevel>, C<usedl>, C<usedtrace>, C<usefaststdio>, C<useithreads>, C<usekernprocpathname>, C<uselanginfo>, C<uselargefiles>, C<uselongdouble>, C<usemallocwrap>, C<usemorebits>, C<usemultiplicity>, C<usemymalloc>, C<usenm>, C<usensgetexecutablepath>, C<useopcode>, C<useperlio>, C<useposix>, C<usequadmath>, C<usereentrant>, C<userelocatableinc>, C<useshrplib>, C<usesitecustomize>, C<usesocks>, C<usethreads>, C<usevendorprefix>, C<useversionedarchname>, C<usevfork>, C<usrinc>, C<uuname>, C<uvoformat>, C<uvsize>, C<uvtype>, C<uvuformat>, C<uvxformat>, C<uvXUformat> =item v C<vendorarch>, C<vendorarchexp>, C<vendorbin>, C<vendorbinexp>, C<vendorhtml1dir>, C<vendorhtml1direxp>, C<vendorhtml3dir>, C<vendorhtml3direxp>, C<vendorlib>, C<vendorlib_stem>, C<vendorlibexp>, C<vendorman1dir>, C<vendorman1direxp>, C<vendorman3dir>, C<vendorman3direxp>, C<vendorprefix>, C<vendorprefixexp>, C<vendorscript>, C<vendorscriptexp>, C<version>, C<version_patchlevel_string>, C<versiononly>, C<vi> =item x C<xlibpth> =item y C<yacc>, C<yaccflags> =item z C<zcat>, C<zip> =back =over 4 =item GIT DATA =item NOTE =back =head2 Config::Extensions - hash lookup of which core extensions were built. =over 4 =item SYNOPSIS =item DESCRIPTION dynamic, nonxs, static =item AUTHOR =back =head2 Config::Perl::V - Structured data retrieval of perl -V output =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item $conf = myconfig () =item $conf = plv2hash ($text [, ...]) =item $info = summary ([$conf]) =item $md5 = signature ([$conf]) =item The hash structure build, osname, stamp, options, derived, patches, environment, config, inc =back =item REASONING =item BUGS =item TODO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 Cwd - get pathname of current working directory =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item getcwd and friends getcwd, cwd, fastcwd, fastgetcwd, getdcwd =item abs_path and friends abs_path, realpath, fast_abs_path =item $ENV{PWD} =back =item NOTES =item AUTHOR =item COPYRIGHT =item SEE ALSO =back =head2 DB - programmatic interface to the Perl debugging API =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Global Variables $DB::sub, %DB::sub, $DB::single, $DB::signal, $DB::trace, @DB::args, @DB::dbline, %DB::dbline, $DB::package, $DB::filename, $DB::subname, $DB::lineno =item API Methods CLIENT->register(), CLIENT->evalcode(STRING), CLIENT->skippkg('D::hide'), CLIENT->run(), CLIENT->step(), CLIENT->next(), CLIENT->done() =item Client Callback Methods CLIENT->init(), CLIENT->prestop([STRING]), CLIENT->stop(), CLIENT->idle(), CLIENT->poststop([STRING]), CLIENT->evalcode(STRING), CLIENT->cleanup(), CLIENT->output(LIST) =back =item BUGS =item AUTHOR =back =head2 DBM_Filter -- Filter DBM keys/values =over 4 =item SYNOPSIS =item DESCRIPTION =item What is a DBM Filter? =over 4 =item So what's new? =back =item METHODS =over 4 =item $db->Filter_Push() / $db->Filter_Key_Push() / $db->Filter_Value_Push() Filter_Push, Filter_Key_Push, Filter_Value_Push =item $db->Filter_Pop() =item $db->Filtered() =back =item Writing a Filter =over 4 =item Immediate Filters =item Canned Filters "name", params =back =item Filters Included utf8, encode, compress, int32, null =item NOTES =over 4 =item Maintain Round Trip Integrity =item Don't mix filtered & non-filtered data in the same database file. =back =item EXAMPLE =item SEE ALSO =item AUTHOR =back =head2 DBM_Filter::compress - filter for DBM_Filter =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 DBM_Filter::encode - filter for DBM_Filter =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 DBM_Filter::int32 - filter for DBM_Filter =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 DBM_Filter::null - filter for DBM_Filter =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 DBM_Filter::utf8 - filter for DBM_Filter =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 DB_File - Perl5 access to Berkeley DB version 1.x =over 4 =item SYNOPSIS =item DESCRIPTION B<DB_HASH>, B<DB_BTREE>, B<DB_RECNO> =over 4 =item Using DB_File with Berkeley DB version 2 or greater =item Interface to Berkeley DB =item Opening a Berkeley DB Database File =item Default Parameters =item In Memory Databases =back =item DB_HASH =over 4 =item A Simple Example =back =item DB_BTREE =over 4 =item Changing the BTREE sort order =item Handling Duplicate Keys =item The get_dup() Method =item The find_dup() Method =item The del_dup() Method =item Matching Partial Keys =back =item DB_RECNO =over 4 =item The 'bval' Option =item A Simple Example =item Extra RECNO Methods B<$X-E<gt>push(list) ;>, B<$value = $X-E<gt>pop ;>, B<$X-E<gt>shift>, B<$X-E<gt>unshift(list) ;>, B<$X-E<gt>length>, B<$X-E<gt>splice(offset, length, elements);> =item Another Example =back =item THE API INTERFACE B<$status = $X-E<gt>get($key, $value [, $flags]) ;>, B<$status = $X-E<gt>put($key, $value [, $flags]) ;>, B<$status = $X-E<gt>del($key [, $flags]) ;>, B<$status = $X-E<gt>fd ;>, B<$status = $X-E<gt>seq($key, $value, $flags) ;>, B<$status = $X-E<gt>sync([$flags]) ;> =item DBM FILTERS =over 4 =item DBM Filter Low-level API B<filter_store_key>, B<filter_store_value>, B<filter_fetch_key>, B<filter_fetch_value> =item The Filter =item An Example -- the NULL termination problem. =item Another Example -- Key is a C int. =back =item HINTS AND TIPS =over 4 =item Locking: The Trouble with fd =item Safe ways to lock a database B<Tie::DB_Lock>, B<Tie::DB_LockFile>, B<DB_File::Lock> =item Sharing Databases With C Applications =item The untie() Gotcha =back =item COMMON QUESTIONS =over 4 =item Why is there Perl source in my database? =item How do I store complex data structures with DB_File? =item What does "wide character in subroutine entry" mean? =item What does "Invalid Argument" mean? =item What does "Bareword 'DB_File' not allowed" mean? =back =item REFERENCES =item HISTORY =item BUGS =item SUPPORT =item AVAILABILITY =item COPYRIGHT =item SEE ALSO =item AUTHOR =back =head2 Data::Dumper - stringified perl data structures, suitable for both printing and C<eval> =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods I<PACKAGE>->new(I<ARRAYREF [>, I<ARRAYREF]>), I<$OBJ>->Dump I<or> I<PACKAGE>->Dump(I<ARRAYREF [>, I<ARRAYREF]>), I<$OBJ>->Seen(I<[HASHREF]>), I<$OBJ>->Values(I<[ARRAYREF]>), I<$OBJ>->Names(I<[ARRAYREF]>), I<$OBJ>->Reset =item Functions Dumper(I<LIST>) =item Configuration Variables or Methods =item Exports Dumper =back =item EXAMPLES =item BUGS =over 4 =item NOTE =back =item AUTHOR =item VERSION =item SEE ALSO =back =head2 Devel::PPPort - Perl/Pollution/Portability =over 4 =item SYNOPSIS =item Start using Devel::PPPort for XS projects =item DESCRIPTION =over 4 =item Why use ppport.h? =item How to use ppport.h =item Running ppport.h =back =item FUNCTIONS =over 4 =item WriteFile =item GetFileContents =back =item COMPATIBILITY =over 4 =item Provided Perl compatibility API =item Supported Perl API, sorted by version perl 5.31.5, perl 5.31.4, perl 5.31.3, perl 5.29.10, perl 5.29.9, perl 5.27.9, perl 5.27.8, perl 5.27.7, perl 5.27.6, perl 5.27.4, perl 5.27.3, perl 5.27.2, perl 5.27.1, perl 5.25.10, perl 5.25.9, perl 5.25.8, perl 5.25.7, perl 5.25.6, perl 5.25.5, perl 5.25.3, perl 5.25.1, perl 5.23.8, perl 5.23.2, perl 5.23.0, perl 5.21.10, perl 5.21.9, perl 5.21.8, perl 5.21.7, perl 5.21.6, perl 5.21.5, perl 5.21.4, perl 5.21.2, perl 5.21.1, perl 5.19.10, perl 5.19.9, perl 5.19.7, perl 5.19.5, perl 5.19.4, perl 5.19.3, perl 5.19.2, perl 5.19.1, perl 5.18.0, perl 5.17.11, perl 5.17.8, perl 5.17.7, perl 5.17.6, perl 5.17.5, perl 5.17.4, perl 5.17.2, perl 5.17.1, perl 5.16.0, perl 5.15.8, perl 5.15.6, perl 5.15.4, perl 5.15.3, perl 5.15.2, perl 5.15.1, perl 5.13.10, perl 5.13.9, perl 5.13.8, perl 5.13.7, perl 5.13.6, perl 5.13.5, perl 5.13.4, perl 5.13.3, perl 5.13.2, perl 5.13.1, perl 5.11.5, perl 5.11.4, perl 5.11.2, perl 5.11.1, perl 5.11.0, perl 5.10.1, perl 5.10.0, perl 5.9.5, perl 5.9.4, perl 5.9.3, perl 5.9.2, perl 5.9.1, perl 5.9.0, perl 5.8.9, perl 5.8.8, perl 5.8.3, perl 5.8.1, perl 5.8.0, perl 5.7.3, perl 5.7.2, perl 5.7.1, perl 5.6.1, perl 5.6.0, perl 5.005_03, perl 5.005, perl 5.004_05, perl 5.004, perl 5.003_07 (at least), Backported version unknown =back =item BUGS =item AUTHORS =item COPYRIGHT =item SEE ALSO =back =head2 Devel::Peek - A data debugging tool for the XS programmer =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Runtime debugging =item Memory footprint debugging =back =item EXAMPLES =over 4 =item A simple scalar string =item A simple scalar number =item A simple scalar with an extra reference =item A reference to a simple scalar =item A reference to an array =item A reference to a hash =item Dumping a large array or hash =item A reference to an SV which holds a C pointer =item A reference to a subroutine =back =item EXPORTS =item BUGS =item AUTHOR =item SEE ALSO =back =head2 Devel::SelfStubber - generate stubs for a SelfLoading module =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Digest - Modules that calculate message digests =over 4 =item SYNOPSIS =item DESCRIPTION I<binary>, I<hex>, I<base64> =item OO INTERFACE $ctx = Digest->XXX($arg,...), $ctx = Digest->new(XXX => $arg,...), $ctx = Digest::XXX->new($arg,...), $other_ctx = $ctx->clone, $ctx->reset, $ctx->add( $data ), $ctx->add( $chunk1, $chunk2, ... ), $ctx->addfile( $io_handle ), $ctx->add_bits( $data, $nbits ), $ctx->add_bits( $bitstring ), $ctx->digest, $ctx->hexdigest, $ctx->b64digest =item Digest speed =item SEE ALSO =item AUTHOR =back =head2 Digest::MD5 - Perl interface to the MD5 Algorithm =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS md5($data,...), md5_hex($data,...), md5_base64($data,...) =item METHODS $md5 = Digest::MD5->new, $md5->reset, $md5->clone, $md5->add($data,...), $md5->addfile($io_handle), $md5->add_bits($data, $nbits), $md5->add_bits($bitstring), $md5->digest, $md5->hexdigest, $md5->b64digest, @ctx = $md5->context, $md5->context(@ctx) =item EXAMPLES =item SEE ALSO =item COPYRIGHT =item AUTHORS =back =head2 Digest::SHA - Perl extension for SHA-1/224/256/384/512 =over 4 =item SYNOPSIS =item SYNOPSIS (HMAC-SHA) =item ABSTRACT =item DESCRIPTION =item UNICODE AND SIDE EFFECTS =item NIST STATEMENT ON SHA-1 =item PADDING OF BASE64 DIGESTS =item EXPORT =item EXPORTABLE FUNCTIONS B<sha1($data, ...)>, B<sha224($data, ...)>, B<sha256($data, ...)>, B<sha384($data, ...)>, B<sha512($data, ...)>, B<sha512224($data, ...)>, B<sha512256($data, ...)>, B<sha1_hex($data, ...)>, B<sha224_hex($data, ...)>, B<sha256_hex($data, ...)>, B<sha384_hex($data, ...)>, B<sha512_hex($data, ...)>, B<sha512224_hex($data, ...)>, B<sha512256_hex($data, ...)>, B<sha1_base64($data, ...)>, B<sha224_base64($data, ...)>, B<sha256_base64($data, ...)>, B<sha384_base64($data, ...)>, B<sha512_base64($data, ...)>, B<sha512224_base64($data, ...)>, B<sha512256_base64($data, ...)>, B<new($alg)>, B<reset($alg)>, B<hashsize>, B<algorithm>, B<clone>, B<add($data, ...)>, B<add_bits($data, $nbits)>, B<add_bits($bits)>, B<addfile(*FILE)>, B<addfile($filename [, $mode])>, B<getstate>, B<putstate($str)>, B<dump($filename)>, B<load($filename)>, B<digest>, B<hexdigest>, B<b64digest>, B<hmac_sha1($data, $key)>, B<hmac_sha224($data, $key)>, B<hmac_sha256($data, $key)>, B<hmac_sha384($data, $key)>, B<hmac_sha512($data, $key)>, B<hmac_sha512224($data, $key)>, B<hmac_sha512256($data, $key)>, B<hmac_sha1_hex($data, $key)>, B<hmac_sha224_hex($data, $key)>, B<hmac_sha256_hex($data, $key)>, B<hmac_sha384_hex($data, $key)>, B<hmac_sha512_hex($data, $key)>, B<hmac_sha512224_hex($data, $key)>, B<hmac_sha512256_hex($data, $key)>, B<hmac_sha1_base64($data, $key)>, B<hmac_sha224_base64($data, $key)>, B<hmac_sha256_base64($data, $key)>, B<hmac_sha384_base64($data, $key)>, B<hmac_sha512_base64($data, $key)>, B<hmac_sha512224_base64($data, $key)>, B<hmac_sha512256_base64($data, $key)> =item SEE ALSO =item AUTHOR =item ACKNOWLEDGMENTS =item COPYRIGHT AND LICENSE =back =head2 Digest::base - Digest base class =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 Digest::file - Calculate digests of files =over 4 =item SYNOPSIS =item DESCRIPTION digest_file( $file, $algorithm, [$arg,...] ), digest_file_hex( $file, $algorithm, [$arg,...] ), digest_file_base64( $file, $algorithm, [$arg,...] ) =item SEE ALSO =back =head2 DirHandle - (obsolete) supply object methods for directory handles =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Dumpvalue - provides screen dump of Perl data. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Creation C<arrayDepth>, C<hashDepth>, C<compactDump>, C<veryCompact>, C<globPrint>, C<dumpDBFiles>, C<dumpPackages>, C<dumpReused>, C<tick>, C<quoteHighBit>, C<printUndef>, C<usageOnly>, unctrl, subdump, bareStringify, quoteHighBit, stopDbSignal =item Methods dumpValue, dumpValues, stringify, dumpvars, set_quote, set_unctrl, compactDump, veryCompact, set, get =back =back =head2 DynaLoader - Dynamically load C libraries into Perl code =over 4 =item SYNOPSIS =item DESCRIPTION @dl_library_path, @dl_resolve_using, @dl_require_symbols, @dl_librefs, @dl_modules, @dl_shared_objects, dl_error(), $dl_debug, $dl_dlext, dl_findfile(), dl_expandspec(), dl_load_file(), dl_unload_file(), dl_load_flags(), dl_find_symbol(), dl_find_symbol_anywhere(), dl_undef_symbols(), dl_install_xsub(), bootstrap() =item AUTHOR =back =head2 Encode - character encodings in Perl =over 4 =item SYNOPSIS =over 4 =item Table of Contents L<Encode::Alias> - Alias definitions to encodings, L<Encode::Encoding> - Encode Implementation Base Class, L<Encode::Supported> - List of Supported Encodings, L<Encode::CN> - Simplified Chinese Encodings, L<Encode::JP> - Japanese Encodings, L<Encode::KR> - Korean Encodings, L<Encode::TW> - Traditional Chinese Encodings =back =item DESCRIPTION =over 4 =item TERMINOLOGY =back =item THE PERL ENCODING API =over 4 =item Basic methods =item Listing available encodings =item Defining Aliases =item Finding IANA Character Set Registry names =back =item Encoding via PerlIO =item Handling Malformed Data =over 4 =item List of I<CHECK> values perlqq mode (I<CHECK> = Encode::FB_PERLQQ), HTML charref mode (I<CHECK> = Encode::FB_HTMLCREF), XML charref mode (I<CHECK> = Encode::FB_XMLCREF) =item coderef for CHECK =back =item Defining Encodings =item The UTF8 flag Goal #1:, Goal #2:, Goal #3:, Goal #4: =over 4 =item Messing with Perl's Internals =back =item UTF-8 vs. utf8 vs. UTF8 =item SEE ALSO =item MAINTAINER =item COPYRIGHT =back =head2 Encode::Alias - alias definitions to encodings =over 4 =item SYNOPSIS =item DESCRIPTION As a simple string, As a qr// compiled regular expression, e.g.:, As a code reference, e.g.: =over 4 =item Alias overloading =back =item SEE ALSO =back =head2 Encode::Byte - Single Byte Encodings =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION =item SEE ALSO =back =head2 Encode::CJKConstants -- Internally used by Encode::??::ISO_2022_* =head2 Encode::CN - China-based Chinese Encodings =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTES =item BUGS =item SEE ALSO =back =head2 Encode::CN::HZ -- internally used by Encode::CN =head2 Encode::Config -- internally used by Encode =head2 Encode::EBCDIC - EBCDIC Encodings =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION =item SEE ALSO =back =head2 Encode::Encoder -- Object Oriented Encoder =over 4 =item SYNOPSIS =item ABSTRACT =item Description =over 4 =item Predefined Methods $e = Encode::Encoder-E<gt>new([$data, $encoding]);, encoder(), $e-E<gt>data([$data]), $e-E<gt>encoding([$encoding]), $e-E<gt>bytes([$encoding]) =item Example: base64 transcoder =item Operator Overloading =back =item SEE ALSO =back =head2 Encode::Encoding - Encode Implementation Base Class =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods you should implement -E<gt>encode($string [,$check]), -E<gt>decode($octets [,$check]), -E<gt>cat_decode($destination, $octets, $offset, $terminator [,$check]) =item Other methods defined in Encode::Encodings -E<gt>name, -E<gt>mime_name, -E<gt>renew, -E<gt>renewed, -E<gt>perlio_ok(), -E<gt>needs_lines() =item Example: Encode::ROT13 =back =item Why the heck Encode API is different? =over 4 =item Compiled Encodings =back =item SEE ALSO Scheme 1, Scheme 2, Other Schemes =back =head2 Encode::GSM0338 -- ESTI GSM 03.38 Encoding =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTES =item BUGS =item SEE ALSO =back =head2 Encode::Guess -- Guesses encoding from data =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION Encode::Guess->set_suspects, Encode::Guess->add_suspects, Encode::decode("Guess" ...), Encode::Guess->guess($data), guess_encoding($data, [, I<list of suspects>]) =item CAVEATS =item TO DO =item SEE ALSO =back =head2 Encode::JP - Japanese Encodings =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION =item Note on ISO-2022-JP(-1)? =item BUGS =item SEE ALSO =back =head2 Encode::JP::H2Z -- internally used by Encode::JP::2022_JP* =head2 Encode::JP::JIS7 -- internally used by Encode::JP =head2 Encode::KR - Korean Encodings =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item SEE ALSO =back =head2 Encode::KR::2022_KR -- internally used by Encode::KR =head2 Encode::MIME::Header -- MIME encoding for an unstructured email header =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION =item BUGS =item AUTHORS =item SEE ALSO =back =head2 Encode::MIME::Name, Encode::MIME::NAME -- internally used by Encode =over 4 =item SEE ALSO =back =head2 Encode::PerlIO -- a detailed document on Encode and PerlIO =over 4 =item Overview =item How does it work? =item Line Buffering =over 4 =item How can I tell whether my encoding fully supports PerlIO ? =back =item SEE ALSO =back =head2 Encode::Supported -- Encodings supported by Encode =over 4 =item DESCRIPTION =over 4 =item Encoding Names =back =item Supported Encodings =over 4 =item Built-in Encodings =item Encode::Unicode -- other Unicode encodings =item Encode::Byte -- Extended ASCII ISO-8859 and corresponding vendor mappings, KOI8 - De Facto Standard for the Cyrillic world =item gsm0338 - Hentai Latin 1 gsm0338 support before 2.19 =item CJK: Chinese, Japanese, Korean (Multibyte) Encode::CN -- Continental China, Encode::JP -- Japan, Encode::KR -- Korea, Encode::TW -- Taiwan, Encode::HanExtra -- More Chinese via CPAN, Encode::JIS2K -- JIS X 0213 encodings via CPAN =item Miscellaneous encodings Encode::EBCDIC, Encode::Symbols, Encode::MIME::Header, Encode::Guess =back =item Unsupported encodings ISO-2022-JP-2 [RFC1554], ISO-2022-CN [RFC1922], Various HP-UX encodings, Cyrillic encoding ISO-IR-111, ISO-8859-8-1 [Hebrew], ISIRI 3342, Iran System, ISIRI 2900 [Farsi], Thai encoding TCVN, Vietnamese encodings VPS, Various Mac encodings, (Mac) Indic encodings =item Encoding vs. Charset -- terminology =item Encoding Classification (by Anton Tagunov and Dan Kogai) =over 4 =item Microsoft-related naming mess KS_C_5601-1987, GB2312, Big5, Shift_JIS =back =item Glossary character repertoire, coded character set (CCS), character encoding scheme (CES), charset (in MIME context), EUC, ISO-2022, UCS, UCS-2, Unicode, UTF, UTF-16 =item See Also =item References ECMA, ECMA-035 (eq C<ISO-2022>), IANA, Assigned Charset Names by IANA, ISO, RFC, UC, Unicode Glossary =over 4 =item Other Notable Sites czyborra.com, CJK.inf, Jungshik Shin's Hangul FAQ, debian.org: "Introduction to i18n" =item Offline sources C<CJKV Information Processing> by Ken Lunde =back =back =head2 Encode::Symbol - Symbol Encodings =over 4 =item SYNOPSIS =item ABSTRACT =item DESCRIPTION =item SEE ALSO =back =head2 Encode::TW - Taiwan-based Chinese Encodings =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTES =item BUGS =item SEE ALSO =back =head2 Encode::Unicode -- Various Unicode Transformation Formats =over 4 =item SYNOPSIS =item ABSTRACT L<http://www.unicode.org/glossary/> says:, Quick Reference =item Size, Endianness, and BOM =over 4 =item by size =item by endianness BOM as integer when fetched in network byte order =back =item Surrogate Pairs =item Error Checking =item SEE ALSO =back =head2 Encode::Unicode::UTF7 -- UTF-7 encoding =over 4 =item SYNOPSIS =item ABSTRACT =item In Practice =item SEE ALSO =back =head2 English - use nice English (or awk) names for ugly punctuation variables =over 4 =item SYNOPSIS =item DESCRIPTION =item PERFORMANCE =back =head2 Env - perl module that imports environment variables as scalars or arrays =over 4 =item SYNOPSIS =item DESCRIPTION =item LIMITATIONS =item AUTHOR =back =head2 Errno - System errno constants =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEATS =item AUTHOR =item COPYRIGHT =back =head2 Exporter - Implements default import method for modules =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item How to Export =item Selecting What to Export =item How to Import C<use YourModule;>, C<use YourModule ();>, C<use YourModule qw(...);> =back =item Advanced Features =over 4 =item Specialised Import Lists =item Exporting Without Using Exporter's import Method =item Exporting Without Inheriting from Exporter =item Module Version Checking =item Managing Unknown Symbols =item Tag Handling Utility Functions =item Generating Combined Tags =item C<AUTOLOAD>ed Constants =back =item Good Practices =over 4 =item Declaring C<@EXPORT_OK> and Friends =item Playing Safe =item What Not to Export =back =item SEE ALSO =item LICENSE =back =head2 Exporter::Heavy - Exporter guts =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::CBuilder - Compile and link C code for Perl modules =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS new, have_compiler, have_cplusplus, compile, C<object_file>, C<include_dirs>, C<extra_compiler_flags>, C<C++>, link, lib_file, module_name, extra_linker_flags, link_executable, exe_file, object_file, lib_file, exe_file, prelink, need_prelink, extra_link_args_after_prelink =item TO DO =item HISTORY =item SUPPORT =item AUTHOR =item COPYRIGHT =item SEE ALSO =back =head2 ExtUtils::CBuilder::Platform::Windows - Builder class for Windows platforms =over 4 =item DESCRIPTION =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::Command - utilities to replace common UNIX commands in Makefiles etc. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item FUNCTIONS =back =back cat eqtime rm_rf rm_f touch mv cp chmod mkpath test_f test_d dos2unix =over 4 =item SEE ALSO =item AUTHOR =back =head2 ExtUtils::Command::MM - Commands for the MM's to use in Makefiles =over 4 =item SYNOPSIS =item DESCRIPTION B<test_harness> =back B<pod2man> B<warn_if_old_packlist> B<perllocal_install> B<uninstall> B<test_s> B<cp_nonempty> =head2 ExtUtils::Constant - generate XS code to import C header constants =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE IV, UV, NV, PV, PVN, SV, YES, NO, UNDEF =item FUNCTIONS =back constant_types XS_constant PACKAGE, TYPES, XS_SUBNAME, C_SUBNAME autoload PACKAGE, VERSION, AUTOLOADER WriteMakefileSnippet WriteConstants ATTRIBUTE =E<gt> VALUE [, ...], NAME, DEFAULT_TYPE, BREAKOUT_AT, NAMES, PROXYSUBS, C_FH, C_FILE, XS_FH, XS_FILE, XS_SUBNAME, C_SUBNAME =over 4 =item AUTHOR =back =head2 ExtUtils::Constant::Base - base class for ExtUtils::Constant objects =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE =back header memEQ_clause args_hashref dump_names arg_hashref, ITEM.. assign arg_hashref, VALUE.. return_clause arg_hashref, ITEM switch_clause arg_hashref, NAMELEN, ITEMHASH, ITEM.. params WHAT dogfood arg_hashref, ITEM.. normalise_items args, default_type, seen_types, seen_items, ITEM.. C_constant arg_hashref, ITEM.., name, type, value, macro, default, pre, post, def_pre, def_post, utf8, weight =over 4 =item BUGS =item AUTHOR =back =head2 ExtUtils::Constant::Utils - helper functions for ExtUtils::Constant =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE C_stringify NAME =back perl_stringify NAME =over 4 =item AUTHOR =back =head2 ExtUtils::Constant::XS - generate C code for XS modules' constants. =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item AUTHOR =back =head2 ExtUtils::Embed - Utilities for embedding Perl in C/C++ applications =over 4 =item SYNOPSIS =item DESCRIPTION =item @EXPORT =item FUNCTIONS xsinit(), Examples, ldopts(), Examples, perl_inc(), ccflags(), ccdlflags(), ccopts(), xsi_header(), xsi_protos(@modules), xsi_body(@modules) =item EXAMPLES =item SEE ALSO =item AUTHOR =back =head2 ExtUtils::Install - install files from here to there =over 4 =item SYNOPSIS =item VERSION =back =over 4 =item DESCRIPTION _chmod($$;$), _warnonce(@), _choke(@) =back _move_file_at_boot( $file, $target, $moan ) _unlink_or_rename( $file, $tryhard, $installing ) =over 4 =item Functions _get_install_skip =back _have_write_access _can_write_dir(C<$dir>) _mkpath($dir,$show,$mode,$verbose,$dry_run) _copy($from,$to,$verbose,$dry_run) _chdir($from) B<install> _do_cleanup install_rooted_file( $file ), install_rooted_dir( $dir ) forceunlink( $file, $tryhard ) directory_not_empty( $dir ) B<install_default> I<DISCOURAGED> B<uninstall> inc_uninstall($filepath,$libdir,$verbose,$dry_run,$ignore,$results) run_filter($cmd,$src,$dest) B<pm_to_blib> _autosplit _invokant =over 4 =item ENVIRONMENT B<PERL_INSTALL_ROOT>, B<EU_INSTALL_IGNORE_SKIP>, B<EU_INSTALL_SITE_SKIPFILE>, B<EU_INSTALL_ALWAYS_COPY> =item AUTHOR =item LICENSE =back =head2 ExtUtils::Installed - Inventory management of installed modules =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE =item METHODS new(), modules(), files(), directories(), directory_tree(), validate(), packlist(), version() =item EXAMPLE =item AUTHOR =back =head2 ExtUtils::Liblist - determine libraries to use and how to use them =over 4 =item SYNOPSIS =item DESCRIPTION For static extensions, For dynamic extensions at build/link time, For dynamic extensions at load time =over 4 =item EXTRALIBS =item LDLOADLIBS and LD_RUN_PATH =item BSLOADLIBS =back =item PORTABILITY =over 4 =item VMS implementation =item Win32 implementation =back =item SEE ALSO =back =head2 ExtUtils::MM - OS adjusted ExtUtils::MakeMaker subclass =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::MM::Utils - ExtUtils::MM methods without dependency on ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS maybe_command =back =over 4 =item BUGS =item SEE ALSO =back =head2 ExtUtils::MM_AIX - AIX specific subclass of ExtUtils::MM_Unix =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods =back =back =over 4 =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MM_Any - Platform-agnostic MM methods =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Cross-platform helper methods =back =back =over 4 =item Targets =back =over 4 =item Init methods =back =over 4 =item Tools =back =over 4 =item File::Spec wrappers =back =over 4 =item Misc =back =over 4 =item AUTHOR =back =head2 ExtUtils::MM_BeOS - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =back os_flavor init_linker =head2 ExtUtils::MM_Cygwin - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION os_flavor =back cflags replace_manpage_separator init_linker maybe_command dynamic_lib install =head2 ExtUtils::MM_DOS - DOS specific subclass of ExtUtils::MM_Unix =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods os_flavor =back =back B<replace_manpage_separator> xs_static_lib_is_xs =over 4 =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MM_Darwin - special behaviors for OS X =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden Methods =back =back =head2 ExtUtils::MM_MacOS - once produced Makefiles for MacOS Classic =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::MM_NW5 - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =back os_flavor init_platform, platform_constants static_lib_pure_cmd xs_static_lib_is_xs dynamic_lib =head2 ExtUtils::MM_OS2 - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS init_dist =back init_linker os_flavor xs_static_lib_is_xs =head2 ExtUtils::MM_QNX - QNX specific subclass of ExtUtils::MM_Unix =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods =back =back =over 4 =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MM_UWIN - U/WIN specific subclass of ExtUtils::MM_Unix =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods os_flavor =back =back B<replace_manpage_separator> =over 4 =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MM_Unix - methods used by ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =back =over 4 =item Methods os_flavor =back c_o (o) xs_obj_opt dbgoutflag cflags (o) const_cccmd (o) const_config (o) const_loadlibs (o) constants (o) depend (o) init_DEST init_dist dist (o) dist_basics (o) dist_ci (o) dist_core (o) B<dist_target> B<tardist_target> B<zipdist_target> B<tarfile_target> zipfile_target uutardist_target shdist_target dlsyms (o) dynamic_bs (o) dynamic_lib (o) xs_dynamic_lib_macros xs_make_dynamic_lib exescan extliblist find_perl fixin force (o) guess_name has_link_code init_dirscan init_MANPODS init_MAN1PODS init_MAN3PODS init_PM init_DIRFILESEP init_main init_tools init_linker init_lib2arch init_PERL init_platform, platform_constants init_PERM init_xs install (o) installbin (o) linkext (o) lsdir macro (o) makeaperl (o) xs_static_lib_is_xs (o) makefile (o) maybe_command needs_linking (o) parse_abstract parse_version pasthru (o) perl_script perldepend (o) pm_to_blib ppd prefixify processPL (o) specify_shell quote_paren replace_manpage_separator cd oneliner quote_literal escape_newlines max_exec_len static (o) xs_make_static_lib static_lib_closures static_lib_fixtures static_lib_pure_cmd staticmake (o) subdir_x (o) subdirs (o) test (o) test_via_harness (override) test_via_script (override) tool_xsubpp (o) all_target top_targets (o) writedoc xs_c (o) xs_cpp (o) xs_o (o) =over 4 =item SEE ALSO =back =head2 ExtUtils::MM_VMS - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Methods always loaded wraplist =back =back =over 4 =item Methods guess_name (override) =back find_perl (override) _fixin_replace_shebang (override) maybe_command (override) pasthru (override) pm_to_blib (override) perl_script (override) replace_manpage_separator init_DEST init_DIRFILESEP init_main (override) init_tools (override) init_platform (override) platform_constants init_VERSION (override) constants (override) special_targets cflags (override) const_cccmd (override) tools_other (override) init_dist (override) c_o (override) xs_c (override) xs_o (override) _xsbuild_replace_macro (override) _xsbuild_value (override) dlsyms (override) xs_obj_opt dynamic_lib (override) xs_make_static_lib (override) static_lib_pure_cmd (override) xs_static_lib_is_xs extra_clean_files zipfile_target, tarfile_target, shdist_target install (override) perldepend (override) makeaperl (override) maketext_filter (override) prefixify (override) cd oneliner B<echo> quote_literal escape_dollarsigns escape_all_dollarsigns escape_newlines max_exec_len init_linker catdir (override), catfile (override) eliminate_macros fixpath os_flavor is_make_type (override) make_type (override) =over 4 =item AUTHOR =back =head2 ExtUtils::MM_VOS - VOS specific subclass of ExtUtils::MM_Unix =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods =back =back =over 4 =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MM_Win32 - methods to override UN*X behaviour in ExtUtils::MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item Overridden methods B<dlsyms> =back xs_dlsyms_ext replace_manpage_separator B<maybe_command> B<init_DIRFILESEP> init_tools init_others init_platform, platform_constants specify_shell constants special_targets static_lib_pure_cmd dynamic_lib extra_clean_files init_linker perl_script quote_dep xs_obj_opt pasthru arch_check (override) oneliner cd max_exec_len os_flavor dbgoutflag cflags make_type =head2 ExtUtils::MM_Win95 - method to customize MakeMaker for Win9X =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Overridden methods max_exec_len =back =back os_flavor =over 4 =item AUTHOR =back =head2 ExtUtils::MY - ExtUtils::MakeMaker subclass for customization =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::MakeMaker - Create a module Makefile =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item How To Write A Makefile.PL =item Default Makefile Behaviour =item make test =item make testdb =item make install =item INSTALL_BASE =item PREFIX and LIB attribute =item AFS users =item Static Linking of a new Perl Binary =item Determination of Perl Library and Installation Locations =item Which architecture dependent directory? =item Using Attributes and Parameters ABSTRACT, ABSTRACT_FROM, AUTHOR, BINARY_LOCATION, BUILD_REQUIRES, C, CCFLAGS, CONFIG, CONFIGURE, CONFIGURE_REQUIRES, DEFINE, DESTDIR, DIR, DISTNAME, DISTVNAME, DLEXT, DL_FUNCS, DL_VARS, EXCLUDE_EXT, EXE_FILES, FIRST_MAKEFILE, FULLPERL, FULLPERLRUN, FULLPERLRUNINST, FUNCLIST, H, IMPORTS, INC, INCLUDE_EXT, INSTALLARCHLIB, INSTALLBIN, INSTALLDIRS, INSTALLMAN1DIR, INSTALLMAN3DIR, INSTALLPRIVLIB, INSTALLSCRIPT, INSTALLSITEARCH, INSTALLSITEBIN, INSTALLSITELIB, INSTALLSITEMAN1DIR, INSTALLSITEMAN3DIR, INSTALLSITESCRIPT, INSTALLVENDORARCH, INSTALLVENDORBIN, INSTALLVENDORLIB, INSTALLVENDORMAN1DIR, INSTALLVENDORMAN3DIR, INSTALLVENDORSCRIPT, INST_ARCHLIB, INST_BIN, INST_LIB, INST_MAN1DIR, INST_MAN3DIR, INST_SCRIPT, LD, LDDLFLAGS, LDFROM, LIB, LIBPERL_A, LIBS, LICENSE, LINKTYPE, MAGICXS, MAKE, MAKEAPERL, MAKEFILE_OLD, MAN1PODS, MAN3PODS, MAP_TARGET, META_ADD, META_MERGE, MIN_PERL_VERSION, MYEXTLIB, NAME, NEEDS_LINKING, NOECHO, NORECURS, NO_META, NO_MYMETA, NO_PACKLIST, NO_PERLLOCAL, NO_VC, OBJECT, OPTIMIZE, PERL, PERL_CORE, PERLMAINCC, PERL_ARCHLIB, PERL_LIB, PERL_MALLOC_OK, PERLPREFIX, PERLRUN, PERLRUNINST, PERL_SRC, PERM_DIR, PERM_RW, PERM_RWX, PL_FILES, PM, PMLIBDIRS, PM_FILTER, POLLUTE, PPM_INSTALL_EXEC, PPM_INSTALL_SCRIPT, PPM_UNINSTALL_EXEC, PPM_UNINSTALL_SCRIPT, PREFIX, PREREQ_FATAL, PREREQ_PM, PREREQ_PRINT, PRINT_PREREQ, SITEPREFIX, SIGN, SKIP, TEST_REQUIRES, TYPEMAPS, USE_MM_LD_RUN_PATH, VENDORPREFIX, VERBINST, VERSION, VERSION_FROM, VERSION_SYM, XS, XSBUILD, XSMULTI, XSOPT, XSPROTOARG, XS_VERSION =item Additional lowercase attributes clean, depend, dist, dynamic_lib, linkext, macro, postamble, realclean, test, tool_autosplit =item Overriding MakeMaker Methods =item The End Of Cargo Cult Programming C<< MAN3PODS => ' ' >> =item Hintsfile support =item Distribution Support make distcheck, make skipcheck, make distclean, make veryclean, make manifest, make distdir, make disttest, make tardist, make dist, make uutardist, make shdist, make zipdist, make ci =item Module Meta-Data (META and MYMETA) =item Disabling an extension =item Other Handy Functions prompt, os_unsupported =item Supported versions of Perl =back =item ENVIRONMENT PERL_MM_OPT, PERL_MM_USE_DEFAULT, PERL_CORE =item SEE ALSO =item AUTHORS =item LICENSE =back =head2 ExtUtils::MakeMaker::Config - Wrapper around Config.pm =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::MakeMaker::FAQ - Frequently Asked Questions About MakeMaker =over 4 =item DESCRIPTION =over 4 =item Module Installation How do I install a module into my home directory?, How do I get MakeMaker and Module::Build to install to the same place?, How do I keep from installing man pages?, How do I use a module without installing it?, How can I organize tests into subdirectories and have them run?, PREFIX vs INSTALL_BASE from Module::Build::Cookbook, Generating *.pm files with substitutions eg of $VERSION =item Common errors and problems "No rule to make target `/usr/lib/perl5/CORE/config.h', needed by `Makefile'" =item Philosophy and History Why not just use <insert other build config tool here>?, What is Module::Build and how does it relate to MakeMaker?, pure perl. no make, no shell commands, easier to customize, cleaner internals, less cruft =item Module Writing How do I keep my $VERSION up to date without resetting it manually?, What's this F<META.yml> thing and how did it get in my F<MANIFEST>?!, How do I delete everything not in my F<MANIFEST>?, Which tar should I use on Windows?, Which zip should I use on Windows for '[ndg]make zipdist'? =item XS How do I prevent "object version X.XX does not match bootstrap parameter Y.YY" errors?, How do I make two or more XS files coexist in the same directory?, XSMULTI, Separate directories, Bootstrapping =back =item DESIGN =over 4 =item MakeMaker object hierarchy (simplified) =item MakeMaker object hierarchy (real) =item The MM_* hierarchy =back =item PATCHING make a pull request on the MakeMaker github repository, raise a issue on the MakeMaker github repository, file an RT ticket, email makemaker@perl.org =item AUTHOR =item SEE ALSO =back =head2 ExtUtils::MakeMaker::Locale - bundled Encode::Locale =over 4 =item SYNOPSIS =item DESCRIPTION decode_argv( ), decode_argv( Encode::FB_CROAK ), env( $uni_key ), env( $uni_key => $uni_value ), reinit( ), reinit( $encoding ), $ENCODING_LOCALE, $ENCODING_LOCALE_FS, $ENCODING_CONSOLE_IN, $ENCODING_CONSOLE_OUT =item NOTES =over 4 =item Windows =item Mac OS X =item POSIX (Linux and other Unixes) =back =item SEE ALSO =item AUTHOR =back =head2 ExtUtils::MakeMaker::Tutorial - Writing a module with MakeMaker =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item The Mantra =item The Layout Makefile.PL, MANIFEST, lib/, t/, Changes, README, INSTALL, MANIFEST.SKIP, bin/ =back =item SEE ALSO =back =head2 ExtUtils::Manifest - Utilities to write and check a MANIFEST file =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =over 4 =item mkmanifest =back =back =over 4 =item manifind =back =over 4 =item manicheck =back =over 4 =item filecheck =back =over 4 =item fullcheck =back =over 4 =item skipcheck =back =over 4 =item maniread =back =over 4 =item maniskip =back =over 4 =item manicopy =back =over 4 =item maniadd =back =over 4 =item MANIFEST =item MANIFEST.SKIP #!include_default, #!include /Path/to/another/manifest.skip =item EXPORT_OK =item GLOBAL VARIABLES =back =over 4 =item DIAGNOSTICS C<Not in MANIFEST:> I<file>, C<Skipping> I<file>, C<No such file:> I<file>, C<MANIFEST:> I<$!>, C<Added to MANIFEST:> I<file> =item ENVIRONMENT B<PERL_MM_MANIFEST_DEBUG> =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 ExtUtils::Miniperl - write the C code for miniperlmain.c and perlmain.c =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 ExtUtils::Mkbootstrap - make a bootstrap file for use by DynaLoader =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::Mksymlists - write linker options files for dynamic extension =over 4 =item SYNOPSIS =item DESCRIPTION DLBASE, DL_FUNCS, DL_VARS, FILE, FUNCLIST, IMPORTS, NAME =item AUTHOR =item REVISION mkfh() =back __find_relocations =head2 ExtUtils::Packlist - manage .packlist files =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE =item FUNCTIONS new(), read(), write(), validate(), packlist_file() =item EXAMPLE =item AUTHOR =back =head2 ExtUtils::ParseXS - converts Perl XS code into C code =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPORT =item METHODS $pxs->new(), $pxs->process_file(), B<C++>, B<hiertype>, B<except>, B<typemap>, B<prototypes>, B<versioncheck>, B<linenumbers>, B<optimize>, B<inout>, B<argtypes>, B<s>, $pxs->report_error_count() =item AUTHOR =item COPYRIGHT =item SEE ALSO =back =head2 ExtUtils::ParseXS::Constants - Initialization values for some globals =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 ExtUtils::ParseXS::Eval - Clean package to evaluate code in =over 4 =item SYNOPSIS =item SUBROUTINES =over 4 =item $pxs->eval_output_typemap_code($typemapcode, $other_hashref) =back =back =over 4 =item $pxs->eval_input_typemap_code($typemapcode, $other_hashref) =back =over 4 =item TODO =back =head2 ExtUtils::ParseXS::Utilities - Subroutines used with ExtUtils::ParseXS =over 4 =item SYNOPSIS =item SUBROUTINES =over 4 =item C<standard_typemap_locations()> Purpose, Arguments, Return Value =back =back =over 4 =item C<trim_whitespace()> Purpose, Argument, Return Value =back =over 4 =item C<C_string()> Purpose, Arguments, Return Value =back =over 4 =item C<valid_proto_string()> Purpose, Arguments, Return Value =back =over 4 =item C<process_typemaps()> Purpose, Arguments, Return Value =back =over 4 =item C<map_type()> Purpose, Arguments, Return Value =back =over 4 =item C<standard_XS_defs()> Purpose, Arguments, Return Value =back =over 4 =item C<assign_func_args()> Purpose, Arguments, Return Value =back =over 4 =item C<analyze_preprocessor_statements()> Purpose, Arguments, Return Value =back =over 4 =item C<set_cond()> Purpose, Arguments, Return Value =back =over 4 =item C<current_line_number()> Purpose, Arguments, Return Value =back =over 4 =item C<Warn()> Purpose, Arguments, Return Value =back =over 4 =item C<blurt()> Purpose, Arguments, Return Value =back =over 4 =item C<death()> Purpose, Arguments, Return Value =back =over 4 =item C<check_conditional_preprocessor_statements()> Purpose, Arguments, Return Value =back =over 4 =item C<escape_file_for_line_directive()> Purpose, Arguments, Return Value =back =over 4 =item C<report_typemap_failure> Purpose, Arguments, Return Value =back =head2 ExtUtils::Typemaps - Read/Write/Modify Perl/XS typemap files =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =back =over 4 =item new =back =over 4 =item file =back =over 4 =item add_typemap =back =over 4 =item add_inputmap =back =over 4 =item add_outputmap =back =over 4 =item add_string =back =over 4 =item remove_typemap =back =over 4 =item remove_inputmap =back =over 4 =item remove_inputmap =back =over 4 =item get_typemap =back =over 4 =item get_inputmap =back =over 4 =item get_outputmap =back =over 4 =item write =back =over 4 =item as_string =back =over 4 =item as_embedded_typemap =back =over 4 =item merge =back =over 4 =item is_empty =back =over 4 =item list_mapped_ctypes =back =over 4 =item _get_typemap_hash =back =over 4 =item _get_inputmap_hash =back =over 4 =item _get_outputmap_hash =back =over 4 =item _get_prototype_hash =back =over 4 =item clone =back =over 4 =item tidy_type =back =over 4 =item CAVEATS =item SEE ALSO =item AUTHOR =item COPYRIGHT & LICENSE =back =head2 ExtUtils::Typemaps::Cmd - Quick commands for handling typemaps =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPORTED FUNCTIONS =over 4 =item embeddable_typemap =back =item SEE ALSO =item AUTHOR =item COPYRIGHT & LICENSE =back =head2 ExtUtils::Typemaps::InputMap - Entry in the INPUT section of a typemap =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =back =over 4 =item new =back =over 4 =item code =back =over 4 =item xstype =back =over 4 =item cleaned_code =back =over 4 =item SEE ALSO =item AUTHOR =item COPYRIGHT & LICENSE =back =head2 ExtUtils::Typemaps::OutputMap - Entry in the OUTPUT section of a typemap =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =back =over 4 =item new =back =over 4 =item code =back =over 4 =item xstype =back =over 4 =item cleaned_code =back =over 4 =item targetable =back =over 4 =item SEE ALSO =item AUTHOR =item COPYRIGHT & LICENSE =back =head2 ExtUtils::Typemaps::Type - Entry in the TYPEMAP section of a typemap =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =back =over 4 =item new =back =over 4 =item proto =back =over 4 =item xstype =back =over 4 =item ctype =back =over 4 =item tidy_ctype =back =over 4 =item SEE ALSO =item AUTHOR =item COPYRIGHT & LICENSE =back =head2 ExtUtils::testlib - add blib/* directories to @INC =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Fatal - Replace functions with equivalents which succeed or die =over 4 =item SYNOPSIS =item BEST PRACTICE =item DESCRIPTION =item DIAGNOSTICS Bad subroutine name for Fatal: %s, %s is not a Perl subroutine, %s is neither a builtin, nor a Perl subroutine, Cannot make the non-overridable %s fatal, Internal error: %s =item BUGS =item AUTHOR =item LICENSE =item SEE ALSO =back =head2 Fcntl - load the C Fcntl.h defines =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item EXPORTED SYMBOLS =back =head2 File::Basename - Parse file paths into directory, filename and suffix. =over 4 =item SYNOPSIS =item DESCRIPTION =back C<fileparse> X<fileparse> C<basename> X<basename> X<filename> C<dirname> X<dirname> C<fileparse_set_fstype> X<filesystem> =over 4 =item SEE ALSO =back =head2 File::Compare - Compare files or filehandles =over 4 =item SYNOPSIS =item DESCRIPTION =item RETURN =item AUTHOR =back =head2 File::Copy - Copy files or filehandles =over 4 =item SYNOPSIS =item DESCRIPTION copy X<copy> X<cp>, move X<move> X<mv> X<rename>, syscopy X<syscopy>, rmscopy($from,$to[,$date_flag]) X<rmscopy> =item RETURN =item NOTES =item AUTHOR =back =head2 File::DosGlob - DOS like globbing and then some =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPORTS (by request only) =item BUGS =item AUTHOR =item HISTORY =item SEE ALSO =back =head2 File::Fetch - A generic file fetching mechanism =over 4 =item SYNOPSIS =item DESCRIPTION =item ACCESSORS $ff->uri, $ff->scheme, $ff->host, $ff->vol, $ff->share, $ff->path, $ff->file, $ff->file_default =back $ff->output_file =over 4 =item METHODS =over 4 =item $ff = File::Fetch->new( uri => 'http://some.where.com/dir/file.txt' ); =back =back =over 4 =item $where = $ff->fetch( [to => /my/output/dir/ | \$scalar] ) =back =over 4 =item $ff->error([BOOL]) =back =over 4 =item HOW IT WORKS =item GLOBAL VARIABLES =over 4 =item $File::Fetch::FROM_EMAIL =item $File::Fetch::USER_AGENT =item $File::Fetch::FTP_PASSIVE =item $File::Fetch::TIMEOUT =item $File::Fetch::WARN =item $File::Fetch::DEBUG =item $File::Fetch::BLACKLIST =item $File::Fetch::METHOD_FAIL =back =item MAPPING =item FREQUENTLY ASKED QUESTIONS =over 4 =item So how do I use a proxy with File::Fetch? =item I used 'lynx' to fetch a file, but its contents is all wrong! =item Files I'm trying to fetch have reserved characters or non-ASCII characters in them. What do I do? =back =item TODO Implement $PREFER_BIN =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 File::Find - Traverse a directory tree. =over 4 =item SYNOPSIS =item DESCRIPTION B<find>, B<finddepth> =over 4 =item %options C<wanted>, C<bydepth>, C<preprocess>, C<postprocess>, C<follow>, C<follow_fast>, C<follow_skip>, C<dangling_symlinks>, C<no_chdir>, C<untaint>, C<untaint_pattern>, C<untaint_skip> =item The wanted function C<$File::Find::dir> is the current directory name,, C<$_> is the current filename within that directory, C<$File::Find::name> is the complete pathname to the file =back =item WARNINGS =item CAVEAT $dont_use_nlink, symlinks =item BUGS AND CAVEATS =item HISTORY =item SEE ALSO =back =head2 File::Glob - Perl extension for BSD glob routine =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item META CHARACTERS =item EXPORTS =item POSIX FLAGS C<GLOB_ERR>, C<GLOB_LIMIT>, C<GLOB_MARK>, C<GLOB_NOCASE>, C<GLOB_NOCHECK>, C<GLOB_NOSORT>, C<GLOB_BRACE>, C<GLOB_NOMAGIC>, C<GLOB_QUOTE>, C<GLOB_TILDE>, C<GLOB_CSH>, C<GLOB_ALPHASORT> =back =item DIAGNOSTICS C<GLOB_NOSPACE>, C<GLOB_ABEND> =item NOTES =item SEE ALSO =item AUTHOR =back =head2 File::GlobMapper - Extend File Glob to Allow Input and Output Files =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Behind The Scenes =item Limitations =item Input File Glob B<~>, B<~user>, B<.>, B<*>, B<?>, B<\>, B<[]>, B<{,}>, B<()> =item Output File Glob "*", #1 =item Returned Data =back =item EXAMPLES =over 4 =item A Rename script =item A few example globmaps =back =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 File::Path - Create or remove directory trees =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION make_path( $dir1, $dir2, .... ), make_path( $dir1, $dir2, ...., \%opts ), mode => $num, chmod => $num, verbose => $bool, error => \$err, owner => $owner, user => $owner, uid => $owner, group => $group, mkpath( $dir ), mkpath( $dir, $verbose, $mode ), mkpath( [$dir1, $dir2,...], $verbose, $mode ), mkpath( $dir1, $dir2,..., \%opt ), remove_tree( $dir1, $dir2, .... ), remove_tree( $dir1, $dir2, ...., \%opts ), verbose => $bool, safe => $bool, keep_root => $bool, result => \$res, error => \$err, rmtree( $dir ), rmtree( $dir, $verbose, $safe ), rmtree( [$dir1, $dir2,...], $verbose, $safe ), rmtree( $dir1, $dir2,..., \%opt ) =over 4 =item ERROR HANDLING B<NOTE:> =item NOTES L<http://cve.circl.lu/cve/CVE-2004-0452>, L<http://cve.circl.lu/cve/CVE-2005-0448> =back =item DIAGNOSTICS mkdir [path]: [errmsg] (SEVERE), No root path(s) specified, No such file or directory, cannot fetch initial working directory: [errmsg], cannot stat initial working directory: [errmsg], cannot chdir to [dir]: [errmsg], directory [dir] changed before chdir, expected dev=[n] ino=[n], actual dev=[n] ino=[n], aborting. (FATAL), cannot make directory [dir] read+writeable: [errmsg], cannot read [dir]: [errmsg], cannot reset chmod [dir]: [errmsg], cannot remove [dir] when cwd is [dir], cannot chdir to [parent-dir] from [child-dir]: [errmsg], aborting. (FATAL), cannot stat prior working directory [dir]: [errmsg], aborting. (FATAL), previous directory [parent-dir] changed before entering [child-dir], expected dev=[n] ino=[n], actual dev=[n] ino=[n], aborting. (FATAL), cannot make directory [dir] writeable: [errmsg], cannot remove directory [dir]: [errmsg], cannot restore permissions of [dir] to [0nnn]: [errmsg], cannot make file [file] writeable: [errmsg], cannot unlink file [file]: [errmsg], cannot restore permissions of [file] to [0nnn]: [errmsg], unable to map [owner] to a uid, ownership not changed");, unable to map [group] to a gid, group ownership not changed =item SEE ALSO =item BUGS AND LIMITATIONS =over 4 =item MULTITHREADED APPLICATIONS =item NFS Mount Points =item REPORTING BUGS =back =item ACKNOWLEDGEMENTS =item AUTHORS =item CONTRIBUTORS <F<bulkdd@cpan.org>>, Charlie Gonzalez <F<itcharlie@cpan.org>>, Craig A. Berry <F<craigberry@mac.com>>, James E Keenan <F<jkeenan@cpan.org>>, John Lightsey <F<john@perlsec.org>>, Nigel Horne <F<njh@bandsman.co.uk>>, Richard Elberger <F<riche@cpan.org>>, Ryan Yee <F<ryee@cpan.org>>, Skye Shaw <F<shaw@cpan.org>>, Tom Lutz <F<tommylutz@gmail.com>>, Will Sheppard <F<willsheppard@github>> =item COPYRIGHT =item LICENSE =back =head2 File::Spec - portably perform operations on file names =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS canonpath X<canonpath>, catdir X<catdir>, catfile X<catfile>, curdir X<curdir>, devnull X<devnull>, rootdir X<rootdir>, tmpdir X<tmpdir>, updir X<updir>, no_upwards, case_tolerant, file_name_is_absolute, path X<path>, join X<join, path>, splitpath X<splitpath> X<split, path>, splitdir X<splitdir> X<split, dir>, catpath(), abs2rel X<abs2rel> X<absolute, path> X<relative, path>, rel2abs() X<rel2abs> X<absolute, path> X<relative, path> =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 File::Spec::AmigaOS - File::Spec for AmigaOS =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS tmpdir =back file_name_is_absolute =head2 File::Spec::Cygwin - methods for Cygwin file specs =over 4 =item SYNOPSIS =item DESCRIPTION =back canonpath file_name_is_absolute tmpdir (override) case_tolerant =over 4 =item COPYRIGHT =back =head2 File::Spec::Epoc - methods for Epoc file specs =over 4 =item SYNOPSIS =item DESCRIPTION =back canonpath() =over 4 =item AUTHOR =item COPYRIGHT =item SEE ALSO =back =head2 File::Spec::Functions - portably perform operations on file names =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Exports =back =item COPYRIGHT =item SEE ALSO =back =head2 File::Spec::Mac - File::Spec for Mac OS (Classic) =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS canonpath =back catdir() catfile curdir devnull rootdir tmpdir updir file_name_is_absolute path splitpath splitdir catpath abs2rel rel2abs =over 4 =item AUTHORS =item COPYRIGHT =item SEE ALSO =back =head2 File::Spec::OS2 - methods for OS/2 file specs =over 4 =item SYNOPSIS =item DESCRIPTION tmpdir, splitpath =item COPYRIGHT =back =head2 File::Spec::Unix - File::Spec for Unix, base for other File::Spec modules =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS canonpath() =back catdir() catfile curdir devnull rootdir tmpdir updir no_upwards case_tolerant file_name_is_absolute path join splitpath splitdir catpath() abs2rel rel2abs() =over 4 =item COPYRIGHT =item SEE ALSO =back =head2 File::Spec::VMS - methods for VMS file specs =over 4 =item SYNOPSIS =item DESCRIPTION =back canonpath (override) catdir (override) catfile (override) curdir (override) devnull (override) rootdir (override) tmpdir (override) updir (override) case_tolerant (override) path (override) file_name_is_absolute (override) splitpath (override) splitdir (override) catpath (override) abs2rel (override) rel2abs (override) =over 4 =item COPYRIGHT =item SEE ALSO =back =head2 File::Spec::Win32 - methods for Win32 file specs =over 4 =item SYNOPSIS =item DESCRIPTION devnull =back tmpdir case_tolerant file_name_is_absolute catfile canonpath splitpath splitdir catpath =over 4 =item Note For File::Spec::Win32 Maintainers =back =over 4 =item COPYRIGHT =item SEE ALSO =back =head2 File::Temp - return name and handle of a temporary file safely =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item PORTABILITY =item OBJECT-ORIENTED INTERFACE B<new>, B<newdir>, B<filename>, B<dirname>, B<unlink_on_destroy>, B<DESTROY> =item FUNCTIONS B<tempfile>, B<tempdir> =item MKTEMP FUNCTIONS B<mkstemp>, B<mkstemps>, B<mkdtemp>, B<mktemp> =item POSIX FUNCTIONS B<tmpnam>, B<tmpfile> =item ADDITIONAL FUNCTIONS B<tempnam> =item UTILITY FUNCTIONS B<unlink0>, B<cmpstat>, B<unlink1>, B<cleanup> =item PACKAGE VARIABLES B<safe_level>, STANDARD, MEDIUM, HIGH, TopSystemUID, B<$KEEP_ALL>, B<$DEBUG> =item WARNING =over 4 =item Temporary files and NFS =item Forking =item Directory removal =item Taint mode =item BINMODE =back =item HISTORY =item SEE ALSO =item SUPPORT =item AUTHOR =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 File::stat - by-name interface to Perl's built-in stat() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item ERRORS -%s is not implemented on a File::stat object =item WARNINGS File::stat ignores use filetest 'access', File::stat ignores VMS ACLs =item NOTE =item AUTHOR =back =head2 FileCache - keep more files open than the system permits =over 4 =item SYNOPSIS =item DESCRIPTION cacheout EXPR, cacheout MODE, EXPR =item CAVEATS =item BUGS =back =head2 FileHandle - supply object methods for filehandles =over 4 =item SYNOPSIS =item DESCRIPTION $fh->print, $fh->printf, $fh->getline, $fh->getlines =item SEE ALSO =back =head2 Filter::Simple - Simplified source filtering =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item The Problem =item A Solution =item Disabling or changing <no> behaviour =item All-in-one interface =item Filtering only specific components of source code C<"code">, C<"code_no_comments">, C<"executable">, C<"executable_no_comments">, C<"quotelike">, C<"string">, C<"regex">, C<"all"> =item Filtering only the code parts of source code =item Using Filter::Simple with an explicit C<import> subroutine =item Using Filter::Simple and Exporter together =item How it works =back =item AUTHOR =item CONTACT =item COPYRIGHT AND LICENSE =back =head2 Filter::Util::Call - Perl Source Filter Utility Module =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item B<use Filter::Util::Call> =item B<import()> =item B<filter_add()> =item B<filter() and anonymous sub> B<$_>, B<$status>, B<filter_read> and B<filter_read_exact>, B<filter_del>, I<real_import>, I<unimport()> =back =item LIMITATIONS __DATA__ is ignored, Max. codesize limited to 32-bit =item EXAMPLES =over 4 =item Example 1: A simple filter. =item Example 2: Using the context =item Example 3: Using the context within the filter =item Example 4: Using filter_del =back =item Filter::Simple =item AUTHOR =item DATE =item LICENSE =back =head2 FindBin - Locate directory of original perl script =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPORTABLE VARIABLES =item KNOWN ISSUES =item AUTHORS =item COPYRIGHT =back =head2 GDBM_File - Perl5 access to the gdbm library. =over 4 =item SYNOPSIS =item DESCRIPTION =item AVAILABILITY =item SECURITY AND PORTABILITY =item BUGS =item SEE ALSO =back =head2 Getopt::Long - Extended processing of command line options =over 4 =item SYNOPSIS =item DESCRIPTION =item Command Line Options, an Introduction =item Getting Started with Getopt::Long =over 4 =item Simple options =item A little bit less simple options =item Mixing command line option with other arguments =item Options with values =item Options with multiple values =item Options with hash values =item User-defined subroutines to handle options =item Options with multiple names =item Case and abbreviations =item Summary of Option Specifications !, +, s, i, o, f, : I<type> [ I<desttype> ], : I<number> [ I<desttype> ], : + [ I<desttype> ] =back =item Advanced Possibilities =over 4 =item Object oriented interface =item Thread Safety =item Documentation and help texts =item Parsing options from an arbitrary array =item Parsing options from an arbitrary string =item Storing options values in a hash =item Bundling =item The lonesome dash =item Argument callback =back =item Configuring Getopt::Long default, posix_default, auto_abbrev, getopt_compat, gnu_compat, gnu_getopt, require_order, permute, bundling (default: disabled), bundling_override (default: disabled), ignore_case (default: enabled), ignore_case_always (default: disabled), auto_version (default:disabled), auto_help (default:disabled), pass_through (default: disabled), prefix, prefix_pattern, long_prefix_pattern, debug (default: disabled) =item Exportable Methods VersionMessage, C<-message>, C<-msg>, C<-exitval>, C<-output>, HelpMessage =item Return values and Errors =item Legacy =over 4 =item Default destinations =item Alternative option starters =item Configuration variables =back =item Tips and Techniques =over 4 =item Pushing multiple values in a hash option =back =item Troubleshooting =over 4 =item GetOptions does not return a false result when an option is not supplied =item GetOptions does not split the command line correctly =item Undefined subroutine &main::GetOptions called =item How do I put a "-?" option into a Getopt::Long? =back =item AUTHOR =item COPYRIGHT AND DISCLAIMER =back =head2 Getopt::Std - Process single-character switches with switch clustering =over 4 =item SYNOPSIS =item DESCRIPTION =item C<--help> and C<--version> =back =head2 HTTP::Tiny - A small, simple, correct HTTP/1.1 client =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item new =item get|head|put|post|delete =item post_form =item mirror =item request =item www_form_urlencode =item can_ssl =item connected =back =item SSL SUPPORT =item PROXY SUPPORT =item LIMITATIONS =item SEE ALSO =item SUPPORT =over 4 =item Bugs / Feature Requests =item Source Code =back =item AUTHORS =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 Hash::Util - A selection of general-utility hash subroutines =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Restricted hashes B<lock_keys>, B<unlock_keys> =back =back B<lock_keys_plus> B<lock_value>, B<unlock_value> B<lock_hash>, B<unlock_hash> B<lock_hash_recurse>, B<unlock_hash_recurse> B<hashref_locked>, B<hash_locked> B<hashref_unlocked>, B<hash_unlocked> B<legal_keys>, B<hidden_keys>, B<all_keys>, B<hash_seed>, B<hash_value>, B<bucket_info>, B<bucket_stats>, B<bucket_array> B<bucket_stats_formatted> B<hv_store>, B<hash_traversal_mask>, B<bucket_ratio>, B<used_buckets>, B<num_buckets> =over 4 =item Operating on references to hashes. lock_ref_keys, unlock_ref_keys, lock_ref_keys_plus, lock_ref_value, unlock_ref_value, lock_hashref, unlock_hashref, lock_hashref_recurse, unlock_hashref_recurse, hash_ref_unlocked, legal_ref_keys, hidden_ref_keys =back =over 4 =item CAVEATS =item BUGS =item AUTHOR =item SEE ALSO =back =head2 Hash::Util::FieldHash - Support for Inside-Out Classes =over 4 =item SYNOPSIS =item FUNCTIONS id, id_2obj, register, idhash, idhashes, fieldhash, fieldhashes =item DESCRIPTION =over 4 =item The Inside-out Technique =item Problems of Inside-out =item Solutions =item More Problems =item The Generic Object =item How to use Field Hashes =item Garbage-Collected Hashes =back =item EXAMPLES C<init()>, C<first()>, C<last()>, C<name()>, C<Name_hash>, C<Name_id>, C<Name_idhash>, C<Name_id_reg>, C<Name_idhash_reg>, C<Name_fieldhash> =over 4 =item Example 1 =item Example 2 =back =item GUTS =over 4 =item The C<PERL_MAGIC_uvar> interface for hashes =item Weakrefs call uvar magic =item How field hashes work =item Internal function Hash::Util::FieldHash::_fieldhash =back =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 I18N::Collate - compare 8-bit scalar data according to the current locale =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 I18N::LangTags - functions for dealing with RFC3066-style language tags =over 4 =item SYNOPSIS =item DESCRIPTION =back the function is_language_tag($lang1) the function extract_language_tags($whatever) the function same_language_tag($lang1, $lang2) the function similarity_language_tag($lang1, $lang2) the function is_dialect_of($lang1, $lang2) the function super_languages($lang1) the function locale2language_tag($locale_identifier) the function encode_language_tag($lang1) the function alternate_language_tags($lang1) the function @langs = panic_languages(@accept_languages) the function implicate_supers( ...languages... ), the function implicate_supers_strictly( ...languages... ) =over 4 =item ABOUT LOWERCASING =item ABOUT UNICODE PLAINTEXT LANGUAGE TAGS =item SEE ALSO =item COPYRIGHT =item AUTHOR =back =head2 I18N::LangTags::Detect - detect the user's language preferences =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =item ENVIRONMENT =item SEE ALSO =item COPYRIGHT =item AUTHOR =back =head2 I18N::LangTags::List -- tags and names for human languages =over 4 =item SYNOPSIS =item DESCRIPTION =item ABOUT LANGUAGE TAGS =item LIST OF LANGUAGES {ab} : Abkhazian, {ace} : Achinese, {ach} : Acoli, {ada} : Adangme, {ady} : Adyghe, {aa} : Afar, {afh} : Afrihili, {af} : Afrikaans, [{afa} : Afro-Asiatic (Other)], {ak} : Akan, {akk} : Akkadian, {sq} : Albanian, {ale} : Aleut, [{alg} : Algonquian languages], [{tut} : Altaic (Other)], {am} : Amharic, {i-ami} : Ami, [{apa} : Apache languages], {ar} : Arabic, {arc} : Aramaic, {arp} : Arapaho, {arn} : Araucanian, {arw} : Arawak, {hy} : Armenian, {an} : Aragonese, [{art} : Artificial (Other)], {ast} : Asturian, {as} : Assamese, [{ath} : Athapascan languages], [{aus} : Australian languages], [{map} : Austronesian (Other)], {av} : Avaric, {ae} : Avestan, {awa} : Awadhi, {ay} : Aymara, {az} : Azerbaijani, {ban} : Balinese, [{bat} : Baltic (Other)], {bal} : Baluchi, {bm} : Bambara, [{bai} : Bamileke languages], {bad} : Banda, [{bnt} : Bantu (Other)], {bas} : Basa, {ba} : Bashkir, {eu} : Basque, {btk} : Batak (Indonesia), {bej} : Beja, {be} : Belarusian, {bem} : Bemba, {bn} : Bengali, [{ber} : Berber (Other)], {bho} : Bhojpuri, {bh} : Bihari, {bik} : Bikol, {bin} : Bini, {bi} : Bislama, {bs} : Bosnian, {bra} : Braj, {br} : Breton, {bug} : Buginese, {bg} : Bulgarian, {i-bnn} : Bunun, {bua} : Buriat, {my} : Burmese, {cad} : Caddo, {car} : Carib, {ca} : Catalan, [{cau} : Caucasian (Other)], {ceb} : Cebuano, [{cel} : Celtic (Other)], [{cai} : Central American Indian (Other)], {chg} : Chagatai, [{cmc} : Chamic languages], {ch} : Chamorro, {ce} : Chechen, {chr} : Cherokee, {chy} : Cheyenne, {chb} : Chibcha, {ny} : Chichewa, {zh} : Chinese, {chn} : Chinook Jargon, {chp} : Chipewyan, {cho} : Choctaw, {cu} : Church Slavic, {chk} : Chuukese, {cv} : Chuvash, {cop} : Coptic, {kw} : Cornish, {co} : Corsican, {cr} : Cree, {mus} : Creek, [{cpe} : English-based Creoles and pidgins (Other)], [{cpf} : French-based Creoles and pidgins (Other)], [{cpp} : Portuguese-based Creoles and pidgins (Other)], [{crp} : Creoles and pidgins (Other)], {hr} : Croatian, [{cus} : Cushitic (Other)], {cs} : Czech, {dak} : Dakota, {da} : Danish, {dar} : Dargwa, {day} : Dayak, {i-default} : Default (Fallthru) Language, {del} : Delaware, {din} : Dinka, {dv} : Divehi, {doi} : Dogri, {dgr} : Dogrib, [{dra} : Dravidian (Other)], {dua} : Duala, {nl} : Dutch, {dum} : Middle Dutch (ca.1050-1350), {dyu} : Dyula, {dz} : Dzongkha, {efi} : Efik, {egy} : Ancient Egyptian, {eka} : Ekajuk, {elx} : Elamite, {en} : English, {enm} : Old English (1100-1500), {ang} : Old English (ca.450-1100), {i-enochian} : Enochian (Artificial), {myv} : Erzya, {eo} : Esperanto, {et} : Estonian, {ee} : Ewe, {ewo} : Ewondo, {fan} : Fang, {fat} : Fanti, {fo} : Faroese, {fj} : Fijian, {fi} : Finnish, [{fiu} : Finno-Ugrian (Other)], {fon} : Fon, {fr} : French, {frm} : Middle French (ca.1400-1600), {fro} : Old French (842-ca.1400), {fy} : Frisian, {fur} : Friulian, {ff} : Fulah, {gaa} : Ga, {gd} : Scots Gaelic, {gl} : Gallegan, {lg} : Ganda, {gay} : Gayo, {gba} : Gbaya, {gez} : Geez, {ka} : Georgian, {de} : German, {gmh} : Middle High German (ca.1050-1500), {goh} : Old High German (ca.750-1050), [{gem} : Germanic (Other)], {gil} : Gilbertese, {gon} : Gondi, {gor} : Gorontalo, {got} : Gothic, {grb} : Grebo, {grc} : Ancient Greek, {el} : Modern Greek, {gn} : Guarani, {gu} : Gujarati, {gwi} : Gwich'in, {hai} : Haida, {ht} : Haitian, {ha} : Hausa, {haw} : Hawaiian, {he} : Hebrew, {hz} : Herero, {hil} : Hiligaynon, {him} : Himachali, {hi} : Hindi, {ho} : Hiri Motu, {hit} : Hittite, {hmn} : Hmong, {hu} : Hungarian, {hup} : Hupa, {iba} : Iban, {is} : Icelandic, {io} : Ido, {ig} : Igbo, {ijo} : Ijo, {ilo} : Iloko, [{inc} : Indic (Other)], [{ine} : Indo-European (Other)], {id} : Indonesian, {inh} : Ingush, {ia} : Interlingua (International Auxiliary Language Association), {ie} : Interlingue, {iu} : Inuktitut, {ik} : Inupiaq, [{ira} : Iranian (Other)], {ga} : Irish, {mga} : Middle Irish (900-1200), {sga} : Old Irish (to 900), [{iro} : Iroquoian languages], {it} : Italian, {ja} : Japanese, {jv} : Javanese, {jrb} : Judeo-Arabic, {jpr} : Judeo-Persian, {kbd} : Kabardian, {kab} : Kabyle, {kac} : Kachin, {kl} : Kalaallisut, {xal} : Kalmyk, {kam} : Kamba, {kn} : Kannada, {kr} : Kanuri, {krc} : Karachay-Balkar, {kaa} : Kara-Kalpak, {kar} : Karen, {ks} : Kashmiri, {csb} : Kashubian, {kaw} : Kawi, {kk} : Kazakh, {kha} : Khasi, {km} : Khmer, [{khi} : Khoisan (Other)], {kho} : Khotanese, {ki} : Kikuyu, {kmb} : Kimbundu, {rw} : Kinyarwanda, {ky} : Kirghiz, {i-klingon} : Klingon, {kv} : Komi, {kg} : Kongo, {kok} : Konkani, {ko} : Korean, {kos} : Kosraean, {kpe} : Kpelle, {kro} : Kru, {kj} : Kuanyama, {kum} : Kumyk, {ku} : Kurdish, {kru} : Kurukh, {kut} : Kutenai, {lad} : Ladino, {lah} : Lahnda, {lam} : Lamba, {lo} : Lao, {la} : Latin, {lv} : Latvian, {lb} : Letzeburgesch, {lez} : Lezghian, {li} : Limburgish, {ln} : Lingala, {lt} : Lithuanian, {nds} : Low German, {art-lojban} : Lojban (Artificial), {loz} : Lozi, {lu} : Luba-Katanga, {lua} : Luba-Lulua, {lui} : Luiseno, {lun} : Lunda, {luo} : Luo (Kenya and Tanzania), {lus} : Lushai, {mk} : Macedonian, {mad} : Madurese, {mag} : Magahi, {mai} : Maithili, {mak} : Makasar, {mg} : Malagasy, {ms} : Malay, {ml} : Malayalam, {mt} : Maltese, {mnc} : Manchu, {mdr} : Mandar, {man} : Mandingo, {mni} : Manipuri, [{mno} : Manobo languages], {gv} : Manx, {mi} : Maori, {mr} : Marathi, {chm} : Mari, {mh} : Marshall, {mwr} : Marwari, {mas} : Masai, [{myn} : Mayan languages], {men} : Mende, {mic} : Micmac, {min} : Minangkabau, {i-mingo} : Mingo, [{mis} : Miscellaneous languages], {moh} : Mohawk, {mdf} : Moksha, {mo} : Moldavian, [{mkh} : Mon-Khmer (Other)], {lol} : Mongo, {mn} : Mongolian, {mos} : Mossi, [{mul} : Multiple languages], [{mun} : Munda languages], {nah} : Nahuatl, {nap} : Neapolitan, {na} : Nauru, {nv} : Navajo, {nd} : North Ndebele, {nr} : South Ndebele, {ng} : Ndonga, {ne} : Nepali, {new} : Newari, {nia} : Nias, [{nic} : Niger-Kordofanian (Other)], [{ssa} : Nilo-Saharan (Other)], {niu} : Niuean, {nog} : Nogai, {non} : Old Norse, [{nai} : North American Indian], {no} : Norwegian, {nb} : Norwegian Bokmal, {nn} : Norwegian Nynorsk, [{nub} : Nubian languages], {nym} : Nyamwezi, {nyn} : Nyankole, {nyo} : Nyoro, {nzi} : Nzima, {oc} : Occitan (post 1500), {oj} : Ojibwa, {or} : Oriya, {om} : Oromo, {osa} : Osage, {os} : Ossetian; Ossetic, [{oto} : Otomian languages], {pal} : Pahlavi, {i-pwn} : Paiwan, {pau} : Palauan, {pi} : Pali, {pam} : Pampanga, {pag} : Pangasinan, {pa} : Panjabi, {pap} : Papiamento, [{paa} : Papuan (Other)], {fa} : Persian, {peo} : Old Persian (ca.600-400 B.C.), [{phi} : Philippine (Other)], {phn} : Phoenician, {pon} : Pohnpeian, {pl} : Polish, {pt} : Portuguese, [{pra} : Prakrit languages], {pro} : Old Provencal (to 1500), {ps} : Pushto, {qu} : Quechua, {rm} : Raeto-Romance, {raj} : Rajasthani, {rap} : Rapanui, {rar} : Rarotongan, [{qaa - qtz} : Reserved for local use.], [{roa} : Romance (Other)], {ro} : Romanian, {rom} : Romany, {rn} : Rundi, {ru} : Russian, [{sal} : Salishan languages], {sam} : Samaritan Aramaic, {se} : Northern Sami, {sma} : Southern Sami, {smn} : Inari Sami, {smj} : Lule Sami, {sms} : Skolt Sami, [{smi} : Sami languages (Other)], {sm} : Samoan, {sad} : Sandawe, {sg} : Sango, {sa} : Sanskrit, {sat} : Santali, {sc} : Sardinian, {sas} : Sasak, {sco} : Scots, {sel} : Selkup, [{sem} : Semitic (Other)], {sr} : Serbian, {srr} : Serer, {shn} : Shan, {sn} : Shona, {sid} : Sidamo, {sgn-...} : Sign Languages, {bla} : Siksika, {sd} : Sindhi, {si} : Sinhalese, [{sit} : Sino-Tibetan (Other)], [{sio} : Siouan languages], {den} : Slave (Athapascan), [{sla} : Slavic (Other)], {sk} : Slovak, {sl} : Slovenian, {sog} : Sogdian, {so} : Somali, {son} : Songhai, {snk} : Soninke, {wen} : Sorbian languages, {nso} : Northern Sotho, {st} : Southern Sotho, [{sai} : South American Indian (Other)], {es} : Spanish, {suk} : Sukuma, {sux} : Sumerian, {su} : Sundanese, {sus} : Susu, {sw} : Swahili, {ss} : Swati, {sv} : Swedish, {syr} : Syriac, {tl} : Tagalog, {ty} : Tahitian, [{tai} : Tai (Other)], {tg} : Tajik, {tmh} : Tamashek, {ta} : Tamil, {i-tao} : Tao, {tt} : Tatar, {i-tay} : Tayal, {te} : Telugu, {ter} : Tereno, {tet} : Tetum, {th} : Thai, {bo} : Tibetan, {tig} : Tigre, {ti} : Tigrinya, {tem} : Timne, {tiv} : Tiv, {tli} : Tlingit, {tpi} : Tok Pisin, {tkl} : Tokelau, {tog} : Tonga (Nyasa), {to} : Tonga (Tonga Islands), {tsi} : Tsimshian, {ts} : Tsonga, {i-tsu} : Tsou, {tn} : Tswana, {tum} : Tumbuka, [{tup} : Tupi languages], {tr} : Turkish, {ota} : Ottoman Turkish (1500-1928), {crh} : Crimean Turkish, {tk} : Turkmen, {tvl} : Tuvalu, {tyv} : Tuvinian, {tw} : Twi, {udm} : Udmurt, {uga} : Ugaritic, {ug} : Uighur, {uk} : Ukrainian, {umb} : Umbundu, {und} : Undetermined, {ur} : Urdu, {uz} : Uzbek, {vai} : Vai, {ve} : Venda, {vi} : Vietnamese, {vo} : Volapuk, {vot} : Votic, [{wak} : Wakashan languages], {wa} : Walloon, {wal} : Walamo, {war} : Waray, {was} : Washo, {cy} : Welsh, {wo} : Wolof, {x-...} : Unregistered (Semi-Private Use), {xh} : Xhosa, {sah} : Yakut, {yao} : Yao, {yap} : Yapese, {ii} : Sichuan Yi, {yi} : Yiddish, {yo} : Yoruba, [{ypk} : Yupik languages], {znd} : Zande, [{zap} : Zapotec], {zen} : Zenaga, {za} : Zhuang, {zu} : Zulu, {zun} : Zuni =item SEE ALSO =item COPYRIGHT AND DISCLAIMER =item AUTHOR =back =head2 I18N::Langinfo - query locale information =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item For systems without C<nl_langinfo> C<ERA>, C<CODESET>, C<YESEXPR>, C<YESSTR>, C<NOEXPR>, C<NOSTR>, C<D_FMT>, C<T_FMT>, C<D_T_FMT>, C<CRNCYSTR>, C<ALT_DIGITS>, C<ERA_D_FMT>, C<ERA_T_FMT>, C<ERA_D_T_FMT>, C<T_FMT_AMPM> =item EXPORT =back =item BUGS =item SEE ALSO =item AUTHOR =item COPYRIGHT AND LICENSE =back =head2 IO - load various IO modules =over 4 =item SYNOPSIS =item DESCRIPTION =item DEPRECATED =back =head2 IO::Compress::Base - Base Class for IO::Compress modules =over 4 =item SYNOPSIS =item DESCRIPTION =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::Bzip2 - Write bzip2 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item bzip2 $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< BlockSize100K => number >>, C<< WorkFactor => number >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item print =item printf =item syswrite =item write =item flush =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item newStream([OPTS]) =back =item Importing :all =item EXAMPLES =over 4 =item Apache::GZip Revisited =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::Deflate - Write RFC 1950 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item deflate $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item print =item printf =item syswrite =item write =item flush =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item newStream([OPTS]) =item deflateParams =back =item Importing :all, :constants, :flush, :level, :strategy =item EXAMPLES =over 4 =item Apache::GZip Revisited =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::FAQ -- Frequently Asked Questions about IO::Compress =over 4 =item DESCRIPTION =item GENERAL =over 4 =item Compatibility with Unix compress/uncompress. =item Accessing .tar.Z files =item How do I recompress using a different compression? =back =item ZIP =over 4 =item What Compression Types do IO::Compress::Zip & IO::Uncompress::Unzip support? Store (method 0), Deflate (method 8), Bzip2 (method 12), Lzma (method 14) =item Can I Read/Write Zip files larger the 4 Gig? =item Can I write more that 64K entries is a Zip files? =item Zip Resources =back =item GZIP =over 4 =item Gzip Resources =item Dealing with concatenated gzip files =item Reading bgzip files with IO::Uncompress::Gunzip =back =item ZLIB =over 4 =item Zlib Resources =back =item Bzip2 =over 4 =item Bzip2 Resources =item Dealing with Concatenated bzip2 files =item Interoperating with Pbzip2 =back =item HTTP & NETWORK =over 4 =item Apache::GZip Revisited =item Compressed files and Net::FTP =back =item MISC =over 4 =item Using C<InputLength> to uncompress data embedded in a larger file/buffer. =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::Gzip - Write RFC 1952 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item gzip $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Minimal => 0|1 >>, C<< Comment => $comment >>, C<< Name => $string >>, C<< Time => $number >>, C<< TextFlag => 0|1 >>, C<< HeaderCRC => 0|1 >>, C<< OS_Code => $value >>, C<< ExtraField => $data >>, C<< ExtraFlags => $value >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item print =item printf =item syswrite =item write =item flush =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item newStream([OPTS]) =item deflateParams =back =item Importing :all, :constants, :flush, :level, :strategy =item EXAMPLES =over 4 =item Apache::GZip Revisited =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::RawDeflate - Write RFC 1951 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item rawdeflate $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< Merge => 0|1 >>, -Level, -Strategy, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item print =item printf =item syswrite =item write =item flush =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item newStream([OPTS]) =item deflateParams =back =item Importing :all, :constants, :flush, :level, :strategy =item EXAMPLES =over 4 =item Apache::GZip Revisited =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Compress::Zip - Write zip files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION To use Bzip2 compression, the module C<IO::Compress::Bzip2> must be installed, To use LZMA compression, the module C<IO::Compress::Lzma> must be installed =item Functional Interface =over 4 =item zip $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeIn => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< Name => $string >>, If the C<$input> parameter is not a filename, the I<archive member name> will be an empty string, C<< CanonicalName => 0|1 >>, C<< FilterName => sub { ... } >>, C<< Efs => 0|1 >>, C<< Minimal => 1|0 >>, C<< Stream => 0|1 >>, C<< Zip64 => 0|1 >>, -Level, -Strategy, C<< BlockSize100K => number >>, C<< WorkFactor => number >>, C<< Preset => number >>, C<< Extreme => 0|1 >>, C<< Time => $number >>, C<< ExtAttr => $attr >>, C<< exTime => [$atime, $mtime, $ctime] >>, C<< exUnix2 => [$uid, $gid] >>, C<< exUnixN => [$uid, $gid] >>, C<< Comment => $comment >>, C<< ZipComment => $comment >>, C<< Method => $method >>, C<< TextFlag => 0|1 >>, C<< ExtraFieldLocal => $data >>, C<< ExtraFieldCentral => $data >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item print =item printf =item syswrite =item write =item flush =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item newStream([OPTS]) =item deflateParams =back =item Importing :all, :constants, :flush, :level, :strategy, :zip_method =item EXAMPLES =over 4 =item Apache::GZip Revisited =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Dir - supply object methods for directory handles =over 4 =item SYNOPSIS =item DESCRIPTION new ( [ DIRNAME ] ), open ( DIRNAME ), read (), seek ( POS ), tell (), rewind (), close (), tie %hash, 'IO::Dir', DIRNAME [, OPTIONS ] =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::File - supply object methods for filehandles =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( FILENAME [,MODE [,PERMS]] ), new_tmpfile =item METHODS open( FILENAME [,MODE [,PERMS]] ), open( FILENAME, IOLAYERS ), binmode( [LAYER] ) =item NOTE =item SEE ALSO =item HISTORY =back =head2 IO::Handle - supply object methods for I/O handles =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new (), new_from_fd ( FD, MODE ) =item METHODS $io->fdopen ( FD, MODE ), $io->opened, $io->getline, $io->getlines, $io->ungetc ( ORD ), $io->write ( BUF, LEN [, OFFSET ] ), $io->error, $io->clearerr, $io->sync, $io->flush, $io->printflush ( ARGS ), $io->blocking ( [ BOOL ] ), $io->untaint =item NOTE =item SEE ALSO =item BUGS =item HISTORY =back =head2 IO::Pipe - supply object methods for pipes =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [READER, WRITER] ) =item METHODS reader ([ARGS]), writer ([ARGS]), handles () =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::Poll - Object interface to system poll call =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS mask ( IO [, EVENT_MASK ] ), poll ( [ TIMEOUT ] ), events ( IO ), remove ( IO ), handles( [ EVENT_MASK ] ) =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::Seekable - supply seek based methods for I/O objects =over 4 =item SYNOPSIS =item DESCRIPTION $io->getpos, $io->setpos, $io->seek ( POS, WHENCE ), WHENCE=0 (SEEK_SET), WHENCE=1 (SEEK_CUR), WHENCE=2 (SEEK_END), $io->sysseek( POS, WHENCE ), $io->tell =item SEE ALSO =item HISTORY =back =head2 IO::Select - OO interface to the select system call =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ HANDLES ] ) =item METHODS add ( HANDLES ), remove ( HANDLES ), exists ( HANDLE ), handles, can_read ( [ TIMEOUT ] ), can_write ( [ TIMEOUT ] ), has_exception ( [ TIMEOUT ] ), count (), bits(), select ( READ, WRITE, EXCEPTION [, TIMEOUT ] ) =item EXAMPLE =item AUTHOR =item COPYRIGHT =back =head2 IO::Socket - Object interface to socket communications =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR ARGUMENTS =over 4 =item Blocking =item Domain =item Listen =item Timeout =item Type =back =item CONSTRUCTORS =over 4 =item new =back =item METHODS =over 4 =item accept =item atmark =item autoflush =item bind =item connected =item getsockopt =item listen =item peername =item protocol =item recv =item send =item setsockopt =item shutdown =item sockdomain =item socket =item socketpair =item sockname =item sockopt =item socktype =item timeout =back =item EXAMPLES =item LIMITATIONS =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::Socket::INET - Object interface for AF_INET domain sockets =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ARGS] ) =over 4 =item METHODS sockaddr (), sockport (), sockhost (), peeraddr (), peerport (), peerhost () =back =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::Socket::IP, C<IO::Socket::IP> - Family-neutral IP socket supporting both IPv4 and IPv6 =over 4 =item SYNOPSIS =item DESCRIPTION =item REPLACING C<IO::Socket> DEFAULT BEHAVIOUR =back =over 4 =item CONSTRUCTORS =back =over 4 =item $sock = IO::Socket::IP->new( %args ) PeerHost => STRING, PeerService => STRING, PeerAddr => STRING, PeerPort => STRING, PeerAddrInfo => ARRAY, LocalHost => STRING, LocalService => STRING, LocalAddr => STRING, LocalPort => STRING, LocalAddrInfo => ARRAY, Family => INT, Type => INT, Proto => STRING or INT, GetAddrInfoFlags => INT, Listen => INT, ReuseAddr => BOOL, ReusePort => BOOL, Broadcast => BOOL, Sockopts => ARRAY, V6Only => BOOL, MultiHomed, Blocking => BOOL, Timeout => NUM =item $sock = IO::Socket::IP->new( $peeraddr ) =back =over 4 =item METHODS =back =over 4 =item ( $host, $service ) = $sock->sockhost_service( $numeric ) =back =over 4 =item $addr = $sock->sockhost =item $port = $sock->sockport =item $host = $sock->sockhostname =item $service = $sock->sockservice =back =over 4 =item $addr = $sock->sockaddr =back =over 4 =item ( $host, $service ) = $sock->peerhost_service( $numeric ) =back =over 4 =item $addr = $sock->peerhost =item $port = $sock->peerport =item $host = $sock->peerhostname =item $service = $sock->peerservice =back =over 4 =item $addr = $peer->peeraddr =back =over 4 =item $inet = $sock->as_inet =back =over 4 =item NON-BLOCKING =back =over 4 =item C<PeerHost> AND C<LocalHost> PARSING =over 4 =item ( $host, $port ) = IO::Socket::IP->split_addr( $addr ) =back =back =over 4 =item $addr = IO::Socket::IP->join_addr( $host, $port ) =back =over 4 =item C<IO::Socket::INET> INCOMPATIBILITES =back =over 4 =item TODO =item AUTHOR =back =head2 IO::Socket::UNIX - Object interface for AF_UNIX domain sockets =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ARGS] ) =item METHODS hostpath(), peerpath() =item SEE ALSO =item AUTHOR =item COPYRIGHT =back =head2 IO::Uncompress::AnyInflate - Uncompress zlib-based (zip, gzip) file/buffer =over 4 =item SYNOPSIS =item DESCRIPTION RFC 1950, RFC 1951 (optionally), gzip (RFC 1952), zip =item Functional Interface =over 4 =item anyinflate $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< RawInflate => 0|1 >>, C<< ParseExtra => 0|1 >> If the gzip FEXTRA header field is present and this option is set, it will force the module to check that it conforms to the sub-field structure as defined in RFC 1952 =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item inflateSync =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::AnyUncompress - Uncompress gzip, zip, bzip2, xz, lzma, lzip, lzf or lzop file/buffer =over 4 =item SYNOPSIS =item DESCRIPTION RFC 1950, RFC 1951 (optionally), gzip (RFC 1952), zip, bzip2, lzop, lzf, lzma, lzip, xz =item Functional Interface =over 4 =item anyuncompress $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< RawInflate => 0|1 >>, C<< UnLzma => 0|1 >> =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::Base - Base Class for IO::Uncompress modules =over 4 =item SYNOPSIS =item DESCRIPTION =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::Bunzip2 - Read bzip2 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item bunzip2 $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< Small => 0|1 >> =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::Gunzip - Read RFC 1952 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item gunzip $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >>, C<< ParseExtra => 0|1 >> If the gzip FEXTRA header field is present and this option is set, it will force the module to check that it conforms to the sub-field structure as defined in RFC 1952 =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item inflateSync =item getHeaderInfo Name, Comment =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::Inflate - Read RFC 1950 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item inflate $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item inflateSync =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::RawInflate - Read RFC 1951 files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item rawinflate $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item inflateSync =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Uncompress::Unzip - Read zip files/buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item Functional Interface =over 4 =item unzip $input_filename_or_reference => $output_filename_or_reference [, OPTS] A filename, A filehandle, A scalar reference, An array reference, An Input FileGlob string, A filename, A filehandle, A scalar reference, An Array Reference, An Output FileGlob =item Notes =item Optional Parameters C<< AutoClose => 0|1 >>, C<< BinModeOut => 0|1 >>, C<< Append => 0|1 >>, A Buffer, A Filename, A Filehandle, C<< MultiStream => 0|1 >>, C<< TrailingData => $scalar >> =item Examples =back =item OO Interface =over 4 =item Constructor A filename, A filehandle, A scalar reference =item Constructor Options C<< Name => "membername" >>, C<< Efs => 0| 1 >>, C<< AutoClose => 0|1 >>, C<< MultiStream => 0|1 >>, C<< Prime => $string >>, C<< Transparent => 0|1 >>, C<< BlockSize => $num >>, C<< InputLength => $size >>, C<< Append => 0|1 >>, C<< Strict => 0|1 >> =item Examples =back =item Methods =over 4 =item read =item read =item getline =item getc =item ungetc =item inflateSync =item getHeaderInfo =item tell =item eof =item seek =item binmode =item opened =item autoflush =item input_line_number =item fileno =item close =item nextStream =item trailingData =back =item Importing :all =item EXAMPLES =over 4 =item Working with Net::FTP =item Walking through a zip file =item Unzipping a complete zip file to disk =back =item SUPPORT =item SEE ALSO =item AUTHOR =item MODIFICATION HISTORY =item COPYRIGHT AND LICENSE =back =head2 IO::Zlib - IO:: style interface to L<Compress::Zlib> =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ARGS] ) =item OBJECT METHODS open ( FILENAME, MODE ), opened, close, getc, getline, getlines, print ( ARGS... ), read ( BUF, NBYTES, [OFFSET] ), eof, seek ( OFFSET, WHENCE ), tell, setpos ( POS ), getpos ( POS ) =item USING THE EXTERNAL GZIP =item CLASS METHODS has_Compress_Zlib, gzip_external, gzip_used, gzip_read_open, gzip_write_open =item DIAGNOSTICS IO::Zlib::getlines: must be called in list context, IO::Zlib::gzopen_external: mode '...' is illegal, IO::Zlib::import: '...' is illegal, IO::Zlib::import: ':gzip_external' requires an argument, IO::Zlib::import: 'gzip_read_open' requires an argument, IO::Zlib::import: 'gzip_read' '...' is illegal, IO::Zlib::import: 'gzip_write_open' requires an argument, IO::Zlib::import: 'gzip_write_open' '...' is illegal, IO::Zlib::import: no Compress::Zlib and no external gzip, IO::Zlib::open: needs a filename, IO::Zlib::READ: NBYTES must be specified, IO::Zlib::WRITE: too long LENGTH =item SEE ALSO =item HISTORY =item COPYRIGHT =back =head2 IPC::Cmd - finding and running system commands made easy =over 4 =item SYNOPSIS =item DESCRIPTION =item CLASS METHODS =over 4 =item $ipc_run_version = IPC::Cmd->can_use_ipc_run( [VERBOSE] ) =back =back =over 4 =item $ipc_open3_version = IPC::Cmd->can_use_ipc_open3( [VERBOSE] ) =back =over 4 =item $bool = IPC::Cmd->can_capture_buffer =back =over 4 =item $bool = IPC::Cmd->can_use_run_forked =back =over 4 =item FUNCTIONS =over 4 =item $path = can_run( PROGRAM ); =back =back =over 4 =item $ok | ($ok, $err, $full_buf, $stdout_buff, $stderr_buff) = run( command => COMMAND, [verbose => BOOL, buffer => \$SCALAR, timeout => DIGIT] ); command, verbose, buffer, timeout, success, error message, full_buffer, out_buffer, error_buffer =back =over 4 =item $hashref = run_forked( COMMAND, { child_stdin => SCALAR, timeout => DIGIT, stdout_handler => CODEREF, stderr_handler => CODEREF} ); C<timeout>, C<child_stdin>, C<stdout_handler>, C<stderr_handler>, C<wait_loop_callback>, C<discard_output>, C<terminate_on_parent_sudden_death>, C<exit_code>, C<timeout>, C<stdout>, C<stderr>, C<merged>, C<err_msg> =back =over 4 =item $q = QUOTE =back =over 4 =item HOW IT WORKS =item Global Variables =over 4 =item $IPC::Cmd::VERBOSE =item $IPC::Cmd::USE_IPC_RUN =item $IPC::Cmd::USE_IPC_OPEN3 =item $IPC::Cmd::WARN =item $IPC::Cmd::INSTANCES =item $IPC::Cmd::ALLOW_NULL_ARGS =back =item Caveats Whitespace and IPC::Open3 / system(), Whitespace and IPC::Run, IO Redirect, Interleaving STDOUT/STDERR =item See Also =item ACKNOWLEDGEMENTS =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 IPC::Msg - SysV Msg IPC object class =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS new ( KEY , FLAGS ), id, rcv ( BUF, LEN [, TYPE [, FLAGS ]] ), remove, set ( STAT ), set ( NAME => VALUE [, NAME => VALUE ...] ), snd ( TYPE, MSG [, FLAGS ] ), stat =item SEE ALSO =item AUTHORS =item COPYRIGHT =back =head2 IPC::Open2 - open a process for both reading and writing using open2() =over 4 =item SYNOPSIS =item DESCRIPTION =item WARNING =item SEE ALSO =back =head2 IPC::Open3 - open a process for reading, writing, and error handling using open3() =over 4 =item SYNOPSIS =item DESCRIPTION =item See Also L<IPC::Open2>, L<IPC::Run> =item WARNING =back =head2 IPC::Semaphore - SysV Semaphore IPC object class =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS new ( KEY , NSEMS , FLAGS ), getall, getncnt ( SEM ), getpid ( SEM ), getval ( SEM ), getzcnt ( SEM ), id, op ( OPLIST ), remove, set ( STAT ), set ( NAME => VALUE [, NAME => VALUE ...] ), setall ( VALUES ), setval ( N , VALUE ), stat =item SEE ALSO =item AUTHORS =item COPYRIGHT =back =head2 IPC::SharedMem - SysV Shared Memory IPC object class =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS new ( KEY , SIZE , FLAGS ), id, read ( POS, SIZE ), write ( STRING, POS, SIZE ), remove, is_removed, stat, attach ( [FLAG] ), detach, addr =item SEE ALSO =item AUTHORS =item COPYRIGHT =back =head2 IPC::SysV - System V IPC constants and system calls =over 4 =item SYNOPSIS =item DESCRIPTION ftok( PATH ), ftok( PATH, ID ), shmat( ID, ADDR, FLAG ), shmdt( ADDR ), memread( ADDR, VAR, POS, SIZE ), memwrite( ADDR, STRING, POS, SIZE ) =item SEE ALSO =item AUTHORS =item COPYRIGHT =back =head2 Internals - Reserved special namespace for internals related functions =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item FUNCTIONS SvREFCNT(THING [, $value]), SvREADONLY(THING, [, $value]), hv_clear_placeholders(%hash) =back =item AUTHOR =item SEE ALSO =back =head2 JSON::PP - JSON::XS compatible pure-Perl module. =over 4 =item SYNOPSIS =item VERSION =item DESCRIPTION =item FUNCTIONAL INTERFACE =over 4 =item encode_json =item decode_json =item JSON::PP::is_bool =back =item OBJECT-ORIENTED INTERFACE =over 4 =item new =item ascii =item latin1 =item utf8 =item pretty =item indent =item space_before =item space_after =item relaxed list items can have an end-comma, shell-style '#'-comments, C-style multiple-line '/* */'-comments (JSON::PP only), C++-style one-line '//'-comments (JSON::PP only), literal ASCII TAB characters in strings =item canonical =item allow_nonref =item allow_unknown =item allow_blessed =item convert_blessed =item allow_tags =item boolean_values =item filter_json_object =item filter_json_single_key_object =item shrink =item max_depth =item max_size =item encode =item decode =item decode_prefix =back =item FLAGS FOR JSON::PP ONLY =over 4 =item allow_singlequote =item allow_barekey =item allow_bignum =item loose =item escape_slash =item indent_length =item sort_by =back =item INCREMENTAL PARSING =over 4 =item incr_parse =item incr_text =item incr_skip =item incr_reset =back =item MAPPING =over 4 =item JSON -> PERL object, array, string, number, true, false, null, shell-style comments (C<< # I<text> >>), tagged values (C<< (I<tag>)I<value> >>) =item PERL -> JSON hash references, array references, other references, JSON::PP::true, JSON::PP::false, JSON::PP::null, blessed objects, simple scalars =item OBJECT SERIALISATION 1. C<allow_tags> is enabled and the object has a C<FREEZE> method, 2. C<convert_blessed> is enabled and the object has a C<TO_JSON> method, 3. C<allow_bignum> is enabled and the object is a C<Math::BigInt> or C<Math::BigFloat>, 4. C<allow_blessed> is enabled, 5. none of the above =back =item ENCODING/CODESET FLAG NOTES C<utf8> flag disabled, C<utf8> flag enabled, C<latin1> or C<ascii> flags enabled =item BUGS =item SEE ALSO =item AUTHOR =item CURRENT MAINTAINER =item COPYRIGHT AND LICENSE =back =head2 JSON::PP::Boolean - dummy module providing JSON::PP::Boolean =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =item LICENSE =back =head2 List::Util - A selection of general-utility list subroutines =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item LIST-REDUCTION FUNCTIONS =back =over 4 =item reduce =item reductions =item any =item all =item none =item notall =item first =item max =item maxstr =item min =item minstr =item product =item sum =item sum0 =back =over 4 =item KEY/VALUE PAIR LIST FUNCTIONS =back =over 4 =item pairs =item unpairs =item pairkeys =item pairvalues =item pairgrep =item pairfirst =item pairmap =back =over 4 =item OTHER FUNCTIONS =back =over 4 =item shuffle =back =over 4 =item sample =item uniq =item uniqint =item uniqnum =item uniqstr =back =over 4 =item head =item tail =back =over 4 =item CONFIGURATION VARIABLES =over 4 =item $RAND =back =item KNOWN BUGS =over 4 =item RT #95409 =item uniqnum() on oversized bignums =back =item SUGGESTED ADDITIONS =item SEE ALSO =item COPYRIGHT =back =head2 List::Util::XS - Indicate if List::Util was compiled with a C compiler =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT =back =head2 Locale::Maketext - framework for localization =over 4 =item SYNOPSIS =item DESCRIPTION =item QUICK OVERVIEW =item METHODS =over 4 =item Construction Methods =item The "maketext" Method $lh->fail_with I<or> $lh->fail_with(I<PARAM>), $lh->failure_handler_auto, $lh->blacklist(@list), $lh->whitelist(@list) =item Utility Methods $language->quant($number, $singular), $language->quant($number, $singular, $plural), $language->quant($number, $singular, $plural, $negative), $language->numf($number), $language->numerate($number, $singular, $plural, $negative), $language->sprintf($format, @items), $language->language_tag(), $language->encoding() =item Language Handle Attributes and Internals =back =item LANGUAGE CLASS HIERARCHIES =item ENTRIES IN EACH LEXICON =item BRACKET NOTATION =item BRACKET NOTATION SECURITY =item AUTO LEXICONS =item READONLY LEXICONS =item CONTROLLING LOOKUP FAILURE =item HOW TO USE MAKETEXT =item SEE ALSO =item COPYRIGHT AND DISCLAIMER =item AUTHOR =back =head2 Locale::Maketext::Cookbook - recipes for using Locale::Maketext =over 4 =item INTRODUCTION =item ONESIDED LEXICONS =item DECIMAL PLACES IN NUMBER FORMATTING =back =head2 Locale::Maketext::Guts - Deprecated module to load Locale::Maketext utf8 code =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Locale::Maketext::GutsLoader - Deprecated module to load Locale::Maketext utf8 code =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Locale::Maketext::Simple - Simple interface to Locale::Maketext::Lexicon =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item OPTIONS =over 4 =item Class =item Path =item Style =item Export =item Subclass =item Decode =item Encoding =back =back =over 4 =item ACKNOWLEDGMENTS =item SEE ALSO =item AUTHORS =item COPYRIGHT =over 4 =item The "MIT" License =back =back =head2 Locale::Maketext::TPJ13 -- article about software localization =over 4 =item SYNOPSIS =item DESCRIPTION =item Localization and Perl: gettext breaks, Maketext fixes =over 4 =item A Localization Horror Story: It Could Happen To You =item The Linguistic View =item Breaking gettext =item Replacing gettext =item Buzzwords: Abstraction and Encapsulation =item Buzzword: Isomorphism =item Buzzword: Inheritance =item Buzzword: Concision =item The Devil in the Details =item The Proof in the Pudding: Localizing Web Sites =item References =back =back =head2 MIME::Base64 - Encoding and decoding of base64 strings =over 4 =item SYNOPSIS =item DESCRIPTION encode_base64( $bytes ), encode_base64( $bytes, $eol );, decode_base64( $str ), encode_base64url( $bytes ), decode_base64url( $str ), encoded_base64_length( $bytes ), encoded_base64_length( $bytes, $eol ), decoded_base64_length( $str ) =item EXAMPLES =item COPYRIGHT =item SEE ALSO =back =head2 MIME::QuotedPrint - Encoding and decoding of quoted-printable strings =over 4 =item SYNOPSIS =item DESCRIPTION encode_qp( $str), encode_qp( $str, $eol), encode_qp( $str, $eol, $binmode ), decode_qp( $str ) =item COPYRIGHT =item SEE ALSO =back =head2 Math::BigFloat - Arbitrary size floating point math package =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Input =item Output =back =item METHODS =over 4 =item Configuration methods accuracy(), precision() =item Constructor methods from_hex(), from_oct(), from_bin(), from_ieee754(), bpi() =item Arithmetic methods bmuladd(), bdiv(), bmod(), bexp(), bnok(), bsin(), bcos(), batan(), batan2(), as_float(), to_ieee754() =item ACCURACY AND PRECISION =item Rounding bfround ( +$scale ), bfround ( -$scale ), bfround ( 0 ), bround ( +$scale ), bround ( -$scale ) and bround ( 0 ) =back =item Autocreating constants =over 4 =item Math library =item Using Math::BigInt::Lite =back =item EXPORTS =item CAVEATS stringify, bstr(), brsft(), Modifying and =, precision() vs. accuracy() =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, MetaCPAN, CPAN Testers Matrix, The Bignum mailing list, Post to mailing list, View mailing list, Subscribe/Unsubscribe =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 Math::BigInt - Arbitrary size integer/float math package =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Input =item Output =back =item METHODS =over 4 =item Configuration methods accuracy(), precision(), div_scale(), round_mode(), upgrade(), downgrade(), modify(), config() =item Constructor methods new(), from_hex(), from_oct(), from_bin(), from_bytes(), from_base(), bzero(), bone(), binf(), bnan(), bpi(), copy(), as_int(), as_number() =item Boolean methods is_zero(), is_one( [ SIGN ]), is_finite(), is_inf( [ SIGN ] ), is_nan(), is_positive(), is_pos(), is_negative(), is_neg(), is_non_positive(), is_non_negative(), is_odd(), is_even(), is_int() =item Comparison methods bcmp(), bacmp(), beq(), bne(), blt(), ble(), bgt(), bge() =item Arithmetic methods bneg(), babs(), bsgn(), bnorm(), binc(), bdec(), badd(), bsub(), bmul(), bmuladd(), bdiv(), btdiv(), bmod(), btmod(), bmodinv(), bmodpow(), bpow(), blog(), bexp(), bnok(), buparrow(), uparrow(), backermann(), ackermann(), bsin(), bcos(), batan(), batan2(), bsqrt(), broot(), bfac(), bdfac(), bfib(), blucas(), brsft(), blsft() =item Bitwise methods band(), bior(), bxor(), bnot() =item Rounding methods round(), bround(), bfround(), bfloor(), bceil(), bint() =item Other mathematical methods bgcd(), blcm() =item Object property methods sign(), digit(), digitsum(), bdigitsum(), length(), mantissa(), exponent(), parts(), sparts(), nparts(), eparts(), dparts() =item String conversion methods bstr(), bsstr(), bnstr(), bestr(), bdstr(), to_hex(), to_bin(), to_oct(), to_bytes(), to_base(), as_hex(), as_bin(), as_oct(), as_bytes() =item Other conversion methods numify() =back =item ACCURACY and PRECISION =over 4 =item Precision P =item Accuracy A =item Fallback F =item Rounding mode R 'trunc', 'even', 'odd', '+inf', '-inf', 'zero', 'common', Precision, Accuracy (significant digits), Setting/Accessing, Creating numbers, Usage, Precedence, Overriding globals, Local settings, Rounding, Default values, Remarks =back =item Infinity and Not a Number oct()/hex() =item INTERNALS =over 4 =item MATH LIBRARY =item SIGN =back =item EXAMPLES =item Autocreating constants =item PERFORMANCE =over 4 =item Alternative math libraries =back =item SUBCLASSING =over 4 =item Subclassing Math::BigInt =back =item UPGRADING =over 4 =item Auto-upgrade =back =item EXPORTS =item CAVEATS Comparing numbers as strings, int(), Modifying and =, Overloading -$x, Mixing different object types =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, MetaCPAN, CPAN Testers Matrix, The Bignum mailing list, Post to mailing list, View mailing list, Subscribe/Unsubscribe =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 Math::BigInt::Calc - Pure Perl module to support Math::BigInt =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 Math::BigInt::FastCalc - Math::BigInt::Calc with some XS for more speed =over 4 =item SYNOPSIS =item DESCRIPTION =item STORAGE =item METHODS =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to mailing list, View mailing list, Subscribe/Unsubscribe =item LICENSE =item AUTHORS =item SEE ALSO =back =head2 Math::BigInt::Lib - virtual parent class for Math::BigInt libraries =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item General Notes CLASS-E<gt>api_version(), CLASS-E<gt>_new(STR), CLASS-E<gt>_zero(), CLASS-E<gt>_one(), CLASS-E<gt>_two(), CLASS-E<gt>_ten(), CLASS-E<gt>_from_bin(STR), CLASS-E<gt>_from_oct(STR), CLASS-E<gt>_from_hex(STR), CLASS-E<gt>_from_bytes(STR), CLASS-E<gt>_from_base(STR, BASE, COLLSEQ), CLASS-E<gt>_add(OBJ1, OBJ2), CLASS-E<gt>_mul(OBJ1, OBJ2), CLASS-E<gt>_div(OBJ1, OBJ2), CLASS-E<gt>_sub(OBJ1, OBJ2, FLAG), CLASS-E<gt>_sub(OBJ1, OBJ2), CLASS-E<gt>_dec(OBJ), CLASS-E<gt>_inc(OBJ), CLASS-E<gt>_mod(OBJ1, OBJ2), CLASS-E<gt>_sqrt(OBJ), CLASS-E<gt>_root(OBJ, N), CLASS-E<gt>_fac(OBJ), CLASS-E<gt>_dfac(OBJ), CLASS-E<gt>_pow(OBJ1, OBJ2), CLASS-E<gt>_modinv(OBJ1, OBJ2), CLASS-E<gt>_modpow(OBJ1, OBJ2, OBJ3), CLASS-E<gt>_rsft(OBJ, N, B), CLASS-E<gt>_lsft(OBJ, N, B), CLASS-E<gt>_log_int(OBJ, B), CLASS-E<gt>_gcd(OBJ1, OBJ2), CLASS-E<gt>_lcm(OBJ1, OBJ2), CLASS-E<gt>_fib(OBJ), CLASS-E<gt>_lucas(OBJ), CLASS-E<gt>_and(OBJ1, OBJ2), CLASS-E<gt>_or(OBJ1, OBJ2), CLASS-E<gt>_xor(OBJ1, OBJ2), CLASS-E<gt>_sand(OBJ1, OBJ2, SIGN1, SIGN2), CLASS-E<gt>_sor(OBJ1, OBJ2, SIGN1, SIGN2), CLASS-E<gt>_sxor(OBJ1, OBJ2, SIGN1, SIGN2), CLASS-E<gt>_is_zero(OBJ), CLASS-E<gt>_is_one(OBJ), CLASS-E<gt>_is_two(OBJ), CLASS-E<gt>_is_ten(OBJ), CLASS-E<gt>_is_even(OBJ), CLASS-E<gt>_is_odd(OBJ), CLASS-E<gt>_acmp(OBJ1, OBJ2), CLASS-E<gt>_str(OBJ), CLASS-E<gt>_to_bin(OBJ), CLASS-E<gt>_to_oct(OBJ), CLASS-E<gt>_to_hex(OBJ), CLASS-E<gt>_to_bytes(OBJ), CLASS-E<gt>_to_base(OBJ, BASE, COLLSEQ), CLASS-E<gt>_as_bin(OBJ), CLASS-E<gt>_as_oct(OBJ), CLASS-E<gt>_as_hex(OBJ), CLASS-E<gt>_as_bytes(OBJ), CLASS-E<gt>_num(OBJ), CLASS-E<gt>_copy(OBJ), CLASS-E<gt>_len(OBJ), CLASS-E<gt>_zeros(OBJ), CLASS-E<gt>_digit(OBJ, N), CLASS-E<gt>_digitsum(OBJ), CLASS-E<gt>_check(OBJ), CLASS-E<gt>_set(OBJ) =item API version 2 CLASS-E<gt>_1ex(N), CLASS-E<gt>_nok(OBJ1, OBJ2), CLASS-E<gt>_alen(OBJ) =back =item WRAP YOUR OWN =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, MetaCPAN, CPAN Testers Matrix, The Bignum mailing list, Post to mailing list, View mailing list, Subscribe/Unsubscribe =item LICENSE =item AUTHOR =item SEE ALSO =back =head2 Math::BigRat - Arbitrary big rational numbers =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item MATH LIBRARY =back =item METHODS new(), numerator(), denominator(), parts(), numify(), as_int(), as_number(), as_float(), as_hex(), as_bin(), as_oct(), from_hex(), from_oct(), from_bin(), bnan(), bzero(), binf(), bone(), length(), digit(), bnorm(), bfac(), bround()/round()/bfround(), bmod(), bmodinv(), bmodpow(), bneg(), is_one(), is_zero(), is_pos()/is_positive(), is_neg()/is_negative(), is_int(), is_odd(), is_even(), bceil(), bfloor(), bint(), bsqrt(), broot(), badd(), bmul(), bsub(), bdiv(), bdec(), binc(), copy(), bstr()/bsstr(), bcmp(), bacmp(), beq(), bne(), blt(), ble(), bgt(), bge(), blsft()/brsft(), band(), bior(), bxor(), bnot(), bpow(), blog(), bexp(), bnok(), config() =item BUGS =item SUPPORT RT: CPAN's request tracker, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, Search CPAN, CPAN Testers Matrix, The Bignum mailing list, Post to mailing list, View mailing list, Subscribe/Unsubscribe =item LICENSE =item SEE ALSO =item AUTHORS =back =head2 Math::Complex - complex numbers and associated mathematical functions =over 4 =item SYNOPSIS =item DESCRIPTION =item OPERATIONS =item CREATION =item DISPLAYING =over 4 =item CHANGED IN PERL 5.6 =back =item USAGE =item CONSTANTS =over 4 =item PI =item Inf =back =item ERRORS DUE TO DIVISION BY ZERO OR LOGARITHM OF ZERO =item ERRORS DUE TO INDIGESTIBLE ARGUMENTS =item BUGS =item SEE ALSO =item AUTHORS =item LICENSE =back =head2 Math::Trig - trigonometric functions =over 4 =item SYNOPSIS =item DESCRIPTION =item TRIGONOMETRIC FUNCTIONS B<tan> =over 4 =item ERRORS DUE TO DIVISION BY ZERO =item SIMPLE (REAL) ARGUMENTS, COMPLEX RESULTS =back =item PLANE ANGLE CONVERSIONS deg2rad, grad2rad, rad2deg, grad2deg, deg2grad, rad2grad, rad2rad, deg2deg, grad2grad =item RADIAL COORDINATE CONVERSIONS =over 4 =item COORDINATE SYSTEMS =item 3-D ANGLE CONVERSIONS cartesian_to_cylindrical, cartesian_to_spherical, cylindrical_to_cartesian, cylindrical_to_spherical, spherical_to_cartesian, spherical_to_cylindrical =back =item GREAT CIRCLE DISTANCES AND DIRECTIONS =over 4 =item great_circle_distance =item great_circle_direction =item great_circle_bearing =item great_circle_destination =item great_circle_midpoint =item great_circle_waypoint =back =item EXAMPLES =over 4 =item CAVEAT FOR GREAT CIRCLE FORMULAS =item Real-valued asin and acos asin_real, acos_real =back =item BUGS =item AUTHORS =item LICENSE =back =head2 Memoize - Make functions faster by trading space for time =over 4 =item SYNOPSIS =item DESCRIPTION =item DETAILS =item OPTIONS =over 4 =item INSTALL =item NORMALIZER =item C<SCALAR_CACHE>, C<LIST_CACHE> C<MEMORY>, C<HASH>, C<TIE>, C<FAULT>, C<MERGE> =back =item OTHER FACILITIES =over 4 =item C<unmemoize> =item C<flush_cache> =back =item CAVEATS =item PERSISTENT CACHE SUPPORT =item EXPIRATION SUPPORT =item BUGS =item MAILING LIST =item AUTHOR =item COPYRIGHT AND LICENSE =item THANK YOU =back =head2 Memoize::AnyDBM_File - glue to provide EXISTS for AnyDBM_File for Storable use =over 4 =item DESCRIPTION =back =head2 Memoize::Expire - Plug-in module for automatic expiration of memoized values =over 4 =item SYNOPSIS =item DESCRIPTION =item INTERFACE TIEHASH, EXISTS, STORE =item ALTERNATIVES =item CAVEATS =item AUTHOR =item SEE ALSO =back =head2 Memoize::ExpireFile - test for Memoize expiration semantics =over 4 =item DESCRIPTION =back =head2 Memoize::ExpireTest - test for Memoize expiration semantics =over 4 =item DESCRIPTION =back =head2 Memoize::NDBM_File - glue to provide EXISTS for NDBM_File for Storable use =over 4 =item DESCRIPTION =back =head2 Memoize::SDBM_File - glue to provide EXISTS for SDBM_File for Storable use =over 4 =item DESCRIPTION =back =head2 Memoize::Storable - store Memoized data in Storable database =over 4 =item DESCRIPTION =back =head2 Module::CoreList - what modules shipped with versions of perl =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS API C<first_release( MODULE )>, C<first_release_by_date( MODULE )>, C<find_modules( REGEX, [ LIST OF PERLS ] )>, C<find_version( PERL_VERSION )>, C<is_core( MODULE, [ MODULE_VERSION, [ PERL_VERSION ] ] )>, C<is_deprecated( MODULE, PERL_VERSION )>, C<deprecated_in( MODULE )>, C<removed_from( MODULE )>, C<removed_from_by_date( MODULE )>, C<changes_between( PERL_VERSION, PERL_VERSION )> =item DATA STRUCTURES C<%Module::CoreList::version>, C<%Module::CoreList::delta>, C<%Module::CoreList::released>, C<%Module::CoreList::families>, C<%Module::CoreList::deprecated>, C<%Module::CoreList::upstream>, C<%Module::CoreList::bug_tracker> =item CAVEATS =item HISTORY =item AUTHOR =item LICENSE =item SEE ALSO =back =head2 Module::CoreList::Utils - what utilities shipped with versions of perl =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS API C<utilities>, C<first_release( UTILITY )>, C<first_release_by_date( UTILITY )>, C<removed_from( UTILITY )>, C<removed_from_by_date( UTILITY )> =item DATA STRUCTURES C<%Module::CoreList::Utils::utilities> =item AUTHOR =item LICENSE =item SEE ALSO =back =head2 Module::Load - runtime require of both modules and files =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Difference between C<load> and C<autoload> =back =item FUNCTIONS load, autoload, load_remote, autoload_remote =item Rules =item IMPORTS THE FUNCTIONS "load","autoload","load_remote","autoload_remote", 'all', '','none',undef =item Caveats =item SEE ALSO =item ACKNOWLEDGEMENTS =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 Module::Load::Conditional - Looking up module information / loading at runtime =over 4 =item SYNOPSIS =item DESCRIPTION =item Methods =over 4 =item $href = check_install( module => NAME [, version => VERSION, verbose => BOOL ] ); module, version, verbose, file, dir, version, uptodate =back =back =over 4 =item $bool = can_load( modules => { NAME => VERSION [,NAME => VERSION] }, [verbose => BOOL, nocache => BOOL, autoload => BOOL] ) modules, verbose, nocache, autoload =back =over 4 =item @list = requires( MODULE ); =back =over 4 =item Global Variables =over 4 =item $Module::Load::Conditional::VERBOSE =item $Module::Load::Conditional::FIND_VERSION =item $Module::Load::Conditional::CHECK_INC_HASH =item $Module::Load::Conditional::FORCE_SAFE_INC =item $Module::Load::Conditional::CACHE =item $Module::Load::Conditional::ERROR =item $Module::Load::Conditional::DEPRECATED =back =item See Also =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 Module::Loaded - mark modules as loaded or unloaded =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =over 4 =item $bool = mark_as_loaded( PACKAGE ); =back =back =over 4 =item $bool = mark_as_unloaded( PACKAGE ); =back =over 4 =item $loc = is_loaded( PACKAGE ); =back =over 4 =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 Module::Metadata - Gather package and POD information from perl module files =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item CLASS METHODS =over 4 =item C<< new_from_file($filename, collect_pod => 1, decode_pod => 1) >> =item C<< new_from_handle($handle, $filename, collect_pod => 1, decode_pod => 1) >> =item C<< new_from_module($module, collect_pod => 1, inc => \@dirs, decode_pod => 1) >> =item C<< find_module_by_name($module, \@dirs) >> =item C<< find_module_dir_by_name($module, \@dirs) >> =item C<< provides( %options ) >> version B<(required)>, dir, files, prefix =item C<< package_versions_from_directory($dir, \@files?) >> =item C<< log_info (internal) >> =back =item OBJECT METHODS =over 4 =item C<< name() >> =item C<< version($package) >> =item C<< filename() >> =item C<< packages_inside() >> =item C<< pod_inside() >> =item C<< contains_pod() >> =item C<< pod($section) >> =item C<< is_indexable($package) >> or C<< is_indexable() >> =back =item SUPPORT =item AUTHOR =item CONTRIBUTORS =item COPYRIGHT & LICENSE =back =head2 NDBM_File - Tied access to ndbm files =over 4 =item SYNOPSIS =item DESCRIPTION C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR> =item DIAGNOSTICS =over 4 =item C<ndbm store returned -1, errno 22, key "..." at ...> =back =item SECURITY AND PORTABILITY =item BUGS AND WARNINGS =back =head2 NEXT - Provide a pseudo-class NEXT (et al) that allows method redispatch =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Enforcing redispatch =item Avoiding repetitions =item Invoking all versions of a method with a single call =item Using C<EVERY> methods =back =item SEE ALSO =item AUTHOR =item BUGS AND IRRITATIONS =item COPYRIGHT =back =head2 Net::Cmd - Network Command class (as used by FTP, SMTP etc) =over 4 =item SYNOPSIS =item DESCRIPTION =item USER METHODS debug ( VALUE ), message (), code (), ok (), status (), datasend ( DATA ), dataend () =item CLASS METHODS debug_print ( DIR, TEXT ), debug_text ( DIR, TEXT ), command ( CMD [, ARGS, ... ]), unsupported (), response (), parse_response ( TEXT ), getline (), ungetline ( TEXT ), rawdatasend ( DATA ), read_until_dot (), tied_fh () =item PSEUDO RESPONSES Initial value, Connection closed, Timeout =item EXPORTS =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::Config - Local configuration data for libnet =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS requires_firewall ( HOST ) =item NetConfig VALUES nntp_hosts, snpp_hosts, pop3_hosts, smtp_hosts, ph_hosts, daytime_hosts, time_hosts, inet_domain, ftp_firewall, ftp_firewall_type, 0Z<>, 1Z<>, 2Z<>, 3Z<>, 4Z<>, 5Z<>, 6Z<>, 7Z<>, ftp_ext_passive, ftp_int_passive, local_netmask, test_hosts, test_exists =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::Domain - Attempt to evaluate the current host's internet name and domain =over 4 =item SYNOPSIS =item DESCRIPTION hostfqdn (), domainname (), hostname (), hostdomain () =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::FTP - FTP Client class =over 4 =item SYNOPSIS =item DESCRIPTION =item OVERVIEW =item CONSTRUCTOR new ([ HOST ] [, OPTIONS ]) =item METHODS login ([LOGIN [,PASSWORD [, ACCOUNT] ] ]), starttls (), stoptls (), prot ( LEVEL ), host (), account( ACCT ), authorize ( [AUTH [, RESP]]), site (ARGS), ascii (), binary (), type ( [ TYPE ] ), rename ( OLDNAME, NEWNAME ), delete ( FILENAME ), cwd ( [ DIR ] ), cdup (), passive ( [ PASSIVE ] ), pwd (), restart ( WHERE ), rmdir ( DIR [, RECURSE ]), mkdir ( DIR [, RECURSE ]), alloc ( SIZE [, RECORD_SIZE] ), ls ( [ DIR ] ), dir ( [ DIR ] ), get ( REMOTE_FILE [, LOCAL_FILE [, WHERE]] ), put ( LOCAL_FILE [, REMOTE_FILE ] ), put_unique ( LOCAL_FILE [, REMOTE_FILE ] ), append ( LOCAL_FILE [, REMOTE_FILE ] ), unique_name (), mdtm ( FILE ), size ( FILE ), supported ( CMD ), hash ( [FILEHANDLE_GLOB_REF],[ BYTES_PER_HASH_MARK] ), feature ( NAME ), nlst ( [ DIR ] ), list ( [ DIR ] ), retr ( FILE ), stor ( FILE ), stou ( FILE ), appe ( FILE ), port ( [ PORT ] ), eprt ( [ PORT ] ), pasv (), epsv (), pasv_xfer ( SRC_FILE, DEST_SERVER [, DEST_FILE ] ), pasv_xfer_unique ( SRC_FILE, DEST_SERVER [, DEST_FILE ] ), pasv_wait ( NON_PASV_SERVER ), abort (), quit () =over 4 =item Methods for the adventurous quot (CMD [,ARGS]), can_inet6 (), can_ssl () =back =item THE dataconn CLASS =item UNIMPLEMENTED B<SMNT>, B<HELP>, B<MODE>, B<SYST>, B<STAT>, B<STRU>, B<REIN> =item REPORTING BUGS =item AUTHOR =item SEE ALSO =item USE EXAMPLES http://www.csh.rit.edu/~adam/Progs/ =item CREDITS =item COPYRIGHT =item LICENCE =back =head2 Net::NNTP - NNTP Client class =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ HOST ] [, OPTIONS ]) =item METHODS host (), starttls (), article ( [ MSGID|MSGNUM ], [FH] ), body ( [ MSGID|MSGNUM ], [FH] ), head ( [ MSGID|MSGNUM ], [FH] ), articlefh ( [ MSGID|MSGNUM ] ), bodyfh ( [ MSGID|MSGNUM ] ), headfh ( [ MSGID|MSGNUM ] ), nntpstat ( [ MSGID|MSGNUM ] ), group ( [ GROUP ] ), help ( ), ihave ( MSGID [, MESSAGE ]), last (), date (), postok (), authinfo ( USER, PASS ), authinfo_simple ( USER, PASS ), list (), newgroups ( SINCE [, DISTRIBUTIONS ]), newnews ( SINCE [, GROUPS [, DISTRIBUTIONS ]]), next (), post ( [ MESSAGE ] ), postfh (), slave (), quit (), can_inet6 (), can_ssl () =over 4 =item Extension methods newsgroups ( [ PATTERN ] ), distributions (), distribution_patterns (), subscriptions (), overview_fmt (), active_times (), active ( [ PATTERN ] ), xgtitle ( PATTERN ), xhdr ( HEADER, MESSAGE-SPEC ), xover ( MESSAGE-SPEC ), xpath ( MESSAGE-ID ), xpat ( HEADER, PATTERN, MESSAGE-SPEC), xrover (), listgroup ( [ GROUP ] ), reader () =back =item UNSUPPORTED =item DEFINITIONS MESSAGE-SPEC, PATTERN, Examples, C<[^]-]>, C<*bdc>, C<[0-9a-zA-Z]>, C<a??d> =item SEE ALSO =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::Netrc - OO interface to users netrc file =over 4 =item SYNOPSIS =item DESCRIPTION =item THE .netrc FILE machine name, default, login name, password string, account string, macdef name =item CONSTRUCTOR lookup ( MACHINE [, LOGIN ]) =item METHODS login (), password (), account (), lpa () =item AUTHOR =item SEE ALSO =item COPYRIGHT =item LICENCE =back =head2 Net::POP3 - Post Office Protocol 3 Client class (RFC1939) =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR new ( [ HOST ] [, OPTIONS ] ) =item METHODS host (), auth ( USERNAME, PASSWORD ), user ( USER ), pass ( PASS ), login ( [ USER [, PASS ]] ), starttls ( SSLARGS ), apop ( [ USER [, PASS ]] ), banner (), capa (), capabilities (), top ( MSGNUM [, NUMLINES ] ), list ( [ MSGNUM ] ), get ( MSGNUM [, FH ] ), getfh ( MSGNUM ), last (), popstat (), ping ( USER ), uidl ( [ MSGNUM ] ), delete ( MSGNUM ), reset (), quit (), can_inet6 (), can_ssl () =item NOTES =item SEE ALSO =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::Ping - check a remote host for reachability =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Functions Net::Ping->new([proto, timeout, bytes, device, tos, ttl, family, host, port, bind, gateway, retrans, pingstring, source_verify econnrefused dontfrag IPV6_USE_MIN_MTU IPV6_RECVPATHMTU]) X<new>, $p->ping($host [, $timeout [, $family]]); X<ping>, $p->source_verify( { 0 | 1 } ); X<source_verify>, $p->service_check( { 0 | 1 } ); X<service_check>, $p->tcp_service_check( { 0 | 1 } ); X<tcp_service_check>, $p->hires( { 0 | 1 } ); X<hires>, $p->time X<time>, $p->socket_blocking_mode( $fh, $mode ); X<socket_blocking_mode>, $p->IPV6_USE_MIN_MTU X<IPV6_USE_MIN_MTU>, $p->IPV6_RECVPATHMTU X<IPV6_RECVPATHMTU>, $p->IPV6_HOPLIMIT X<IPV6_HOPLIMIT>, $p->IPV6_REACHCONF I<NYI> X<IPV6_REACHCONF>, $p->bind($local_addr); X<bind>, $p->message_type([$ping_type]); X<message_type>, $p->open($host); X<open>, $p->ack( [ $host ] ); X<ack>, $p->nack( $failed_ack_host ); X<nack>, $p->ack_unfork($host) X<ack_unfork>, $p->ping_icmp([$host, $timeout, $family]) X<ping_icmp>, $p->ping_icmpv6([$host, $timeout, $family]) I<NYI> X<ping_icmpv6>, $p->ping_stream([$host, $timeout, $family]) X<ping_stream>, $p->ping_syn([$host, $ip, $start_time, $stop_time]) X<ping_syn>, $p->ping_syn_fork([$host, $timeout, $family]) X<ping_syn_fork>, $p->ping_tcp([$host, $timeout, $family]) X<ping_tcp>, $p->ping_udp([$host, $timeout, $family]) X<ping_udp>, $p->ping_external([$host, $timeout, $family]) X<ping_external>, $p->tcp_connect([$ip, $timeout]) X<tcp_connect>, $p->tcp_echo([$ip, $timeout, $pingstring]) X<tcp_echo>, $p->close(); X<close>, $p->port_number([$port_number]) X<port_number>, $p->mselect X<mselect>, $p->ntop X<ntop>, $p->checksum($msg) X<checksum>, $p->icmp_result X<icmp_result>, pingecho($host [, $timeout]); X<pingecho>, wakeonlan($mac, [$host, [$port]]) X<wakeonlan> =back =item NOTES =item INSTALL =item BUGS =item AUTHORS =item COPYRIGHT =back =head2 Net::SMTP - Simple Mail Transfer Protocol Client =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES =item CONSTRUCTOR new ( [ HOST ] [, OPTIONS ] ) =item METHODS banner (), domain (), hello ( DOMAIN ), host (), etrn ( DOMAIN ), starttls ( SSLARGS ), auth ( USERNAME, PASSWORD ), auth ( SASL ), mail ( ADDRESS [, OPTIONS] ), send ( ADDRESS ), send_or_mail ( ADDRESS ), send_and_mail ( ADDRESS ), reset (), recipient ( ADDRESS [, ADDRESS, [...]] [, OPTIONS ] ), to ( ADDRESS [, ADDRESS [...]] ), cc ( ADDRESS [, ADDRESS [...]] ), bcc ( ADDRESS [, ADDRESS [...]] ), data ( [ DATA ] ), bdat ( DATA ), bdatlast ( DATA ), expand ( ADDRESS ), verify ( ADDRESS ), help ( [ $subject ] ), quit (), can_inet6 (), can_ssl () =item ADDRESSES =item SEE ALSO =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::Time - time and daytime network client interface =over 4 =item SYNOPSIS =item DESCRIPTION inet_time ( [HOST [, PROTOCOL [, TIMEOUT]]]), inet_daytime ( [HOST [, PROTOCOL [, TIMEOUT]]]) =item AUTHOR =item COPYRIGHT =item LICENCE =back =head2 Net::hostent - by-name interface to Perl's built-in gethost*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES =item NOTE =item AUTHOR =back =head2 Net::libnetFAQ, libnetFAQ - libnet Frequently Asked Questions =over 4 =item DESCRIPTION =over 4 =item Where to get this document =item How to contribute to this document =back =item Author and Copyright Information =over 4 =item Disclaimer =back =item Obtaining and installing libnet =over 4 =item What is libnet ? =item Which version of perl do I need ? =item What other modules do I need ? =item What machines support libnet ? =item Where can I get the latest libnet release =back =item Using Net::FTP =over 4 =item How do I download files from an FTP server ? =item How do I transfer files in binary mode ? =item How can I get the size of a file on a remote FTP server ? =item How can I get the modification time of a file on a remote FTP server ? =item How can I change the permissions of a file on a remote server ? =item Can I do a reget operation like the ftp command ? =item How do I get a directory listing from an FTP server ? =item Changing directory to "" does not fail ? =item I am behind a SOCKS firewall, but the Firewall option does not work ? =item I am behind an FTP proxy firewall, but cannot access machines outside ? =item My ftp proxy firewall does not listen on port 21 =item Is it possible to change the file permissions of a file on an FTP server ? =item I have seen scripts call a method message, but cannot find it documented ? =item Why does Net::FTP not implement mput and mget methods =back =item Using Net::SMTP =over 4 =item Why can't the part of an Email address after the @ be used as the hostname ? =item Why does Net::SMTP not do DNS MX lookups ? =item The verify method always returns true ? =back =item Debugging scripts =over 4 =item How can I debug my scripts that use Net::* modules ? =back =item AUTHOR AND COPYRIGHT =back =head2 Net::netent - by-name interface to Perl's built-in getnet*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES =item NOTE =item AUTHOR =back =head2 Net::protoent - by-name interface to Perl's built-in getproto*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item AUTHOR =back =head2 Net::servent - by-name interface to Perl's built-in getserv*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES =item NOTE =item AUTHOR =back =head2 O - Generic interface to Perl Compiler backends =over 4 =item SYNOPSIS =item DESCRIPTION =item CONVENTIONS =item IMPLEMENTATION =item BUGS =item AUTHOR =back =head2 ODBM_File - Tied access to odbm files =over 4 =item SYNOPSIS =item DESCRIPTION C<O_RDONLY>, C<O_WRONLY>, C<O_RDWR> =item DIAGNOSTICS =over 4 =item C<odbm store returned -1, errno 22, key "..." at ...> =back =item SECURITY AND PORTABILITY =item BUGS AND WARNINGS =back =head2 Opcode - Disable named opcodes when compiling perl code =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item WARNING =item Operator Names and Operator Lists an operator name (opname), an operator tag name (optag), a negated opname or optag, an operator set (opset) =item Opcode Functions opcodes, opset (OP, ...), opset_to_ops (OPSET), opset_to_hex (OPSET), full_opset, empty_opset, invert_opset (OPSET), verify_opset (OPSET, ...), define_optag (OPTAG, OPSET), opmask_add (OPSET), opmask, opdesc (OP, ...), opdump (PAT) =item Manipulating Opsets =item TO DO (maybe) =back =over 4 =item Predefined Opcode Tags :base_core, :base_mem, :base_loop, :base_io, :base_orig, :base_math, :base_thread, :default, :filesys_read, :sys_db, :browse, :filesys_open, :filesys_write, :subprocess, :ownprocess, :others, :load, :still_to_be_decided, :dangerous =item SEE ALSO =item AUTHORS =back =head2 POSIX - Perl interface to IEEE Std 1003.1 =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEATS =item FUNCTIONS C<_exit>, C<abort>, C<abs>, C<access>, C<acos>, C<acosh>, C<alarm>, C<asctime>, C<asin>, C<asinh>, C<assert>, C<atan>, C<atanh>, C<atan2>, C<atexit>, C<atof>, C<atoi>, C<atol>, C<bsearch>, C<calloc>, C<cbrt>, C<ceil>, C<chdir>, C<chmod>, C<chown>, C<clearerr>, C<clock>, C<close>, C<closedir>, C<cos>, C<cosh>, C<copysign>, C<creat>, C<ctermid>, C<ctime>, C<cuserid> [POSIX.1-1988], C<difftime>, C<div>, C<dup>, C<dup2>, C<erf>, C<erfc>, C<errno>, C<execl>, C<execle>, C<execlp>, C<execv>, C<execve>, C<execvp>, C<exit>, C<exp>, C<expm1>, C<fabs>, C<fclose>, C<fcntl>, C<fdopen>, C<feof>, C<ferror>, C<fflush>, C<fgetc>, C<fgetpos>, C<fgets>, C<fileno>, C<floor>, C<fdim>, C<fegetround>, C<fesetround>, C<fma>, C<fmax>, C<fmin>, C<fmod>, C<fopen>, C<fork>, C<fpathconf>, C<fpclassify>, C<fprintf>, C<fputc>, C<fputs>, C<fread>, C<free>, C<freopen>, C<frexp>, C<fscanf>, C<fseek>, C<fsetpos>, C<fstat>, C<fsync>, C<ftell>, C<fwrite>, C<getc>, C<getchar>, C<getcwd>, C<getegid>, C<getenv>, C<geteuid>, C<getgid>, C<getgrgid>, C<getgrnam>, C<getgroups>, C<getlogin>, C<getpayload>, C<getpgrp>, C<getpid>, C<getppid>, C<getpwnam>, C<getpwuid>, C<gets>, C<getuid>, C<gmtime>, C<hypot>, C<ilogb>, C<Inf>, C<isalnum>, C<isalpha>, C<isatty>, C<iscntrl>, C<isdigit>, C<isfinite>, C<isgraph>, C<isgreater>, C<isinf>, C<islower>, C<isnan>, C<isnormal>, C<isprint>, C<ispunct>, C<issignaling>, C<isspace>, C<isupper>, C<isxdigit>, C<j0>, C<j1>, C<jn>, C<y0>, C<y1>, C<yn>, C<kill>, C<labs>, C<lchown>, C<ldexp>, C<ldiv>, C<lgamma>, C<log1p>, C<log2>, C<logb>, C<link>, C<localeconv>, C<localtime>, C<log>, C<log10>, C<longjmp>, C<lseek>, C<lrint>, C<lround>, C<malloc>, C<mblen>, C<mbtowc>, C<memchr>, C<memcmp>, C<memcpy>, C<memmove>, C<memset>, C<mkdir>, C<mkfifo>, C<mktime>, C<modf>, C<NaN>, C<nan>, C<nearbyint>, C<nextafter>, C<nexttoward>, C<nice>, C<offsetof>, C<open>, C<opendir>, C<pathconf>, C<pause>, C<perror>, C<pipe>, C<pow>, C<printf>, C<putc>, C<putchar>, C<puts>, C<qsort>, C<raise>, C<rand>, C<read>, C<readdir>, C<realloc>, C<remainder>, C<remove>, C<remquo>, C<rename>, C<rewind>, C<rewinddir>, C<rint>, C<rmdir>, C<round>, C<scalbn>, C<scanf>, C<setgid>, C<setjmp>, C<setlocale>, C<setpayload>, C<setpayloadsig>, C<setpgid>, C<setsid>, C<setuid>, C<sigaction>, C<siglongjmp>, C<signbit>, C<sigpending>, C<sigprocmask>, C<sigsetjmp>, C<sigsuspend>, C<sin>, C<sinh>, C<sleep>, C<sprintf>, C<sqrt>, C<srand>, C<sscanf>, C<stat>, C<strcat>, C<strchr>, C<strcmp>, C<strcoll>, C<strcpy>, C<strcspn>, C<strerror>, C<strftime>, C<strlen>, C<strncat>, C<strncmp>, C<strncpy>, C<strpbrk>, C<strrchr>, C<strspn>, C<strstr>, C<strtod>, C<strtok>, C<strtol>, C<strtold>, C<strtoul>, C<strxfrm>, C<sysconf>, C<system>, C<tan>, C<tanh>, C<tcdrain>, C<tcflow>, C<tcflush>, C<tcgetpgrp>, C<tcsendbreak>, C<tcsetpgrp>, C<tgamma>, C<time>, C<times>, C<tmpfile>, C<tmpnam>, C<tolower>, C<toupper>, C<trunc>, C<ttyname>, C<tzname>, C<tzset>, C<umask>, C<uname>, C<ungetc>, C<unlink>, C<utime>, C<vfprintf>, C<vprintf>, C<vsprintf>, C<wait>, C<waitpid>, C<wctomb>, C<write> =item CLASSES =over 4 =item C<POSIX::SigAction> C<new>, C<handler>, C<mask>, C<flags>, C<safe> =item C<POSIX::SigRt> C<%SIGRT>, C<SIGRTMIN>, C<SIGRTMAX> =item C<POSIX::SigSet> C<new>, C<addset>, C<delset>, C<emptyset>, C<fillset>, C<ismember> =item C<POSIX::Termios> C<new>, C<getattr>, C<getcc>, C<getcflag>, C<getiflag>, C<getispeed>, C<getlflag>, C<getoflag>, C<getospeed>, C<setattr>, C<setcc>, C<setcflag>, C<setiflag>, C<setispeed>, C<setlflag>, C<setoflag>, C<setospeed>, Baud rate values, Terminal interface values, C<c_cc> field values, C<c_cflag> field values, C<c_iflag> field values, C<c_lflag> field values, C<c_oflag> field values =back =item PATHNAME CONSTANTS Constants =item POSIX CONSTANTS Constants =item RESOURCE CONSTANTS Constants =item SYSTEM CONFIGURATION Constants =item ERRNO Constants =item FCNTL Constants =item FLOAT Constants =item FLOATING-POINT ENVIRONMENT Constants =item LIMITS Constants =item LOCALE Constants =item MATH Constants =item SIGNAL Constants =item STAT Constants, Macros =item STDLIB Constants =item STDIO Constants =item TIME Constants =item UNISTD Constants =item WAIT Constants, C<WNOHANG>, C<WUNTRACED>, Macros, C<WIFEXITED>, C<WEXITSTATUS>, C<WIFSIGNALED>, C<WTERMSIG>, C<WIFSTOPPED>, C<WSTOPSIG> =item WINSOCK Constants =back =head2 Params::Check - A generic input parsing/checking mechanism. =over 4 =item SYNOPSIS =item DESCRIPTION =item Template default, required, strict_type, defined, no_override, store, allow =item Functions =over 4 =item check( \%tmpl, \%args, [$verbose] ); Template, Arguments, Verbose =back =back =over 4 =item allow( $test_me, \@criteria ); string, regexp, subroutine, array ref =back =over 4 =item last_error() =back =over 4 =item Global Variables =over 4 =item $Params::Check::VERBOSE =item $Params::Check::STRICT_TYPE =item $Params::Check::ALLOW_UNKNOWN =item $Params::Check::STRIP_LEADING_DASHES =item $Params::Check::NO_DUPLICATES =item $Params::Check::PRESERVE_CASE =item $Params::Check::ONLY_ALLOW_DEFINED =item $Params::Check::SANITY_CHECK_TEMPLATE =item $Params::Check::WARNINGS_FATAL =item $Params::Check::CALLER_DEPTH =back =item Acknowledgements =item BUG REPORTS =item AUTHOR =item COPYRIGHT =back =head2 Parse::CPAN::Meta - Parse META.yml and META.json CPAN metadata files =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item load_file =item load_yaml_string =item load_json_string =item load_string =item yaml_backend =item json_backend =item json_decoder =back =item FUNCTIONS =over 4 =item Load =item LoadFile =back =item ENVIRONMENT =over 4 =item CPAN_META_JSON_DECODER =item CPAN_META_JSON_BACKEND =item PERL_JSON_BACKEND =item PERL_YAML_BACKEND =back =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 Perl::OSType - Map Perl operating system names to generic types =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item USAGE =over 4 =item os_type() =item is_os_type() =back =item SEE ALSO =item SUPPORT =over 4 =item Bugs / Feature Requests =item Source Code =back =item AUTHOR =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 PerlIO - On demand loader for PerlIO layers and root of PerlIO::* name space =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Layers :unix, :stdio, :perlio, :crlf, :utf8, :bytes, :raw, :pop, :win32 =item Custom Layers :encoding, :mmap, :via, :scalar =item Alternatives to raw =item Defaults and how to override them =item Querying the layers of filehandles =back =item AUTHOR =item SEE ALSO =back =head2 PerlIO::encoding - encoding layer =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 PerlIO::mmap - Memory mapped IO =over 4 =item SYNOPSIS =item DESCRIPTION =item IMPLEMENTATION NOTE =back =head2 PerlIO::scalar - in-memory IO, scalar IO =over 4 =item SYNOPSIS =item DESCRIPTION =item IMPLEMENTATION NOTE =back =head2 PerlIO::via - Helper class for PerlIO layers implemented in perl =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPECTED METHODS $class->PUSHED([$mode,[$fh]]), $obj->POPPED([$fh]), $obj->UTF8($belowFlag,[$fh]), $obj->OPEN($path,$mode,[$fh]), $obj->BINMODE([$fh]), $obj->FDOPEN($fd,[$fh]), $obj->SYSOPEN($path,$imode,$perm,[$fh]), $obj->FILENO($fh), $obj->READ($buffer,$len,$fh), $obj->WRITE($buffer,$fh), $obj->FILL($fh), $obj->CLOSE($fh), $obj->SEEK($posn,$whence,$fh), $obj->TELL($fh), $obj->UNREAD($buffer,$fh), $obj->FLUSH($fh), $obj->SETLINEBUF($fh), $obj->CLEARERR($fh), $obj->ERROR($fh), $obj->EOF($fh) =item EXAMPLES =over 4 =item Example - a Hexadecimal Handle =back =back =head2 PerlIO::via::QuotedPrint - PerlIO layer for quoted-printable strings =over 4 =item SYNOPSIS =item VERSION =item DESCRIPTION =item REQUIRED MODULES =item SEE ALSO =item ACKNOWLEDGEMENTS =item COPYRIGHT =back =head2 Pod::Checker - check pod documents for syntax errors =over 4 =item SYNOPSIS =item OPTIONS/ARGUMENTS =over 4 =item podchecker() B<-warnings> =E<gt> I<val>, B<-quiet> =E<gt> I<val> =back =item DESCRIPTION =item DIAGNOSTICS =over 4 =item Errors empty =headn, =over on line I<N> without closing =back, You forgot a '=back' before '=headI<N>', =over is the last thing in the document?!, '=item' outside of any '=over', =back without =over, Can't have a 0 in =over I<N>, =over should be: '=over' or '=over positive_number', =begin I<TARGET> without matching =end I<TARGET>, =begin without a target?, =end I<TARGET> without matching =begin, '=end' without a target?, '=end I<TARGET>' is invalid, =end I<CONTENT> doesn't match =begin I<TARGET>, =for without a target?, unresolved internal link I<NAME>, Unknown directive: I<CMD>, Deleting unknown formatting code I<SEQ>, Unterminated I<SEQ>E<lt>E<gt> sequence, An EE<lt>...E<gt> surrounding strange content, An empty EE<lt>E<gt>, An empty C<< LE<lt>E<gt> >>, An empty XE<lt>E<gt>, A non-empty ZE<lt>E<gt>, Spurious text after =pod / =cut, =back doesn't take any parameters, but you said =back I<ARGUMENT>, =pod directives shouldn't be over one line long! Ignoring all I<N> lines of content, =cut found outside a pod block, Invalid =encoding syntax: I<CONTENT> =item Warnings nested commands I<CMD>E<lt>...I<CMD>E<lt>...E<gt>...E<gt>, multiple occurrences (I<N>) of link target I<name>, line containing nothing but whitespace in paragraph, =item has no contents, You can't have =items (as at line I<N>) unless the first thing after the =over is an =item, Expected '=item I<EXPECTED VALUE>', Expected '=item *', Possible =item type mismatch: 'I<x>' found leading a supposed definition =item, You have '=item x' instead of the expected '=item I<N>', Unknown E content in EE<lt>I<CONTENT>E<gt>, empty =over/=back block, empty section in previous paragraph, Verbatim paragraph in NAME section, =headI<n> without preceding higher level =item Hyperlinks ignoring leading/trailing whitespace in link, alternative text/node '%s' contains non-escaped | or / =back =item RETURN VALUE =item EXAMPLES =item SCRIPTS =item INTERFACE =back C<Pod::Checker-E<gt>new( %options )> C<$checker-E<gt>poderror( @args )>, C<$checker-E<gt>poderror( {%opts}, @args )> C<$checker-E<gt>num_errors()> C<$checker-E<gt>num_warnings()> C<$checker-E<gt>name()> C<$checker-E<gt>node()> C<$checker-E<gt>idx()> C<$checker-E<gt>hyperlinks()> line() type() page() node() =over 4 =item AUTHOR =back =head2 Pod::Escapes - for resolving Pod EE<lt>...E<gt> sequences =over 4 =item SYNOPSIS =item DESCRIPTION =item GOODIES e2char($e_content), e2charnum($e_content), $Name2character{I<name>}, $Name2character_number{I<name>}, $Latin1Code_to_fallback{I<integer>}, $Latin1Char_to_fallback{I<character>}, $Code2USASCII{I<integer>} =item CAVEATS =item SEE ALSO =item REPOSITORY =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Html - module to convert pod files to HTML =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =over 4 =item pod2html backlink, cachedir, css, flush, header, help, htmldir, htmlroot, index, infile, outfile, poderrors, podpath, podroot, quiet, recurse, title, verbose =item htmlify =item anchorify =back =item ENVIRONMENT =item AUTHOR =item SEE ALSO =item COPYRIGHT =back =head2 Pod::Man - Convert POD data to formatted *roff input =over 4 =item SYNOPSIS =item DESCRIPTION center, date, errors, fixed, fixedbold, fixeditalic, fixedbolditalic, lquote, rquote, name, nourls, quotes, release, section, stderr, utf8 =item DIAGNOSTICS roff font should be 1 or 2 chars, not "%s", Invalid errors setting "%s", Invalid quote specification "%s", POD document had syntax errors =item ENVIRONMENT PERL_CORE, POD_MAN_DATE, SOURCE_DATE_EPOCH =item BUGS =item CAVEATS =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::ParseLink - Parse an LE<lt>E<gt> formatting code in POD text =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::Perldoc - Look up Perl documentation in Pod format. =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::BaseTo - Base for Pod::Perldoc formatters =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::GetOptsOO - Customized option parser for Pod::Perldoc =over 4 =item SYNOPSIS =item DESCRIPTION Call Pod::Perldoc::GetOptsOO::getopts($object, \@ARGV, $truth), Given -n, if there's a opt_n_with, it'll call $object->opt_n_with( ARGUMENT ) (e.g., "-n foo" => $object->opt_n_with('foo'). Ditto "-nfoo"), Otherwise (given -n) if there's an opt_n, we'll call it $object->opt_n($truth) (Truth defaults to 1), Otherwise we try calling $object->handle_unknown_option('n') (and we increment the error count by the return value of it), If there's no handle_unknown_option, then we just warn, and then increment the error counter =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToANSI - render Pod with ANSI color escapes =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToChecker - let Perldoc check Pod for errors =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToMan - let Perldoc render Pod as man pages =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToNroff - let Perldoc convert Pod to nroff =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToPod - let Perldoc render Pod as ... Pod! =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToRtf - let Perldoc render Pod as RTF =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToTerm - render Pod with terminal escapes =over 4 =item SYNOPSIS =item DESCRIPTION =item PAGER FORMATTING =item CAVEAT =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToText - let Perldoc render Pod as plaintext =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Perldoc::ToTk - let Perldoc use Tk::Pod to render Pod =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item AUTHOR =back =head2 Pod::Perldoc::ToXml - let Perldoc render Pod as XML =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item COPYRIGHT AND DISCLAIMERS =item AUTHOR =back =head2 Pod::Simple - framework for parsing Pod =over 4 =item SYNOPSIS =item DESCRIPTION =item MAIN METHODS C<< $parser = I<SomeClass>->new(); >>, C<< $parser->output_fh( *OUT ); >>, C<< $parser->output_string( \$somestring ); >>, C<< $parser->parse_file( I<$some_filename> ); >>, C<< $parser->parse_file( *INPUT_FH ); >>, C<< $parser->parse_string_document( I<$all_content> ); >>, C<< $parser->parse_lines( I<...@lines...>, undef ); >>, C<< $parser->content_seen >>, C<< I<SomeClass>->filter( I<$filename> ); >>, C<< I<SomeClass>->filter( I<*INPUT_FH> ); >>, C<< I<SomeClass>->filter( I<\$document_content> ); >> =item SECONDARY METHODS C<< $parser->parse_characters( I<SOMEVALUE> ) >>, C<< $parser->no_whining( I<SOMEVALUE> ) >>, C<< $parser->no_errata_section( I<SOMEVALUE> ) >>, C<< $parser->complain_stderr( I<SOMEVALUE> ) >>, C<< $parser->source_filename >>, C<< $parser->doc_has_started >>, C<< $parser->source_dead >>, C<< $parser->strip_verbatim_indent( I<SOMEVALUE> ) >>, C<< $parser->expand_verbatim_tabs( I<n> ) >> =item TERTIARY METHODS C<< $parser->abandon_output_fh() >>X<abandon_output_fh>, C<< $parser->abandon_output_string() >>X<abandon_output_string>, C<< $parser->accept_code( @codes ) >>X<accept_code>, C<< $parser->accept_codes( @codes ) >>X<accept_codes>, C<< $parser->accept_directive_as_data( @directives ) >>X<accept_directive_as_data>, C<< $parser->accept_directive_as_processed( @directives ) >>X<accept_directive_as_processed>, C<< $parser->accept_directive_as_verbatim( @directives ) >>X<accept_directive_as_verbatim>, C<< $parser->accept_target( @targets ) >>X<accept_target>, C<< $parser->accept_target_as_text( @targets ) >>X<accept_target_as_text>, C<< $parser->accept_targets( @targets ) >>X<accept_targets>, C<< $parser->accept_targets_as_text( @targets ) >>X<accept_targets_as_text>, C<< $parser->any_errata_seen() >>X<any_errata_seen>, C<< $parser->errata_seen() >>X<errata_seen>, C<< $parser->detected_encoding() >>X<detected_encoding>, C<< $parser->encoding() >>X<encoding>, C<< $parser->parse_from_file( $source, $to ) >>X<parse_from_file>, C<< $parser->scream( @error_messages ) >>X<scream>, C<< $parser->unaccept_code( @codes ) >>X<unaccept_code>, C<< $parser->unaccept_codes( @codes ) >>X<unaccept_codes>, C<< $parser->unaccept_directive( @directives ) >>X<unaccept_directive>, C<< $parser->unaccept_directives( @directives ) >>X<unaccept_directives>, C<< $parser->unaccept_target( @targets ) >>X<unaccept_target>, C<< $parser->unaccept_targets( @targets ) >>X<unaccept_targets>, C<< $parser->version_report() >>X<version_report>, C<< $parser->whine( @error_messages ) >>X<whine> =item ENCODING =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org>, Karl Williamson C<khw@cpan.org>, Gabor Szabo C<szabgab@gmail.com>, Shawn H Corey C<SHCOREY at cpan.org> =back =head2 Pod::Simple::Checker -- check the Pod syntax of a document =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::Debug -- put Pod::Simple into trace/debug mode =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEATS =item GUTS =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::DumpAsText -- dump Pod-parsing events as text =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::DumpAsXML -- turn Pod into XML =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::HTML - convert Pod to HTML =over 4 =item SYNOPSIS =item DESCRIPTION =item CALLING FROM THE COMMAND LINE =item CALLING FROM PERL =over 4 =item Minimal code =item More detailed example =back =item METHODS =over 4 =item html_css =item html_javascript =item title_prefix =item title_postfix =item html_header_before_title =item top_anchor =item html_h_level =item index =item html_header_after_title =item html_footer =back =item SUBCLASSING =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item ACKNOWLEDGEMENTS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::HTMLBatch - convert several Pod files to several HTML files =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item FROM THE COMMAND LINE =back =item MAIN METHODS $batchconv = Pod::Simple::HTMLBatch->new;, $batchconv->batch_convert( I<indirs>, I<outdir> );, $batchconv->batch_convert( undef , ...);, $batchconv->batch_convert( q{@INC}, ...);, $batchconv->batch_convert( \@dirs , ...);, $batchconv->batch_convert( "somedir" , ...);, $batchconv->batch_convert( 'somedir:someother:also' , ...);, $batchconv->batch_convert( ... , undef );, $batchconv->batch_convert( ... , 'somedir' ); =over 4 =item ACCESSOR METHODS $batchconv->verbose( I<nonnegative_integer> );, $batchconv->index( I<true-or-false> );, $batchconv->contents_file( I<filename> );, $batchconv->contents_page_start( I<HTML_string> );, $batchconv->contents_page_end( I<HTML_string> );, $batchconv->add_css( $url );, $batchconv->add_javascript( $url );, $batchconv->css_flurry( I<true-or-false> );, $batchconv->javascript_flurry( I<true-or-false> );, $batchconv->no_contents_links( I<true-or-false> );, $batchconv->html_render_class( I<classname> );, $batchconv->search_class( I<classname> ); =back =item NOTES ON CUSTOMIZATION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::JustPod -- just the Pod, the whole Pod, and nothing but the Pod =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::LinkSection -- represent "section" attributes of L codes =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::Methody -- turn Pod::Simple events into method calls =over 4 =item SYNOPSIS =item DESCRIPTION =item METHOD CALLING =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::PullParser -- a pull-parser interface to parsing Pod =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS my $token = $parser->get_token, $parser->unget_token( $token ), $parser->unget_token( $token1, $token2, ... ), $parser->set_source( $filename ), $parser->set_source( $filehandle_object ), $parser->set_source( \$document_source ), $parser->set_source( \@document_lines ), $parser->parse_file(...), $parser->parse_string_document(...), $parser->filter(...), $parser->parse_from_file(...), my $title_string = $parser->get_title, my $title_string = $parser->get_short_title, $author_name = $parser->get_author, $description_name = $parser->get_description, $version_block = $parser->get_version =item NOTE =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::PullParserEndToken -- end-tokens from Pod::Simple::PullParser =over 4 =item SYNOPSIS =item DESCRIPTION $token->tagname, $token->tagname(I<somestring>), $token->tag(...), $token->is_tag(I<somestring>) or $token->is_tagname(I<somestring>) =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::PullParserStartToken -- start-tokens from Pod::Simple::PullParser =over 4 =item SYNOPSIS =item DESCRIPTION $token->tagname, $token->tagname(I<somestring>), $token->tag(...), $token->is_tag(I<somestring>) or $token->is_tagname(I<somestring>), $token->attr(I<attrname>), $token->attr(I<attrname>, I<newvalue>), $token->attr_hash =item SEE ALSO =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::PullParserTextToken -- text-tokens from Pod::Simple::PullParser =over 4 =item SYNOPSIS =item DESCRIPTION $token->text, $token->text(I<somestring>), $token->text_r() =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::PullParserToken -- tokens from Pod::Simple::PullParser =over 4 =item SYNOPSIS =item DESCRIPTION $token->type, $token->is_start, $token->is_text, $token->is_end, $token->dump =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::RTF -- format Pod as RTF =over 4 =item SYNOPSIS =item DESCRIPTION =item FORMAT CONTROL ATTRIBUTES $parser->head1_halfpoint_size( I<halfpoint_integer> );, $parser->head2_halfpoint_size( I<halfpoint_integer> );, $parser->head3_halfpoint_size( I<halfpoint_integer> );, $parser->head4_halfpoint_size( I<halfpoint_integer> );, $parser->codeblock_halfpoint_size( I<halfpoint_integer> );, $parser->header_halfpoint_size( I<halfpoint_integer> );, $parser->normal_halfpoint_size( I<halfpoint_integer> );, $parser->no_proofing_exemptions( I<true_or_false> );, $parser->doc_lang( I<microsoft_decimal_language_code> ) =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::Search - find POD documents in directory trees =over 4 =item SYNOPSIS =item DESCRIPTION =item CONSTRUCTOR =item ACCESSORS $search->inc( I<true-or-false> );, $search->verbose( I<nonnegative-number> );, $search->limit_glob( I<some-glob-string> );, $search->callback( I<\&some_routine> );, $search->laborious( I<true-or-false> );, $search->recurse( I<true-or-false> );, $search->shadows( I<true-or-false> );, $search->is_case_insensitive( I<true-or-false> );, $search->limit_re( I<some-regxp> );, $search->dir_prefix( I<some-string-value> );, $search->progress( I<some-progress-object> );, $name2path = $self->name2path;, $path2name = $self->path2name; =item MAIN SEARCH METHODS =over 4 =item C<< $search->survey( @directories ) >> C<name2path>, C<path2name> =item C<< $search->simplify_name( $str ) >> =item C<< $search->find( $pod ) >> =item C<< $search->find( $pod, @search_dirs ) >> =item C<< $self->contains_pod( $file ) >> =back =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::SimpleTree -- parse Pod into a simple parse tree =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =item Tree Contents =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::Subclassing -- write a formatter as a Pod::Simple subclass =over 4 =item SYNOPSIS =item DESCRIPTION Pod::Simple, Pod::Simple::Methody, Pod::Simple::PullParser, Pod::Simple::SimpleTree =item Events C<< $parser->_handle_element_start( I<element_name>, I<attr_hashref> ) >>, C<< $parser->_handle_element_end( I<element_name> ) >>, C<< $parser->_handle_text( I<text_string> ) >>, events with an element_name of Document, events with an element_name of Para, events with an element_name of B, C, F, or I, events with an element_name of S, events with an element_name of X, events with an element_name of L, events with an element_name of E or Z, events with an element_name of Verbatim, events with an element_name of head1 .. head4, events with an element_name of encoding, events with an element_name of over-bullet, events with an element_name of over-number, events with an element_name of over-text, events with an element_name of over-block, events with an element_name of over-empty, events with an element_name of item-bullet, events with an element_name of item-number, events with an element_name of item-text, events with an element_name of for, events with an element_name of Data =item More Pod::Simple Methods C<< $parser->accept_targets( I<SOMEVALUE> ) >>, C<< $parser->accept_targets_as_text( I<SOMEVALUE> ) >>, C<< $parser->accept_codes( I<Codename>, I<Codename>... ) >>, C<< $parser->accept_directive_as_data( I<directive_name> ) >>, C<< $parser->accept_directive_as_verbatim( I<directive_name> ) >>, C<< $parser->accept_directive_as_processed( I<directive_name> ) >>, C<< $parser->nbsp_for_S( I<BOOLEAN> ); >>, C<< $parser->version_report() >>, C<< $parser->pod_para_count() >>, C<< $parser->line_count() >>, C<< $parser->nix_X_codes( I<SOMEVALUE> ) >>, C<< $parser->keep_encoding_directive( I<SOMEVALUE> ) >>, C<< $parser->merge_text( I<SOMEVALUE> ) >>, C<< $parser->code_handler( I<CODE_REF> ) >>, C<< $parser->cut_handler( I<CODE_REF> ) >>, C<< $parser->pod_handler( I<CODE_REF> ) >>, C<< $parser->whiteline_handler( I<CODE_REF> ) >>, C<< $parser->whine( I<linenumber>, I<complaint string> ) >>, C<< $parser->scream( I<linenumber>, I<complaint string> ) >>, C<< $parser->source_dead(1) >>, C<< $parser->hide_line_numbers( I<SOMEVALUE> ) >>, C<< $parser->no_whining( I<SOMEVALUE> ) >>, C<< $parser->no_errata_section( I<SOMEVALUE> ) >>, C<< $parser->complain_stderr( I<SOMEVALUE> ) >>, C<< $parser->bare_output( I<SOMEVALUE> ) >>, C<< $parser->preserve_whitespace( I<SOMEVALUE> ) >>, C<< $parser->parse_empty_lists( I<SOMEVALUE> ) >> =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::Text -- format Pod as plaintext =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::TextContent -- get the text content of Pod =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::XHTML -- format Pod as validating XHTML =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Minimal code =back =back =over 4 =item METHODS =over 4 =item perldoc_url_prefix =item perldoc_url_postfix =item man_url_prefix =item man_url_postfix =item title_prefix, title_postfix =item html_css =item html_javascript =item html_doctype =item html_charset =item html_header_tags =item html_h_level =item default_title =item force_title =item html_header, html_footer =item index =item anchor_items =item backlink =back =back =over 4 =item SUBCLASSING =back =over 4 =item handle_text =item handle_code =item accept_targets_as_html =back =over 4 =item resolve_pod_page_link =back =over 4 =item resolve_man_page_link =back =over 4 =item idify =back =over 4 =item batch_mode_page_object_init =back =over 4 =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item ACKNOWLEDGEMENTS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Simple::XMLOutStream -- turn Pod into XML =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item ABOUT EXTENDING POD =item SEE ALSO =item SUPPORT =item COPYRIGHT AND DISCLAIMERS =item AUTHOR Allison Randal C<allison@perl.org>, Hans Dieter Pearcey C<hdp@cpan.org>, David E. Wheeler C<dwheeler@cpan.org> =back =head2 Pod::Text - Convert POD data to formatted text =over 4 =item SYNOPSIS =item DESCRIPTION alt, code, errors, indent, loose, margin, nourls, quotes, sentence, stderr, utf8, width =item DIAGNOSTICS Bizarre space in item, Item called without tag, Can't open %s for reading: %s, Invalid errors setting "%s", Invalid quote specification "%s", POD document had syntax errors =item BUGS =item CAVEATS =item NOTES =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::Text::Color - Convert POD data to formatted color ASCII text =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::Text::Overstrike - Convert POD data to formatted overstrike text =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::Text::Termcap - Convert POD data to ASCII text with format escapes =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Pod::Usage - print a usage message from embedded pod documentation =over 4 =item SYNOPSIS =item ARGUMENTS C<-message> I<string>, C<-msg> I<string>, C<-exitval> I<value>, C<-verbose> I<value>, C<-sections> I<spec>, C<-output> I<handle>, C<-input> I<handle>, C<-pathlist> I<string>, C<-noperldoc>, C<-perlcmd>, C<-perldoc> I<path-to-perldoc>, C<-perldocopt> I<string> =over 4 =item Formatting base class =item Pass-through options =back =item DESCRIPTION =over 4 =item Scripts =back =item EXAMPLES =over 4 =item Recommended Use =back =item CAVEATS =item AUTHOR =item ACKNOWLEDGMENTS =item SEE ALSO =back =head2 SDBM_File - Tied access to sdbm files =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Tie =back =item EXPORTS =item DIAGNOSTICS =over 4 =item C<sdbm store returned -1, errno 22, key "..." at ...> =back =item SECURITY WARNING =item BUGS AND WARNINGS =back =head2 Safe - Compile and execute code in restricted compartments =over 4 =item SYNOPSIS =item DESCRIPTION a new namespace, an operator mask =item WARNING =item METHODS =over 4 =item permit (OP, ...) =item permit_only (OP, ...) =item deny (OP, ...) =item deny_only (OP, ...) =item trap (OP, ...), untrap (OP, ...) =item share (NAME, ...) =item share_from (PACKAGE, ARRAYREF) =item varglob (VARNAME) =item reval (STRING, STRICT) =item rdo (FILENAME) =item root (NAMESPACE) =item mask (MASK) =item wrap_code_ref (CODEREF) =item wrap_code_refs_within (...) =back =item RISKS Memory, CPU, Snooping, Signals, State Changes =item AUTHOR =back =head2 Scalar::Util - A selection of general-utility scalar subroutines =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item FUNCTIONS FOR REFERENCES =over 4 =item blessed =item refaddr =item reftype =item weaken =item unweaken =item isweak =back =item OTHER FUNCTIONS =over 4 =item dualvar =item isdual =item isvstring =item looks_like_number =item openhandle =item readonly =item set_prototype =item tainted =back =item DIAGNOSTICS Weak references are not implemented in the version of perl, Vstrings are not implemented in the version of perl =item KNOWN BUGS =item SEE ALSO =item COPYRIGHT =back =head2 Search::Dict - look - search for key in dictionary file =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 SelectSaver - save and restore selected file handle =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 SelfLoader - load functions only on demand =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item The __DATA__ token =item SelfLoader autoloading =item Autoloading and package lexicals =item SelfLoader and AutoLoader =item __DATA__, __END__, and the FOOBAR::DATA filehandle. =item Classes and inherited methods. =back =item Multiple packages and fully qualified subroutine names =item AUTHOR =item COPYRIGHT AND LICENSE a), b) =back =head2 Socket, C<Socket> - networking constants and support functions =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item CONSTANTS =back =over 4 =item PF_INET, PF_INET6, PF_UNIX, ... =item AF_INET, AF_INET6, AF_UNIX, ... =item SOCK_STREAM, SOCK_DGRAM, SOCK_RAW, ... =item SOCK_NONBLOCK. SOCK_CLOEXEC =item SOL_SOCKET =item SO_ACCEPTCONN, SO_BROADCAST, SO_ERROR, ... =item IP_OPTIONS, IP_TOS, IP_TTL, ... =item IP_PMTUDISC_WANT, IP_PMTUDISC_DONT, ... =item IPTOS_LOWDELAY, IPTOS_THROUGHPUT, IPTOS_RELIABILITY, ... =item MSG_BCAST, MSG_OOB, MSG_TRUNC, ... =item SHUT_RD, SHUT_RDWR, SHUT_WR =item INADDR_ANY, INADDR_BROADCAST, INADDR_LOOPBACK, INADDR_NONE =item IPPROTO_IP, IPPROTO_IPV6, IPPROTO_TCP, ... =item TCP_CORK, TCP_KEEPALIVE, TCP_NODELAY, ... =item IN6ADDR_ANY, IN6ADDR_LOOPBACK =item IPV6_ADD_MEMBERSHIP, IPV6_MTU, IPV6_V6ONLY, ... =back =over 4 =item STRUCTURE MANIPULATORS =back =over 4 =item $family = sockaddr_family $sockaddr =item $sockaddr = pack_sockaddr_in $port, $ip_address =item ($port, $ip_address) = unpack_sockaddr_in $sockaddr =item $sockaddr = sockaddr_in $port, $ip_address =item ($port, $ip_address) = sockaddr_in $sockaddr =item $sockaddr = pack_sockaddr_in6 $port, $ip6_address, [$scope_id, [$flowinfo]] =item ($port, $ip6_address, $scope_id, $flowinfo) = unpack_sockaddr_in6 $sockaddr =item $sockaddr = sockaddr_in6 $port, $ip6_address, [$scope_id, [$flowinfo]] =item ($port, $ip6_address, $scope_id, $flowinfo) = sockaddr_in6 $sockaddr =item $sockaddr = pack_sockaddr_un $path =item ($path) = unpack_sockaddr_un $sockaddr =item $sockaddr = sockaddr_un $path =item ($path) = sockaddr_un $sockaddr =item $ip_mreq = pack_ip_mreq $multiaddr, $interface =item ($multiaddr, $interface) = unpack_ip_mreq $ip_mreq =item $ip_mreq_source = pack_ip_mreq_source $multiaddr, $source, $interface =item ($multiaddr, $source, $interface) = unpack_ip_mreq_source $ip_mreq =item $ipv6_mreq = pack_ipv6_mreq $multiaddr6, $ifindex =item ($multiaddr6, $ifindex) = unpack_ipv6_mreq $ipv6_mreq =back =over 4 =item FUNCTIONS =back =over 4 =item $ip_address = inet_aton $string =item $string = inet_ntoa $ip_address =item $address = inet_pton $family, $string =item $string = inet_ntop $family, $address =item ($err, @result) = getaddrinfo $host, $service, [$hints] flags => INT, family => INT, socktype => INT, protocol => INT, family => INT, socktype => INT, protocol => INT, addr => STRING, canonname => STRING, AI_PASSIVE, AI_CANONNAME, AI_NUMERICHOST =item ($err, $hostname, $servicename) = getnameinfo $sockaddr, [$flags, [$xflags]] NI_NUMERICHOST, NI_NUMERICSERV, NI_NAMEREQD, NI_DGRAM, NIx_NOHOST, NIx_NOSERV =back =over 4 =item getaddrinfo() / getnameinfo() ERROR CONSTANTS EAI_AGAIN, EAI_BADFLAGS, EAI_FAMILY, EAI_NODATA, EAI_NONAME, EAI_SERVICE =back =over 4 =item EXAMPLES =over 4 =item Lookup for connect() =item Making a human-readable string out of an address =item Resolving hostnames into IP addresses =item Accessing socket options =back =back =over 4 =item AUTHOR =back =head2 Storable - persistence for Perl data structures =over 4 =item SYNOPSIS =item DESCRIPTION =item MEMORY STORE =item ADVISORY LOCKING =item SPEED =item CANONICAL REPRESENTATION =item CODE REFERENCES =item FORWARD COMPATIBILITY utf8 data, restricted hashes, huge objects, files from future versions of Storable =item ERROR REPORTING =item WIZARDS ONLY =over 4 =item Hooks C<STORABLE_freeze> I<obj>, I<cloning>, C<STORABLE_thaw> I<obj>, I<cloning>, I<serialized>, .., C<STORABLE_attach> I<class>, I<cloning>, I<serialized> =item Predicates C<Storable::last_op_in_netorder>, C<Storable::is_storing>, C<Storable::is_retrieving> =item Recursion =item Deep Cloning =back =item Storable magic $info = Storable::file_magic( $filename ), C<version>, C<version_nv>, C<major>, C<minor>, C<hdrsize>, C<netorder>, C<byteorder>, C<intsize>, C<longsize>, C<ptrsize>, C<nvsize>, C<file>, $info = Storable::read_magic( $buffer ), $info = Storable::read_magic( $buffer, $must_be_file ) =item EXAMPLES =item SECURITY WARNING =item WARNING =item REGULAR EXPRESSIONS =item BUGS =over 4 =item 64 bit data in perl 5.6.0 and 5.6.1 =back =item CREDITS =item AUTHOR =item SEE ALSO =back =head2 Sub::Util - A selection of utility subroutines for subs and CODE references =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item FUNCTIONS =back =over 4 =item prototype =back =over 4 =item set_prototype =back =over 4 =item subname =back =over 4 =item set_subname =back =over 4 =item AUTHOR =back =head2 Symbol - manipulate Perl symbols and their names =over 4 =item SYNOPSIS =item DESCRIPTION =item BUGS =back =head2 Sys::Hostname - Try every conceivable way to get hostname =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =back =head2 Sys::Syslog - Perl interface to the UNIX syslog(3) calls =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item EXPORTS =item FUNCTIONS B<openlog($ident, $logopt, $facility)>, B<syslog($priority, $message)>, B<syslog($priority, $format, @args)>, B<Note>, B<setlogmask($mask_priority)>, B<setlogsock()>, B<Note>, B<closelog()> =item THE RULES OF SYS::SYSLOG =item EXAMPLES =item CONSTANTS =over 4 =item Facilities =item Levels =back =item DIAGNOSTICS C<Invalid argument passed to setlogsock>, C<eventlog passed to setlogsock, but no Win32 API available>, C<no connection to syslog available>, C<stream passed to setlogsock, but %s is not writable>, C<stream passed to setlogsock, but could not find any device>, C<tcp passed to setlogsock, but tcp service unavailable>, C<syslog: expecting argument %s>, C<syslog: invalid level/facility: %s>, C<syslog: too many levels given: %s>, C<syslog: too many facilities given: %s>, C<syslog: level must be given>, C<udp passed to setlogsock, but udp service unavailable>, C<unix passed to setlogsock, but path not available> =item HISTORY =item SEE ALSO =over 4 =item Other modules =item Manual Pages =item RFCs =item Articles =item Event Log =back =item AUTHORS & ACKNOWLEDGEMENTS =item BUGS =item SUPPORT Perl Documentation, MetaCPAN, Search CPAN, AnnoCPAN: Annotated CPAN documentation, CPAN Ratings, RT: CPAN's request tracker =item COPYRIGHT =item LICENSE =back =head2 TAP::Base - Base class that provides common functionality to L<TAP::Parser> and L<TAP::Harness> =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =head2 TAP::Formatter::Base - Base class for harness output delegates =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods C<verbosity>, C<verbose>, C<timer>, C<failures>, C<comments>, C<quiet>, C<really_quiet>, C<silent>, C<errors>, C<directives>, C<stdout>, C<color>, C<jobs>, C<show_count> =back =back =head2 TAP::Formatter::Color - Run Perl test scripts with color =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS =over 4 =item Class Methods =back =back =head2 TAP::Formatter::Console - Harness output delegate for default console output =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =over 4 =item C<< open_test >> =back =back =head2 TAP::Formatter::Console::ParallelSession - Harness output delegate for parallel console output =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods =back =back =head2 TAP::Formatter::Console::Session - Harness output delegate for default console output =over 4 =item VERSION =back =over 4 =item DESCRIPTION =back =over 4 =item C<< clear_for_close >> =item C<< close_test >> =item C<< header >> =item C<< result >> =back =head2 TAP::Formatter::File - Harness output delegate for file output =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =over 4 =item C<< open_test >> =back =back =head2 TAP::Formatter::File::Session - Harness output delegate for file output =over 4 =item VERSION =back =over 4 =item DESCRIPTION =back =over 4 =item METHODS =over 4 =item result =back =back =over 4 =item close_test =back =head2 TAP::Formatter::Session - Abstract base class for harness output delegate =over 4 =item VERSION =back =over 4 =item METHODS =over 4 =item Class Methods C<formatter>, C<parser>, C<name>, C<show_count> =back =back =head2 TAP::Harness - Run test scripts with statistics =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item SYNOPSIS =back =over 4 =item METHODS =over 4 =item Class Methods C<verbosity>, C<timer>, C<failures>, C<comments>, C<show_count>, C<normalize>, C<lib>, C<switches>, C<test_args>, C<color>, C<exec>, C<merge>, C<sources>, C<aggregator_class>, C<version>, C<formatter_class>, C<multiplexer_class>, C<parser_class>, C<scheduler_class>, C<formatter>, C<errors>, C<directives>, C<ignore_exit>, C<jobs>, C<rules>, C<rulesfiles>, C<stdout>, C<trap> =back =back =over 4 =item Instance Methods =back the source name of a test to run, a reference to a [ source name, display name ] array =over 4 =item CONFIGURING =over 4 =item Plugins =item C<Module::Build> =item C<ExtUtils::MakeMaker> =item C<prove> =back =item WRITING PLUGINS Customize how TAP gets into the parser, Customize how TAP results are output from the parser =item SUBCLASSING =over 4 =item Methods L</new>, L</runtests>, L</summary> =back =back =over 4 =item REPLACING =item SEE ALSO =back =head2 TAP::Harness::Beyond, Test::Harness::Beyond - Beyond make test =over 4 =item Beyond make test =over 4 =item Saved State =item Parallel Testing =item Non-Perl Tests =item Mixing it up =item Rolling My Own =item Deeper Customisation =item Callbacks =item Parsing TAP =item Getting Support =back =back =head2 TAP::Harness::Env - Parsing harness related environmental variables where appropriate =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS create( \%args ) =item ENVIRONMENTAL VARIABLES C<HARNESS_PERL_SWITCHES>, C<HARNESS_VERBOSE>, C<HARNESS_SUBCLASS>, C<HARNESS_OPTIONS>, C<< j<n> >>, C<< c >>, C<< a<file.tgz> >>, C<< fPackage-With-Dashes >>, C<HARNESS_TIMER>, C<HARNESS_COLOR>, C<HARNESS_IGNORE_EXIT> =back =head2 TAP::Object - Base class that provides common functionality to all C<TAP::*> modules =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =head2 TAP::Parser - Parse L<TAP|Test::Harness::TAP> output =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods C<source>, C<tap>, C<exec>, C<sources>, C<callback>, C<switches>, C<test_args>, C<spool>, C<merge>, C<grammar_class>, C<result_factory_class>, C<iterator_factory_class> =back =back =over 4 =item Instance Methods =back =over 4 =item INDIVIDUAL RESULTS =over 4 =item Result types Version, Plan, Pragma, Test, Comment, Bailout, Unknown =item Common type methods =item C<plan> methods =item C<pragma> methods =item C<comment> methods =item C<bailout> methods =item C<unknown> methods =item C<test> methods =back =item TOTAL RESULTS =over 4 =item Individual Results =back =back =over 4 =item Pragmas =back =over 4 =item Summary Results =back =over 4 =item C<ignore_exit> =back Misplaced plan, No plan, More than one plan, Test numbers out of sequence =over 4 =item CALLBACKS C<test>, C<version>, C<plan>, C<comment>, C<bailout>, C<yaml>, C<unknown>, C<ELSE>, C<ALL>, C<EOF> =item TAP GRAMMAR =item BACKWARDS COMPATIBILITY =over 4 =item Differences TODO plans, 'Missing' tests =back =item SUBCLASSING =over 4 =item Parser Components option 1, option 2 =back =item ACKNOWLEDGMENTS Michael Schwern, Andy Lester, chromatic, GEOFFR, Shlomi Fish, Torsten Schoenfeld, Jerry Gay, Aristotle, Adam Kennedy, Yves Orton, Adrian Howard, Sean & Lil, Andreas J. Koenig, Florian Ragwitz, Corion, Mark Stosberg, Matt Kraai, David Wheeler, Alex Vandiver, Cosimo Streppone, Ville Skyttä =item AUTHORS =item BUGS =item COPYRIGHT & LICENSE =back =head2 TAP::Parser::Aggregator - Aggregate TAP::Parser results =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item Summary methods failed, parse_errors, passed, planned, skipped, todo, todo_passed, wait, exit =back Failed tests, Parse errors, Bad exit or wait status =over 4 =item See Also =back =head2 TAP::Parser::Grammar - A grammar for the Test Anything Protocol. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item TAP GRAMMAR =item SUBCLASSING =item SEE ALSO =back =head2 TAP::Parser::Iterator - Base class for TAP source iterators =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Instance Methods =back =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item SEE ALSO =back =head2 TAP::Parser::Iterator::Array - Iterator for array-based TAP sources =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Instance Methods =back =back =over 4 =item ATTRIBUTION =item SEE ALSO =back =head2 TAP::Parser::Iterator::Process - Iterator for process-based TAP sources =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Instance Methods =back =back =over 4 =item ATTRIBUTION =item SEE ALSO =back =head2 TAP::Parser::Iterator::Stream - Iterator for filehandle-based TAP sources =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item ATTRIBUTION =item SEE ALSO =back =head2 TAP::Parser::IteratorFactory - Figures out which SourceHandler objects to use for a given Source =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item AUTHORS =item ATTRIBUTION =item SEE ALSO =back =head2 TAP::Parser::Multiplexer - Multiplex multiple TAP::Parsers =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item See Also =back =head2 TAP::Parser::Result - Base class for TAP::Parser output objects =over 4 =item VERSION =back =over 4 =item SYNOPSIS =over 4 =item DESCRIPTION =item METHODS =back =back =over 4 =item Boolean methods C<is_plan>, C<is_pragma>, C<is_test>, C<is_comment>, C<is_bailout>, C<is_version>, C<is_unknown>, C<is_yaml> =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item SEE ALSO =back =head2 TAP::Parser::Result::Bailout - Bailout result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Result::Comment - Comment result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Result::Plan - Plan result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string>, C<raw> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Result::Pragma - TAP pragma token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string>, C<raw> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Result::Test - Test result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS =over 4 =item Instance Methods =back =back =head2 TAP::Parser::Result::Unknown - Unknown result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string>, C<raw> =back =head2 TAP::Parser::Result::Version - TAP syntax version token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string>, C<raw> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Result::YAML - YAML result token. =over 4 =item VERSION =back =over 4 =item DESCRIPTION =item OVERRIDDEN METHODS C<as_string>, C<raw> =back =over 4 =item Instance Methods =back =head2 TAP::Parser::ResultFactory - Factory for creating TAP::Parser output objects =over 4 =item SYNOPSIS =item VERSION =back =over 4 =item DESCRIPTION =item METHODS =item Class Methods =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item SEE ALSO =back =head2 TAP::Parser::Scheduler - Schedule tests during parallel testing =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Rules data structure By default, all tests are eligible to be run in parallel. Specifying any of your own rules removes this one, "First match wins". The first rule that matches a test will be the one that applies, Any test which does not match a rule will be run in sequence at the end of the run, The existence of a rule does not imply selecting a test. You must still specify the tests to run, Specifying a rule to allow tests to run in parallel does not make the run in parallel. You still need specify the number of parallel C<jobs> in your Harness object =back =back =over 4 =item Instance Methods =back =head2 TAP::Parser::Scheduler::Job - A single testing job. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item Attributes =back =head2 TAP::Parser::Scheduler::Spinner - A no-op job. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item SEE ALSO =back =head2 TAP::Parser::Source - a TAP source & meta data about it =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item Instance Methods =back =over 4 =item AUTHORS =item SEE ALSO =back =head2 TAP::Parser::SourceHandler - Base class for different TAP source handlers =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item AUTHORS =item SEE ALSO =back =head2 TAP::Parser::SourceHandler::Executable - Stream output from an executable TAP source =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item SEE ALSO =back =head2 TAP::Parser::SourceHandler::File - Stream TAP from a text file. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item CONFIGURATION =item SUBCLASSING =item SEE ALSO =back =head2 TAP::Parser::SourceHandler::Handle - Stream TAP from an IO::Handle or a GLOB. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item SUBCLASSING =item SEE ALSO =back =head2 TAP::Parser::SourceHandler::Perl - Stream TAP from a Perl executable =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item SUBCLASSING =over 4 =item Example =back =item SEE ALSO =back =head2 TAP::Parser::SourceHandler::RawTAP - Stream output from raw TAP in a scalar/array ref. =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =back =back =over 4 =item SUBCLASSING =item SEE ALSO =back =head2 TAP::Parser::YAMLish::Reader - Read YAMLish data from iterator =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Instance Methods =back =item AUTHOR =item SEE ALSO =item COPYRIGHT =back =head2 TAP::Parser::YAMLish::Writer - Write YAMLish data =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS =over 4 =item Class Methods =item Instance Methods a reference to a scalar to append YAML to, the handle of an open file, a reference to an array into which YAML will be pushed, a code reference =back =item AUTHOR =item SEE ALSO =item COPYRIGHT =back =head2 Term::ANSIColor - Color screen output using ANSI escape sequences =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Supported Colors =item Function Interface color(ATTR[, ATTR ...]), colored(STRING, ATTR[, ATTR ...]), colored(ATTR-REF, STRING[, STRING...]), uncolor(ESCAPE), colorstrip(STRING[, STRING ...]), colorvalid(ATTR[, ATTR ...]), coloralias(ALIAS[, ATTR ...]) =item Constant Interface =item The Color Stack =item Supporting CLICOLOR =back =item DIAGNOSTICS Bad color mapping %s, Bad escape sequence %s, Bareword "%s" not allowed while "strict subs" in use, Cannot alias standard color %s, Cannot alias standard color %s in %s, Invalid alias name %s, Invalid alias name %s in %s, Invalid attribute name %s, Invalid attribute name %s in %s, Name "%s" used only once: possible typo, No comma allowed after filehandle, No name for escape sequence %s =item ENVIRONMENT ANSI_COLORS_ALIASES, ANSI_COLORS_DISABLED, NO_COLOR =item COMPATIBILITY =item RESTRICTIONS =item NOTES =item AUTHORS =item COPYRIGHT AND LICENSE =item SEE ALSO =back =head2 Term::Cap - Perl termcap interface =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item METHODS =back =back B<Tgetent>, OSPEED, TERM B<Tpad>, B<$string>, B<$cnt>, B<$FH> B<Tputs>, B<$cap>, B<$cnt>, B<$FH> B<Tgoto>, B<$cap>, B<$col>, B<$row>, B<$FH> B<Trequire> =over 4 =item EXAMPLES =item COPYRIGHT AND LICENSE =item AUTHOR =item SEE ALSO =back =head2 Term::Complete - Perl word completion module =over 4 =item SYNOPSIS =item DESCRIPTION E<lt>tabE<gt>, ^D, ^U, E<lt>delE<gt>, E<lt>bsE<gt> =item DIAGNOSTICS =item BUGS =item AUTHOR =back =head2 Term::ReadLine - Perl interface to various C<readline> packages. If no real package is found, substitutes stubs instead of basic functions. =over 4 =item SYNOPSIS =item DESCRIPTION =item Minimal set of supported functions C<ReadLine>, C<new>, C<readline>, C<addhistory>, C<IN>, C<OUT>, C<MinLine>, C<findConsole>, Attribs, C<Features> =item Additional supported functions C<tkRunning>, C<event_loop>, C<ornaments>, C<newTTY> =item EXPORTS =item ENVIRONMENT =back =head2 Test - provides a simple framework for writing test scripts =over 4 =item SYNOPSIS =item DESCRIPTION =item QUICK START GUIDE =over 4 =item Functions C<plan(...)>, C<tests =E<gt> I<number>>, C<todo =E<gt> [I<1,5,14>]>, C<onfail =E<gt> sub { ... }>, C<onfail =E<gt> \&some_sub> =back =back B<_to_value> C<ok(...)> C<skip(I<skip_if_true>, I<args...>)> =over 4 =item TEST TYPES NORMAL TESTS, SKIPPED TESTS, TODO TESTS =item ONFAIL =item BUGS and CAVEATS =item ENVIRONMENT =item NOTE =item SEE ALSO =item AUTHOR =back =head2 Test2 - Framework for writing test tools that all work together. =over 4 =item DESCRIPTION =over 4 =item WHAT IS NEW? Easier to test new testing tools, Better diagnostics capabilities, Event driven, More complete API, Support for output other than TAP, Subtest implementation is more sane, Support for threading/forking =back =item GETTING STARTED =back =head2 Test2, This describes the namespace layout for the Test2 ecosystem. Not all the namespaces listed here are part of the Test2 distribution, some are implemented in L<Test2::Suite>. =over 4 =item Test2::Tools:: =item Test2::Plugin:: =item Test2::Bundle:: =item Test2::Require:: =item Test2::Formatter:: =item Test2::Event:: =item Test2::Hub:: =item Test2::IPC:: =item Test2::Util:: =item Test2::API:: =item Test2:: =back =over 4 =item SEE ALSO =item CONTACTING US =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::API - Primary interface for writing Test2 based testing tools. =over 4 =item ***INTERNALS NOTE*** =item DESCRIPTION =item SYNOPSIS =over 4 =item WRITING A TOOL =item TESTING YOUR TOOLS The event from C<ok(1, "pass")>, The plan event for the subtest, The subtest event itself, with the first 2 events nested inside it as children =item OTHER API FUNCTIONS =back =item MAIN API EXPORTS =over 4 =item context(...) $ctx = context(), $ctx = context(%params), level => $int, wrapped => $int, stack => $stack, hub => $hub, on_init => sub { ... }, on_release => sub { ... } =item release($;$) release $ctx;, release $ctx, ...; =item context_do(&;@) =item no_context(&;$) no_context { ... };, no_context { ... } $hid; =item intercept(&) =item run_subtest(...) $NAME, \&CODE, $BUFFERED or \%PARAMS, 'buffered' => $bool, 'inherit_trace' => $bool, 'no_fork' => $bool, @ARGS, Things not effected by this flag, Things that are effected by this flag, Things that are formatter dependant =back =item OTHER API EXPORTS =over 4 =item STATUS AND INITIALIZATION STATE $bool = test2_init_done(), $bool = test2_load_done(), test2_set_is_end(), test2_set_is_end($bool), $bool = test2_get_is_end(), $stack = test2_stack(), $bool = test2_is_testing_done(), test2_ipc_disable, $bool = test2_ipc_diabled, test2_ipc_wait_enable(), test2_ipc_wait_disable(), $bool = test2_ipc_wait_enabled(), $bool = test2_no_wait(), test2_no_wait($bool), $fh = test2_stdout(), $fh = test2_stderr(), test2_reset_io() =item BEHAVIOR HOOKS test2_add_callback_exit(sub { ... }), test2_add_callback_post_load(sub { ... }), test2_add_callback_testing_done(sub { ... }), test2_add_callback_context_acquire(sub { ... }), test2_add_callback_context_init(sub { ... }), test2_add_callback_context_release(sub { ... }), test2_add_callback_pre_subtest(sub { ... }), @list = test2_list_context_acquire_callbacks(), @list = test2_list_context_init_callbacks(), @list = test2_list_context_release_callbacks(), @list = test2_list_exit_callbacks(), @list = test2_list_post_load_callbacks(), @list = test2_list_pre_subtest_callbacks(), test2_add_uuid_via(sub { ... }), $sub = test2_add_uuid_via() =item IPC AND CONCURRENCY $bool = test2_has_ipc(), $ipc = test2_ipc(), test2_ipc_add_driver($DRIVER), @drivers = test2_ipc_drivers(), $bool = test2_ipc_polling(), test2_ipc_enable_polling(), test2_ipc_disable_polling(), test2_ipc_enable_shm(), test2_ipc_set_pending($uniq_val), $pending = test2_ipc_get_pending(), $timeout = test2_ipc_get_timeout(), test2_ipc_set_timeout($timeout) =item MANAGING FORMATTERS $formatter = test2_formatter, test2_formatter_set($class_or_instance), @formatters = test2_formatters(), test2_formatter_add($class_or_instance) =back =item OTHER EXAMPLES =item SEE ALSO =item MAGIC =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::API::Breakage - What breaks at what version =over 4 =item DESCRIPTION =item FUNCTIONS %mod_ver = upgrade_suggested(), %mod_ver = Test2::API::Breakage->upgrade_suggested(), %mod_ver = upgrade_required(), %mod_ver = Test2::API::Breakage->upgrade_required(), %mod_ver = known_broken(), %mod_ver = Test2::API::Breakage->known_broken() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::API::Context - Object to represent a testing context. =over 4 =item DESCRIPTION =item SYNOPSIS =item CRITICAL DETAILS you MUST always use the context() sub from Test2::API, You MUST always release the context when done with it, You MUST NOT pass context objects around, You MUST NOT store or cache a context for later, You SHOULD obtain your context as soon as possible in a given tool =item METHODS $ctx->done_testing;, $clone = $ctx->snapshot(), $ctx->release(), $ctx->throw($message), $ctx->alert($message), $stack = $ctx->stack(), $hub = $ctx->hub(), $dbg = $ctx->trace(), $ctx->do_in_context(\&code, @args);, $ctx->restore_error_vars(), $! = $ctx->errno(), $? = $ctx->child_error(), $@ = $ctx->eval_error() =over 4 =item EVENT PRODUCTION METHODS $event = $ctx->pass(), $event = $ctx->pass($name), $true = $ctx->pass_and_release(), $true = $ctx->pass_and_release($name), my $event = $ctx->fail(), my $event = $ctx->fail($name), my $event = $ctx->fail($name, @diagnostics), my $false = $ctx->fail_and_release(), my $false = $ctx->fail_and_release($name), my $false = $ctx->fail_and_release($name, @diagnostics), $event = $ctx->ok($bool, $name), $event = $ctx->ok($bool, $name, \@on_fail), $event = $ctx->note($message), $event = $ctx->diag($message), $event = $ctx->plan($max), $event = $ctx->plan(0, 'SKIP', $reason), $event = $ctx->skip($name, $reason);, $event = $ctx->bail($reason), $event = $ctx->send_ev2(%facets), $event = $ctx->build_e2(%facets), $event = $ctx->send_ev2_and_release($Type, %parameters), $event = $ctx->send_event($Type, %parameters), $event = $ctx->build_event($Type, %parameters), $event = $ctx->send_event_and_release($Type, %parameters) =back =item HOOKS =over 4 =item INIT HOOKS =item RELEASE HOOKS =back =item THIRD PARTY META-DATA =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric E<lt>kentnl@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::API::Instance - Object used by Test2::API under the hood =over 4 =item DESCRIPTION =item SYNOPSIS $pid = $obj->pid, $obj->tid, $obj->reset(), $obj->load(), $bool = $obj->loaded, $arrayref = $obj->post_load_callbacks, $obj->add_post_load_callback(sub { ... }), $hashref = $obj->contexts(), $arrayref = $obj->context_acquire_callbacks, $arrayref = $obj->context_init_callbacks, $arrayref = $obj->context_release_callbacks, $arrayref = $obj->pre_subtest_callbacks, $obj->add_context_init_callback(sub { ... }), $obj->add_context_release_callback(sub { ... }), $obj->add_pre_subtest_callback(sub { ... }), $obj->set_exit(), $obj->set_ipc_pending($val), $pending = $obj->get_ipc_pending(), $timeout = $obj->ipc_timeout;, $obj->set_ipc_timeout($timeout);, $drivers = $obj->ipc_drivers, $obj->add_ipc_driver($DRIVER_CLASS), $bool = $obj->ipc_polling, $obj->enable_ipc_polling, $obj->disable_ipc_polling, $bool = $obj->no_wait, $bool = $obj->set_no_wait($bool), $arrayref = $obj->exit_callbacks, $obj->add_exit_callback(sub { ... }), $bool = $obj->finalized, $ipc = $obj->ipc, $obj->ipc_disable, $bool = $obj->ipc_disabled, $stack = $obj->stack, $formatter = $obj->formatter, $bool = $obj->formatter_set(), $obj->add_formatter($class), $obj->add_formatter($obj), $obj->set_add_uuid_via(sub { ... }), $sub = $obj->add_uuid_via() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::API::Stack - Object to manage a stack of L<Test2::Hub> instances. =over 4 =item ***INTERNALS NOTE*** =item DESCRIPTION =item SYNOPSIS =item METHODS $stack = Test2::API::Stack->new(), $hub = $stack->new_hub(), $hub = $stack->new_hub(%params), $hub = $stack->new_hub(%params, class => $class), $hub = $stack->top(), $hub = $stack->peek(), $stack->cull, @hubs = $stack->all, $stack->clear, $stack->push($hub), $stack->pop($hub) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event - Base class for events =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS =over 4 =item GENERAL $trace = $e->trace, $bool_or_undef = $e->related($e2), $e->add_amnesty({tag => $TAG, details => $DETAILS});, $uuid = $e->uuid, $class = $e->load_facet($name), @classes = $e->FACET_TYPES(), @classes = Test2::Event->FACET_TYPES() =item NEW API $hashref = $e->common_facet_data();, $hashref = $e->facet_data(), $hashref = $e->facets(), @errors = $e->validate_facet_data();, @errors = $e->validate_facet_data(%params);, @errors = $e->validate_facet_data(\%facets, %params);, @errors = Test2::Event->validate_facet_data(%params);, @errors = Test2::Event->validate_facet_data(\%facets, %params);, require_facet_class => $BOOL, about => {...}, assert => {...}, control => {...}, meta => {...}, parent => {...}, plan => {...}, trace => {...}, amnesty => [{...}, ...], errors => [{...}, ...], info => [{...}, ...] =item LEGACY API $bool = $e->causes_fail, $bool = $e->increments_count, $e->callback($hub), $num = $e->nested, $bool = $e->global, $code = $e->terminate, $msg = $e->summary, ($count, $directive, $reason) = $e->sets_plan(), $bool = $e->diagnostics, $bool = $e->no_display, $id = $e->in_subtest, $id = $e->subtest_id =back =item THIRD PARTY META-DATA =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Bail - Bailout! =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $reason = $e->reason =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Diag - Diag event type =over 4 =item DESCRIPTION =item SYNOPSIS =item ACCESSORS $diag->message =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Encoding - Set the encoding for the output stream =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $encoding = $e->encoding =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Exception - Exception event =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $reason = $e->error =item CAVEATS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Fail - Event for a simple failed assertion =over 4 =item DESCRIPTION =item SYNOPSIS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Generic - Generic event type. =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $e->facet_data($data), $data = $e->facet_data, $e->callback($hub), $e->set_callback(sub { ... }), $bool = $e->causes_fail, $e->set_causes_fail($bool), $bool = $e->diagnostics, $e->set_diagnostics($bool), $bool_or_undef = $e->global, @bool_or_empty = $e->global, $e->set_global($bool_or_undef), $bool = $e->increments_count, $e->set_increments_count($bool), $bool = $e->no_display, $e->set_no_display($bool), @plan = $e->sets_plan, $e->set_sets_plan(\@plan), $summary = $e->summary, $e->set_summary($summary_or_undef), $int_or_undef = $e->terminate, @int_or_empty = $e->terminate, $e->set_terminate($int_or_undef) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Note - Note event type =over 4 =item DESCRIPTION =item SYNOPSIS =item ACCESSORS $note->message =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Ok - Ok event type =over 4 =item DESCRIPTION =item SYNOPSIS =item ACCESSORS $rb = $e->pass, $name = $e->name, $b = $e->effective_pass =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Pass - Event for a simple passing assertion =over 4 =item DESCRIPTION =item SYNOPSIS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Plan - The event of a plan =over 4 =item DESCRIPTION =item SYNOPSIS =item ACCESSORS $num = $plan->max, $dir = $plan->directive, $reason = $plan->reason =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Skip - Skip event type =over 4 =item DESCRIPTION =item SYNOPSIS =item ACCESSORS $reason = $e->reason =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Subtest - Event for subtest types =over 4 =item DESCRIPTION =item ACCESSORS $arrayref = $e->subevents, $bool = $e->buffered =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::TAP::Version - Event for TAP version. =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $version = $e->version =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::V2 - Second generation event. =over 4 =item DESCRIPTION =item SYNOPSIS =over 4 =item USING A CONTEXT =item USING THE CONSTRUCTOR =back =item METHODS $fd = $e->facet_data(), $about = $e->about(), $trace = $e->trace() =over 4 =item MUTATION $e->add_amnesty({...}), $e->add_hub({...}), $e->set_uuid($UUID), $e->set_trace($trace) =item LEGACY SUPPORT METHODS causes_fail, diagnostics, global, increments_count, no_display, sets_plan, subtest_id, summary, terminate =back =item THIRD PARTY META-DATA =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Event::Waiting - Tell all procs/threads it is time to be done =over 4 =item DESCRIPTION =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet - Base class for all event facets. =over 4 =item DESCRIPTION =item METHODS $key = $facet_class->facet_key(), $bool = $facet_class->is_list(), $clone = $facet->clone(), $clone = $facet->clone(%replace) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::About - Facet with event details. =over 4 =item DESCRIPTION =item FIELDS $string = $about->{details}, $string = $about->details(), $package = $about->{package}, $package = $about->package(), $bool = $about->{no_display}, $bool = $about->no_display(), $uuid = $about->{uuid}, $uuid = $about->uuid(), $uuid = $about->{eid}, $uuid = $about->eid() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Amnesty - Facet for assertion amnesty. =over 4 =item DESCRIPTION =item NOTES =item FIELDS $string = $amnesty->{details}, $string = $amnesty->details(), $short_string = $amnesty->{tag}, $short_string = $amnesty->tag(), $bool = $amnesty->{inherited}, $bool = $amnesty->inherited() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Assert - Facet representing an assertion. =over 4 =item DESCRIPTION =item FIELDS $string = $assert->{details}, $string = $assert->details(), $bool = $assert->{pass}, $bool = $assert->pass(), $bool = $assert->{no_debug}, $bool = $assert->no_debug(), $int = $assert->{number}, $int = $assert->number() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Control - Facet for hub actions and behaviors. =over 4 =item DESCRIPTION =item FIELDS $string = $control->{details}, $string = $control->details(), $bool = $control->{global}, $bool = $control->global(), $exit = $control->{terminate}, $exit = $control->terminate(), $bool = $control->{halt}, $bool = $control->halt(), $bool = $control->{has_callback}, $bool = $control->has_callback(), $encoding = $control->{encoding}, $encoding = $control->encoding(), $phase = $control->{phase}, $phase = $control->phase() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Error - Facet for errors that need to be shown. =over 4 =item DESCRIPTION =item NOTES =item FIELDS $string = $error->{details}, $string = $error->details(), $short_string = $error->{tag}, $short_string = $error->tag(), $bool = $error->{fail}, $bool = $error->fail() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Hub - Facet for the hubs an event passes through. =over 4 =item DESCRIPTION =item FACET FIELDS $string = $trace->{details}, $string = $trace->details(), $int = $trace->{pid}, $int = $trace->pid(), $int = $trace->{tid}, $int = $trace->tid(), $hid = $trace->{hid}, $hid = $trace->hid(), $huuid = $trace->{huuid}, $huuid = $trace->huuid(), $int = $trace->{nested}, $int = $trace->nested(), $bool = $trace->{buffered}, $bool = $trace->buffered() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Info - Facet for information a developer might care about. =over 4 =item DESCRIPTION =item NOTES =item FIELDS $string_or_structure = $info->{details}, $string_or_structure = $info->details(), $structure = $info->{table}, $structure = $info->table(), $short_string = $info->{tag}, $short_string = $info->tag(), $bool = $info->{debug}, $bool = $info->debug(), $bool = $info->{important}, $bool = $info->important =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Info::Table - Intermediary representation of a table. =over 4 =item DESCRIPTION =item SYNOPSIS =item ATTRIBUTES $header_aref = $t->header(), $rows_aref = $t->rows(), $bool = $t->collapse(), $aref = $t->no_collapse(), $str = $t->as_string(), $href = $t->as_hash(), %args = $t->info_args() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Meta - Facet for meta-data =over 4 =item DESCRIPTION =item METHODS AND FIELDS $anything = $meta->{anything}, $anything = $meta->anything() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Parent - Facet for events contains other events =over 4 =item DESCRIPTION =item FIELDS $string = $parent->{details}, $string = $parent->details(), $hid = $parent->{hid}, $hid = $parent->hid(), $arrayref = $parent->{children}, $arrayref = $parent->children(), $bool = $parent->{buffered}, $bool = $parent->buffered() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Plan - Facet for setting the plan =over 4 =item DESCRIPTION =item FIELDS $string = $plan->{details}, $string = $plan->details(), $positive_int = $plan->{count}, $positive_int = $plan->count(), $bool = $plan->{skip}, $bool = $plan->skip(), $bool = $plan->{none}, $bool = $plan->none() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Render - Facet that dictates how to render an event. =over 4 =item DESCRIPTION =item FIELDS $string = $render->[#]->{details}, $string = $render->[#]->details(), $string = $render->[#]->{tag}, $string = $render->[#]->tag(), $string = $render->[#]->{facet}, $string = $render->[#]->facet(), $mode = $render->[#]->{mode}, $mode = $render->[#]->mode(), calculated, replace =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::EventFacet::Trace - Debug information for events =over 4 =item DESCRIPTION =item SYNOPSIS =item FACET FIELDS $string = $trace->{details}, $string = $trace->details(), $frame = $trace->{frame}, $frame = $trace->frame(), $int = $trace->{pid}, $int = $trace->pid(), $int = $trace->{tid}, $int = $trace->tid(), $id = $trace->{cid}, $id = $trace->cid(), $uuid = $trace->{uuid}, $uuid = $trace->uuid() =over 4 =item DISCOURAGED HUB RELATED FIELDS $hid = $trace->{hid}, $hid = $trace->hid(), $huuid = $trace->{huuid}, $huuid = $trace->huuid(), $int = $trace->{nested}, $int = $trace->nested(), $bool = $trace->{buffered}, $bool = $trace->buffered() =back =item METHODS $trace->set_detail($msg), $msg = $trace->detail, $str = $trace->debug, $trace->alert($MESSAGE), $trace->throw($MESSAGE), ($package, $file, $line, $subname) = $trace->call(), $pkg = $trace->package, $file = $trace->file, $line = $trace->line, $subname = $trace->subname, $sig = trace->signature =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Formatter - Namespace for formatters. =over 4 =item DESCRIPTION =item CREATING FORMATTERS The number of tests that were planned, The number of tests actually seen, The number of tests which failed, A boolean indicating whether or not the test suite passed, A boolean indicating whether or not this call is for a subtest =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Formatter::TAP - Standard TAP formatter =over 4 =item DESCRIPTION =item SYNOPSIS =item METHODS $bool = $tap->no_numbers, $tap->set_no_numbers($bool), $arrayref = $tap->handles, $tap->set_handles(\@handles);, $encoding = $tap->encoding, $tap->encoding($encoding), $tap->write($e, $num) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric E<lt>kentnl@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Hub - The conduit through which all events flow. =over 4 =item SYNOPSIS =item DESCRIPTION =item COMMON TASKS =over 4 =item SENDING EVENTS =item ALTERING OR REMOVING EVENTS =item LISTENING FOR EVENTS =item POST-TEST BEHAVIORS =item SETTING THE FORMATTER =back =item METHODS $hub->send($event), $hub->process($event), $old = $hub->format($formatter), $sub = $hub->listen(sub { ... }, %optional_params), $hub->unlisten($sub), $sub = $hub->filter(sub { ... }, %optional_params), $sub = $hub->pre_filter(sub { ... }, %optional_params), $hub->unfilter($sub), $hub->pre_unfilter($sub), $hub->follow_op(sub { ... }), $sub = $hub->add_context_acquire(sub { ... });, $hub->remove_context_acquire($sub);, $sub = $hub->add_context_init(sub { ... });, $hub->remove_context_init($sub);, $sub = $hub->add_context_release(sub { ... });, $hub->remove_context_release($sub);, $hub->cull(), $pid = $hub->pid(), $tid = $hub->tid(), $hud = $hub->hid(), $uuid = $hub->uuid(), $ipc = $hub->ipc(), $hub->set_no_ending($bool), $bool = $hub->no_ending, $bool = $hub->active, $hub->set_active($bool) =over 4 =item STATE METHODS $hub->reset_state(), $num = $hub->count, $num = $hub->failed, $bool = $hub->ended, $bool = $hub->is_passing, $hub->is_passing($bool), $hub->plan($plan), $plan = $hub->plan, $bool = $hub->check_plan =back =item THIRD PARTY META-DATA =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Hub::Interceptor - Hub used by interceptor to grab results. =over 4 =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Hub::Interceptor::Terminator - Exception class used by Test2::Hub::Interceptor =over 4 =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Hub::Subtest - Hub used by subtests =over 4 =item DESCRIPTION =item TOGGLES $bool = $hub->manual_skip_all, $hub->set_manual_skip_all($bool) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::IPC - Turn on IPC for threading or forking support. =over 4 =item SYNOPSIS =over 4 =item DISABLING IT =back =item EXPORTS cull() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::IPC::Driver - Base class for Test2 IPC drivers. =over 4 =item SYNOPSIS =item METHODS $self->abort($msg), $self->abort_trace($msg) =item LOADING DRIVERS =item WRITING DRIVERS =over 4 =item METHODS SUBCLASSES MUST IMPLEMENT $ipc->is_viable, $ipc->add_hub($hid), $ipc->drop_hub($hid), $ipc->send($hid, $event);, $ipc->send($hid, $event, $global);, @events = $ipc->cull($hid), $ipc->waiting() =item METHODS SUBCLASSES MAY IMPLEMENT OR OVERRIDE $ipc->driver_abort($msg) =back =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::IPC::Driver::Files - Temp dir + Files concurrency model. =over 4 =item DESCRIPTION =item SYNOPSIS =item ENVIRONMENT VARIABLES T2_KEEP_TEMPDIR=0, T2_TEMPDIR_TEMPLATE='test2-XXXXXX' =item SEE ALSO =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Tools::Tiny - Tiny set of tools for unfortunate souls who cannot use L<Test2::Suite>. =over 4 =item DESCRIPTION =item USE Test2::Suite INSTEAD =item EXPORTS ok($bool, $name), ok($bool, $name, @diag), is($got, $want, $name), is($got, $want, $name, @diag), isnt($got, $do_not_want, $name), isnt($got, $do_not_want, $name, @diag), like($got, $regex, $name), like($got, $regex, $name, @diag), unlike($got, $regex, $name), unlike($got, $regex, $name, @diag), is_deeply($got, $want, $name), is_deeply($got, $want, $name, @diag), diag($msg), note($msg), skip_all($reason), todo $reason => sub { ... }, plan($count), done_testing(), $warnings = warnings { ... }, $exception = exception { ... }, tests $name => sub { ... }, $output = capture { ... } =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Transition - Transition notes when upgrading to Test2 =over 4 =item DESCRIPTION =item THINGS THAT BREAK =over 4 =item Test::Builder1.5/2 conditionals =item Replacing the Test::Builder singleton =item Directly Accessing Hash Elements =item Subtest indentation =back =item DISTRIBUTIONS THAT BREAK OR NEED TO BE UPGRADED =over 4 =item WORKS BUT TESTS WILL FAIL Test::DBIx::Class::Schema, Device::Chip =item UPGRADE SUGGESTED Test::Exception, Data::Peek, circular::require, Test::Module::Used, Test::Moose::More, Test::FITesque, Test::Kit, autouse =item NEED TO UPGRADE Test::SharedFork, Test::Builder::Clutch, Test::Dist::VersionSync, Test::Modern, Test::UseAllModules, Test::More::Prefix =item STILL BROKEN Test::Aggregate, Test::Wrapper, Test::ParallelSubtest, Test::Pretty, Net::BitTorrent, Test::Group, Test::Flatten, Log::Dispatch::Config::TestLog, Test::Able =back =item MAKE ASSERTIONS -> SEND EVENTS =over 4 =item LEGACY =item TEST2 ok($bool, $name), diag(@messages), note(@messages), subtest($name, $code) =back =item WRAP EXISTING TOOLS =over 4 =item LEGACY =item TEST2 =back =item USING UTF8 =over 4 =item LEGACY =item TEST2 =back =item AUTHORS, CONTRIBUTORS AND REVIEWERS Chad Granum (EXODIST) E<lt>exodist@cpan.orgE<gt> =item SOURCE =item MAINTAINER Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Util - Tools used by Test2 and friends. =over 4 =item DESCRIPTION =item EXPORTS ($success, $error) = try { ... }, protect { ... }, CAN_FORK, CAN_REALLY_FORK, CAN_THREAD, USE_THREADS, get_tid, my $file = pkg_to_file($package), $string = ipc_separator(), $string = gen_uid(), ($ok, $err) = do_rename($old_name, $new_name), ($ok, $err) = do_unlink($filename), ($ok, $err) = try_sig_mask { ... }, SIGINT, SIGALRM, SIGHUP, SIGTERM, SIGUSR1, SIGUSR2 =item NOTES && CAVEATS Devel::Cover =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt>, Kent Fredric E<lt>kentnl@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Util::ExternalMeta - Allow third party tools to safely attach meta-data to your instances. =over 4 =item DESCRIPTION =item SYNOPSIS =item WHERE IS THE DATA STORED? =item EXPORTS $val = $obj->meta($key), $val = $obj->meta($key, $default), $val = $obj->get_meta($key), $val = $obj->delete_meta($key), $obj->set_meta($key, $val) =item META-KEY RESTRICTIONS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Util::Facets2Legacy - Convert facet data to the legacy event API. =over 4 =item DESCRIPTION =item SYNOPSIS =over 4 =item AS METHODS =item AS FUNCTIONS =back =item NOTE ON CYCLES =item EXPORTS $bool = $e->causes_fail(), $bool = causes_fail($f), $bool = $e->diagnostics(), $bool = diagnostics($f), $bool = $e->global(), $bool = global($f), $bool = $e->increments_count(), $bool = increments_count($f), $bool = $e->no_display(), $bool = no_display($f), ($max, $directive, $reason) = $e->sets_plan(), ($max, $directive, $reason) = sets_plan($f), $id = $e->subtest_id(), $id = subtest_id($f), $string = $e->summary(), $string = summary($f), $undef_or_int = $e->terminate(), $undef_or_int = terminate($f), $uuid = $e->uuid(), $uuid = uuid($f) =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Util::HashBase - Build hash based classes. =over 4 =item SYNOPSIS =item DESCRIPTION =item THIS IS A BUNDLED COPY OF HASHBASE =item METHODS =over 4 =item PROVIDED BY HASH BASE $it = $class->new(%PAIRS), $it = $class->new(\%PAIRS), $it = $class->new(\@ORDERED_VALUES) =item HOOKS $self->init() =back =item ACCESSORS =over 4 =item READ/WRITE foo(), set_foo(), FOO() =item READ ONLY set_foo() =item DEPRECATED SETTER set_foo() =item NO SETTER =item NO READER =item CONSTANT ONLY =back =item SUBCLASSING =item GETTING A LIST OF ATTRIBUTES FOR A CLASS @list = Test2::Util::HashBase::attr_list($class), @list = $class->Test2::Util::HashBase::attr_list() =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test2::Util::Trace - Legacy wrapper fro L<Test2::EventFacet::Trace>. =over 4 =item DESCRIPTION =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::Builder - Backend for building test libraries =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Construction B<new>, B<create>, B<subtest>, B<name>, B<reset> =item Setting up tests B<plan>, B<expected_tests>, B<no_plan>, B<done_testing>, B<has_plan>, B<skip_all>, B<exported_to> =item Running tests B<ok>, B<is_eq>, B<is_num>, B<isnt_eq>, B<isnt_num>, B<like>, B<unlike>, B<cmp_ok> =item Other Testing Methods B<BAIL_OUT>, B<skip>, B<todo_skip>, B<skip_rest> =item Test building utility methods B<maybe_regex>, B<is_fh> =back =back =over 4 =item Test style B<level>, B<use_numbers>, B<no_diag>, B<no_ending>, B<no_header> =item Output B<diag>, B<note>, B<explain>, B<output>, B<failure_output>, B<todo_output>, reset_outputs, carp, croak =item Test Status and Info B<no_log_results>, B<current_test>, B<is_passing>, B<summary>, B<details>, B<todo>, B<find_TODO>, B<in_todo>, B<todo_start>, C<todo_end>, B<caller> =back =over 4 =item EXIT CODES =item THREADS =item MEMORY =item EXAMPLES =item SEE ALSO =over 4 =item INTERNALS =item LEGACY =item EXTERNAL =back =item AUTHORS =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::Builder::Formatter - Test::Builder subclass of Test2::Formatter::TAP =over 4 =item DESCRIPTION =item SYNOPSIS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::Builder::IO::Scalar - A copy of IO::Scalar for Test::Builder =over 4 =item DESCRIPTION =item COPYRIGHT and LICENSE =back =over 4 =item Construction =back new [ARGS...] open [SCALARREF] opened close =over 4 =item Input and output =back flush getc getline getlines print ARGS.. read BUF, NBYTES, [OFFSET] write BUF, NBYTES, [OFFSET] sysread BUF, LEN, [OFFSET] syswrite BUF, NBYTES, [OFFSET] =over 4 =item Seeking/telling and other attributes =back autoflush binmode clearerr eof seek OFFSET, WHENCE sysseek OFFSET, WHENCE tell use_RS [YESNO] setpos POS getpos sref =over 4 =item WARNINGS =item VERSION =item AUTHORS =over 4 =item Primary Maintainer =item Principal author =item Other contributors =back =item SEE ALSO =back =head2 Test::Builder::Module - Base class for test modules =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Importing =back =back =over 4 =item Builder =back =over 4 =item SEE ALSO =back =head2 Test::Builder::Tester - test testsuites that have been built with Test::Builder =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item Functions test_out, test_err =back test_fail test_diag test_test, title (synonym 'name', 'label'), skip_out, skip_err line_num color =over 4 =item BUGS =item AUTHOR =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item NOTES =item SEE ALSO =back =head2 Test::Builder::Tester::Color - turn on colour in Test::Builder::Tester =over 4 =item SYNOPSIS =item DESCRIPTION =back =over 4 =item AUTHOR =item BUGS =item SEE ALSO =back =head2 Test::Builder::TodoDiag - Test::Builder subclass of Test2::Event::Diag =over 4 =item DESCRIPTION =item SYNOPSIS =item SOURCE =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item AUTHORS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::Harness - Run Perl standard test scripts with statistics =over 4 =item VERSION =back =over 4 =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =over 4 =item runtests( @test_files ) =back =back =over 4 =item execute_tests( tests => \@test_files, out => \*FH ) =back =over 4 =item EXPORT =item ENVIRONMENT VARIABLES THAT TAP::HARNESS::COMPATIBLE SETS C<HARNESS_ACTIVE>, C<HARNESS_VERSION> =item ENVIRONMENT VARIABLES THAT AFFECT TEST::HARNESS C<HARNESS_PERL_SWITCHES>, C<HARNESS_TIMER>, C<HARNESS_VERBOSE>, C<HARNESS_OPTIONS>, C<< j<n> >>, C<< c >>, C<< a<file.tgz> >>, C<< fPackage-With-Dashes >>, C<HARNESS_SUBCLASS>, C<HARNESS_SUMMARY_COLOR_SUCCESS>, C<HARNESS_SUMMARY_COLOR_FAIL> =item Taint Mode =item SEE ALSO =item BUGS =item AUTHORS =item LICENCE AND COPYRIGHT =back =head2 Test::More - yet another framework for writing test scripts =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item I love it when a plan comes together =back =back B<done_testing> =over 4 =item Test names =item I'm ok, you're not ok. B<ok> =back B<is>, B<isnt> B<like> B<unlike> B<cmp_ok> B<can_ok> B<isa_ok> B<new_ok> B<subtest> B<pass>, B<fail> =over 4 =item Module tests B<require_ok> =back B<use_ok> =over 4 =item Complex data structures B<is_deeply> =back =over 4 =item Diagnostics B<diag>, B<note> =back B<explain> =over 4 =item Conditional tests B<SKIP: BLOCK> =back B<TODO: BLOCK>, B<todo_skip> When do I use SKIP vs. TODO? =over 4 =item Test control B<BAIL_OUT> =back =over 4 =item Discouraged comparison functions B<eq_array> =back B<eq_hash> B<eq_set> =over 4 =item Extending and Embedding Test::More B<builder> =back =over 4 =item EXIT CODES =item COMPATIBILITY subtests, C<done_testing()>, C<cmp_ok()>, C<new_ok()> C<note()> and C<explain()> =item CAVEATS and NOTES utf8 / "Wide character in print", Overloaded objects, Threads =item HISTORY =item SEE ALSO =over 4 =item ALTERNATIVES =item ADDITIONAL LIBRARIES =item OTHER COMPONENTS =item BUNDLES =back =item AUTHORS =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item BUGS =item SOURCE =item COPYRIGHT =back =head2 Test::Simple - Basic utilities for writing tests. =over 4 =item SYNOPSIS =item DESCRIPTION B<ok> =back =over 4 =item EXAMPLE =item CAVEATS =item NOTES =item HISTORY =item SEE ALSO L<Test::More> =item AUTHORS =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::Tester - Ease testing test modules built with Test::Builder =over 4 =item SYNOPSIS =item DESCRIPTION =item HOW TO USE (THE EASY WAY) =item HOW TO USE (THE HARD WAY) =item TEST RESULTS ok, actual_ok, name, type, reason, diag, depth =item SPACES AND TABS =item COLOUR =item EXPORTED FUNCTIONS =item HOW IT WORKS =item CAVEATS =item SEE ALSO =item AUTHOR =item LICENSE =back =head2 Test::Tester::Capture - Help testing test modules built with Test::Builder =over 4 =item DESCRIPTION =item AUTHOR =item LICENSE =back =head2 Test::Tester::CaptureRunner - Help testing test modules built with Test::Builder =over 4 =item DESCRIPTION =item AUTHOR =item LICENSE =back =head2 Test::Tutorial - A tutorial about writing really basic tests =over 4 =item DESCRIPTION =over 4 =item Nuts and bolts of testing. =item Where to start? =item Names =item Test the manual =item Sometimes the tests are wrong =item Testing lots of values =item Informative names =item Skipping tests =item Todo tests =item Testing with taint mode. =back =item FOOTNOTES =item AUTHORS =item MAINTAINERS Chad Granum E<lt>exodist@cpan.orgE<gt> =item COPYRIGHT =back =head2 Test::use::ok - Alternative to Test::More::use_ok =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =item MAINTAINER Chad Granum E<lt>exodist@cpan.orgE<gt> =item CC0 1.0 Universal =back =head2 Text::Abbrev - abbrev - create an abbreviation table from a list =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLE =back =head2 Text::Balanced - Extract delimited text sequences from strings. =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item General behaviour in list contexts [0], [1], [2] =item General behaviour in scalar and void contexts =item A note about prefixes =item C<extract_delimited> =item C<extract_bracketed> =item C<extract_variable> [0], [1], [2] =item C<extract_tagged> C<reject =E<gt> $listref>, C<ignore =E<gt> $listref>, C<fail =E<gt> $str>, [0], [1], [2], [3], [4], [5] =item C<gen_extract_tagged> =item C<extract_quotelike> [0], [1], [2], [3], [4], [5], [6], [7], [8], [9], [10] =item C<extract_quotelike> and "here documents" [0], [1], [2], [3], [4], [5], [6], [7..10] =item C<extract_codeblock> =item C<extract_multiple> =item C<gen_delimited_pat> =item C<delimited_pat> =back =item DIAGNOSTICS C<Did not find a suitable bracket: "%s">, C<Did not find prefix: /%s/>, C<Did not find opening bracket after prefix: "%s">, C<No quotelike operator found after prefix: "%s">, C<Unmatched closing bracket: "%c">, C<Unmatched opening bracket(s): "%s">, C<Unmatched embedded quote (%s)>, C<Did not find closing delimiter to match '%s'>, C<Mismatched closing bracket: expected "%c" but found "%s">, C<No block delimiter found after quotelike "%s">, C<Did not find leading dereferencer>, C<Bad identifier after dereferencer>, C<Did not find expected opening bracket at %s>, C<Improperly nested codeblock at %s>, C<Missing second block for quotelike "%s">, C<No match found for opening bracket>, C<Did not find opening tag: /%s/>, C<Unable to construct closing tag to match: /%s/>, C<Found invalid nested tag: %s>, C<Found unbalanced nested tag: %s>, C<Did not find closing tag> =item AUTHOR =item BUGS AND IRRITATIONS =item COPYRIGHT =back =head2 Text::ParseWords - parse text into an array of tokens or array of arrays =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLES 0Z<>, 1Z<>, 2Z<>, 3Z<>, 4Z<>, 5Z<> =item SEE ALSO =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 Text::Tabs - expand and unexpand tabs like unix expand(1) and unexpand(1) =over 4 =item SYNOPSIS =item DESCRIPTION =item EXPORTS expand, unexpand, $tabstop =item EXAMPLE =item SUBVERSION =item BUGS =item LICENSE =back =head2 Text::Wrap - line wrapping to form simple paragraphs =over 4 =item SYNOPSIS =item DESCRIPTION =item OVERRIDES =item EXAMPLES =item SUBVERSION =item SEE ALSO =item AUTHOR =item LICENSE =back =head2 Thread - Manipulate threads in Perl (for old code only) =over 4 =item DEPRECATED =item HISTORY =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS $thread = Thread->new(\&start_sub), $thread = Thread->new(\&start_sub, LIST), lock VARIABLE, async BLOCK;, Thread->self, Thread->list, cond_wait VARIABLE, cond_signal VARIABLE, cond_broadcast VARIABLE, yield =item METHODS join, detach, equal, tid, done =item DEFUNCT lock(\&sub), eval, flags =item SEE ALSO =back =head2 Thread::Queue - Thread-safe queues =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION Ordinary scalars, Array refs, Hash refs, Scalar refs, Objects based on the above =item QUEUE CREATION ->new(), ->new(LIST) =item BASIC METHODS ->enqueue(LIST), ->dequeue(), ->dequeue(COUNT), ->dequeue_nb(), ->dequeue_nb(COUNT), ->dequeue_timed(TIMEOUT), ->dequeue_timed(TIMEOUT, COUNT), ->pending(), ->limit, ->end() =item ADVANCED METHODS ->peek(), ->peek(INDEX), ->insert(INDEX, LIST), ->extract(), ->extract(INDEX), ->extract(INDEX, COUNT) =item NOTES =item LIMITATIONS =item SEE ALSO =item MAINTAINER =item LICENSE =back =head2 Thread::Semaphore - Thread-safe semaphores =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item METHODS ->new(), ->new(NUMBER), ->down(), ->down(NUMBER), ->down_nb(), ->down_nb(NUMBER), ->down_force(), ->down_force(NUMBER), ->down_timed(TIMEOUT), ->down_timed(TIMEOUT, NUMBER), ->up(), ->up(NUMBER) =item NOTES =item SEE ALSO =item MAINTAINER =item LICENSE =back =head2 Tie::Array - base class for tied arrays =over 4 =item SYNOPSIS =item DESCRIPTION TIEARRAY classname, LIST, STORE this, index, value, FETCH this, index, FETCHSIZE this, STORESIZE this, count, EXTEND this, count, EXISTS this, key, DELETE this, key, CLEAR this, DESTROY this, PUSH this, LIST, POP this, SHIFT this, UNSHIFT this, LIST, SPLICE this, offset, length, LIST =item CAVEATS =item AUTHOR =back =head2 Tie::File - Access the lines of a disk file via a Perl array =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item C<recsep> =item C<autochomp> =item C<mode> =item C<memory> =item C<dw_size> =item Option Format =back =item Public Methods =over 4 =item C<flock> =item C<autochomp> =item C<defer>, C<flush>, C<discard>, and C<autodefer> =item C<offset> =back =item Tying to an already-opened filehandle =item Deferred Writing =over 4 =item Autodeferring =back =item CONCURRENT ACCESS TO FILES =item CAVEATS =item SUBCLASSING =item WHAT ABOUT C<DB_File>? =item AUTHOR =item LICENSE =item WARRANTY =item THANKS =item TODO =back =head2 Tie::Handle - base class definitions for tied handles =over 4 =item SYNOPSIS =item DESCRIPTION TIEHANDLE classname, LIST, WRITE this, scalar, length, offset, PRINT this, LIST, PRINTF this, format, LIST, READ this, scalar, length, offset, READLINE this, GETC this, CLOSE this, OPEN this, filename, BINMODE this, EOF this, TELL this, SEEK this, offset, whence, DESTROY this =item MORE INFORMATION =item COMPATIBILITY =back =head2 Tie::Hash, Tie::StdHash, Tie::ExtraHash - base class definitions for tied hashes =over 4 =item SYNOPSIS =item DESCRIPTION TIEHASH classname, LIST, STORE this, key, value, FETCH this, key, FIRSTKEY this, NEXTKEY this, lastkey, EXISTS this, key, DELETE this, key, CLEAR this, SCALAR this =item Inheriting from B<Tie::StdHash> =item Inheriting from B<Tie::ExtraHash> =item C<SCALAR>, C<UNTIE> and C<DESTROY> =item MORE INFORMATION =back =head2 Tie::Hash::NamedCapture - Named regexp capture buffers =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO =back =head2 Tie::Memoize - add data to hash when needed =over 4 =item SYNOPSIS =item DESCRIPTION =item Inheriting from B<Tie::Memoize> =item EXAMPLE =item BUGS =item AUTHOR =back =head2 Tie::RefHash - use references as hash keys =over 4 =item SYNOPSIS =item DESCRIPTION =item EXAMPLE =item THREAD SUPPORT =item STORABLE SUPPORT =item RELIC SUPPORT =item LICENSE =item MAINTAINER =item AUTHOR =item SEE ALSO =back =head2 Tie::Scalar, Tie::StdScalar - base class definitions for tied scalars =over 4 =item SYNOPSIS =item DESCRIPTION TIESCALAR classname, LIST, FETCH this, STORE this, value, DESTROY this =over 4 =item Tie::Scalar vs Tie::StdScalar =back =item MORE INFORMATION =back =head2 Tie::StdHandle - base class definitions for tied handles =over 4 =item SYNOPSIS =item DESCRIPTION =back =head2 Tie::SubstrHash - Fixed-table-size, fixed-key-length hashing =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEATS =back =head2 Time::HiRes - High resolution alarm, sleep, gettimeofday, interval timers =over 4 =item SYNOPSIS =item DESCRIPTION gettimeofday (), usleep ( $useconds ), nanosleep ( $nanoseconds ), ualarm ( $useconds [, $interval_useconds ] ), tv_interval, time (), sleep ( $floating_seconds ), alarm ( $floating_seconds [, $interval_floating_seconds ] ), setitimer ( $which, $floating_seconds [, $interval_floating_seconds ] ), getitimer ( $which ), clock_gettime ( $which ), clock_getres ( $which ), clock_nanosleep ( $which, $nanoseconds, $flags = 0), clock(), stat, stat FH, stat EXPR, lstat, lstat FH, lstat EXPR, utime LIST =item EXAMPLES =item C API =item DIAGNOSTICS =over 4 =item useconds or interval more than ... =item negative time not invented yet =item internal error: useconds < 0 (unsigned ... signed ...) =item useconds or uinterval equal to or more than 1000000 =item unimplemented in this platform =back =item CAVEATS =item SEE ALSO =item AUTHORS =item COPYRIGHT AND LICENSE =back =head2 Time::Local - Efficiently compute time from local and GMT time =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =item FUNCTIONS =over 4 =item C<timelocal_modern()> and C<timegm_modern()> =item C<timelocal()> and C<timegm()> =item C<timelocal_nocheck()> and C<timegm_nocheck()> =item Year Value Interpretation =item Limits of time_t =item Ambiguous Local Times (DST) =item Non-Existent Local Times (DST) =item Negative Epoch Values =back =item IMPLEMENTATION =item AUTHORS EMERITUS =item BUGS =item SOURCE =item AUTHOR =item CONTRIBUTORS =item COPYRIGHT AND LICENSE =back =head2 Time::Piece - Object Oriented time objects =over 4 =item SYNOPSIS =item DESCRIPTION =item USAGE =over 4 =item Local Locales =item Date Calculations =item Truncation =item Date Comparisons =item Date Parsing =item YYYY-MM-DDThh:mm:ss =item Week Number =item Global Overriding =back =item CAVEATS =over 4 =item Setting $ENV{TZ} in Threads on Win32 =item Use of epoch seconds =back =item AUTHOR =item COPYRIGHT AND LICENSE =item SEE ALSO =item BUGS =back =head2 Time::Seconds - a simple API to convert seconds to other date values =over 4 =item SYNOPSIS =item DESCRIPTION =item METHODS =item AUTHOR =item COPYRIGHT AND LICENSE =item Bugs =back =head2 Time::gmtime - by-name interface to Perl's built-in gmtime() function =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item AUTHOR =back =head2 Time::localtime - by-name interface to Perl's built-in localtime() function =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item AUTHOR =back =head2 Time::tm - internal object used by Time::gmtime and Time::localtime =over 4 =item SYNOPSIS =item DESCRIPTION =item AUTHOR =back =head2 UNIVERSAL - base class for ALL classes (blessed references) =over 4 =item SYNOPSIS =item DESCRIPTION C<< $obj->isa( TYPE ) >>, C<< CLASS->isa( TYPE ) >>, C<< eval { VAL->isa( TYPE ) } >>, C<TYPE>, C<$obj>, C<CLASS>, C<VAL>, C<< $obj->DOES( ROLE ) >>, C<< CLASS->DOES( ROLE ) >>, C<< $obj->can( METHOD ) >>, C<< CLASS->can( METHOD ) >>, C<< eval { VAL->can( METHOD ) } >>, C<VERSION ( [ REQUIRE ] )> =item WARNINGS =item EXPORTS =back =head2 Unicode::Collate - Unicode Collation Algorithm =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Constructor and Tailoring UCA_Version, alternate, backwards, entry, hangul_terminator, highestFFFF, identical, ignoreChar, ignoreName, ignore_level2, katakana_before_hiragana, level, long_contraction, minimalFFFE, normalization, overrideCJK, overrideHangul, overrideOut, preprocess, rearrange, rewrite, suppress, table, undefChar, undefName, upper_before_lower, variable =item Methods for Collation C<@sorted = $Collator-E<gt>sort(@not_sorted)>, C<$result = $Collator-E<gt>cmp($a, $b)>, C<$result = $Collator-E<gt>eq($a, $b)>, C<$result = $Collator-E<gt>ne($a, $b)>, C<$result = $Collator-E<gt>lt($a, $b)>, C<$result = $Collator-E<gt>le($a, $b)>, C<$result = $Collator-E<gt>gt($a, $b)>, C<$result = $Collator-E<gt>ge($a, $b)>, C<$sortKey = $Collator-E<gt>getSortKey($string)>, C<$sortKeyForm = $Collator-E<gt>viewSortKey($string)> =item Methods for Searching C<$position = $Collator-E<gt>index($string, $substring[, $position])>, C<($position, $length) = $Collator-E<gt>index($string, $substring[, $position])>, C<$match_ref = $Collator-E<gt>match($string, $substring)>, C<($match) = $Collator-E<gt>match($string, $substring)>, C<@match = $Collator-E<gt>gmatch($string, $substring)>, C<$count = $Collator-E<gt>subst($string, $substring, $replacement)>, C<$count = $Collator-E<gt>gsubst($string, $substring, $replacement)> =item Other Methods C<%old_tailoring = $Collator-E<gt>change(%new_tailoring)>, C<$modified_collator = $Collator-E<gt>change(%new_tailoring)>, C<$version = $Collator-E<gt>version()>, C<UCA_Version()>, C<Base_Unicode_Version()> =back =item EXPORT =item INSTALL =item CAVEATS Normalization, Conformance Test =item AUTHOR, COPYRIGHT AND LICENSE =item SEE ALSO Unicode Collation Algorithm - UTS #10, The Default Unicode Collation Element Table (DUCET), The conformance test for the UCA, Hangul Syllable Type, Unicode Normalization Forms - UAX #15, Unicode Locale Data Markup Language (LDML) - UTS #35 =back =head2 Unicode::Collate::CJK::Big5 - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::GB2312 - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::JISX0208 - weighting JIS KANJI for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::Korean - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::Pinyin - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::Stroke - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::CJK::Zhuyin - weighting CJK Unified Ideographs for Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =item CAVEAT =item SEE ALSO CLDR - Unicode Common Locale Data Repository, Unicode Locale Data Markup Language (LDML) - UTS #35, L<Unicode::Collate>, L<Unicode::Collate::Locale> =back =head2 Unicode::Collate::Locale - Linguistic tailoring for DUCET via Unicode::Collate =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Constructor =item Methods C<$Collator-E<gt>getlocale>, C<$Collator-E<gt>locale_version> =item A list of tailorable locales =item A list of variant codes and their aliases =back =item INSTALL =item CAVEAT Tailoring is not maximum, Collation reordering is not supported =over 4 =item Reference =back =item AUTHOR =item SEE ALSO Unicode Collation Algorithm - UTS #10, The Default Unicode Collation Element Table (DUCET), Unicode Locale Data Markup Language (LDML) - UTS #35, CLDR - Unicode Common Locale Data Repository, L<Unicode::Collate>, L<Unicode::Normalize> =back =head2 Unicode::Normalize - Unicode Normalization Forms =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item Normalization Forms C<$NFD_string = NFD($string)>, C<$NFC_string = NFC($string)>, C<$NFKD_string = NFKD($string)>, C<$NFKC_string = NFKC($string)>, C<$FCD_string = FCD($string)>, C<$FCC_string = FCC($string)>, C<$normalized_string = normalize($form_name, $string)> =item Decomposition and Composition C<$decomposed_string = decompose($string [, $useCompatMapping])>, C<$reordered_string = reorder($string)>, C<$composed_string = compose($string)>, C<($processed, $unprocessed) = splitOnLastStarter($normalized)>, C<$processed = normalize_partial($form, $unprocessed)>, C<$processed = NFD_partial($unprocessed)>, C<$processed = NFC_partial($unprocessed)>, C<$processed = NFKD_partial($unprocessed)>, C<$processed = NFKC_partial($unprocessed)> =item Quick Check C<$result = checkNFD($string)>, C<$result = checkNFC($string)>, C<$result = checkNFKD($string)>, C<$result = checkNFKC($string)>, C<$result = checkFCD($string)>, C<$result = checkFCC($string)>, C<$result = check($form_name, $string)> =item Character Data C<$canonical_decomposition = getCanon($code_point)>, C<$compatibility_decomposition = getCompat($code_point)>, C<$code_point_composite = getComposite($code_point_here, $code_point_next)>, C<$combining_class = getCombinClass($code_point)>, C<$may_be_composed_with_prev_char = isComp2nd($code_point)>, C<$is_exclusion = isExclusion($code_point)>, C<$is_singleton = isSingleton($code_point)>, C<$is_non_starter_decomposition = isNonStDecomp($code_point)>, C<$is_Full_Composition_Exclusion = isComp_Ex($code_point)>, C<$NFD_is_NO = isNFD_NO($code_point)>, C<$NFC_is_NO = isNFC_NO($code_point)>, C<$NFC_is_MAYBE = isNFC_MAYBE($code_point)>, C<$NFKD_is_NO = isNFKD_NO($code_point)>, C<$NFKC_is_NO = isNFKC_NO($code_point)>, C<$NFKC_is_MAYBE = isNFKC_MAYBE($code_point)> =back =item EXPORT =item CAVEATS Perl's version vs. Unicode version, Correction of decomposition mapping, Revised definition of canonical composition =item AUTHOR =item LICENSE =item SEE ALSO L<http://www.unicode.org/reports/tr15/>, L<http://www.unicode.org/Public/UNIDATA/CompositionExclusions.txt>, L<http://www.unicode.org/Public/UNIDATA/DerivedNormalizationProps.txt>, L<http://www.unicode.org/Public/UNIDATA/NormalizationCorrections.txt>, L<http://www.unicode.org/review/pr-29.html>, L<http://www.unicode.org/notes/tn5/> =back =head2 Unicode::UCD - Unicode character database =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item code point argument =back =back =over 4 =item B<charinfo()> B<code>, B<name>, B<category>, B<combining>, B<bidi>, B<decomposition>, B<decimal>, B<digit>, B<numeric>, B<mirrored>, B<unicode10>, B<comment>, B<upper>, B<lower>, B<title>, B<block>, B<script> =back =over 4 =item B<charprop()> Block, Decomposition_Mapping, Name_Alias, Numeric_Value, Script_Extensions =back =over 4 =item B<charprops_all()> =back =over 4 =item B<charblock()> =back =over 4 =item B<charscript()> =back =over 4 =item B<charblocks()> =back =over 4 =item B<charscripts()> =back =over 4 =item B<charinrange()> =back =over 4 =item B<general_categories()> =back =over 4 =item B<bidi_types()> =back =over 4 =item B<compexcl()> =back =over 4 =item B<casefold()> B<code>, B<full>, B<simple>, B<mapping>, B<status>, Z<>B<*> If you use this C<I> mapping, Z<>B<*> If you exclude this C<I> mapping, B<turkic> =back =over 4 =item B<all_casefolds()> =back =over 4 =item B<casespec()> B<code>, B<lower>, B<title>, B<upper>, B<condition> =back =over 4 =item B<namedseq()> =back =over 4 =item B<num()> =back =over 4 =item B<prop_aliases()> =back =over 4 =item B<prop_values()> =back =over 4 =item B<prop_value_aliases()> =back =over 4 =item B<prop_invlist()> =back =over 4 =item B<prop_invmap()> B<C<s>>, B<C<sl>>, C<correction>, C<control>, C<alternate>, C<figment>, C<abbreviation>, B<C<a>>, B<C<al>>, B<C<ae>>, B<C<ale>>, B<C<ar>>, B<C<n>>, B<C<ad>> =back =over 4 =item B<search_invlist()> =back =over 4 =item Unicode::UCD::UnicodeVersion =back =over 4 =item B<Blocks versus Scripts> =item B<Matching Scripts and Blocks> =item Old-style versus new-style block names =item Use with older Unicode versions =back =over 4 =item AUTHOR =back =head2 User::grent - by-name interface to Perl's built-in getgr*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =item NOTE =item AUTHOR =back =head2 User::pwent - by-name interface to Perl's built-in getpw*() functions =over 4 =item SYNOPSIS =item DESCRIPTION =over 4 =item System Specifics =back =item NOTE =item AUTHOR =item HISTORY March 18th, 2000 =back =head2 XSLoader - Dynamically load C libraries into Perl code =over 4 =item VERSION =item SYNOPSIS =item DESCRIPTION =over 4 =item Migration from C<DynaLoader> =item Backward compatible boilerplate =back =item Order of initialization: early load() =over 4 =item The most hairy case =back =item DIAGNOSTICS C<Can't find '%s' symbol in %s>, C<Can't load '%s' for module %s: %s>, C<Undefined symbols present after loading %s: %s> =item LIMITATIONS =item KNOWN BUGS =item BUGS =item SEE ALSO =item AUTHORS =item COPYRIGHT & LICENSE =back =head1 AUXILIARY DOCUMENTATION Here should be listed all the extra programs' documentation, but they don't all have manual pages yet: =over 4 =item h2ph =item h2xs =item perlbug =item pl2pm =item pod2html =item pod2man =item splain =item xsubpp =back =head1 AUTHOR Larry Wall <F<larry@wall.org>>, with the help of oodles of other folks. PK �=�[�>�"