CPAN is an acronym standing for Comprehensive Perl Archive Network. It is a large archive of software written in Perl, as well as documentation for it. It has a presence on the World Wide Web at www.cpan.org and is mirrored worldwide. It also denotes the script that acts as a package manager.
Modules
The CPAN's main purpose is to help programmers easily locate modules and scripts not included in the Perl standard distribution. It is also used to distribute new versions of Perl, as well as related projects, such as Parrot.
The CPAN is an important resource for the professional Perl programmer. With over 10,000 modules (containing 20,000,000 lines of code) the CPAN can save programmers weeks of time, and large Perl programs often make use of dozens of modules. Some of them, such as the DBI family of modules used for interfacing with SQL databases, are nearly irreplaceable in their area of functionality; others, such as the List::Util module, are simply handy resources containing a few common functions.
The CPAN's role
Files on the CPAN are referred to as distributions. A distribution may consist of one or more modules, documentation files, or scripts packaged in a common archiving format, such as a gzipped tar archive or a PKWARE ZIP file. Distributions will often contain installation scripts (usually called Makefile.PL or Build.PL) and test scripts which can be run to verify the contents of the distribution are functioning properly.
In 2003 distributions started to include metadata files, called META.yml, indicating the distribution's name, version, dependencies, and other useful information; however, not all distributions contain metadata. When metadata is not present in a distribution, the PAUSE's software will usually try to analyze the code in the distribution to look for the same information; this is not necessarily very reliable. (See the Uploading Distributions with PAUSE section for more.)
With thousands of distributions, CPAN needs to be structured to be useful. Distributions on the CPAN are divided into 24 broad chapters based on their purpose, such as Internationalization and Locale; Archiving, Compression, And Conversion; and Mail and Usenet News. Distributions can also be browsed by author. Finally, the natural hierarchy of Perl module names (such as Apache::DBI or Lingua::EN::Inflect) can sometimes be used to browse modules in the CPAN.
CPAN module distributions usually have names in the form of CGI-Application-3.1 (where the :: used in the module's name has been replaced with a dash, and the version number has been appended to the name), but this is only a convention; many prominent distributions break the convention, especially those that contain multiple modules. Security restrictions prevent a distribution from ever being replaced, so virtually all distribution names do include a version number.
CPAN structure
Components of CPAN
The heart of the CPAN is its worldwide network of mirrors. The CPAN master site, ftp.funet.fi, has over 280 public mirrors in 60 countries. Each site has a copy of the over 3.1 gigabytes of data in the CPAN.
Most mirrors update themselves daily from the CPAN master site. Some update two times a day or even hourly, and a few update from other mirrors. Some sites are major FTP servers which mirror lots of other software, but others are simply servers owned by companies that use Perl heavily. There are at least two mirrors on every continent except Antarctica.
For more information on CPAN mirrors, see mirrors.cpan.org.
The CPAN mirrors
Several search engines have been written to help Perl programmers sort through the CPAN. The most popular is search.cpan.org, which includes textual search, a browsable index of modules, and extracted copies of all distributions currently on the CPAN. Another popular search engine is cpan.uwinnipeg.ca.
Other supporting websites
There is also a Perl core module named CPAN; it's usually differentiated from the repository itself by calling it CPAN.pm. CPAN.pm is mainly an interactive shell which can be used to search for, download, and install distributions. A launch script called cpan is also provided in the Perl core, and is the usual way of running CPAN.pm. After a short configuration process and mirror selection, it uses tools available on the user's computer to automatically download, unpack, compile, test, and install modules. It is also capable of updating itself.
Recently, an effort to replace CPAN.pm with something cleaner and more modern has resulted in the CPANPLUS or CPAN++ set of modules. CPANPLUS more cleanly separates the back-end work of downloading, compiling, and installing modules from the interactive shell used to issue commands. It also supports several advanced features, such as cryptographic signature checking and test result reporting. Finally, CPANPLUS can uninstall a distribution. CPANPLUS is expected to replace CPAN.pm in the core distribution in Perl 5.10.
Both modules can check a distribution's dependencies and are capable of automatically (or with the user's approval) recursively installing any prerequisites. Both support FTP and HTTP and can work through firewalls and proxies.
CPAN.pm and CPANPLUS
Authors can upload new distributions to the CPAN through the Perl Authors Upload Server (PAUSE). To do so, they must register for a PAUSE account. PAUSE accounts have a 3-9 character username consisting of uppercase letters only--no numbers, no lowercase, no punctuation. They also give their full name in their native language, an e-mail address, an optional web address, and a "short description of what [they]'re planning to contribute" to the CPAN.
Registration is not immediate, and typically takes a week.
Once registered, the new PAUSE account has a directory in the CPAN under authors/id/(first letter)/(first two letters)/(author ID). They may use a Web interface to upload files to their directory and delete them. The PAUSE will warn an administrator if a user uploads a module that already exists, unless they are listed as a co-maintainer. This can be specified through PAUSE's web interface.
Uploading distributions with PAUSE
Experienced Perl programmers often comment that half of Perl's power is in the CPAN. Though the TeX typesetting language has an equivalent, the CTAN (and in fact the CPAN's name is based on the CTAN), few languages have an exhaustive central repository for libraries. The PHP language has PECL (PHP Extension Community Library) and PEAR (PHP Extension and Application Repository), and Python has a PyPI (Python Package Index) repository, but neither is as large nor as active as the CPAN. Other major languages, such as Java and C++, do not have anything similar to the CPAN (though for Java there is central Maven repository which in some ways resembles CPAN ).
The CPAN has grown so large and comprehensive over the years that many people learning Perl seem to elevate it to a sort of mythical status, and express surprise when they begin to encounter topics for which a CPAN module doesn't exist already.
The CPAN's influence on Perl's eclectic culture should not be underestimated either. As a hive of activity in the Perl world, the CPAN both shapes and is shaped by Perl culture. Its "self-appointed master librarian", Jarkko Hietaniemi, often takes part in the April Fools Day jokes so popular on the Internet; on 1 April 2002 the site was temporarily named to CJAN, where the "J" stood for "Java". In 2003, the www.cpan.org domain name was redirected to Matt's Script Archive, a site infamous in the Perl community for having badly-written code.
Beyond April Fools', however, some of the distributions on the CPAN are jokes in themselves. The Acme:: hierarchy is reserved for joke modules; for instance, Acme::Don't adds a
don't
function that doesn't run the code given to it (to complement the do
built-in, which does). Even outside the Acme:: hierarchy, some modules are still written largely for amusement; one example is Lingua::Romana::Perligata, which can be used to write Perl programs in a subset of Latin.The CPAN's Influence
Over the years, the CPAN has had a range of unusual, yet legitimate, non-Perl things uploaded to it.
The following are just a few examples.
DBD::SQLite - The complete C code for the SQLite database.
PITA::Test::Image::Qemu - A fully working (if small) Linux distribution.
Religion::Islam::Quran - The entire Muslim holy book, the Quran, in 5 different languages. Derivative Works
CRAN
CTAN
JSAN
CJAN
Ruby equivalent : RubyGems
No comments:
Post a Comment