← Index
NYTProf Performance Profile   « block view • line view • sub view »
For ddd2.pl
  Run on Tue May 25 16:52:24 2010
Reported on Tue May 25 16:57:06 2010

File /project/perl/lib/LWP.pm
Statements Executed 4
Statement Execution Time 30µs
Subroutines — ordered by exclusive time
Calls P F Exclusive
Time
Inclusive
Time
Subroutine
0000s0sLWP::::VersionLWP::Version
Call graph for these subroutines as a Graphviz dot language file.
Line State
ments
Time
on line
Calls Time
in subs
Code
1#
2# $Id: LWP.pm,v 1.149 2005/12/08 12:06:22 gisle Exp $
3
4package LWP;
5
616µs$VERSION = "5.805";
7sub Version { $VERSION; }
8
914µsrequire 5.005;
1016µsrequire LWP::UserAgent; # this should load everything you need
11
12114µs1;
13
14__END__
15
16=head1 NAME
17
18LWP - The World-Wide Web library for Perl
19
20=head1 SYNOPSIS
21
22 use LWP;
23 print "This is libwww-perl-$LWP::VERSION\n";
24
25
26=head1 DESCRIPTION
27
28The libwww-perl collection is a set of Perl modules which provides a
29simple and consistent application programming interface (API) to the
30World-Wide Web. The main focus of the library is to provide classes
31and functions that allow you to write WWW clients. The library also
32contain modules that are of more general use and even classes that
33help you implement simple HTTP servers.
34
35Most modules in this library provide an object oriented API. The user
36agent, requests sent and responses received from the WWW server are
37all represented by objects. This makes a simple and powerful
38interface to these services. The interface is easy to extend
39and customize for your own needs.
40
41The main features of the library are:
42
43=over 3
44
45=item *
46
47Contains various reusable components (modules) that can be
48used separately or together.
49
50=item *
51
52Provides an object oriented model of HTTP-style communication. Within
53this framework we currently support access to http, https, gopher, ftp, news,
54file, and mailto resources.
55
56=item *
57
58Provides a full object oriented interface or
59a very simple procedural interface.
60
61=item *
62
63Supports the basic and digest authorization schemes.
64
65=item *
66
67Supports transparent redirect handling.
68
69=item *
70
71Supports access through proxy servers.
72
73=item *
74
75Provides parser for F<robots.txt> files and a framework for constructing robots.
76
77=item *
78
79Supports parsing of HTML forms.
80
81=item *
82
83Implements HTTP content negotiation algorithm that can
84be used both in protocol modules and in server scripts (like CGI
85scripts).
86
87=item *
88
89Supports HTTP cookies.
90
91=item *
92
93Some simple command line clients, for instance C<lwp-request> and C<lwp-download>.
94
95=back
96
97
98=head1 HTTP STYLE COMMUNICATION
99
100
101The libwww-perl library is based on HTTP style communication. This
102section tries to describe what that means.
103
104Let us start with this quote from the HTTP specification document
105<URL:http://www.w3.org/pub/WWW/Protocols/>:
106
107=over 3
108
109=item
110
111The HTTP protocol is based on a request/response paradigm. A client
112establishes a connection with a server and sends a request to the
113server in the form of a request method, URI, and protocol version,
114followed by a MIME-like message containing request modifiers, client
115information, and possible body content. The server responds with a
116status line, including the message's protocol version and a success or
117error code, followed by a MIME-like message containing server
118information, entity meta-information, and possible body content.
119
120=back
121
122What this means to libwww-perl is that communication always take place
123through these steps: First a I<request> object is created and
124configured. This object is then passed to a server and we get a
125I<response> object in return that we can examine. A request is always
126independent of any previous requests, i.e. the service is stateless.
127The same simple model is used for any kind of service we want to
128access.
129
130For example, if we want to fetch a document from a remote file server,
131then we send it a request that contains a name for that document and
132the response will contain the document itself. If we access a search
133engine, then the content of the request will contain the query
134parameters and the response will contain the query result. If we want
135to send a mail message to somebody then we send a request object which
136contains our message to the mail server and the response object will
137contain an acknowledgment that tells us that the message has been
138accepted and will be forwarded to the recipient(s).
139
140It is as simple as that!
141
142
143=head2 The Request Object
144
145The libwww-perl request object has the class name C<HTTP::Request>.
146The fact that the class name uses C<HTTP::> as a
147prefix only implies that we use the HTTP model of communication. It
148does not limit the kind of services we can try to pass this I<request>
149to. For instance, we will send C<HTTP::Request>s both to ftp and
150gopher servers, as well as to the local file system.
151
152The main attributes of the request objects are:
153
154=over 3
155
156=item *
157
158The B<method> is a short string that tells what kind of
159request this is. The most common methods are B<GET>, B<PUT>,
160B<POST> and B<HEAD>.
161
162=item *
163
164The B<uri> is a string denoting the protocol, server and
165the name of the "document" we want to access. The B<uri> might
166also encode various other parameters.
167
168=item *
169
170The B<headers> contain additional information about the
171request and can also used to describe the content. The headers
172are a set of keyword/value pairs.
173
174=item *
175
176The B<content> is an arbitrary amount of data.
177
178=back
179
180=head2 The Response Object
181
182The libwww-perl response object has the class name C<HTTP::Response>.
183The main attributes of objects of this class are:
184
185=over 3
186
187=item *
188
189The B<code> is a numerical value that indicates the overall
190outcome of the request.
191
192=item *
193
194The B<message> is a short, human readable string that
195corresponds to the I<code>.
196
197=item *
198
199The B<headers> contain additional information about the
200response and describe the content.
201
202=item *
203
204The B<content> is an arbitrary amount of data.
205
206=back
207
208Since we don't want to handle all possible I<code> values directly in
209our programs, a libwww-perl response object has methods that can be
210used to query what kind of response this is. The most commonly used
211response classification methods are:
212
213=over 3
214
215=item is_success()
216
217The request was was successfully received, understood or accepted.
218
219=item is_error()
220
221The request failed. The server or the resource might not be
222available, access to the resource might be denied or other things might
223have failed for some reason.
224
225=back
226
227=head2 The User Agent
228
229Let us assume that we have created a I<request> object. What do we
230actually do with it in order to receive a I<response>?
231
232The answer is that you pass it to a I<user agent> object and this
233object takes care of all the things that need to be done
234(like low-level communication and error handling) and returns
235a I<response> object. The user agent represents your
236application on the network and provides you with an interface that
237can accept I<requests> and return I<responses>.
238
239The user agent is an interface layer between
240your application code and the network. Through this interface you are
241able to access the various servers on the network.
242
243The class name for the user agent is C<LWP::UserAgent>. Every
244libwww-perl application that wants to communicate should create at
245least one object of this class. The main method provided by this
246object is request(). This method takes an C<HTTP::Request> object as
247argument and (eventually) returns a C<HTTP::Response> object.
248
249The user agent has many other attributes that let you
250configure how it will interact with the network and with your
251application.
252
253=over 3
254
255=item *
256
257The B<timeout> specifies how much time we give remote servers to
258respond before the library disconnects and creates an
259internal I<timeout> response.
260
261=item *
262
263The B<agent> specifies the name that your application should use when it
264presents itself on the network.
265
266=item *
267
268The B<from> attribute can be set to the e-mail address of the person
269responsible for running the application. If this is set, then the
270address will be sent to the servers with every request.
271
272=item *
273
274The B<parse_head> specifies whether we should initialize response
275headers from the E<lt>head> section of HTML documents.
276
277=item *
278
279The B<proxy> and B<no_proxy> attributes specify if and when to go through
280a proxy server. <URL:http://www.w3.org/pub/WWW/Proxies/>
281
282=item *
283
284The B<credentials> provide a way to set up user names and
285passwords needed to access certain services.
286
287=back
288
289Many applications want even more control over how they interact
290with the network and they get this by sub-classing
291C<LWP::UserAgent>. The library includes a
292sub-class, C<LWP::RobotUA>, for robot applications.
293
294=head2 An Example
295
296This example shows how the user agent, a request and a response are
297represented in actual perl code:
298
299 # Create a user agent object
300 use LWP::UserAgent;
301 $ua = LWP::UserAgent->new;
302 $ua->agent("MyApp/0.1 ");
303
304 # Create a request
305 my $req = HTTP::Request->new(POST => 'http://search.cpan.org/search');
306 $req->content_type('application/x-www-form-urlencoded');
307 $req->content('query=libwww-perl&mode=dist');
308
309 # Pass request to the user agent and get a response back
310 my $res = $ua->request($req);
311
312 # Check the outcome of the response
313 if ($res->is_success) {
314 print $res->content;
315 }
316 else {
317 print $res->status_line, "\n";
318 }
319
320The $ua is created once when the application starts up. New request
321objects should normally created for each request sent.
322
323
324=head1 NETWORK SUPPORT
325
326This section discusses the various protocol schemes and
327the HTTP style methods that headers may be used for each.
328
329For all requests, a "User-Agent" header is added and initialized from
330the $ua->agent attribute before the request is handed to the network
331layer. In the same way, a "From" header is initialized from the
332$ua->from attribute.
333
334For all responses, the library adds a header called "Client-Date".
335This header holds the time when the response was received by
336your application. The format and semantics of the header are the
337same as the server created "Date" header. You may also encounter other
338"Client-XXX" headers. They are all generated by the library
339internally and are not received from the servers.
340
341=head2 HTTP Requests
342
343HTTP requests are just handed off to an HTTP server and it
344decides what happens. Few servers implement methods beside the usual
345"GET", "HEAD", "POST" and "PUT", but CGI-scripts may implement
346any method they like.
347
348If the server is not available then the library will generate an
349internal error response.
350
351The library automatically adds a "Host" and a "Content-Length" header
352to the HTTP request before it is sent over the network.
353
354For a GET request you might want to add a "If-Modified-Since" or
355"If-None-Match" header to make the request conditional.
356
357For a POST request you should add the "Content-Type" header. When you
358try to emulate HTML E<lt>FORM> handling you should usually let the value
359of the "Content-Type" header be "application/x-www-form-urlencoded".
360See L<lwpcook> for examples of this.
361
362The libwww-perl HTTP implementation currently support the HTTP/1.1
363and HTTP/1.0 protocol.
364
365The library allows you to access proxy server through HTTP. This
366means that you can set up the library to forward all types of request
367through the HTTP protocol module. See L<LWP::UserAgent> for
368documentation of this.
369
370
371=head2 HTTPS Requests
372
373HTTPS requests are HTTP requests over an encrypted network connection
374using the SSL protocol developed by Netscape. Everything about HTTP
375requests above also apply to HTTPS requests. In addition the library
376will add the headers "Client-SSL-Cipher", "Client-SSL-Cert-Subject" and
377"Client-SSL-Cert-Issuer" to the response. These headers denote the
378encryption method used and the name of the server owner.
379
380The request can contain the header "If-SSL-Cert-Subject" in order to
381make the request conditional on the content of the server certificate.
382If the certificate subject does not match, no request is sent to the
383server and an internally generated error response is returned. The
384value of the "If-SSL-Cert-Subject" header is interpreted as a Perl
385regular expression.
386
387
388=head2 FTP Requests
389
390The library currently supports GET, HEAD and PUT requests. GET
391retrieves a file or a directory listing from an FTP server. PUT
392stores a file on a ftp server.
393
394You can specify a ftp account for servers that want this in addition
395to user name and password. This is specified by including an "Account"
396header in the request.
397
398User name/password can be specified using basic authorization or be
399encoded in the URL. Failed logins return an UNAUTHORIZED response with
400"WWW-Authenticate: Basic" and can be treated like basic authorization
401for HTTP.
402
403The library supports ftp ASCII transfer mode by specifying the "type=a"
404parameter in the URL. It also supports transfer of ranges for FTP transfers
405using the "Range" header.
406
407Directory listings are by default returned unprocessed (as returned
408from the ftp server) with the content media type reported to be
409"text/ftp-dir-listing". The C<File::Listing> module provides methods
410for parsing of these directory listing.
411
412The ftp module is also able to convert directory listings to HTML and
413this can be requested via the standard HTTP content negotiation
414mechanisms (add an "Accept: text/html" header in the request if you
415want this).
416
417For normal file retrievals, the "Content-Type" is guessed based on the
418file name suffix. See L<LWP::MediaTypes>.
419
420The "If-Modified-Since" request header works for servers that implement
421the MDTM command. It will probably not work for directory listings though.
422
423Example:
424
425 $req = HTTP::Request->new(GET => 'ftp://me:passwd@ftp.some.where.com/');
426 $req->header(Accept => "text/html, */*;q=0.1");
427
428=head2 News Requests
429
430Access to the USENET News system is implemented through the NNTP
431protocol. The name of the news server is obtained from the
432NNTP_SERVER environment variable and defaults to "news". It is not
433possible to specify the hostname of the NNTP server in news: URLs.
434
435The library supports GET and HEAD to retrieve news articles through the
436NNTP protocol. You can also post articles to newsgroups by using
437(surprise!) the POST method.
438
439GET on newsgroups is not implemented yet.
440
441Examples:
442
443 $req = HTTP::Request->new(GET => 'news:abc1234@a.sn.no');
444
445 $req = HTTP::Request->new(POST => 'news:comp.lang.perl.test');
446 $req->header(Subject => 'This is a test',
447 From => 'me@some.where.org');
448 $req->content(<<EOT);
449 This is the content of the message that we are sending to
450 the world.
451 EOT
452
453
454=head2 Gopher Request
455
456The library supports the GET and HEAD methods for gopher requests. All
457request header values are ignored. HEAD cheats and returns a
458response without even talking to server.
459
460Gopher menus are always converted to HTML.
461
462The response "Content-Type" is generated from the document type
463encoded (as the first letter) in the request URL path itself.
464
465Example:
466
467 $req = HTTP::Request->new(GET => 'gopher://gopher.sn.no/');
468
469
470
471=head2 File Request
472
473The library supports GET and HEAD methods for file requests. The
474"If-Modified-Since" header is supported. All other headers are
475ignored. The I<host> component of the file URL must be empty or set
476to "localhost". Any other I<host> value will be treated as an error.
477
478Directories are always converted to an HTML document. For normal
479files, the "Content-Type" and "Content-Encoding" in the response are
480guessed based on the file suffix.
481
482Example:
483
484 $req = HTTP::Request->new(GET => 'file:/etc/passwd');
485
486
487=head2 Mailto Request
488
489You can send (aka "POST") mail messages using the library. All
490headers specified for the request are passed on to the mail system.
491The "To" header is initialized from the mail address in the URL.
492
493Example:
494
495 $req = HTTP::Request->new(POST => 'mailto:libwww@perl.org');
496 $req->header(Subject => "subscribe");
497 $req->content("Please subscribe me to the libwww-perl mailing list!\n");
498
499=head2 CPAN Requests
500
501URLs with scheme C<cpan:> are redirected to the a suitable CPAN
502mirror. If you have your own local mirror of CPAN you might tell LWP
503to use it for C<cpan:> URLs by an assignment like this:
504
505 $LWP::Protocol::cpan::CPAN = "file:/local/CPAN/";
506
507Suitable CPAN mirrors are also picked up from the configuration for
508the CPAN.pm, so if you have used that module a suitable mirror should
509be picked automatically. If neither of these apply, then a redirect
510to the generic CPAN http location is issued.
511
512Example request to download the newest perl:
513
514 $req = HTTP::Request->new(GET => "cpan:src/latest.tar.gz");
515
516
517=head1 OVERVIEW OF CLASSES AND PACKAGES
518
519This table should give you a quick overview of the classes provided by the
520library. Indentation shows class inheritance.
521
522 LWP::MemberMixin -- Access to member variables of Perl5 classes
523 LWP::UserAgent -- WWW user agent class
524 LWP::RobotUA -- When developing a robot applications
525 LWP::Protocol -- Interface to various protocol schemes
526 LWP::Protocol::http -- http:// access
527 LWP::Protocol::file -- file:// access
528 LWP::Protocol::ftp -- ftp:// access
529 ...
530
531 LWP::Authen::Basic -- Handle 401 and 407 responses
532 LWP::Authen::Digest
533
534 HTTP::Headers -- MIME/RFC822 style header (used by HTTP::Message)
535 HTTP::Message -- HTTP style message
536 HTTP::Request -- HTTP request
537 HTTP::Response -- HTTP response
538 HTTP::Daemon -- A HTTP server class
539
540 WWW::RobotRules -- Parse robots.txt files
541 WWW::RobotRules::AnyDBM_File -- Persistent RobotRules
542
543 Net::HTTP -- Low level HTTP client
544
545The following modules provide various functions and definitions.
546
547 LWP -- This file. Library version number and documentation.
548 LWP::MediaTypes -- MIME types configuration (text/html etc.)
549 LWP::Debug -- Debug logging module
550 LWP::Simple -- Simplified procedural interface for common functions
551 HTTP::Status -- HTTP status code (200 OK etc)
552 HTTP::Date -- Date parsing module for HTTP date formats
553 HTTP::Negotiate -- HTTP content negotiation calculation
554 File::Listing -- Parse directory listings
555 HTML::Form -- Processing for <form>s in HTML documents
556
557
558=head1 MORE DOCUMENTATION
559
560All modules contain detailed information on the interfaces they
561provide. The I<lwpcook> manpage is the libwww-perl cookbook that contain
562examples of typical usage of the library. You might want to take a
563look at how the scripts C<lwp-request>, C<lwp-rget> and C<lwp-mirror>
564are implemented.
565
566=head1 ENVIRONMENT
567
568The following environment variables are used by LWP:
569
570=over
571
572=item HOME
573
574The C<LWP::MediaTypes> functions will look for the F<.media.types> and
575F<.mime.types> files relative to you home directory.
576
577=item http_proxy
578
579=item ftp_proxy
580
581=item xxx_proxy
582
583=item no_proxy
584
585These environment variables can be set to enable communication through
586a proxy server. See the description of the C<env_proxy> method in
587L<LWP::UserAgent>.
588
589=item PERL_LWP_USE_HTTP_10
590
591Enable the old HTTP/1.0 protocol driver instead of the new HTTP/1.1
592driver. You might want to set this to a TRUE value if you discover
593that your old LWP applications fails after you installed LWP-5.60 or
594better.
595
596=item PERL_HTTP_URI_CLASS
597
598Used to decide what URI objects to instantiate. The default is C<URI>.
599You might want to set it to C<URI::URL> for compatibility with old times.
600
601=back
602
603=head1 AUTHORS
604
605LWP was made possible by contributions from Adam Newby, Albert
606Dvornik, Alexandre Duret-Lutz, Andreas Gustafsson, Andreas König,
607Andrew Pimlott, Andy Lester, Ben Coleman, Benjamin Low, Ben Low, Ben
608Tilly, Blair Zajac, Bob Dalgleish, BooK, Brad Hughes, Brian
609J. Murrell, Brian McCauley, Charles C. Fu, Charles Lane, Chris Nandor,
610Christian Gilmore, Chris W. Unger, Craig Macdonald, Dale Couch, Dan
611Kubb, Dave Dunkin, Dave W. Smith, David Coppit, David Dick, David
612D. Kilzer, Doug MacEachern, Edward Avis, erik, Gary Shea, Gisle Aas,
613Graham Barr, Gurusamy Sarathy, Hans de Graaff, Harald Joerg, Harry
614Bochner, Hugo, Ilya Zakharevich, INOUE Yoshinari, Ivan Panchenko, Jack
615Shirazi, James Tillman, Jan Dubois, Jared Rhine, Jim Stern, Joao
616Lopes, John Klar, Johnny Lee, Josh Kronengold, Josh Rai, Joshua
617Chamas, Joshua Hoblitt, Kartik Subbarao, Keiichiro Nagano, Ken
618Williams, KONISHI Katsuhiro, Lee T Lindley, Liam Quinn, Marc Hedlund,
619Marc Langheinrich, Mark D. Anderson, Marko Asplund, Mark Stosberg,
620Markus B Krüger, Markus Laker, Martijn Koster, Martin Thurn, Matthew
621Eldridge, Matthew.van.Eerde, Matt Sergeant, Michael A. Chase, Michael
622Quaranta, Michael Thompson, Mike Schilli, Moshe Kaminsky, Nathan
623Torkington, Nicolai Langfeldt, Norton Allen, Olly Betts, Paul
624J. Schinder, peterm, Philip GuentherDaniel Buenzli, Pon Hwa Lin,
625Radoslaw Zielinski, Radu Greab, Randal L. Schwartz, Richard Chen,
626Robin Barker, Roy Fielding, Sander van Zoest, Sean M. Burke,
627shildreth, Slaven Rezic, Steve A Fink, Steve Hay, Steven Butler,
628Steve_Kilbane, Takanori Ugai, Thomas Lotterer, Tim Bunce, Tom Hughes,
629Tony Finch, Ville Skyttä, Ward Vandewege, William York, Yale Huang,
630and Yitzchak Scott-Thoennes.
631
632LWP owes a lot in motivation, design, and code, to the libwww-perl
633library for Perl4 by Roy Fielding, which included work from Alberto
634Accomazzi, James Casey, Brooks Cutter, Martijn Koster, Oscar
635Nierstrasz, Mel Melchner, Gertjan van Oosten, Jared Rhine, Jack
636Shirazi, Gene Spafford, Marc VanHeyningen, Steven E. Brenner, Marion
637Hakanson, Waldemar Kebsch, Tony Sanders, and Larry Wall; see the
638libwww-perl-0.40 library for details.
639
640=head1 COPYRIGHT
641
642 Copyright 1995-2005, Gisle Aas
643 Copyright 1995, Martijn Koster
644
645This library is free software; you can redistribute it and/or
646modify it under the same terms as Perl itself.
647
648=head1 AVAILABILITY
649
650The latest version of this library is likely to be available from CPAN
651as well as:
652
653 http://www.linpro.no/lwp/
654
655The best place to discuss this code is on the <libwww@perl.org>
656mailing list.
657
658=cut