1 | | | | | # |
2 | | | | | # $Id: LWP.pm,v 1.149 2005/12/08 12:06:22 gisle Exp $ |
3 | | | | | |
4 | | | | | package LWP; |
5 | | | | | |
6 | 1 | 6µs | | | $VERSION = "5.805"; |
7 | | | | | sub Version { $VERSION; } |
8 | | | | | |
9 | 1 | 4µs | | | require 5.005; |
10 | 1 | 6µs | | | require LWP::UserAgent; # this should load everything you need |
11 | | | | | |
12 | 1 | 14µs | | | 1; |
13 | | | | | |
14 | | | | | __END__ |
15 | | | | | |
16 | | | | | =head1 NAME |
17 | | | | | |
18 | | | | | LWP - The World-Wide Web library for Perl |
19 | | | | | |
20 | | | | | =head1 SYNOPSIS |
21 | | | | | |
22 | | | | | use LWP; |
23 | | | | | print "This is libwww-perl-$LWP::VERSION\n"; |
24 | | | | | |
25 | | | | | |
26 | | | | | =head1 DESCRIPTION |
27 | | | | | |
28 | | | | | The libwww-perl collection is a set of Perl modules which provides a |
29 | | | | | simple and consistent application programming interface (API) to the |
30 | | | | | World-Wide Web. The main focus of the library is to provide classes |
31 | | | | | and functions that allow you to write WWW clients. The library also |
32 | | | | | contain modules that are of more general use and even classes that |
33 | | | | | help you implement simple HTTP servers. |
34 | | | | | |
35 | | | | | Most modules in this library provide an object oriented API. The user |
36 | | | | | agent, requests sent and responses received from the WWW server are |
37 | | | | | all represented by objects. This makes a simple and powerful |
38 | | | | | interface to these services. The interface is easy to extend |
39 | | | | | and customize for your own needs. |
40 | | | | | |
41 | | | | | The main features of the library are: |
42 | | | | | |
43 | | | | | =over 3 |
44 | | | | | |
45 | | | | | =item * |
46 | | | | | |
47 | | | | | Contains various reusable components (modules) that can be |
48 | | | | | used separately or together. |
49 | | | | | |
50 | | | | | =item * |
51 | | | | | |
52 | | | | | Provides an object oriented model of HTTP-style communication. Within |
53 | | | | | this framework we currently support access to http, https, gopher, ftp, news, |
54 | | | | | file, and mailto resources. |
55 | | | | | |
56 | | | | | =item * |
57 | | | | | |
58 | | | | | Provides a full object oriented interface or |
59 | | | | | a very simple procedural interface. |
60 | | | | | |
61 | | | | | =item * |
62 | | | | | |
63 | | | | | Supports the basic and digest authorization schemes. |
64 | | | | | |
65 | | | | | =item * |
66 | | | | | |
67 | | | | | Supports transparent redirect handling. |
68 | | | | | |
69 | | | | | =item * |
70 | | | | | |
71 | | | | | Supports access through proxy servers. |
72 | | | | | |
73 | | | | | =item * |
74 | | | | | |
75 | | | | | Provides parser for F<robots.txt> files and a framework for constructing robots. |
76 | | | | | |
77 | | | | | =item * |
78 | | | | | |
79 | | | | | Supports parsing of HTML forms. |
80 | | | | | |
81 | | | | | =item * |
82 | | | | | |
83 | | | | | Implements HTTP content negotiation algorithm that can |
84 | | | | | be used both in protocol modules and in server scripts (like CGI |
85 | | | | | scripts). |
86 | | | | | |
87 | | | | | =item * |
88 | | | | | |
89 | | | | | Supports HTTP cookies. |
90 | | | | | |
91 | | | | | =item * |
92 | | | | | |
93 | | | | | Some simple command line clients, for instance C<lwp-request> and C<lwp-download>. |
94 | | | | | |
95 | | | | | =back |
96 | | | | | |
97 | | | | | |
98 | | | | | =head1 HTTP STYLE COMMUNICATION |
99 | | | | | |
100 | | | | | |
101 | | | | | The libwww-perl library is based on HTTP style communication. This |
102 | | | | | section tries to describe what that means. |
103 | | | | | |
104 | | | | | Let us start with this quote from the HTTP specification document |
105 | | | | | <URL:http://www.w3.org/pub/WWW/Protocols/>: |
106 | | | | | |
107 | | | | | =over 3 |
108 | | | | | |
109 | | | | | =item |
110 | | | | | |
111 | | | | | The HTTP protocol is based on a request/response paradigm. A client |
112 | | | | | establishes a connection with a server and sends a request to the |
113 | | | | | server in the form of a request method, URI, and protocol version, |
114 | | | | | followed by a MIME-like message containing request modifiers, client |
115 | | | | | information, and possible body content. The server responds with a |
116 | | | | | status line, including the message's protocol version and a success or |
117 | | | | | error code, followed by a MIME-like message containing server |
118 | | | | | information, entity meta-information, and possible body content. |
119 | | | | | |
120 | | | | | =back |
121 | | | | | |
122 | | | | | What this means to libwww-perl is that communication always take place |
123 | | | | | through these steps: First a I<request> object is created and |
124 | | | | | configured. This object is then passed to a server and we get a |
125 | | | | | I<response> object in return that we can examine. A request is always |
126 | | | | | independent of any previous requests, i.e. the service is stateless. |
127 | | | | | The same simple model is used for any kind of service we want to |
128 | | | | | access. |
129 | | | | | |
130 | | | | | For example, if we want to fetch a document from a remote file server, |
131 | | | | | then we send it a request that contains a name for that document and |
132 | | | | | the response will contain the document itself. If we access a search |
133 | | | | | engine, then the content of the request will contain the query |
134 | | | | | parameters and the response will contain the query result. If we want |
135 | | | | | to send a mail message to somebody then we send a request object which |
136 | | | | | contains our message to the mail server and the response object will |
137 | | | | | contain an acknowledgment that tells us that the message has been |
138 | | | | | accepted and will be forwarded to the recipient(s). |
139 | | | | | |
140 | | | | | It is as simple as that! |
141 | | | | | |
142 | | | | | |
143 | | | | | =head2 The Request Object |
144 | | | | | |
145 | | | | | The libwww-perl request object has the class name C<HTTP::Request>. |
146 | | | | | The fact that the class name uses C<HTTP::> as a |
147 | | | | | prefix only implies that we use the HTTP model of communication. It |
148 | | | | | does not limit the kind of services we can try to pass this I<request> |
149 | | | | | to. For instance, we will send C<HTTP::Request>s both to ftp and |
150 | | | | | gopher servers, as well as to the local file system. |
151 | | | | | |
152 | | | | | The main attributes of the request objects are: |
153 | | | | | |
154 | | | | | =over 3 |
155 | | | | | |
156 | | | | | =item * |
157 | | | | | |
158 | | | | | The B<method> is a short string that tells what kind of |
159 | | | | | request this is. The most common methods are B<GET>, B<PUT>, |
160 | | | | | B<POST> and B<HEAD>. |
161 | | | | | |
162 | | | | | =item * |
163 | | | | | |
164 | | | | | The B<uri> is a string denoting the protocol, server and |
165 | | | | | the name of the "document" we want to access. The B<uri> might |
166 | | | | | also encode various other parameters. |
167 | | | | | |
168 | | | | | =item * |
169 | | | | | |
170 | | | | | The B<headers> contain additional information about the |
171 | | | | | request and can also used to describe the content. The headers |
172 | | | | | are a set of keyword/value pairs. |
173 | | | | | |
174 | | | | | =item * |
175 | | | | | |
176 | | | | | The B<content> is an arbitrary amount of data. |
177 | | | | | |
178 | | | | | =back |
179 | | | | | |
180 | | | | | =head2 The Response Object |
181 | | | | | |
182 | | | | | The libwww-perl response object has the class name C<HTTP::Response>. |
183 | | | | | The main attributes of objects of this class are: |
184 | | | | | |
185 | | | | | =over 3 |
186 | | | | | |
187 | | | | | =item * |
188 | | | | | |
189 | | | | | The B<code> is a numerical value that indicates the overall |
190 | | | | | outcome of the request. |
191 | | | | | |
192 | | | | | =item * |
193 | | | | | |
194 | | | | | The B<message> is a short, human readable string that |
195 | | | | | corresponds to the I<code>. |
196 | | | | | |
197 | | | | | =item * |
198 | | | | | |
199 | | | | | The B<headers> contain additional information about the |
200 | | | | | response and describe the content. |
201 | | | | | |
202 | | | | | =item * |
203 | | | | | |
204 | | | | | The B<content> is an arbitrary amount of data. |
205 | | | | | |
206 | | | | | =back |
207 | | | | | |
208 | | | | | Since we don't want to handle all possible I<code> values directly in |
209 | | | | | our programs, a libwww-perl response object has methods that can be |
210 | | | | | used to query what kind of response this is. The most commonly used |
211 | | | | | response classification methods are: |
212 | | | | | |
213 | | | | | =over 3 |
214 | | | | | |
215 | | | | | =item is_success() |
216 | | | | | |
217 | | | | | The request was was successfully received, understood or accepted. |
218 | | | | | |
219 | | | | | =item is_error() |
220 | | | | | |
221 | | | | | The request failed. The server or the resource might not be |
222 | | | | | available, access to the resource might be denied or other things might |
223 | | | | | have failed for some reason. |
224 | | | | | |
225 | | | | | =back |
226 | | | | | |
227 | | | | | =head2 The User Agent |
228 | | | | | |
229 | | | | | Let us assume that we have created a I<request> object. What do we |
230 | | | | | actually do with it in order to receive a I<response>? |
231 | | | | | |
232 | | | | | The answer is that you pass it to a I<user agent> object and this |
233 | | | | | object takes care of all the things that need to be done |
234 | | | | | (like low-level communication and error handling) and returns |
235 | | | | | a I<response> object. The user agent represents your |
236 | | | | | application on the network and provides you with an interface that |
237 | | | | | can accept I<requests> and return I<responses>. |
238 | | | | | |
239 | | | | | The user agent is an interface layer between |
240 | | | | | your application code and the network. Through this interface you are |
241 | | | | | able to access the various servers on the network. |
242 | | | | | |
243 | | | | | The class name for the user agent is C<LWP::UserAgent>. Every |
244 | | | | | libwww-perl application that wants to communicate should create at |
245 | | | | | least one object of this class. The main method provided by this |
246 | | | | | object is request(). This method takes an C<HTTP::Request> object as |
247 | | | | | argument and (eventually) returns a C<HTTP::Response> object. |
248 | | | | | |
249 | | | | | The user agent has many other attributes that let you |
250 | | | | | configure how it will interact with the network and with your |
251 | | | | | application. |
252 | | | | | |
253 | | | | | =over 3 |
254 | | | | | |
255 | | | | | =item * |
256 | | | | | |
257 | | | | | The B<timeout> specifies how much time we give remote servers to |
258 | | | | | respond before the library disconnects and creates an |
259 | | | | | internal I<timeout> response. |
260 | | | | | |
261 | | | | | =item * |
262 | | | | | |
263 | | | | | The B<agent> specifies the name that your application should use when it |
264 | | | | | presents itself on the network. |
265 | | | | | |
266 | | | | | =item * |
267 | | | | | |
268 | | | | | The B<from> attribute can be set to the e-mail address of the person |
269 | | | | | responsible for running the application. If this is set, then the |
270 | | | | | address will be sent to the servers with every request. |
271 | | | | | |
272 | | | | | =item * |
273 | | | | | |
274 | | | | | The B<parse_head> specifies whether we should initialize response |
275 | | | | | headers from the E<lt>head> section of HTML documents. |
276 | | | | | |
277 | | | | | =item * |
278 | | | | | |
279 | | | | | The B<proxy> and B<no_proxy> attributes specify if and when to go through |
280 | | | | | a proxy server. <URL:http://www.w3.org/pub/WWW/Proxies/> |
281 | | | | | |
282 | | | | | =item * |
283 | | | | | |
284 | | | | | The B<credentials> provide a way to set up user names and |
285 | | | | | passwords needed to access certain services. |
286 | | | | | |
287 | | | | | =back |
288 | | | | | |
289 | | | | | Many applications want even more control over how they interact |
290 | | | | | with the network and they get this by sub-classing |
291 | | | | | C<LWP::UserAgent>. The library includes a |
292 | | | | | sub-class, C<LWP::RobotUA>, for robot applications. |
293 | | | | | |
294 | | | | | =head2 An Example |
295 | | | | | |
296 | | | | | This example shows how the user agent, a request and a response are |
297 | | | | | represented in actual perl code: |
298 | | | | | |
299 | | | | | # Create a user agent object |
300 | | | | | use LWP::UserAgent; |
301 | | | | | $ua = LWP::UserAgent->new; |
302 | | | | | $ua->agent("MyApp/0.1 "); |
303 | | | | | |
304 | | | | | # Create a request |
305 | | | | | my $req = HTTP::Request->new(POST => 'http://search.cpan.org/search'); |
306 | | | | | $req->content_type('application/x-www-form-urlencoded'); |
307 | | | | | $req->content('query=libwww-perl&mode=dist'); |
308 | | | | | |
309 | | | | | # Pass request to the user agent and get a response back |
310 | | | | | my $res = $ua->request($req); |
311 | | | | | |
312 | | | | | # Check the outcome of the response |
313 | | | | | if ($res->is_success) { |
314 | | | | | print $res->content; |
315 | | | | | } |
316 | | | | | else { |
317 | | | | | print $res->status_line, "\n"; |
318 | | | | | } |
319 | | | | | |
320 | | | | | The $ua is created once when the application starts up. New request |
321 | | | | | objects should normally created for each request sent. |
322 | | | | | |
323 | | | | | |
324 | | | | | =head1 NETWORK SUPPORT |
325 | | | | | |
326 | | | | | This section discusses the various protocol schemes and |
327 | | | | | the HTTP style methods that headers may be used for each. |
328 | | | | | |
329 | | | | | For all requests, a "User-Agent" header is added and initialized from |
330 | | | | | the $ua->agent attribute before the request is handed to the network |
331 | | | | | layer. In the same way, a "From" header is initialized from the |
332 | | | | | $ua->from attribute. |
333 | | | | | |
334 | | | | | For all responses, the library adds a header called "Client-Date". |
335 | | | | | This header holds the time when the response was received by |
336 | | | | | your application. The format and semantics of the header are the |
337 | | | | | same as the server created "Date" header. You may also encounter other |
338 | | | | | "Client-XXX" headers. They are all generated by the library |
339 | | | | | internally and are not received from the servers. |
340 | | | | | |
341 | | | | | =head2 HTTP Requests |
342 | | | | | |
343 | | | | | HTTP requests are just handed off to an HTTP server and it |
344 | | | | | decides what happens. Few servers implement methods beside the usual |
345 | | | | | "GET", "HEAD", "POST" and "PUT", but CGI-scripts may implement |
346 | | | | | any method they like. |
347 | | | | | |
348 | | | | | If the server is not available then the library will generate an |
349 | | | | | internal error response. |
350 | | | | | |
351 | | | | | The library automatically adds a "Host" and a "Content-Length" header |
352 | | | | | to the HTTP request before it is sent over the network. |
353 | | | | | |
354 | | | | | For a GET request you might want to add a "If-Modified-Since" or |
355 | | | | | "If-None-Match" header to make the request conditional. |
356 | | | | | |
357 | | | | | For a POST request you should add the "Content-Type" header. When you |
358 | | | | | try to emulate HTML E<lt>FORM> handling you should usually let the value |
359 | | | | | of the "Content-Type" header be "application/x-www-form-urlencoded". |
360 | | | | | See L<lwpcook> for examples of this. |
361 | | | | | |
362 | | | | | The libwww-perl HTTP implementation currently support the HTTP/1.1 |
363 | | | | | and HTTP/1.0 protocol. |
364 | | | | | |
365 | | | | | The library allows you to access proxy server through HTTP. This |
366 | | | | | means that you can set up the library to forward all types of request |
367 | | | | | through the HTTP protocol module. See L<LWP::UserAgent> for |
368 | | | | | documentation of this. |
369 | | | | | |
370 | | | | | |
371 | | | | | =head2 HTTPS Requests |
372 | | | | | |
373 | | | | | HTTPS requests are HTTP requests over an encrypted network connection |
374 | | | | | using the SSL protocol developed by Netscape. Everything about HTTP |
375 | | | | | requests above also apply to HTTPS requests. In addition the library |
376 | | | | | will add the headers "Client-SSL-Cipher", "Client-SSL-Cert-Subject" and |
377 | | | | | "Client-SSL-Cert-Issuer" to the response. These headers denote the |
378 | | | | | encryption method used and the name of the server owner. |
379 | | | | | |
380 | | | | | The request can contain the header "If-SSL-Cert-Subject" in order to |
381 | | | | | make the request conditional on the content of the server certificate. |
382 | | | | | If the certificate subject does not match, no request is sent to the |
383 | | | | | server and an internally generated error response is returned. The |
384 | | | | | value of the "If-SSL-Cert-Subject" header is interpreted as a Perl |
385 | | | | | regular expression. |
386 | | | | | |
387 | | | | | |
388 | | | | | =head2 FTP Requests |
389 | | | | | |
390 | | | | | The library currently supports GET, HEAD and PUT requests. GET |
391 | | | | | retrieves a file or a directory listing from an FTP server. PUT |
392 | | | | | stores a file on a ftp server. |
393 | | | | | |
394 | | | | | You can specify a ftp account for servers that want this in addition |
395 | | | | | to user name and password. This is specified by including an "Account" |
396 | | | | | header in the request. |
397 | | | | | |
398 | | | | | User name/password can be specified using basic authorization or be |
399 | | | | | encoded in the URL. Failed logins return an UNAUTHORIZED response with |
400 | | | | | "WWW-Authenticate: Basic" and can be treated like basic authorization |
401 | | | | | for HTTP. |
402 | | | | | |
403 | | | | | The library supports ftp ASCII transfer mode by specifying the "type=a" |
404 | | | | | parameter in the URL. It also supports transfer of ranges for FTP transfers |
405 | | | | | using the "Range" header. |
406 | | | | | |
407 | | | | | Directory listings are by default returned unprocessed (as returned |
408 | | | | | from the ftp server) with the content media type reported to be |
409 | | | | | "text/ftp-dir-listing". The C<File::Listing> module provides methods |
410 | | | | | for parsing of these directory listing. |
411 | | | | | |
412 | | | | | The ftp module is also able to convert directory listings to HTML and |
413 | | | | | this can be requested via the standard HTTP content negotiation |
414 | | | | | mechanisms (add an "Accept: text/html" header in the request if you |
415 | | | | | want this). |
416 | | | | | |
417 | | | | | For normal file retrievals, the "Content-Type" is guessed based on the |
418 | | | | | file name suffix. See L<LWP::MediaTypes>. |
419 | | | | | |
420 | | | | | The "If-Modified-Since" request header works for servers that implement |
421 | | | | | the MDTM command. It will probably not work for directory listings though. |
422 | | | | | |
423 | | | | | Example: |
424 | | | | | |
425 | | | | | $req = HTTP::Request->new(GET => 'ftp://me:passwd@ftp.some.where.com/'); |
426 | | | | | $req->header(Accept => "text/html, */*;q=0.1"); |
427 | | | | | |
428 | | | | | =head2 News Requests |
429 | | | | | |
430 | | | | | Access to the USENET News system is implemented through the NNTP |
431 | | | | | protocol. The name of the news server is obtained from the |
432 | | | | | NNTP_SERVER environment variable and defaults to "news". It is not |
433 | | | | | possible to specify the hostname of the NNTP server in news: URLs. |
434 | | | | | |
435 | | | | | The library supports GET and HEAD to retrieve news articles through the |
436 | | | | | NNTP protocol. You can also post articles to newsgroups by using |
437 | | | | | (surprise!) the POST method. |
438 | | | | | |
439 | | | | | GET on newsgroups is not implemented yet. |
440 | | | | | |
441 | | | | | Examples: |
442 | | | | | |
443 | | | | | $req = HTTP::Request->new(GET => 'news:abc1234@a.sn.no'); |
444 | | | | | |
445 | | | | | $req = HTTP::Request->new(POST => 'news:comp.lang.perl.test'); |
446 | | | | | $req->header(Subject => 'This is a test', |
447 | | | | | From => 'me@some.where.org'); |
448 | | | | | $req->content(<<EOT); |
449 | | | | | This is the content of the message that we are sending to |
450 | | | | | the world. |
451 | | | | | EOT |
452 | | | | | |
453 | | | | | |
454 | | | | | =head2 Gopher Request |
455 | | | | | |
456 | | | | | The library supports the GET and HEAD methods for gopher requests. All |
457 | | | | | request header values are ignored. HEAD cheats and returns a |
458 | | | | | response without even talking to server. |
459 | | | | | |
460 | | | | | Gopher menus are always converted to HTML. |
461 | | | | | |
462 | | | | | The response "Content-Type" is generated from the document type |
463 | | | | | encoded (as the first letter) in the request URL path itself. |
464 | | | | | |
465 | | | | | Example: |
466 | | | | | |
467 | | | | | $req = HTTP::Request->new(GET => 'gopher://gopher.sn.no/'); |
468 | | | | | |
469 | | | | | |
470 | | | | | |
471 | | | | | =head2 File Request |
472 | | | | | |
473 | | | | | The library supports GET and HEAD methods for file requests. The |
474 | | | | | "If-Modified-Since" header is supported. All other headers are |
475 | | | | | ignored. The I<host> component of the file URL must be empty or set |
476 | | | | | to "localhost". Any other I<host> value will be treated as an error. |
477 | | | | | |
478 | | | | | Directories are always converted to an HTML document. For normal |
479 | | | | | files, the "Content-Type" and "Content-Encoding" in the response are |
480 | | | | | guessed based on the file suffix. |
481 | | | | | |
482 | | | | | Example: |
483 | | | | | |
484 | | | | | $req = HTTP::Request->new(GET => 'file:/etc/passwd'); |
485 | | | | | |
486 | | | | | |
487 | | | | | =head2 Mailto Request |
488 | | | | | |
489 | | | | | You can send (aka "POST") mail messages using the library. All |
490 | | | | | headers specified for the request are passed on to the mail system. |
491 | | | | | The "To" header is initialized from the mail address in the URL. |
492 | | | | | |
493 | | | | | Example: |
494 | | | | | |
495 | | | | | $req = HTTP::Request->new(POST => 'mailto:libwww@perl.org'); |
496 | | | | | $req->header(Subject => "subscribe"); |
497 | | | | | $req->content("Please subscribe me to the libwww-perl mailing list!\n"); |
498 | | | | | |
499 | | | | | =head2 CPAN Requests |
500 | | | | | |
501 | | | | | URLs with scheme C<cpan:> are redirected to the a suitable CPAN |
502 | | | | | mirror. If you have your own local mirror of CPAN you might tell LWP |
503 | | | | | to use it for C<cpan:> URLs by an assignment like this: |
504 | | | | | |
505 | | | | | $LWP::Protocol::cpan::CPAN = "file:/local/CPAN/"; |
506 | | | | | |
507 | | | | | Suitable CPAN mirrors are also picked up from the configuration for |
508 | | | | | the CPAN.pm, so if you have used that module a suitable mirror should |
509 | | | | | be picked automatically. If neither of these apply, then a redirect |
510 | | | | | to the generic CPAN http location is issued. |
511 | | | | | |
512 | | | | | Example request to download the newest perl: |
513 | | | | | |
514 | | | | | $req = HTTP::Request->new(GET => "cpan:src/latest.tar.gz"); |
515 | | | | | |
516 | | | | | |
517 | | | | | =head1 OVERVIEW OF CLASSES AND PACKAGES |
518 | | | | | |
519 | | | | | This table should give you a quick overview of the classes provided by the |
520 | | | | | library. Indentation shows class inheritance. |
521 | | | | | |
522 | | | | | LWP::MemberMixin -- Access to member variables of Perl5 classes |
523 | | | | | LWP::UserAgent -- WWW user agent class |
524 | | | | | LWP::RobotUA -- When developing a robot applications |
525 | | | | | LWP::Protocol -- Interface to various protocol schemes |
526 | | | | | LWP::Protocol::http -- http:// access |
527 | | | | | LWP::Protocol::file -- file:// access |
528 | | | | | LWP::Protocol::ftp -- ftp:// access |
529 | | | | | ... |
530 | | | | | |
531 | | | | | LWP::Authen::Basic -- Handle 401 and 407 responses |
532 | | | | | LWP::Authen::Digest |
533 | | | | | |
534 | | | | | HTTP::Headers -- MIME/RFC822 style header (used by HTTP::Message) |
535 | | | | | HTTP::Message -- HTTP style message |
536 | | | | | HTTP::Request -- HTTP request |
537 | | | | | HTTP::Response -- HTTP response |
538 | | | | | HTTP::Daemon -- A HTTP server class |
539 | | | | | |
540 | | | | | WWW::RobotRules -- Parse robots.txt files |
541 | | | | | WWW::RobotRules::AnyDBM_File -- Persistent RobotRules |
542 | | | | | |
543 | | | | | Net::HTTP -- Low level HTTP client |
544 | | | | | |
545 | | | | | The following modules provide various functions and definitions. |
546 | | | | | |
547 | | | | | LWP -- This file. Library version number and documentation. |
548 | | | | | LWP::MediaTypes -- MIME types configuration (text/html etc.) |
549 | | | | | LWP::Debug -- Debug logging module |
550 | | | | | LWP::Simple -- Simplified procedural interface for common functions |
551 | | | | | HTTP::Status -- HTTP status code (200 OK etc) |
552 | | | | | HTTP::Date -- Date parsing module for HTTP date formats |
553 | | | | | HTTP::Negotiate -- HTTP content negotiation calculation |
554 | | | | | File::Listing -- Parse directory listings |
555 | | | | | HTML::Form -- Processing for <form>s in HTML documents |
556 | | | | | |
557 | | | | | |
558 | | | | | =head1 MORE DOCUMENTATION |
559 | | | | | |
560 | | | | | All modules contain detailed information on the interfaces they |
561 | | | | | provide. The I<lwpcook> manpage is the libwww-perl cookbook that contain |
562 | | | | | examples of typical usage of the library. You might want to take a |
563 | | | | | look at how the scripts C<lwp-request>, C<lwp-rget> and C<lwp-mirror> |
564 | | | | | are implemented. |
565 | | | | | |
566 | | | | | =head1 ENVIRONMENT |
567 | | | | | |
568 | | | | | The following environment variables are used by LWP: |
569 | | | | | |
570 | | | | | =over |
571 | | | | | |
572 | | | | | =item HOME |
573 | | | | | |
574 | | | | | The C<LWP::MediaTypes> functions will look for the F<.media.types> and |
575 | | | | | F<.mime.types> files relative to you home directory. |
576 | | | | | |
577 | | | | | =item http_proxy |
578 | | | | | |
579 | | | | | =item ftp_proxy |
580 | | | | | |
581 | | | | | =item xxx_proxy |
582 | | | | | |
583 | | | | | =item no_proxy |
584 | | | | | |
585 | | | | | These environment variables can be set to enable communication through |
586 | | | | | a proxy server. See the description of the C<env_proxy> method in |
587 | | | | | L<LWP::UserAgent>. |
588 | | | | | |
589 | | | | | =item PERL_LWP_USE_HTTP_10 |
590 | | | | | |
591 | | | | | Enable the old HTTP/1.0 protocol driver instead of the new HTTP/1.1 |
592 | | | | | driver. You might want to set this to a TRUE value if you discover |
593 | | | | | that your old LWP applications fails after you installed LWP-5.60 or |
594 | | | | | better. |
595 | | | | | |
596 | | | | | =item PERL_HTTP_URI_CLASS |
597 | | | | | |
598 | | | | | Used to decide what URI objects to instantiate. The default is C<URI>. |
599 | | | | | You might want to set it to C<URI::URL> for compatibility with old times. |
600 | | | | | |
601 | | | | | =back |
602 | | | | | |
603 | | | | | =head1 AUTHORS |
604 | | | | | |
605 | | | | | LWP was made possible by contributions from Adam Newby, Albert |
606 | | | | | Dvornik, Alexandre Duret-Lutz, Andreas Gustafsson, Andreas König, |
607 | | | | | Andrew Pimlott, Andy Lester, Ben Coleman, Benjamin Low, Ben Low, Ben |
608 | | | | | Tilly, Blair Zajac, Bob Dalgleish, BooK, Brad Hughes, Brian |
609 | | | | | J. Murrell, Brian McCauley, Charles C. Fu, Charles Lane, Chris Nandor, |
610 | | | | | Christian Gilmore, Chris W. Unger, Craig Macdonald, Dale Couch, Dan |
611 | | | | | Kubb, Dave Dunkin, Dave W. Smith, David Coppit, David Dick, David |
612 | | | | | D. Kilzer, Doug MacEachern, Edward Avis, erik, Gary Shea, Gisle Aas, |
613 | | | | | Graham Barr, Gurusamy Sarathy, Hans de Graaff, Harald Joerg, Harry |
614 | | | | | Bochner, Hugo, Ilya Zakharevich, INOUE Yoshinari, Ivan Panchenko, Jack |
615 | | | | | Shirazi, James Tillman, Jan Dubois, Jared Rhine, Jim Stern, Joao |
616 | | | | | Lopes, John Klar, Johnny Lee, Josh Kronengold, Josh Rai, Joshua |
617 | | | | | Chamas, Joshua Hoblitt, Kartik Subbarao, Keiichiro Nagano, Ken |
618 | | | | | Williams, KONISHI Katsuhiro, Lee T Lindley, Liam Quinn, Marc Hedlund, |
619 | | | | | Marc Langheinrich, Mark D. Anderson, Marko Asplund, Mark Stosberg, |
620 | | | | | Markus B Krüger, Markus Laker, Martijn Koster, Martin Thurn, Matthew |
621 | | | | | Eldridge, Matthew.van.Eerde, Matt Sergeant, Michael A. Chase, Michael |
622 | | | | | Quaranta, Michael Thompson, Mike Schilli, Moshe Kaminsky, Nathan |
623 | | | | | Torkington, Nicolai Langfeldt, Norton Allen, Olly Betts, Paul |
624 | | | | | J. Schinder, peterm, Philip GuentherDaniel Buenzli, Pon Hwa Lin, |
625 | | | | | Radoslaw Zielinski, Radu Greab, Randal L. Schwartz, Richard Chen, |
626 | | | | | Robin Barker, Roy Fielding, Sander van Zoest, Sean M. Burke, |
627 | | | | | shildreth, Slaven Rezic, Steve A Fink, Steve Hay, Steven Butler, |
628 | | | | | Steve_Kilbane, Takanori Ugai, Thomas Lotterer, Tim Bunce, Tom Hughes, |
629 | | | | | Tony Finch, Ville Skyttä, Ward Vandewege, William York, Yale Huang, |
630 | | | | | and Yitzchak Scott-Thoennes. |
631 | | | | | |
632 | | | | | LWP owes a lot in motivation, design, and code, to the libwww-perl |
633 | | | | | library for Perl4 by Roy Fielding, which included work from Alberto |
634 | | | | | Accomazzi, James Casey, Brooks Cutter, Martijn Koster, Oscar |
635 | | | | | Nierstrasz, Mel Melchner, Gertjan van Oosten, Jared Rhine, Jack |
636 | | | | | Shirazi, Gene Spafford, Marc VanHeyningen, Steven E. Brenner, Marion |
637 | | | | | Hakanson, Waldemar Kebsch, Tony Sanders, and Larry Wall; see the |
638 | | | | | libwww-perl-0.40 library for details. |
639 | | | | | |
640 | | | | | =head1 COPYRIGHT |
641 | | | | | |
642 | | | | | Copyright 1995-2005, Gisle Aas |
643 | | | | | Copyright 1995, Martijn Koster |
644 | | | | | |
645 | | | | | This library is free software; you can redistribute it and/or |
646 | | | | | modify it under the same terms as Perl itself. |
647 | | | | | |
648 | | | | | =head1 AVAILABILITY |
649 | | | | | |
650 | | | | | The latest version of this library is likely to be available from CPAN |
651 | | | | | as well as: |
652 | | | | | |
653 | | | | | http://www.linpro.no/lwp/ |
654 | | | | | |
655 | | | | | The best place to discuss this code is on the <libwww@perl.org> |
656 | | | | | mailing list. |
657 | | | | | |
658 | | | | | =cut |