# # LWP (Library for WWW in Perl)
APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

LWP (Library for WWW in Perl)

I've removed advertising from most of this site and will eventually clean up the few pages where it remains.

While not terribly expensive to maintain, this does cost me something. If I don't get enough donations to cover that expense, I will be shutting the site down in early 2020.

If you found something useful today, please consider a small donation.



Some material is very old and may be incorrect today

© May 2005 Tony Lawrence

If you want to automatically process web pages to extract data, you have a number of tools available. You can bring a web page down to your computer using "curl" or "wget"


curl http:.//aplawrence.com > mysite
 

If you don't really want the html, use "lynx --dump http://whatever.com > /yourstorage/whatever.txt" to get a text representation of the page. Check the man page for options you might want like "--nolist".

You can also easily be selective and pull only the data you want from a page with simple Perl scripts.

#!/usr/bin/perl
use LWP::Simple;  
$url = 'http://aplawrence.com";   
$content = get $url;     
print $content;
 

And then of course you'd process the $content as desired. It's only a little more complex if you are dealing with forms.

A book that covers LWP is reviewed at /Books/webc.html.


If you found something useful today, please consider a small donation.



Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> LWP (Library for WWW in Perl)


Inexpensive and informative Apple related e-books:

Take Control of iCloud

Take Control of Pages

Take Control of Parallels Desktop 12

Take Control of OS X Server

Take Control of Numbers





More Articles by © Tony Lawrence





Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





Anyone who slaps a 'this page is best viewed with Browser X' label on a Web page appears to be yearning for the bad old days, before the Web, when you had very little chance of reading a document written on another computer, another word processor, or another network. (Tim Berners-Lee)




Linux posts

Troubleshooting posts


This post tagged:

Perl

Web/HTML



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode