scrape2rss.pl - extract information as RSS (well, Atom) feed
This is a simple program to extract data from HTML by specifying CSS3 or XPath selectors.
scrape2rss.pl URL OPTIONS scrape2rss.pl http://conferences.yapceurope.org/gpw2011/news --feed-title "GPW 2011 Atom Feed" --title "h3 a" --summary "h3+p+p" --permalink "h3 a@href" --date "h3+p em" --date-fmt "%d/%m/%y %H:%M" -o gpw2011.de.atom
This program fetches an HTML page and creates an RSS feed from it. The elements that are turned into the RSS feed are specified as CSS or XPath selectors.
If the URL is -, input will be read from STDIN.
-
Selector for the entry title
Selector for the entry summary
Selector for the entry permalink
Selector for the pagination links to follow
Selector for the entry publication date
sprintf format that the entry publication date is in for conversion into a proper Atom timestamp
sprintf
Name of the output file
Default is STDOUT
Output information in clear text
The public repository of this module is http://github.com/Corion/App-scrape.
The public support forum of this program is http://perlmonks.org/.
Max Maischein corion@cpan.org
corion@cpan.org
Copyright 2011-2011 by Max Maischein corion@cpan.org.
This module is released under the same terms as Perl itself.
To install App::scrape, copy and paste the appropriate command in to your terminal.
cpanm
cpanm App::scrape
CPAN shell
perl -MCPAN -e shell install App::scrape
For more information on module installation, please visit the detailed CPAN module installation guide.