HTML::SummaryBasic - Basic summary info from HTML.
use HTML::SummaryBasic; my $p = new HTML::SummaryBasic { PATH => "input.html", # or HTML => '<html>...</html>', NOT_AVAILABLE => undef, }; foreach (keys %{$p->{SUMMARY}}){ warn "$_ ... $p->{SUMMARY}->{$_}\n"; }
use HTML::TokeParser; use HTML::HeadParser;
From a file or string of HTML, creates a hash of useful summary information from meta and body elements of an HTML document.
meta
body
Value for empty fields. Default is [Not Available]. May be over-ridden directly by supplying the constructor with a field of the same name. See "THE SUMMARY STRUCTURE".
[Not Available]
Accepts a hash-like structure...
Ref to a scalar of HTML, or plain string that is the path to an HTML file to process.
Filled after get_summary is called (see "METHOD get_summary" and "THE SUMMARY STRUCTURE").
get_summary
An array of meta tag names whose content value should be placed into the respective slots of the SUMMARY field after get_summary has been called.
name
content
SUMMARY
A field of the object which is a hash, with key/values as follows:
HTML meta tag X-META-AUTHOR.
X-META-AUTHOR
Text of the element of the same name.
Content of the meta tag named X-META-DESCRIPTION.
X-META-DESCRIPTION
Time since of the modification of the file, respectively according to any meta tag of the same name, with a X-META- prefix; failing that, according to the file system.
X-META-
As above, but relating to the creation date of the file.
The first HTML p element of the document.
p
The first h1 tag; failing that, the first h2; failing that, the value of $NOT_AVAILABLE.
h1
h2
$NOT_AVAILABLE
Any meta-fields specified in the FIELDS field.
FIELDS
Maybe work on URI as well as file paths.
HTML::TokeParser, HTML::HeadParser.
Lee Goddard (LGoddard@CPAN.org)
Copyright 2000-2001 Lee Goddard.
This library is free software; you may use and redistribute it or modify it undef the same terms as Perl itself.
2 POD Errors
The following errors were encountered while parsing the POD:
'=item' outside of any '=over'
You forgot a '=back' before '=head1'
To install HTML::SummaryBasic, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML::SummaryBasic
CPAN shell
perl -MCPAN -e shell install HTML::SummaryBasic
For more information on module installation, please visit the detailed CPAN module installation guide.