HTML::StripScripts::Parser - XSS filter using HTML::Parser
use HTML::StripScripts::Parser(); my $hss = HTML::StripScripts::Parser->new( { Context => 'Document', ## HTML::StripScripts configuration Rules => { ... }, }, strict_comment => 1, ## HTML::Parser options strict_names => 1, ); $hss->parse_file("foo.html"); print $hss->filtered_document; OR print $hss->filter_html($html);
This class provides an easy interface to HTML::StripScripts, using HTML::Parser to parse the HTML.
HTML::StripScripts
HTML::Parser
See HTML::Parser for details of how to customise how the raw HTML is parsed into tags, and HTML::StripScripts for details of how to customise the way those tags are filtered.
Creates a new HTML::StripScripts::Parser object.
HTML::StripScripts::Parser
The CONFIG parameter has the same semantics as the CONFIG parameter to the HTML::StripScripts constructor.
Any PARSER_OPTIONS supplied will be passed on to the HTML::Parser init method, allowing you to influence the way the input is parsed.
You cannot use PARSER_OPTIONS to set the HTML::Parser event handlers (see "Events" in HTML::Parser) since HTML::StripScripts::Parser uses all of the event hooks itself. However, you can use Rules (see "Rules" in HTML::StripScripts) to customise the handling of all tags and attributes.
Rules
See HTML::Parser for input methods, HTML::StripScripts for output methods.
filter_html()
filter_html() is a convenience method for filtering HTML already loaded into a scalar variable. It combines calls to HTML::Parser::parse(), HTML::Parser::eof() and HTML::StripScripts::filtered_document().
HTML::Parser::parse()
HTML::Parser::eof()
HTML::StripScripts::filtered_document()
$filtered_html = $hss->filter_html($html);
The HTML::StripScripts::Parser class is subclassable. Filter objects are plain hashes. The hss_init() method takes the same arguments as new(), and calls the initialization methods of both HTML::StripScripts and HTML::Parser.
See "SUBCLASSING" in HTML::StripScripts and "SUBCLASSING" in HTML::Parser.
HTML::StripScripts, HTML::Parser, HTML::StripScripts::LibXML
None reported.
Please report any bugs or feature requests to bug-html-stripscripts-parser@rt.cpan.org, or through the web interface at http://rt.cpan.org.
Original author Nick Cleaton <nick@cleaton.net>
New code added and module maintained by Clinton Gormley <clint@traveljury.com>
Copyright (C) 2003 Nick Cleaton. All Rights Reserved.
Copyright (C) 2007 Clinton Gormley. All Rights Reserved.
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
To install HTML::StripScripts::Parser, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML::StripScripts::Parser
CPAN shell
perl -MCPAN -e shell install HTML::StripScripts::Parser
For more information on module installation, please visit the detailed CPAN module installation guide.