xpathify - output HTML document as a flat XPath/content list
version 0.019
xpathify [options] (HTML file | URL | -)
Represents a typical HTML document in a very verbose two-column mode. The first column is a XPath which locates each element inside the HTML tree. The second column is a respective content (if any).
/html/head/title/text() test 1 /html/body/h1/text() test 2 /html/body/p[1]/text() Lorem ipsum dolor sit amet, consectetur adipiscing elit.
This.
Specify the HTML document encoding (latin1, utf8). UTF-8 is assumed by default.
latin1
utf8
Enable syntax highlight for XPath. By default, enabled automatically on interactive terminals.
Use 16 system colors. By default, try to use 256-color ANSI palette.
Disables the --color option and highlights using HTML/CSS.
--color
Shrink the XPath to the minimal unique identifier. For example:
/html/body[@id='cpansearch']/form[@class='searchbox']/input[@name='query']
Could be shortened as:
//input[@name='query']
The shrinking is enabled by default.
Strict mode disables grouping by id, class or name attributes. The grouping is enabled by default.
id
class
name
Print XPath weight on a second column.
xpathify http://metacpan.org curl http://www.msn.com | xpathify -c --strict - xpathify --nocolor --noshrink t/test.html
Stanislaw Pusep <stas@sysd.org>
This software is copyright (c) 2014 by Stanislaw Pusep.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install HTML::Untemplate, copy and paste the appropriate command in to your terminal.
cpanm
cpanm HTML::Untemplate
CPAN shell
perl -MCPAN -e shell install HTML::Untemplate
For more information on module installation, please visit the detailed CPAN module installation guide.