@<Biblio::Document::Parser::Utils> - utility module for handling International characters and document conversion
Biblio::Document::Parser::Utils provides some utility functions for handling international characters and for conversion of documents to plaintext.
use Biblio::Document::Parser::Utils qw( normalise_multichars ); print normalise_multichars( $str );
Convert multi-char international characters into single UTF-8 chars, e.g.: ¨o => ö These appear in pdftotext output from PDFs generated by pdflatex.
This function takes either a filename or a URL as a parameter, and aims to return a string containing the lines in the file. A hash of converters is provided in ParaTools/Utils.pm, which should be customised for your system.
For URLs, the file is first downloaded to a temporary directory, then converted, whereas local files are copied straight into the temporary directory. For this reason, some care should be taken when handling very large files.
Simple function to convert a string into an encoded URL (i.e. spaces to %20, etc). Takes the unencoded URL as a parameter, and returns the encoded version.
Tim Brody <tdb01r@ecs.soton.ac.uk> Mike Jewell <moj@ecs.soton.ac.uk> (packaging)
1 POD Error
The following errors were encountered while parsing the POD:
Non-ASCII character seen before =encoding in '¨o'. Assuming UTF-8
To install Biblio::Document::Parser, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Biblio::Document::Parser
CPAN shell
perl -MCPAN -e shell install Biblio::Document::Parser
For more information on module installation, please visit the detailed CPAN module installation guide.