RTF::HTMLConverter - Converter from RTF format to HTML.
use XML::GDOME; use RTF::HTMLConverter; my $parser = RTF::HTMLConverter->new(in => 'test.rtf', out => 'test.html'); $parser->parse(); use XML::DOM; use RTF::HTMLConverter; open my $in, 'test.rtf' or die; my $parser = RTF::HTMLConverter->new( in => $in, out => 'test.html', DOMImplementation => 'XML::DOM', image_uri => "http://somewhere.net/images", codepage => 'iso-8859-1', ); $parser->parse(); use XML::GDOME; use RTF::HTMLConverter; my $html = ''; my $parser = RTF::HTMLConverter->new( in => 'test.rtf', out => \$html, discard_images => 1, ); $parser->parse();
RTF::HTMLConverter is a high-level RTF to HTML format converter. It is based on the low-level RTF parser module RTF::Lexer. Additionally, it requires the W3C's DOM implementation and it is known to work with either XML::DOM or XML::GDOME.
The constructor. The following parameters are recognized:
Input file handle or a file name. Default value is \*STDIN. See RTF::Lexer for more information.
\*STDIN
RTF::Lexer
Output file handler or file name or scalar reference. If this parameter is a string it is treated as a file name and the constructor tries to open that file. If that file already exists, it is truncated. In the case of failure while opening the file an exception is thrown. If this parameter is a scalar reference the resulting html is stored in that scalar.
The DOM implementation module name. Supported values are XML::DOM and XML::GDOME. The default value is XML::GDOME.
XML::DOM
XML::GDOME
The charset of the resulted html-document. By default is utf8. This parameter is recognized only if DOMImplementation is XML::GDOME.
utf8
The formatting of the resulted html-document. This parameter is recognized only if DOMImplementation is XML::GDOME. Possible values are: GDOME_SAVE_STANDARD and GDOME_SAVE_LIBXML_INDENT. See XML::GDOME::Document for more information. Default value is GDOME_SAVE_LIBXML_INDENT.
GDOME_SAVE_STANDARD
GDOME_SAVE_LIBXML_INDENT
XML::GDOME::Document
A reference to an array ($name, $publicId, $systemId) if DOMImplementation is XML::GDOME or ($name, $systemId, $publicId) if DOMImplementation is XML::DOM. Default values are:
$name
$publicId
$systemId
HTML
-//W3C//DTD HTML 4.01 Transitional//EN
http://www.w3.org/TR/html4/loose.dtd
Being set, this parameter disables any image processing. By default it is unset.
The string that being concatenated with the image name gives this image's URL. Default value is empty string.
A directory name where the images are generated. Default value is empty string which means the current directory.
The pattern for generating image names from there number. Default value is img%d.
img%d
A path to ImageMagick's convert utility. Default value is simply convert assuming it is in one of the $ENV{PATH} directories.
convert
A path to ImageMagick's mogrify utility. If the value is undef or the specified file does not exists, the images extracted from RTF will not be scaled. Default value is mogrify.
mogrify
undef
A path to libwmf's wmf2eps utility. If the value is undef or the specified file does not exists, the WMF-images will not be extracted from RTF. Default value is wmf2eps.
wmf2eps
The display resolution in dpi. Default value is 100.
Parses the input RTF stream until the end of file.
RTF::Lexer, Rich Text Format (RTF) Specification (version 1.7), The_RTF_Cookbook, RTF::Parser, RTF::Tokenizer.
The symbols that absent in Unicode character set will be displayed incorrectly.
The images that are stored in RTF file in WMF format may be scaled incorrectly.
The text in WMF images in non-ASCII charset may be displayed incorrectly.
And there should be lots of unknown bugs;)
Vadim O. Ustiansky <ustiansky@cpan.org>
To install RTF::HTMLConverter, copy and paste the appropriate command in to your terminal.
cpanm
cpanm RTF::HTMLConverter
CPAN shell
perl -MCPAN -e shell install RTF::HTMLConverter
For more information on module installation, please visit the detailed CPAN module installation guide.