Encode::Detect::CJK - A Charset Detector, optimized for EastAsia charset and website content
use Encode::Detect::CJK; #just use use Encode::Detect::CJK qw(detect); #use and export function #simple use it my $charset=CharsetDetector::detect($octets); #use it with advanced option my $charset = CharsetDetector::detect($octets,$max_len,$is_consider_html_head_charset); #return the charset of binary string $octets #$max_len if $octets 's size is big, will make detect slow, sometimes you need specify $max_len for detect,null is for DEFAULT(unlimit max_len) #$is_consider_html_header_charset, by DEFAULT, detetor will consider # html header (e.g. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> ) as a factor to detect charset, # if you don't want detetor to consider html header as a factor, set $is_consider_html_header_charset to "" or 0
$charset=CharsetDetector::detect($octets,$max_len,$is_consider_html_head_charset); $charset=CharsetDetector::detect($octets,$max_len);#CharsetDetector::detect($octets,$max_len,1); $charset=CharsetDetector::detect($octets);#same as CharsetDetector::detect($octets,undef);
input binary string
if $octets 's size is big, will make detect slow, sometimes you need specify $max_len for detect,null is for DEFAULT(unlimit max_len) DEFAULT is unlimit
by DEFAULT, detetor will consider html header (e.g. <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> ) as a factor to detect charset, if you don't want detetor to consider html header as a factor, set $is_consider_html_header_charset to "" or 0
if $octets is null return '' if $octets is '' return 'iso-8859-1' else return charset name
return value: alias ascii : ascii iso-8859-1 : iso-8859-1 utf8 : utf8 utf-8-strict utf16 : utf16 cp936 : euc-cn(gb2312) cp936(gbk) gb18030 big5-eten : big5-eten euc-jp : euc-jp shiftjis : shiftjis iso-2022-jp : iso-2022-jp euc-kr : euc-kr iso-2022-kr : iso-2022-kr
The CharsetDetector module is Copyright (c) 2003-2008 QIAN YU. All rights reserved.
You may distribute under the terms of either the GNU General Public License or the Artistic License, as specified in the Perl README file.
To install Encode::Detect::CJK, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Encode::Detect::CJK
CPAN shell
perl -MCPAN -e shell install Encode::Detect::CJK
For more information on module installation, please visit the detailed CPAN module installation guide.