Biblio::Isis - Read CDS/ISIS, WinISIS and IsisMarc database
use Biblio::Isis; my $isis = new Biblio::Isis( isisdb => './cds/cds', ); for(my $mfn = 1; $mfn <= $isis->count; $mfn++) { print $isis->to_ascii($mfn),"\n"; }
This module will read ISIS databases created by DOS CDS/ISIS, WinIsis or IsisMarc. It can be used as perl-only alternative to OpenIsis module which seems to depriciate it's old XS bindings for perl.
XS
It can create hash values from data in ISIS database (using to_hash), ASCII dump (using to_ascii) or just hash with field names and packed values (like ^asomething^belse).
to_hash
to_ascii
^asomething^belse
Unique feature of this module is ability to include_deleted records. It will also skip zero sized fields (OpenIsis has a bug in XS bindings, so fields which are zero sized will be filled with random junk from memory).
include_deleted
It also has support for identifiers (only if ISIS database is created by IsisMarc), see to_hash.
This module will always be slower than OpenIsis module which use C library. However, since it's written in perl, it's platform independent (so you don't need C compiler), and can be easily modified. I hope that it creates data structures which are easier to use than ones created by OpenIsis, so reduced time in other parts of the code should compensate for slower performance of this module (speed of reading ISIS database is rarely an issue).
Open ISIS database
my $isis = new Biblio::Isis( isisdb => './cds/cds', read_fdt => 1, include_deleted => 1, hash_filter => sub { my ($v,$field_number) = @_; $v =~ s#foo#bar#g; }, debug => 1, join_subfields_with => ' ; ', );
Options are described below:
This is full or relative path to ISIS database files which include common prefix of .MST, and .XRF and optionally .FDT (if using read_fdt option) files.
.MST
.XRF
.FDT
read_fdt
In this example it uses ./cds/cds.MST and related files.
./cds/cds.MST
Boolean flag to specify if field definition table should be read. It's off by default.
Don't skip logically deleted records in ISIS.
Filter code ref which will be used before data is converted to hash. It will receive two arguments, whole line from current field (in $_[0]) and field number (in $_[1]).
$_[0]
$_[1]
Dump a lot of debugging output even at level 1. For even more increase level.
Define delimiter which will be used to join repeatable subfields. This option is included to support lagacy application written against version older than 0.21 of this module. By default, it disabled. See "to_hash".
Remove all empty subfields while reading from ISIS file.
Return number of records in database
print $isis->count;
Read record with selected MFN
my $rec = $isis->fetch(55);
Returns hash with keys which are field names and values are unpacked values for that field like this:
$rec = { '210' => [ '^aNew York^cNew York University press^dcop. 1988' ], '990' => [ '2140', '88', 'HAY' ], };
Returns current MFN position
my $mfn = $isis->mfn;
Returns ASCII output of record with specified MFN
print $isis->to_ascii(42);
This outputs something like this:
210 ^aNew York^cNew York University press^dcop. 1988 990 2140 990 88 990 HAY
If read_fdt is specified when calling new it will display field names from .FDT file instead of numeric tags.
new
Read record with specified MFN and convert it to hash
my $hash = $isis->to_hash($mfn);
It has ability to convert characters (using hash_filter) from ISIS database before creating structures enabling character re-mapping or quick fix-up of data.
hash_filter
This function returns hash which is like this:
$hash = { '210' => [ { 'c' => 'New York University press', 'a' => 'New York', 'd' => 'cop. 1988' } ], '990' => [ '2140', '88', 'HAY' ], };
You can later use that hash to produce any output from ISIS data.
If database is created using IsisMarc, it will also have to special fields which will be used for identifiers, i1 and i2 like this:
i1
i2
'200' => [ { 'i1' => '1', 'i2' => ' ' 'a' => 'Goa', 'f' => 'Valdo D\'Arienzo', 'e' => 'tipografie e tipografi nel XVI secolo', } ],
In case there are repeatable subfields in record, this will create following structure:
'900' => [ { 'a' => [ 'foo', 'bar', 'baz' ], }]
Or in more complex example of
902 ^aa1^aa2^aa3^bb1^aa4^bb2^cc1^aa5
it will create
902 => [ { a => ["a1", "a2", "a3", "a4", "a5"], b => ["b1", "b2"], c => "c1" }, ],
This behaviour can be changed using join_subfields_with option to "new", in which case to_hash will always create single value for each subfield. This will change result to:
join_subfields_with
This method will also create additional field 000 with MFN.
000
There is also more elaborative way to call to_hash like this:
my $hash = $isis->to_hash({ mfn => 42, include_subfields => 1, });
Each option controll creation of hash:
Specify MFN number of record
This option will create additional key in hash called subfields which will have original record subfield order and index to that subfield like this:
subfields
902 => [ { a => ["a1", "a2", "a3", "a4", "a5"], b => ["b1", "b2"], c => "c1", subfields => ["a", 0, "a", 1, "a", 2, "b", 0, "a", 3, "b", 1, "c", 0, "a", 4], } ],
Define delimiter which will be used to join repeatable subfields. You can specify option here instead in "new" if you want to have per-record control.
You can override hash_filter defined in "new" using this option.
Return name of selected tag
print $isis->tag_name('200');
Read content of .CNT file and return hash containing it.
.CNT
print Dumper($isis->read_cnt);
This function is not used by module (.CNT files are not required for this module to work), but it can be useful to examine your index (while debugging for example).
Unpack one of two 26 bytes fixed length record in .CNT file.
Here is definition of record:
off key description size 0: IDTYPE BTree type s 2: ORDN Nodes Order s 4: ORDF Leafs Order s 6: N Number of Memory buffers for nodes s 8: K Number of buffers for first level index s 10: LIV Current number of Index Levels s 12: POSRX Pointer to Root Record in N0x l 16: NMAXPOS Next Available position in N0x l 20: FMAXPOS Next available position in L0x l 24: ABNORMAL Formal BTree normality indicator s length: 26 bytes
This will fill $self object under cnt with hash. It's used by read_cnt.
$self
cnt
read_cnt
Some parts of CDS/ISIS documentation are not detailed enough to exmplain some variations in input databases which has been tested with this module. When I was in doubt, I assumed that OpenIsis's implementation was right (except for obvious bugs).
However, every effort has been made to test this module with as much databases (and programs that create them) as possible.
I would be very greatful for success or failure reports about usage of this module with databases from programs other than WinIsis and IsisMarc. I had tested this against ouput of one isis.dll-based application, but I don't know any details about it's version.
isis.dll
As this is young module, new features are added in subsequent version. It's a good idea to specify version when using this module like this:
use Biblio::Isis 0.23
Below is list of changes in specific version of module (so you can target older versions if you really have to):
Added ignore_empty_subfields
ignore_empty_subfields
Added hash_filter to "to_hash"
Fixed bug with documented join_subfields_with in "new" which wasn't implemented
Added field number when calling hash_filter
Added join_subfields_with to "new" and "to_hash".
Added include_subfields to "to_hash".
include_subfields
Added $isis->mfn, support for repeatable subfields and $isis->to_hash({ mfn => 42, ... }) calling convention
$isis->mfn
$isis->to_hash({ mfn => 42, ... })
Dobrica Pavlinusic CPAN ID: DPAVLIN dpavlin@rot13.org http://www.rot13.org/~dpavlin/
This module is based heavily on code from LIBISIS.PHP library to read ISIS files V0.1.1 written in php and (c) 2000 Franck Martin <franck@sopac.org> and released under LGPL.
LIBISIS.PHP
This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
The full text of the license can be found in the LICENSE file included with this module.
Biblio::Isis::Manual for CDS/ISIS manual appendix F, G and H which describe file format
OpenIsis web site http://www.openisis.org
perl4lib site http://perl4lib.perl.org
To install Biblio::Isis, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Biblio::Isis
CPAN shell
perl -MCPAN -e shell install Biblio::Isis
For more information on module installation, please visit the detailed CPAN module installation guide.