MCDB_File - Perl extension for access to mcdb constant databases
use MCDB_File (); tie %mcdb, 'MCDB_File', 'file.mcdb' or die "tie failed: $!\n"; $value = $mcdb{$key}; $num_records = scalar $mcdb; untie %mcdb; use MCDB_File (); eval { my $mcdb_make = new MCDB_File::Make('t.mcdb') or die "create t.mcdb failed: $!\n"; $mcdb_make->insert('key1', 'value1'); $mcdb_make->insert('key2' => 'value2', 'key3' => 'value3'); $mcdb_make->insert(%t); $mcdb_make->finish; } or ($@ ne "" and warn "$@"); use MCDB_File (); eval { MCDB_File::Make::create $file, %t; } or ($@ ne "" and warn "$@");
MCDB_File is a module which provides a Perl interface to mcdb. mcdb is originally based on Dan Bernstein's cdb package.
mcdb - fast, reliable, simple code to create, read constant databases
After the tie shown above, accesses to %h will refer to the mcdb file file.mcdb, as described in "tie" in perlfunc.
tie
%h
file.mcdb
keys, values, and each can be used to iterate through records. Note that only one iteration loop can be in progress at any one time. Performing multiple iterations at the same time (i.e. in nested loops) will not have independent iterators and therefore should be avoided. Note that it is safe to use the find('key') method while iterating. See PERFORMANCE section below for sample usage.
keys
values
each
A mcdb file is created in three steps. First call new MCDB_File::Make($fname), where $fname is the name of the database file to be created. Secondly, call the insert method once for each (key, value) pair. Finally, call the finish method to complete the creation. A temporary file is used during mcdb creation and atomically renamed to $fname when finish method is successful.
new MCDB_File::Make($fname)
$fname
insert
finish
Alternatively, call the insert() method with multiple key/value pairs. This can be significantly faster because there is less crossing over the bridge from perl to C code. One simple way to do this is to pass in an entire hash, as in: $mcdb_make->insert(%hash);.
insert()
$mcdb_make->insert(%hash);
A simpler interface to cdb file creation is provided by MCDB_File::Make::create $fname, %t. This creates a mcdb file named $fname containing the contents of %t.
MCDB_File::Make::create $fname, %t
%t
These are all complete programs.
1. Use $mcdb->find('key') method to look up a 'key' in an mcdb.
use MCDB_File (); $mcdb = tie %h, MCDB_File, "$file.mcdb" or die ...; $value = $mcdb->find('key'); # slightly faster than $value = $h{key}; undef $mcdb; untie %h;
2. Convert a Berkeley DB (B-tree) database to mcdb format.
use MCDB_File (); use DB_File; tie %h, DB_File, $ARGV[0], O_RDONLY, undef, $DB_BTREE or die "$0: can't tie to $ARGV[0]: $!\n"; MCDB_File::Make::create $ARGV[1], %h; # croak()s if error
3. Convert a flat file to mcdb format. In this example, the flat file consists of one key per line, separated by a colon from the value. Blank lines and lines beginning with # are skipped.
use MCDB_File; eval { my $mcdb = new MCDB_File::Make("data.mcdb") or die "$0: new MCDB_File::Make failed: $!\n"; while (<>) { next if /^$/ or /^#/; chomp; ($k, $v) = split /:/, $_, 2; if (defined $v) { $mcdb->insert($k, $v); } else { warn "bogus line: $_\n"; } } $mcdb->finish; } or ($@ ne "" and die "$@");
4. Perl version of mcdbctl dump.
use MCDB_File (); tie %data, 'MCDB_File', $ARGV[0] or die "$0: can't tie to $ARGV[0]: $!\n"; while (($k, $v) = each %data) { print '+', length $k, ',', length $v, ":$k->$v\n"; } print "\n";
5. Although a mcdb file is constant, you can simulate updating it in Perl. This is an expensive operation, as you have to create a new database, and copy into it everything that's unchanged from the old database. (As compensation, the update does not affect database readers. The old database is available for them, till the moment the new one is finished.)
use MCDB_File (); $file = 'data.cdb'; tie %old, 'MCDB_File', $file or die "$0: can't tie to $file: $!\n"; $new = new MCDB_File::Make($file) or die "$0: new MCDB_File::Make failed: $!\n"; eval { # Add the new values; remember which keys we've seen. while (<>) { chomp; ($k, $v) = split; $new->insert($k, $v); $seen{$k} = 1; } # Add any old values that haven't been replaced. while (($k, $v) = each %old) { $new->insert($k, $v) unless $seen{$k}; } $new->finish; } or ($@ ne "" and die "$@");
Most users can ignore this section.
An mcdb file can contain repeated keys. If the insert method is called more than once with the same key during the creation of a mcdb file, that key will be repeated.
Here's an example.
$mcdb = new MCDB_File::Make("$file.mcdb") or die ...; $mcdb->insert('cat', 'gato'); $mcdb->insert('cat', 'chat'); $mcdb->finish;
Normally, any attempt to access a key retrieves the first value stored under that key. This code snippet always prints gato.
$catref = tie %catalogue, MCDB_File, "$file.mcdb" or die ...; print "$catalogue{cat}";
However, all the usual ways of iterating over a hash---keys, values, and each---do the Right Thing, even in the presence of repeated keys. This code snippet prints cat cat gato chat.
print join(' ', keys %catalogue, values %catalogue);
And these two both print cat:gato cat:chat, although the second is more efficient.
foreach $key (keys %catalogue) { print "$key:$catalogue{$key} "; } while (($key, $val) = each %catalogue) { print "$key:$val "; }
The multi_get method retrieves all the values associated with a key. It returns a reference to an array containing all the values. This code prints gato chat.
multi_get
print "@{$catref->multi_get('cat')}";
multi_get always returns an array reference. If the key was not found in the database, it will be a reference to an empty array. To test whether the key was found, you must test the array, and not the reference.
$x = $catref->multi_get($key); warn "$key not found\n" unless $x; # WRONG; message never printed warn "$key not found\n" unless @$x; # Correct
Any extra references to MCDB_File object (like $catref in the examples above) must be released with undef or must have gone out of scope before calling untie on the hash. This ensures that the object's DESTROY method is called. Note that perl -w will check this for you; see perltie for further details.
MCDB_File
$catref
undef
untie
DESTROY
perl -w
use MCDB_File (); $catref = tie %catalogue, MCDB_File, "$file.mcdb" or die ...; print "@{$catref->multi_get('cat')}"; undef $catref; untie %catalogue;
The routines tie and new return undef if the attempted operation failed; $! contains the reason for failure. insert and finish call croak if the attempted operation fails.
new
$!
croak
The following fatal errors may occur. (See "eval" in perlfunc if you want to trap them.)
You attempted to modify a hash tied to a MCDB_File.
An OS level problem occurred, such as permission denied writing to filesystem, or you have run out of disk space.
Sometimes you need to get the most performance possible out of a library. Rumour has it that perl's tie() interface is slow. In order to get around that you can use MCDB_File in an object oriented fashion, rather than via tie().
my $mcdb = MCDB_File->TIEHASH('/path/to/mcdbfile.mcdb'); if ($mcdb->EXISTS('key')) { print "Key: 'key'; Value: ", $mcdb->FETCH('key'), "\n"; } undef $mcdb;
For more information on the methods available on tied hashes see perltie.
Due to the internal Perl reuse of FETCH method to support queries, as well as each() and values(), it will be sligthly more efficient to call the $mcdb->find('key') method than to call $mcdb->FETCH('key').
mcdb is based on cdb, created by Dan Bernstein <djb@koobera.math.uic.edu>. MCDB_File is based on CDB_File, created by Tim Goodwin, <tjg@star.le.ac.uk> and currently maintained by Todd Rinaldo https://github.com/toddr/CDB_File/
gstrauss <code () gluelogic.com>
To install MCDB_File, copy and paste the appropriate command in to your terminal.
cpanm
cpanm MCDB_File
CPAN shell
perl -MCPAN -e shell install MCDB_File
For more information on module installation, please visit the detailed CPAN module installation guide.