The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

SPOPS::Manual::ImportExport - Moving data (and more) with SPOPS

SYNOPSIS

This part of the SPOPS manual describes how to get data in and out of SPOPS datasources, and also how to create portable database structures.

DESCRIPTION

Once you can retrofit object-oriented access and behaviors on top of existing datasources, the world is your oyster, right?

Except...

Getting data in and out of these datasources can prove an issue. One of the benefits of SPOPS is that databases become more transparent -- you don't really care what the data are stored on, just that you can access it.

Of course, one of the downsides of this transparency is that you might actually take advantage of it! Being able to move from MySQL to PostgreSQL maybe become much more important, particularly if you're consolidating disparate datasources to one or two.

To get around this fact, SPOPS (as of 0.55) comes with importing and exporting capabilities. And, of course, it's fairly simple for you to extend these capabilities and create your own.

This manual section describes the basics of getting data in and out (with examples) and along the way points out ways you can extend it.

IMPORTING

There are two types of importing we'll talk about: data and structures. Currently the only structural support is for DBI database tables, but the framework exists for other datastores.

Importing Data

No better way to introduce a topic than show working code, so here's a simple example of a script to import data:

  1: #!/usr/bin/perl
  2: 
  3: use strict;
  4: use SPOPS::Import;
  5: use SPOPS::Initialize;
  6: 
  7: {
  8:     SPOPS::Initialize->process({ filename => 'spops.conf' });
  9:     my $import = SPOPS::Import->new( 'object' )
 10:                               ->data_from_fh( \*DATA );
 11:     my $status = $import->run;
 12:     foreach my $item ( @{ $status } ) {
 13:         print "Status for $item->[1][0]: ";
 14:         if ( $item->[0] ) { print "OK\n" }
 15:         else              { print "FAILED ($item->[2])\n" }
 16:     }
 17: }
 18: 
 19: __DATA__
 20: $item = [
 21:   { spops_class => 'My::Doodad',
 22:     field_order => [ qw/ name description unit_cost factory created_by / ] },
 23:   [q{Amazing Melonhead}, q{Spits seeds}, q{100.75}, q{Saskatoon, Saskatchewan, Canada}, q{2}],
 24:   [q{Dipsy Doodler}, q{Who knows?}, q{5.00}, q{Chicago, Illinois, USA}, q{2}],
 25:   [q{Greg Kihn Band}, q{I lost on Jeopardy}, q{11.99}, q{Topeka, Kansas, USA}, q{2}]
 26: ];

As seen here, there are three main steps to importing:

  1. Create an importer object. Instantiating the import object must include at least the type of import you will be performing.

    In the example, we set the type of import we're doing with the parameter 'object' passed into the new() method.

  2. Set properties for the importer object. The importer object needs to know some basic information about what you're importing. At a minimum, it needs the SPOPS object class and the data you're importing. Different types of imports may need additional information as well.

    In the example, we read the properties for the object from a filehandle using the data_from_fh() method.

  3. Run the import. Every import subclass includes the run() method, which actually performs the import. Until you execute run(), no data are written to the datasource.

No matter what the import format, you will always need to execute these three steps. The first two can be combined, either explicitly as we did in the example or by passing the properties to the new() method.

Currently SPOPS data import options are:

Importing DBI Tables

Another goal of SPOPS importing and exporting is to make the structures the datasources use portable as well. SPOPS has support for simplistic generic DBI table importing. It uses keys in the CREATE TABLE SQL statement and replaces them with database-dependent structures.

One of the most common uses of this is to create a table that supports an 'auto-incrementing' field on different databases. Since databases use very different schemes for generating this value, it's abstracted into a key that's replaced on import.

For instance, take this basic table:

  1: CREATE TABLE user (
  2:   user_id      %%INCREMENT%%,
  3:   login_name   varchar(50) not null,
  4:   email        varchar(75) not null,
  5:   primary key( user_id ) 
  6: )

When we run the import the %%INCREMENT%% key gets translated to a database-specific expression. In MySQL it would be 'INT NOT NULL AUTO_INCREMENT' and in PostgreSQL it would be 'SERIAL'.

You can also add your own translation behaviors. For instance, you can create a central datastore for datatypes in your application:

  1: address     varchar(100)
  2: email       varchar(75)
  3: phone       varchar(25)
  4: city        varchar(40)
  5: postal      varchar(12)

And then you'd create a custom behavior and add it to the import routine as in this example:

  1: #!/usr/bin/perl
  2: 
  3: use strict;
  4: use SPOPS::Import;
  5: 
  6: my %DATATYPES = initialize_datatypes();
  7: 
  8: {
  9:     my $table_import = SPOPS::Import->new( 'table' );
 10:     $table_import->database_type( 'mysql' );
 11:     $table_import->print_only( 1 );
 12:     $table_import->transforms([ \&my_transform ]);
 13:     $table_import->read_table_from_file( 'ie_import_table_custom_before' );
 14:     $table_import->run;
 15: }
 16: 
 17: sub my_transform {
 18:     my ( $self, $sql, $importer ) = @_;
 19:     foreach my $datatype ( keys %DATATYPES ) {
 20:         $$sql =~ s/%%$datatype%%/$DATATYPES{ $datatype }/g;
 21:     }
 22: }
 23: 
 24: 
 25: sub initialize_datatypes {
 26:     my %h = ();
 27:     open( DT, "ie_import_table_datatypes" )
 28:         || die "Cannot import datatypes: $!";
 29:     while ( <DT> ) {
 30:         chomp;
 31:         s/^\s+//;
 32:         s/\s+$//;
 33:         my ( $datatype, $sql ) = split /\s+/, $_, 2;
 34:         $h{ uc $datatype } = $sql;
 35:     }
 36:     return %h;
 37: }

So that a table definition like:

  1: CREATE TABLE address (
  2:   address_id   %%INCREMENT%%,
  3:   company      varchar(50) null,
  4:   address1     %%ADDRESS%% null,
  5:   address2     %%ADDRESS%% null,
  6:   city         %%CITY%% null,
  7:   state        char(2) null,
  8:   postal       %%POSTAL%% null
  9:   email        %%EMAIL%% null,
 10:   main_phone   %%PHONE%% not null,
 11:   main_fax     %%PHONE%% null,
 12:   mobile_phone %%PHONE%% null,
 13:   primary key ( address_id )
 14: )

would become:

  1: CREATE TABLE address (
  2:   address_id   INT NOT NULL AUTO_INCREMENT,
  3:   company      varchar(50) null,
  4:   address1     varchar(100) null,
  5:   address2     varchar(100) null,
  6:   city         varchar(40) null,
  7:   state        char(2) null,
  8:   postal       varchar(12) null
  9:   email        varchar(75) null,
 10:   main_phone   varchar(25) not null,
 11:   main_fax     varchar(25) null,
 12:   mobile_phone varchar(25) null,
 13:   primary key ( address_id )
 14: )

It's not as powerful as custom datatypes -- with inheritance and dynamic schema updating capabitility -- but it's still very useful to enforce data standards within and across applications. (Besides, how often do you modify the schema once something is created?)

See SPOPS::Import::DBI::Table for the currently supported keys.

EXPORTING

Exporting data and importing data are slightly different. With importing, you do everything required in the run() method. With exporting, the parent class defines the run() method but triggers callbacks in the process. Each callback returns content which gets concatendated together to form the set of exported objects.

These callbacks are:

create_header

Only called once before any records have been processed.

create_record

Called for each record.

create_footer

Only called once after all the records have been processed.

The export implementation is free to use these how it sees fit. For instance, the SPOPS::Export::Perl just keeps a copy of every object it sees and then dumps them all with a single call to Data::Dumper in the create_footer callback. (Due to memory issues, you'd probably want to modify this if you were exporting hundreds of thousands of records. But it's just an example.)

COPYRIGHT

Copyright (c) 2001-2004 Chris Winters. All rights reserved.

See SPOPS::Manual for license.

AUTHORS

Chris Winters <chris@cwinters.com>