The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DBIx::DWIW - Robust and simple DBI wrapper to Do What I Want (DWIW)

SYNOPSIS

When used directly:

  use DBIx::DWIW;

  my $db = DBIx::DWIW->Connect(DB   => $database,
                               User => $user,
                               Pass => $password,
                               Host => $host);

  my @records = $db->Array("select * from foo");

When sub-classed for full functionality:

  use MyDBI;  # class inherits from DBIx::DWIW

  my $db = MyDBI->Connect('somedb') or die;

  my @records = $db->Hashes("SELECT * FROM foo ORDER BY bar");

DESCRIPTION

NOTE: This module is currently specific to MySQL, but needn't be. We just haven't had a need to talk to any other database server.

DBIx::DWIW was developed (over the course of roughly 1.5 years) in Yahoo! Finance (http://finance.yahoo.com/) to suit our needs. Parts of the API may not make sense and the documentation may be lacking in some areas. We've been using it for so long (in one form or another) that these may not be readily obvious to us, so feel free to point that out. There's a reason the version number is currently < 1.0.

This module was recently extracted from Yahoo-specific code, so things may be a little strange yet while we smooth out any bumps and blemishes left over form that.

DBIx::DWIW is intended to be sub-classed. Doing so gives you all the benefits it can provide and the ability to easily customize some of its features. You can, of course, use it directly if it meets your needs as-is. But you'll be accepting its default behavior in some cases where it may not be wise to do so.

The DBIx::DWIW distribution comes with a sample sub-class in the file examples/MyDBI.pm which illustrates some of what you might want to do in your own class(es).

This module provides three main benefits:

Centralized Configuration

Rather than store the various connection parameters (username, password, hostname, port number, database name) in each and every script or application which needs them, you can easily put them in once place--or even generate them on the fly by writing a bit of custom code.

If this is all you need, consider looking at Brian Aker's fine DBIx::Password module on the CPAN. It may be sufficient.

API Simplicity

Taking a lesson from Python (gasp!), this module promotes one obvious way to do most things. If you want to run a query and get the results back as a list of hashrefs, there's one way to do that. The API may sacrifice speed in some cases, but new users can easily learn the simple and descriptive method calls. (Nobody is forcing you to use it.)

Fault Tolerance

Databases sometimes go down. Networks flake out. Bad stuff happens. Rather than have your application die, DBIx::DWIW provides a way to handle outages. You can build custom wait/retry/fail logic which does anything you might want (such as ringing your pager or sending e-mail).

Transaction Handling

As of version 0.25, three transaction related methods were added to DWIW. These methods were designed to make transaction programming easier in a couple of ways.

Consider a code snippet like this:

  sub do_stuff_with_thing
  {
      $db->Begin();
      $db->Execute("some sql here");
      $db->Execute("another query here");
      $db->Commit();
  }

That's all well an good. You have a function that you can call and it will perform 2 discrete actions as part of a transaction. However, what if you need to call that in the context of a larger transaction from time to time? What you'd like to do is this:

  $db->Begin();
  for my $thing (@thing_list)
  {
      do_stuff_with_thing($thing);
  }
  $db->Commit();

and have it all wrapped up in once nice juicy transaction.

With DBIx::DWIW, you can. That is, in fact, the default behavior. You can call Begin() as many times as you want, but it'll only ever let you start a single transaction until you call the corresponding commit. It does this by tracking the number of times you call Begin() and Commit(). A counter is incremented each time you call Begin() and decremented each time you call Commit(). When the count reaches zero, the original transaction is actually committed.

Of course, there are problems with that method, so DBIx::DWIW provides an alternative. You can use named transactions. Using named transactions instead, the code above would look like this:

  sub do_stuff_with_thing
  {
      $db->Begin('do_stuff transaction');
      $db->Execute("some sql here");
      $db->Execute("another query here");
      $db->Commit('do_stuff transaction');
  }

and:

  $db->Begin('Big Transaction');
  for my $thing (@thing_list)
  {
      do_stuff_with_thing($thing);
  }
  $db->Commit('Big Transaction');

In that way, you can avoid problems that might be caused by not calling Begin() and Commit() the same number of times. Once a named transaction is begun, the module simply ignores any Begin() or Commit() calls that don't have a name or whose name doesn't match that assigned to the currently open transaction.

The only exception to this rule is Rollback(). Because a transaction rollback usually signifies a big problem, calling Rollback() always ends the currently running transaction.

Return values for these functions are a bit different, too. Begin() and Commit() can return undef, 0, or 1. undef means there was an error. 0 means that nothing was done (but there was no error either), and 1 means that work was done.

The methods are:

Begin

Start a new transaction if one is not already running.

Commit

Commit the current transaction, if one is running.

Rollback

Rollback the current transaction, if one is running.

See the detailed method descriptions below for all the gory details.

Note that Begin(), Commit(), and Rollback() are not protected by DBIx::DWIW's normal wait/retry logic if a network connection fails. This because I'm not sure that it it makes sense. If your connection drops and the other end notices, it'll probably rollback for you anyway.

DBIx::DWIW CLASS METHODS

The following methods are available from DBIx::DWIW objects. Any function or method not documented should be considered private. If you call it, your code may break someday and it will be your fault.

The methods follow the Perl tradition of returning false values when an error occurs (and usually setting $@ with a descriptive error message).

Any method which takes an SQL query string can also be passed bind values for any placeholders in the query string:

  $db->Hashes("SELECT * FROM foo WHERE id = ?", $id);

Any method which takes an SQL query string can also be passed a prepared DWIW statement handle:

  $db->Hashes($sth, $id);
Connect()

The Connect() constructor creates and returns a database connection object through which all database actions are conducted. On error, it calls die(), so you may want to eval {...} the call. The NoAbort option (described below) controls that behavior.

Connect() accepts ``hash-style'' key/value pairs as arguments. The arguments which it recognizes are:

Host

The name of the host to connect to. Use undef to force a socket connection on the local machine.

User

The database user to authenticate as.

Pass

The password to authenticate with.

DB

The name of the database to use.

Socket

NOT IMPLEMENTED.

The path to the Unix socket to use.

Port

The port number to connect to.

Proxy

Set to true to connect to a DBI::ProxyServer proxy. You'll also need to set ProxyHost, ProxyKey, and ProxyPort. You may also want to set ProxyKey and ProxyCipher.

ProxyHost

The hostname of the proxy server.

ProxyPort

The port number on which the proxy is listening. This is probably different than the port number on which the database server is listening.

ProxyKey

If the proxy server you're using requires encryption, supply the encryption key (as a hex string).

ProxyCipher

If the proxy server requires encryption, supply the name of the package which provides encryption. Typically this is something like Crypt::DES or Crypt::Blowfish.

Unique

A boolean which controls connection reuse.

If false (the default), multiple Connects with the same connection parameters (User, Pass, DB, Host) return the same open connection. If Unique is true, it returns a connection distinct from all other connections.

If you have a process with an active connection that fork()s, be aware that you CANNOT share the connection between the parent and child. Well, you can if you're REALLY CAREFUL and know what you're doing. But don't do it.

Instead, acquire a new connection in the child. Be sure to set this flag when you do, or you'll end up with the same connection and spend a lot of time pulling your hair out over why the code does mysterious things.

As of version 0.27, DWIW also checks the class name of the caller and guarantees unique connections across different classes. So if you call Connect() from SubClass1 and SubClass2, each class gets its own connection.

Verbose

Turns verbose reporting on. See Verbose().

Quiet

Turns off warning messages. See Quiet().

NoRetry

If true, the Connect() fails immediately if it can't connect to the database. Normally, it retries based on calls to RetryWait(). NoRetry affects only Connect, and has no effect on the fault-tolerance of the package once connected.

NoAbort

If there is an error in the arguments, or in the end the database can't be connected to, Connect() normally prints an error message and dies. If NoAbort is true, it puts the error string into $@ and return false.

Timeout

The amount of time (in seconds) after which Connect() should give up and return. You may use fractional seconds. A Timeout of zero is the same as not having one at all.

If you set the timeout, you probably also want to set NoRetry to a true value. Otherwise you'll be surprised when a server is down and your retry logic is running.

QueryTimeout

The amount of time (in seconds) after which query operations should give up and return. You may use fractional seconds. A Timeout of zero is the same as not having one at all.

There are a minimum of four components to any database connection: DB, User, Pass, and Host. If any are not provided, there may be defaults that kick in. A local configuration package, such as the MyDBI example class that comes with DBIx::DWIW, may provide appropriate default connection values for several database. In such a case, a client may be able to simply use:

    my $db = MyDBI->Connect(DB => 'Finances');

to connect to the Finances database.

As a convenience, you can just give the database name:

    my $db = MyDBI->Connect('Finances');

See the local configuration package appropriate to your installation for more information about what is and isn't preconfigured.

Dump()

Dump the internal configuration to stdout. This is mainly useful for debugging DBIx::DWIW. You probably don't need to call it unless you know what you're doing. :-)

Timeout()

Like the QueryTimeout argument to Connect(), sets (or resets) the amount of time (in seconds) after which queries should give up and return. You may use fractional seconds. A timeout of zero is the same as not having one at all.

Timeout() called with any (or no) arguments returns the current query timeout value.

Disconnect()

Closes the connection. Upon program exit, this is called automatically on all open connections. Returns true if the open connection was closed, false if there was no connection or there was some other error (with the error being returned in $@).

Quote(@values)

Calls the DBI quote() function on each value, returning a list of properly quoted values. As per quote(), NULL is returned for items that are not defined.

InList($field => @values)

Given a field and a value or values, returns SQL appropriate for a WHERE clause in the form

    field = 'value'

or

    field IN ('value1', 'value2', ...)

depending on the number of values. Each value is passed through Quote while building the SQL.

If no values are provided, nothing is returned.

This function is useful because MySQL apparently does not optimize

    field IN ('val')

as well as it optimizes

    field = 'val'
InListUnquoted($field => @values)

Just like InList, but the values are not passed through Quote.

ExecuteReturnCode()

Returns the return code from the most recently Execute()d query. This is what Execute() returns, so there's little reason to call it directly. But it didn't use to be that way, so old code may be relying on this.

Execute($sql)

Executes the given SQL, returning true if successful, false if not (with the error in $@).

Do() is a synonym for Execute()

Prepare($sql)

Prepares the given sql statement, but does not execute it (just like DBI). Instead, it returns a statement handle $sth that you can later execute by calling its Execute() method:

  my $sth = $db->Prepare("INSERT INTO foo VALUES (?, ?)");

  $sth->Execute($a, $b);

The statement handle returned is not a native DBI statement handle. It's a DBIx::DWIW::Statement handle.

When called from Execute(), Scalar(), Hashes(), etc. AND there are values to substitute, the statement handle is cached. This benefits a typical case where ?-substitutions being done lazily in an Execute call inside a loop. Meanwhile, interpolated sql queries, non-? queries, and manually Prepare'd statements are unaffected. These typically do not benefit from moving caching the prepare.

Note: prepare-caching is of no benefit until Mysql 4.1.

RecentSth()

Returns the DBI statement handle ($sth) of the most-recently successfully executed statement.

RecentPreparedSth()

Returns the DBI statement handle ($sth) of the most-recently prepared DBI statement handle (which may or may not have already been executed).

InsertedId()

Returns the mysql_insertid associated with the most recently executed statement. Returns nothing if there is none.

Synonyms: InsertID(), LastInsertID(), and LastInsertId()

RowsAffected()

Returns the number of rows affected for the most recently executed statement. This is valid only if it was for a non-SELECT. (For SELECTs, count the return values). As per the DBI, -1 is returned if there was an error.

RecentSql()

Returns the SQL of the most recently executed statement.

PreparedSql()

Returns the SQL of the most recently prepared statement. (Useful for showing SQL that doesn't parse.)

Hash($sql)

A generic query routine. Pass an SQL statement that returns a single record, and it returns a hashref with all the key/value pairs of the record.

The example at the bottom of page 50 of DuBois's MySQL book would return a value similar to:

  my $hashref = {
     last_name  => 'McKinley',
     first_name => 'William',
  };

On error, $@ has the error text, and false is returned. If the query doesn't return a record, false is returned, but $@ is also false.

Use this routine only if the query will return a single record. Use Hashes() for queries that might return multiple records.

Because calling Hashes() on a larger recordset can use a lot of memory, you may wish to call Hash() once with a valid query and call it repeatedly with no SQL to retrieve records one at a time. It'll take more CPU to do this, but it is more memory efficient:

  my $record = $db->Hash("SELECT * FROM big_table");
  do {
      # ... do something with $record
  }  while (defined($record = $db->Hash()));

Note that a call to any other DWIW query resets the iterator, so only do so when you are finished with the current query.

This seems like it breaks the principle of having only one obvious way to do things with this package. But it's really not all that obvious, now is it? :-)

Hashes($sql)

A generic query routine. Given an SQL statement, returns a list of hashrefs, one per returned record, containing the key/value pairs of each record.

The example in the middle of page 50 of DuBois's MySQL would return a value similar to:

 my @hashrefs = (
  { last_name => 'Tyler',    first_name => 'John',    birth => '1790-03-29' },
  { last_name => 'Buchanan', first_name => 'James',   birth => '1791-04-23' },
  { last_name => 'Polk',     first_name => 'James K', birth => '1795-11-02' },
  { last_name => 'Fillmore', first_name => 'Millard', birth => '1800-01-07' },
  { last_name => 'Pierce',   first_name => 'Franklin',birth => '1804-11-23' },
 );

On error, $@ has the error text, and false is returned. If the query doesn't return a record, false is returned, but $@ is also false.

Array($sql)

Similar to Hash(), but returns a list of values from the matched record. On error, the empty list is returned and the error can be found in $@. If the query matches no records, an empty list is returned but $@ is false.

The example at the bottom of page 50 of DuBois's MySQL would return a value similar to:

  my @array = ( 'McKinley', 'William' );

Use this routine only if the query will return a single record. Use Arrays() or FlatArray() for queries that might return multiple records.

Arrays($sql)

A generic query routine. Given an SQL statement, returns a list of array refs, one per returned record, containing the values of each record.

The example in the middle of page 50 of DuBois's MySQL would return a value similar to:

 my @arrayrefs = (
  [ 'Tyler',     'John',     '1790-03-29' ],
  [ 'Buchanan',  'James',    '1791-04-23' ],
  [ 'Polk',      'James K',  '1795-11-02' ],
  [ 'Fillmore',  'Millard',  '1800-01-07' ],
  [ 'Pierce',    'Franklin', '1804-11-23' ],
 );

On error, $@ has the error text, and false is returned. If the query doesn't return a record, false is returned, but $@ is also false.

FlatArray($sql)

A generic query routine. Pass an SQL string, and all matching fields of all matching records are returned in one big list.

If the query matches a single records, FlatArray() ends up being the same as Array(). But if there are multiple records matched, the return list will contain a set of fields from each record.

The example in the middle of page 50 of DuBois's MySQL would return a value similar to:

     my @items = (
         'Tyler', 'John', '1790-03-29', 'Buchanan', 'James', '1791-04-23',
         'Polk', 'James K', '1795-11-02', 'Fillmore', 'Millard',
         '1800-01-07', 'Pierce', 'Franklin', '1804-11-23'
     );

FlatArray() tends to be most useful when the query returns one column per record, as with

    my @names = $db->FlatArray('select distinct name from mydb');

or two records with a key/value relationship:

    my %IdToName = $db->FlatArray('select id, name from mydb');

But you never know.

FlatArrayRef($sql)

Works just like FlatArray() but returns a ref to the array instead of copying it. This is a big win if you have very large arrays.

Scalar($sql)

A generic query routine. Pass an SQL string, and a scalar is returned.

If the query matches a single row column pair this is what you want. Scalar() is useful for computational queries, count(*), max(xxx), etc.

my $max = $dbh->Scalar('select max(id) from personnel');

If the result set contains more than one value, the first value is returned and a warning is issued.

CSV($sql)

A generic query routine. Pass an SQL string, and a CSV scalar is returned.

my $max = $dbh->CSV('select * from personnel');

The example in the middle of page 50 of DuBois\'s MySQL would return a value similar to:

     my $item = <<END_OF_CSV;
     "Tyler","John","1790-03-29"
     "Buchanan","James","1791-04-23"
     "Polk","James K","1795-11-02"
     "Fillmore","Millard","1800-01-07",
     "Pierce","Franklin","1804-11-23"
     END_OF_CSV
Verbose([boolean])

Returns the value of the verbose flag associated with the connection. If a value is provided, it is taken as the new value to install. Verbose is OFF by default. If you pass a true value, you'll get some verbose output each time a query executes.

Returns the current value.

Quiet()

When errors occur, a message will be sent to STDOUT if Quiet is true (it is by default). Pass a false value to disable it.

Returns the current value.

Safe()

Enable or disable "safe" mode (on by default). In "safe" mode, you must prefix a native DBI method call with "dbi_" in order to call it. If safe mode is off, you can call native DBI methods using their real names.

For example, in safe mode, you'd write something like this:

  $db->dbi_commit;

but in unsafe mode you could use:

  $db->commit;

The rationale behind having a safe mode is that you probably don't want to mix DBIx::DWIW and DBI method calls on an object unless you know what you're doing. You need to opt in.

Safe() returns the current value.

dbh()

Returns the real DBI database handle for the connection.

RetryWait($error)

This method is called each time there is a error (usually caused by a network outage or a server going down) which a sub-class may want to examine and decide how to continue.

If RetryWait() returns 1, the operation which was being attempted when the failure occurred is retried. If RetryWait() returns 0, the action fails.

The default implementation causes your application to make up to three immediate reconnect attempts, and if all fail, emit a message to STDERR (via a warn() call) and then sleep for 30 seconds. After 30 seconds, the warning and sleep repeat until successful.

You probably want to override this so method that it will eventually give up. Otherwise your application may hang forever. The default method does maintain a count of how many times the retry has been attempted in $self-{RetryCount}>.

Note that RetryWait() is not be called in the middle of transaction. In that case, we assume that the transaction will have been rolled back by the server and you'll get an error.

Local Configuration

There are two ways to to configure DBIx::DWIW for your local databases. The simplest (but least flexible) way is to create a package like:

    package MyDBI;
    @ISA = 'DBIx::DWIW';
    use strict;

    sub DefaultDB   { "MyDatabase"         }
    sub DefaultUser { "defaultuser"        }
    sub DefaultPass { "paSSw0rd"           }
    sub DefaultHost { "mysql.somehost.com" }
    sub DefaultPort { 3306                 }

The four routines override those in DBIx::DWIW, and explicitly provide exactly what's needed to contact the given database.

The user can then use

    use MyDBI
    my $db = MyDBI->Connect();

and not have to worry about the details.

A more flexible approach appropriate for multiple-database or multiple-user installations is to create a more complex package, such as the MyDBI.pm which was included in the examples sub-directory of the DBIx::DWIW distribution.

In that setup, you have quit a bit of control over what connection parameters are used. And, since it's Just Perl Code, you can do anything you need in there.

The following methods are provided to support this in sub-classes:

LocalConfig($name)

Passed a configuration name, LocalConfig() should return a list of connection parameters suitable for passing to Connect().

By default, LocalConfig() simply returns an empty list.

DefaultDB($config_name)

Returns the default database name for the given configuration. Calls LocalConfig() to get it.

DefaultUser($config_name)

Returns the default username for the given configuration. Calls LocalConfig() to get it.

DefaultPass($config_name)

Returns the default password for the given configuration. Calls LocalConfig() to get it.

DefaultHost($config_name)

Returns the default hostname for the given configuration. Calls LocalConfig() to get it.

DefaultPort($config_name)

Returns the default port number for the given configuration. Calls LocalConfig() to get it.

Transaction Methods

Begin([name)

Begin a new transaction, optionally naming it.

Commit([name)

Commit the current transaction (or named transaction).

Rollback()

Rollback the current transaction.

The DBIx::DWIW::Statement CLASS

Calling Prepare() on a database handle returns a DBIx::DWIW::Statement object which acts like a limited DBI statement handle.

Methods

The following methods can be called on a statement object.

    Execute([@values])

    Executes the statement. If values are provided, they'll be substituted for the appropriate placeholders in the SQL.

AUTHORS

DBIx::DWIW evolved out of some Perl modules that we developed and used in Yahoo! Finance (http://finance.yahoo.com). The following people contributed to its development:

  Jeffrey Friedl (jfriedl@yahoo.com)
  rayg (rayg@bitbaron.com)
  John Hagelgans
  Jeremy Zawodny (Jeremy@Zawodny.com)

CREDITS

The following folks have provided feedback, patches, and other help along the way:

  Eric E. Bowles (bowles@ambisys.com)
  David Yan (davidyan@yahoo-inc.com)
  DH <crazyinsomniac@yahoo.com>
  Toby Elliott (telliott@yahoo-inc.com)
  Keith C. Ivey (keith@smokefreedc.org)
  Brian Webb (brianw@yahoo-inc.com)
  Steve Friedl (steve@unixwiz.net)

Please direct comments, questions, etc to Jeremy for the time being. Thanks.

COPYRIGHT

DBIx::DWIW is Copyright (c) 2001, Yahoo! Inc. All rights reserved.

You may distribute under the same terms of the Artistic License, as specified in the Perl README file.

SEE ALSO

DBI, perl

Jeremy's presentation at the 2001 Open Source Database Summit, which introduced DBIx::DWIW is available from:

  http://jeremy.zawodny.com/mysql/

3 POD Errors

The following errors were encountered while parsing the POD:

Around line 2276:

You forgot a '=back' before '=head2'

Around line 2475:

=back without =over

Around line 2487:

You can't have =items (as at line 2532) unless the first thing after the =over is an =item