The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

NoSQL::PL2SQL::Simple - Implementation of NoSQL::PL2SQL

SYNOPSIS

  BEGIN {
        package MyArbitraryClass ;
        use base qw( NoSQL::PL2SQL::Simple ) ;
        }

  use CGI ;

  my $collection = new MyArbitraryClass ;

  ## Writing to the database
  $collection->record( CGI->new->Vars ) ;       ## Save user input
  $collection->save( CGI->new->Vars ) ; ## save() is an alias

  ## Accessing the database
  @allemails = values %{ { $collection->contactemail } } ;

  $user = $collection->contactemail('jim@tqis.com')->record ;
  @neighbors = $collection->contactaddress('Ann Arbor')->records ;
  @classmates = $collection->query(
                contactcity => 'Ann Arbor',
                graduationyear => '1989',
                )->records ;

DESCRIPTION

NoSQL::PL2SQL is a low level abstract interface designed to be swapped into existing applications. It's powerful, flexible, and not quite ready to roll out of the box.

NoSQL::PL2SQL::Simple is an intermediate solution more appropriate for building out new applications. It retains many of PL2SQL's features, particularly universal data schemas that can accomodate heterogeneous, indeterminate data elements. Additionally, NoSQL::PL2SQL::Simple allows data definition capabilities and provides a complete solution for data access.

NoSQL::PL2SQL::Simple's API consists of the following functions:

Constructor

new() generally takes no arguments and returns an instance of the data definition class.

Classes that implement NoSQL::PL2SQL::Simple are usually referred to as data definition classes. Except this document recommends creating additional abstract subclasses called data source classes. Naturally, the properties of the instance returned by new() reflect the data definition.

Perl OO, in its annoying flexibility, allows class overloading. So as a convenient mapping of definition and data, objects stored in the database have the same class name. An instance that contains a data definition is distinguished from a data containing object.

It's appropriate for implementations to create methods that will only be applied to objects. If so, they should probably override the new() constructor (shown in the example under Developer's Notes) to minimize potential confusion.

Data Access

Most of NoSQL::PL2SQL::Simple's methods are heavily overloaded. This approach makes the API easier to remember and minimizes namespace collisions.

record(), for example, has five variants:

    If its argument is an unblessed object, record() is nearly identical to NoSQL::PL2SQL::SQLObject() in that it simultanously blesses the object and writes it to the database. Unlike NoSQL::PL2SQL::SQLObject(), record() returns a cloned object that must be explicitly re-saved.

    If its argument is a blessed object, record() saves the object into the database.

    If its argument is an integer, record() returns an object that was previously written into the database.

    If its argument is a blessed array, record() is an alias for records(), which returns a set of objects from the database.

    If there are two arguments, the first a blessed object and the second unblessed, record() replaces the first object with the second.

NoSQL::PL2SQL::Simple can be used to maintain the profile records of an access control list:

  ## A new user registers by submitting a website form.  $userid contains
  ## an automatic assignment, which should be returned as part of the
  ## confirmation.

  $userid = $collection->record( CGI->new->Vars )->SQLObjectID ;

  ## If the user logs back in using the $userid, his/her record is
  ## recovered as follows:

  $userrecord = $collection->record( $userid ) ;

  ## Suppose the user changes his/her password.  The change is saved as
  ## follows:

  $userrecord->{password} = $encrypted ;
  $collection->record( $userrecord ) ;

  ## If the website contains a profile editting form, then replace the 
  ## entire record with the form data:

  $collection->record( $userrecord, CGI->new->Vars ) ;

recordID() is a used to convert query results to record id's (otherwise known as object id's).

SQLObjectID() returns the object id automatically generated by NoSQL::PL2SQL.

keyValues() establishes many-to-one and many-to-many relationships among objects, and has three variants:

    To add a relationships, pass one or more values as arguments to the keyValues() method.

    To list relationships, pass no arguments to the keyValues() method.

    To clear all relationships, chain the c<keyValues()> method as follows: $o->keyValues()->clear().

See the MANY-TO-ONE RELATIONSHIPS section below.

save() is an alias for record(). Each of the pair of statements below are equivalent:

  $collection->record( CGI->new->Vars ) ;
  $collection->save( CGI->new->Vars ) ;

  $collection->record( $userrecord ) ;
  $userrecord->save() ;

  $collection->record( $userrecord, CGI->new->Vars ) ;
  $userrecord->save( CGI->new->Vars ) ;

Data Queries

query() is used to query the database. Its arguments are name-value-pairs (NVP's). Multiple NVP's are combined using SQL AND logic. query() normally returns an array of record id's. In scalar context, its output can be chained to other methods as illustrated:

  ## In the above example, the user recovers his/her userid by querying
  ## an email address:

  my $cgivars = CGI->new->Vars

  my @userids = $collection->query( 
                        contactemail => $cgivars->{contactemail},
                        ) ;
  error( "Unknown email address" ) if @userids == 0 ;
  error( "Duplicate email address" ) if @userids > 1 ;
  my $userid = $userids[0] ;

  ## Chain recordID to return a scalar instead of an array
  $userid = $collection->query( 
                        contactemail => $cgivars->{contactemail} 
                        )->recordID ;

  ## Allow the user to login using an email address
  @userids = $collection->query( 
                        contactemail => $cgivars->{contactemail},
                        password => $encrypted,
                        ) ;
  error( "Login failed" ) unless @userids == 1 ;
  my $user = $collection->record( $userids[0] ) ;
  my $greeting = sprintf( "Hello %s", $user->{contactnamefirst} ) ;

NoSQL::PL2SQL::Simple uses Perl's magical AUTLOAD() to make simple queries more concise. The premise of object oriented programming is that complexity is encapsulated inside a black box; and simplicity is exposed at the interface. The premise of NoSQL::PL2SQL::Simple is that the code below is easy to read and maintain:

  ## The output of query can be chained to return a set of objects
  my @users = $collection->query( 
                contactemail => $cgivars->{contactemail},
                )->records ;

  ## This statement can be rewritten using an AUTOLOAD alias
  @users = $collection->contactemail( $cgivars->{contactemail} )->records ;
  error( "Unknown email address" ) if @users == 0 ;
  error( "Duplicate email address" ) if @users > 1 ;
  my $response = sprintf "Hello %s,\nYour userid is: %d",
                $users[0]->{contactnamefirst}, $users[0]->SQLObjectID ;

  ## Without error checking, this code can be further simplified
  ## In this case, record() aliases records() for readability.
  $user = $collection->contactemail( $cgivars->{contactemail} )->record ;
  $response = sprintf "Hello %s,\nYour userid is: %d",
                $user->{contactnamefirst}, $user->SQLObjectID ;

In this example, the contactemail() method is not explicitly defined. The magical AUTOLOAD() essentially calls query() using the method name as the first argument.

AUTOLOAD() aliases and query() are slightly different. The AUTOLOAD() methods are overloaded to take zero or one argument(s). query() accepts a hash, or even numbered array, as arguments.

If called without arguments, the AUTOLOAD() methods return a hash of NVP's that correspond to the matching objects, keyed on object id. Each NVP value represents the object element's value.

  ## Allow users to log in using their name, selected from a combo box
  ## 'username' must be part of the data definition
  ## Note:  From a security standpoint, this is a poor practice

  %users = $collection->username ;
  
  ## Build a combo box using the %users NVP
  ## <select name="userid">
  ##  <option value="$userskey">$usersvalue</option>
  ## </select>
  

If called without arguments, query() returns all the stored objects matching the data definition class. This invocation is used with the reindex() method as shown below.

Class Definition

A set of NoSQL::PL2SQL::Simple's methods are specifically used as part of the class definition, explained below.

    addTextIndex()

    addNumberIndex()

    addDateIndex()

reindex() updates the database to reflect data definition changes. It requires a data element argument. See the INDEXES section below.

DEFINING CLASSES

Object oriented programming techniques improve code reuse and manage complexity. For example, an object's behavior is determined by its class definition and its individual properties. OO designers usually consider tradeoffs between class attributes and object properties. Subclassing is a technique of assigning attributes to a group of objects so that identical properties don't need to be repetitively defined at instantiation. Instead, properties are defined as attributes of a subclass instantiated by objects who share those properties.

NoSQL::PL2SQL::Simple uses this approach to separate data definitions from its internal complexity. Subclassing also ensures that every object is associated with a data definition, which can be accessed as a class instantiation.

Data Definition Subclass

NoSQL::PL2SQL::Simple is simple even for users who have never defined a class in Perl. Only the following code is required to define a data definition subclass. The BEGIN clause is good practice, but not necessary.

  ## This code defines a data definition class named 
  ## 'TQIS::HighSchoolFriends'
  ## The BEGIN clause is not strictly necessary

  BEGIN {
        package TQIS::HighSchoolFriends ;
        use base qw( NoSQL::PL2SQL::Simple ) ;
        }

The actual data definition is built using the data definition methods described above. This approach should be easier for users who have never written a class definition. Using a Perl terminal, these statements are simpler than using command line SQL. Otherwise, the code below could be written into a one-time throw-away script. Or, using an appropriate Perl interpreter, the data definitions can be easily managed with a web-based tool.

  ## Create another Data Definition Class for this example
  BEGIN {
        package TQIS::HighSchools ;
        use base qw( NoSQL::PL2SQL::Simple )
        }

  ## TQIS::HighSchools contains only a list of schools.  This example
  ## illustrates how to normalize data using NoSQL::PL2SQL::Simple
  TQIS::HighSchools->new->addTextIndex( qw( highschool ) ) ;

  ## Create an instance of the TQIS::HighSchoolFriends data definition
  my $friends = new TQIS::HighSchoolFriends ;

  ## Use the data definition methods to perform the actual data 
  ## definition
  $friends->addTextIndex( qw( yeargraduated lastname ) ) ;

  ## A strictly normalized RDB would use a separate table to reflect
  ## this relationship.  I find this simpler approach of using
  ## internal references more durable: This spouse value will be the
  ## record id of another HighSchoolFriend object.
  $friends->addNumberIndex( qw( spouse ) ) ;
  
  ## Normalizing means these highschool data values will be record
  ## ids from the other data definition class.
  $friends->addNumberIndex( highschool ) ;
  
  ## The data definition is reflected in the properties of the
  ## class instantiation
  print join "\n", %$friends ;

  ## __END__
  ## Throwaway setup script ends here

  ## This remaining code is considered runtime.  For example, this
  ## code would be run to create a web form to add new (or modify)
  ## HighSchoolFriends objects
  %schools = TQIS::HighSchools->new->highschool ;

  ## Populate a web form dropdown with the %schools set, eg:
  ## <select name="highschool">
  ##   <option value="210">Pioneer High School</option>
  ## </select>

  ## Runtime code showing queries using this data definition

  ## This statement chains the constructor with an implicit query
  ## and converts the result to a scalar.
  my $phs = TQIS::HighSchools->new->highschool(
                'Pioneer High School')->recordID ;

  ## A more robust solution is to select the school from a list
  ## Naturally, this works better using a mechanism such as 
  ## a web form combo box
  my %schools = TQIS::HighSchools->new->highschool ;
  my %inverse = reverse %schools ;
  $phs = $inverse{'Pioneer High School'} ;

  ## Incidentally, all these statements will blow up.  Don't try them.
  # $phs = TQIS::HighSchools->new->highschool->{'Pioneer High School'} ;
  # $phs = TQIS::HighSchools->new->{highschool}->{'Pioneer High School'} ;
  # $phs = TQIS::HighSchools->new->highschool('Pioneer High School') ;

  ## The actual query statements
  my @pioneerfriends = $friends->highschool( $phs )->records ;

  or

  my @classmates = $friends->query( 
                highschool => $phs,
                yeargraduated => '1989',
                )->records ;

The data definition only needs to define object properties that will be queried. See the INDEXES section below.

The definition may also include an archive element. (Object elements named archive will be ignored.) The archive data definition prevents object data from being deleted. Instead, when a record is replaced, the original record is assigned a new object id; and the replacement is inserted into the database with the value of the existing object id.

  ## Create a new data definition subclass
  BEGIN {
        package TQIS::GPRC::Members ;
        use base ( NoSQL::PL2SQL::Simple ) ;
        }

  ## Create an instantiation
  my $members = new TQIS::GPRC::Members ;

  ## Create a new member from webform data 
  my $memberid = $members->record( CGI->new->Vars )->SQLObjectID ;

  ## An application may email the $memberid as a confirmation.
  ## Later on, the $memberid may be used to replace the member
  ## data with something else
  $members->record( $memberid )->save( CGI->new->Vars ) ;

  ## The new data is returned by default
  my $member = $members->record( $memberid ) ;

  ## Code to return the original data
  my @archive = $members->archive( $memberid )->records ;
  $member = $archive[-1] ;      ## returned set is chronological

Data Source Subclass

Unfortunately, none of the examples presented so far will actually run. Like any database interface, NoSQL::PL2SQL::Simple needs an actual database. Advanced users should create an abstract Data Source subclass, described below. Other users will need to define data sources and pass them to the constructors of each data definition.

NoSQL::PL2SQL::Simple uses its own version of DBI, NoSQL::PL2SQL::DBI. NoSQL::PL2SQL::DBI contains the methods to generate the specific SQL. NoSQL::PL2SQL::DBI subclasses are used to handle discrepencies in the SQL syntax.

NoSQL::PL2SQL is based on a single table which can be shared across multiple implementations. As in the NoSQL::PL2SQL documentation examples, it's commonly called $dsn.

NoSQL::PL2SQL includes implementations for SQLite and MySQL. These examples use SQLite, which comes preinstalled.

  ## define a data source
  use NoSQL::PL2SQL::DBI::SQLite ;
  my $dsn = new NoSQL::PL2SQL::DBI::SQLite $tablename = 'objectdata' ;
  $dsn->connect( 'dbi:SQLite:dbname=:memory:', '', '') ;

  ## NoSQL::PL2SQL::Simple queries actually require a second DSN
  my $index = $dsn->table('querydata') ;

  ## run once and only once
  NoSQL::PL2SQL::Simple->loadschema( $dsn, $index ) ;

  ## The DSN objects are passed to the constructor
  my $members = new TQIS::GPRC::Members $dsn, $index ;

  ## The DSN objects can also be shared among all the implementations
  ## we've used in our examples

  my $collection = new MyArbitraryClass $dsn, $index ;
  my $friends = new TQIS::HighSchoolFriends $dsn, $index ;

In order to use the last two statements, we would need to redefine the DSN objects every time we instantiate another data definition subclass. The resulting code is much less concise than the examples.

The solution is to define an abstract data source subclass. The code is simple enough to cut and paste from the example below. But in order to work, this class must be an appropriately named file that can be found using @INC. For newer users, this might be a good opportunity to learn more about Perl OO programming.

  package TQIS::PL2SQL::DSN
  use base qw( NoSQL::PL2SQL::Simple ) ;        ## Do not change this line

  use NoSQL::PL2SQL::DBI::SQLite ;

  my @dsn = () ;                                ## Do not change this line

  ## data source subclasses override this dsn() method
  sub dsn {
        return @dsn if @dsn ;                   ## Do not change this line

        push @dsn, new NoSQL::PL2SQL::DBI::SQLite 'objectdata' ;
        $dsn[0]->connect( 'dbi:SQLite:dbname=:memory:', '', '') ;

        push @dsn, $dsn[0]->table('querydata') ;
        return @dsn ;                           ## Do not change this line
        }

  1 ;

Once complete, the data definition subclasses are very easy to declare:

  BEGIN {
        package MyArbitraryClass ;
        use base qw( TQIS::PL2SQL::DSN ) ;

        package TQIS::HighSchools ;
        use base qw( TQIS::PL2SQL::DSN ) ;

        package TQIS::HighSchoolFriends ;
        use base qw( TQIS::PL2SQL::DSN ) ;

        package TQIS::GPRC::Members ;
        use base qw( TQIS::PL2SQL::DSN ) ;
        }

  ## Run once and only once
  TQIS::PL2SQL::DSN->loadschema() ;

  ## Each of these statements should now succeed
  my $o = new MyArbitraryClass ;
  my $schools = new TQIS::HighSchools ;
  my $friends = new TQIS::HighSchoolFriends ;
  my $members = new TQIS::GPRC::Members ;

INDEXES

Should this section be named INDICES?

Typically, RDB data is indexed directly on the data set. NoSQL::PL2SQL::Simple indexes a second table that contains references to the first table. The biggest drawback to this approach is that records must be explicitly added and deleted from this second table. Happily, this additional complexity is automatically handled whenever records are added or modified. Provided the class definition is completed before object data is inserted, this approach should not affect anyone using NoSQL::PL2SQL::Simple.

However, a little housekeeping is required to modify an existing data set:

  my $friends = new TQIS::HighSchoolFriends ;
  $friends->addTextIndex('contactemail') ;

  ## This solution is inefficient, but comprehensive
  map { $friends->record( $_ )->save } $friends->query ;

  ## Alternatively, use this sequence, which must be 
  ## repeated for each new element definition
  map { $friends->record( $_ )->reindex('contactemail') } 
                        $friends->query ;

In order to implement NoSQL::PL2SQL's support for indeterminate objects, the class data definition may be completely independent of the object data. This approach requires explicit mapping, but can be useful:

  my $members = new TQIS::GPRC::Members ;
  $members->addTextIndex('byemail') ;
  
  ## add a new member
  $members->record( CGI->new->Vars, 
                workemail => 'byemail', homeemail => 'byemail' ) ;

  ## This approach creates two records in the index table (three if 
  ## there's an explicit byemail form field), and improves the
  ## data access capabilities as follows:

  ## Both these statements work
  my $jim = $members->byemail('jim@work.tqis.com')->record ;
  $jim = $members->byemail('jim@home.tqis.com')->record ;

  ## this approach must be implemented consistently
  $jim->save( CGI->new->Vars, workemail => 'byemail', homeemail => 'byemail' ) ;

  ## reindexing also allows mapping:
  map { $members->record( $_ )->reindex( workemail => 'byemail' ) }
                $members->query ;
  map { $members->record( $_ )->reindex( homeemail => 'byemail' ) }
                $members->query ;

MANY-TO-ONE RELATIONSHIPS

Most of the time, database records can be accessed using a one-to-many relationship. For example, a calendar application would primarily consist of event objects whose elements include a single date, say February 14. Many events can share that date, so when February 14 is queried, all the matching events can be displayed.

NoSQL::PL2SQL::Simple handles all of these relationships automatically. If the object's date is changed to February 17, the correspondence from February 14 is automatically broken and all future queries will return expected results.

This example, which assumes that events only correspond to a single date, represents a one-to-many relationship. Eg, one date corresponds to many events. In a more complicated data model, events can correspond to many dates, if the event occurs repeatedly or spans more than one day. This more complicated model is called many-to-many: multiple objects per key and multiple keys per object.

NoSQL::PL2SQL::Simple automates the multiple objects per key functionality; but handling multiple keys per object requires explicit management using the keyValues() method. In a web-based calendar, the defined properties might be the record owner and location (zipcode). If event entries always correspond to a single date, then the date will also be a defined property. NoSQL::PL2SQL::Simple automatically manages queries for defined properties.

If the event can correspond to many dates, the web interface becomes more complicated. This codes demonstrates several different scenarios:

  my $events = new mycalendar::event ;          ## $events is an instance
  print $events->{eventdate}, "\n" ;            ## prints: I<datekey>
  my $event = $events->record( CGI->new->Vars ) ;       ## $event is an object

  ## One scenario is that the interface returns a set of dates
  ## such as a string containing many separated dates.
  my @dates = split /,/, $event->{datelist} ;
  $event->keyValues( eventdate => @dates ) ;    ## add the set of dates

  ## In this scenario, when the record is edited, simply replace the
  ## set of dates:
  $event->keyValues('eventdate')->clear() ;     ## clear the existing set
  $event->keyValues( eventdate => split /,/, CGI->new->Vars->{datelist} ) ;

  ## Another scenario is an interface worksheet that adds dates one at a time
  $event->keyValues( eventdate => CGI->new->Vars->{nextdate} ) ;

  ## The named property I<nextdate> must be distinct from the named
  ## property I<eventdate> to prevent automatic key management

  ## Repeat this process everytime a new date is added.  The interface should
  ## prevent a duplicate date selection
  $event->keyValues( eventdate => CGI->new->Vars->{nextdate} ) ;

  ## The interface should allow users to delete date selections
  my $delete = CGI->new->Vars->{deletedate} ;
  my @replace = grep $_ ne $delete, $event->keyValues('eventdate') ;
  $event->keyValues('eventdate')->clear() ;
  $event->keyValues( eventdate => @replace ) ;

DEVELOPERS NOTES

If you're planning to write an implementation of NoSQL::PL2SQL, take a look at samples/pl2sql/comments.pl in the distribution package. NoSQL::PL2SQL::Simple is intended to be simple only from the point of view of its interface.

If you're interested in writing OO modules, this module is unusually deliberate and worth noting.

First, the module contains several private methods: update(), recno(), filter(), index(), matching(), and indexmap(). These methods may disappear in future versions, or require changes where they're called. So these methods are out of bounds. Also, since NoSQL::PL2SQL::Simple is intended for subclassing, the number of base methods should be deliberately small to minimize namespace conflicts. The source code illustrates how to defined these private methods.

Second, I normally would not write a new() constructor- most implementations will want to override the constructor. But in the interest of a simple API, I prefer constructors with no arguments. So the new() method is used to generate an instance, which in turn uses record() to create objects. This approach may be insufficiently robust.

Perl OO generally consists of blessed data structures. Once blessed, a data structure is associated with specific class methods that can make assumptions about that data. NoSQL::PL2SQL::Simple is a bit more complicated. But advanced users may want their data definition classes to include this functionality and define additional class methods. They should also rename the constructor.

These users should be familiar enough with Perl OO to implement an abstract data source subclass. One is defined above and used in this example:

  package TQIS::MyAdvancedClass ;
  use base qw( TQIS::PL2SQL::DSN ) ;

  ## Rename the constructor to dataDefinition()
  sub dataDefinition {
        my $package = shift ;
        return bless NoSQL::PL2SQL::Simple::new( $package, @_ ),
                        $package ;
        }

  ## Create a default constructor that returns an object instead 
  ## of an instance.  Replace the generic code with something useful.
  sub new {
        my $package = shift ;
        return bless {}, $package ;
        }

  ## Create methods intended for objects, not instances.  This one
  ## prints out a mailing label from a contact record.
  sub mailingLabel {
        my $self = shift ;
        return join '', $self->{name}, "\n",
                        $self->{address}, "\n",
                        $self->{city}, ' ', 
                        $self->{state}, '  ',
                        $self->{zipcode}, "\n" ;
        }

  ## __END__

  ## Here's how the class might be used:

  use TQIS::MyAdvancedClass ;

  my $contacts = TQIS::MyAdvancedClass->dataDefinition ;
  print $contacts->email('jim@tqis.com')->record->mailingLabel ;

Third, NoSQL::PL2SQL::Simple uses private properties to avoid potentially overwriting user data.

Using Data::Dumper on both an unblessed structure and its corresponding blessed object might indicate that the two are equivalent. In fact, the blessed object has additional private properties. For example, the object id property can only be accessed using the SQLObjectID() method. Private properties in Perl OO can share a name assigned to a public property without conflict.

EXPORT

None by default.

SEE ALSO

NoSQL::PL2SQL
NoSQL::PL2SQL::DBI
http://pl2sql.tqis.com/

AUTHOR

Jim Schueler, <jim@tqis.com>

COPYRIGHT AND LICENSE

Copyright (C) 2012 Jim Schueler

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.9 or, at your option, any later version of Perl 5 you may have available.