The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

DBIx::DataModel::Doc::Cookbook - Helpful recipes

DOCUMENTATION CONTEXT

This chapter is part of the DBIx::DataModel manual.

DESCRIPTION

This chapter provides some recipes for common ORM tasks.

SCHEMA DECLARATION

Automatically generate a schema

A schema skeleton can be produced automatically from the following external sources : a DBI connection, a SQL::Translator parser, or a DBIx::Class schema. See DBIx::DataModel::Schema::Generator. That schema skeleton contains enough information to be immediately usable with minimal functionalities; but it is usually a good idea to enrich the schema with additional specifications, like for example types and column definitions.

Add custom methods into a generated table class

Defining methods in any Perl class does not require to have a file corresponding to that class; it suffices to define the method within the appropriate package. So the easiest way to add methods into tables is to first let DBIx::DataModel create the schema and table classes, and then switch to those packages, all in the same file :

  # define schema, tables, associations (current package doesn't matter)
  DBIx::DataModel->Schema('Some::Schema')
    ->Table(qw/Foo foo foo_id/)
    ->Table(...)
    ->Association(...)
    ->...;

  # add a method into table 'Foo'
  package Some::Schema::Foo;
  sub my_added_method {
    my $self = shift;
    ...
  }

  # go back to main package
  package main;
  ...
   

Another way to achieve the same result is to use DBIx::DataModel's internal utility method for injecting methods into classes :

   use DBIx::DataModel::Meta::Utils qw/define_method/;
   define_method(
      class => 'Some::Schema::Foo',
      name  => 'my_added_method',
      body  => sub {my $self = shift; ...},
    );
  

Views within the ORM

define_table() declarations usually map directly to database tables or database views; but it is also possible to map to an SQL query, possibly with a predefined where clause :

  $schema->metadm->define_table(
    class       => 'View_example',
    db_name     => 'Foo INNER JOIN Bar ON Foo.fk=Bar.pk',
    where       => {col => $special_filter},
    primary_key => [qw/some_foo_col some_bar_col/],
    parents     => [map {$schema->metadm->table($_)} qw/Foo Bar/],
  );

The same can be declared through the front-end View() method :

  $schema->View('View_example', '*',
                'Foo INNER JOIN Bar ON Foo.fk=Bar.pk',
                {col => $special_filter}, [qw/Foo Bar/],
                {primary_key => [qw/some_foo_col some_bar_col/],
                 parents     => [map {$schema->metadm->table($_)} qw/Foo Bar/]},
                );

This is exactly the same idea as a database view, except that it is implemented within the ORM, not within the database. Such views can join several tables, or can specify WHERE clauses to filter the data. ORM views are useful to implement application-specific or short-lived requests, that would not be worth registering persistently within the database model. They can also be useful if you have no administration rights in the database.

Object inflation/deflation

The term "object inflation" means that a scalar value read from a column in the database is transformed into an Perl object in memory, and is transformed back into a scalar value when writing into the database. The standard example for such situations is the handling of dates, because Perl programs often need to perform operations on dates that are not possible with a plain scalar format.

Here is an example of automatic inflation/deflation of date columns to Perl objects of class Date::Simple :

  # declare column type
  use Date::Simple;
  $schema->Type(Date_simple => 
    from_DB => sub {Date::Simple->new($_[0]) if $_[0] },
    to_DB   => sub {$_[0] = $_[0]->as_str    if $_[0] },
  );
  
  # apply column type to columns
  My::Table1->metadm->define_column_type(Date_simple => qw/d_start d_end/);
  My::Table2->metadm->define_column_type(Date_simple => qw/d_birth/);

With this automatic conversion, all functionalities of Date::Simple can be applied to date columns within rows of Table1 and Table2 : comparisons, date arithmetics, etc.

Caveat: the from_DB / to_DB functions do not apply automatically within -where conditions. So the following would not work :

  use Date::Simple qw/today/;
  my $rows = $schema->table($name)->select(
    -where => {d_end => {'<' => today()}},  # BOGUS
  );

because today() returns a Date::Simple object that will not be understood by SQL::Abstract when generating the SQL query. DBIx::DataModel is not clever enough to inspect the -where conditions and decide which column types to apply, so you have to do it yourself :

  my $today = today()->as_str;
  my $rows = $schema->table($name)->select(
    -where => {d_end => {'<' => $today}},
  );

SQL Types

At places where a plain value is expected, you can put an arrayref of 2 elements, where the first element is a type specification, and the second element is the value. This is convenient when the DBD driver needs additional information about the values used in the statement. See "BIND VALUES WITH TYPES" in SQL::Abstract::More for explanations.

  my $rows = $source->select(
    -where => {col => [{sql_type => 'some_type'}, $val]}
  );
  $source->insert(
    {key => $pk, some_col => [{sql_type => 'some_type'}, $val]}
  );
  $record->update(
    {some_col => [{sql_type => 'some_type'}, $val]}
  );

This can also be automated within a to_DB handler :

  # adding type information for the DBD handler to inform Oracle about XML data
  $schema->Type(XML => 
     to_DB  => sub {$_[0] = [{dbd_attrs => {ora_type => ORA_XMLTYPE}}, $_[0]]
                        if $_[0]},
    );

Quoting table and column names

By default, table or column names are inserted "as is" in the generated SQL; but sometimes this could cause conflicts with SQL reserved words. The solution is to quote table and column names, by activating the quote_char option of SQL::Abstract, inherited through SQL::Abstract::More. Here is an example :

  # define the schema
  DBIx::DataModel->Schema('SCH', {sql_abstract_args => [quote_char => "`"]});

  # define a table
  SCH->Table(qw/Config CONFIG KEY/);

  # produce SQL with quoted table and column names
  my ($sql, @bind) = SCH::Config->select(
    -columns   => [qw/KEY VALUE/],
    -where     => {KEY => 123},
    -result_as => 'sql',
   );

  print $sql; # SELECT `KEY`, `VALUE` FROM `CONFIG` WHERE ( `KEY` = ? )

Self-referential associations

Associations can be self-referential, i.e. describing tree structures :

  $schema->Association([qw/OrganisationalUnit parent   1 ou_id       /],
                       [qw/OrganisationalUnit children * parent_ou_id/],

However, when there are several self-referential associations, we might get into problems : consider

  $schema->Association([qw/Person mother   1 pers_id  /],
                       [qw/Person children * mother_id/])
         ->Association([qw/Person father   1 pers_id  /],
                       [qw/Person children * father_id/]); # BUG: children

This does not work because there are two definitions of the "children" role name in the same class "Person". One solution is to distinguish these roles, and then write by hand a general "children" role :

  $schema->Association([qw/Person mother          1 pers_id  /],
                       [qw/Person mother_children * mother_id/])
         ->Association([qw/Person father          1 pers_id  /],
                       [qw/Person father_children * father_id/]);
  
  package MySchema::Person;
  sub children {
    my $self = shift;
    my $id = $self->{pers_id};
    my $sql = "SELECT * FROM Person WHERE mother_id = $id OR father_id = $id";
    my $children = $self->dbh->selectall_arrayref($sql, {Slice => {}});
    MySchema::Person->bless_from_DB($_) foreach @$children;
    return $children;
  }

Alternatively, since rolenames mother_children and father_children are most probably useless, we might just specify unidirectional associations :

  $schema->Association([qw/Person mother  1 pers_id  /],
                       [qw/Person ---     * mother_id/])
         ->Association([qw/Person father  1 pers_id  /],
                       [qw/Person ---     * father_id/]);

And here is a more sophisticated way to define the "children" method, that will accept additional "where" criteria, like every regular method.

  package MySchema::Person;
  sub children {
    my $self      = shift; # remaining args in @_ will be passed to select()
    my $class     = ref $self;
    my $id        = $self->{pers_id};
    my $statement = $self->schema->table($class)->select(
      -where => [mother_id => $id, 
                 father_id => $id],
      -result_as => 'statement'
    );
    return $statement->select(@_);
  }

This definition forces the join on mother_id or father_id, while leaving open the possibility for the caller to specify additional criteria. For example, all female children of a person (either father or mother) can now be retrieved through

  $person->children(-where => {gender => 'F'})

Observe that mother_id and father_id are inside an arrayref instead of a hashref, so that SQL::Abstract will generate an SQL 'OR'.

Schema versioning

Currently DBIx::DataModel has no specific support for schema versioning. See CPAN module DBIx::VersionedSchema, or switch to the DBIx::Class ORM, that has good support for schema versioning.

DATA RETRIEVAL

Database functions

Use normal SQL syntax for database functions, and give them column aliases (with a vertical bar |) in order to retrieve the results.

  my $row = $source->select(-columns   => [qw/MAX(col1)|max_col1
                                              AVG(col2)|avg_col2
                                              COUNT(DISTINCT(col3))|n_col3/],
                            -where     => ...,
                            -result_as => 'firstrow');
  print "max is : $row->{max_col1}, average is $row->{avg_col2}";

Or you can dispense with column aliases, and retrieve the results directly into an arrayref, using -result_as => 'flat_arrayref' :

  my $array_ref = $source->select(-columns   => [qw/MAX(col1)
                                                   AVG(col2)
                                                   COUNT(DISTINCT(col3))/],
                                  -where     => ...,
                                  -result_as => 'flat_arrayref');
  my ($max_col1, $avg_col2, $count_col3) = @$array_ref;

Caveat: from_DB handlers do not apply to database functions. So if the result needs any transformation, you have to specify a column type for it at the statement level :

  my $row = $source->select(
    -columns      => [qw/MAX(d_begin)|max_d_begin MIN(d_end)|min_d_end .../],
    -where        => ...,
    -column_types => {Date_simple => [qw/max_d_begin min_d_end/],
    -result_as    => 'firstrow'
  );

Conditions on functions with special syntax

Some database systems have SQL functions with special syntax. For example a fulltext search in Oracle is expressed as

  ... WHERE CONTAINS(fulltext_field, 'word') > 0

This does not fit well in a hashref to be passed as a -where condition for SQL::Abstract::More, because the name of the field and the bind value are lost within the SQL syntax. To make it easier, we define a special operator for SQL::Abstract::More :

  # define the schema
  DBIx::DataModel->Schema('SCH',
                          {sql_abstract_args => [sql_dialect => "Oracle12c",
                                                 special_ops => [{regex   => qr/^contains(:?_all|_any)?$/i,
                                                                  handler => \&_fulltext_contains_for_Oracle}]]});
  
  sub _fulltext_contains_for_Oracle {
    my ($self, $field, $op, $arg) = @_;

    my $sql = "CONTAINS($field, ?) > 0";
    my @bind;

    # Oracle connector for words : default '&', but '|' if op is -contains_any
    my $connector = ($op =~ /any$/) ? ' | ' : ' & ';

    # words to be passed to the CONTAINS function
    my @words = ref $arg ? @$arg : ($arg);
    @words = map { split /\s+/ } grep {$_} @words;

    @bind = (join $connector, @words);
    return ($sql, @bind);
  }

Now fulltext queries can be expressed easily as

  my $results = SCH->table('Table1')->select(
    -where => {fulltext_field1 => {-contains_all => ['ab', 'cd']},
               fulltext_field2 => {-contains_any => ['ef', 'gh', 'ij']},
              },
   );

Nested queries

For inserting a nested query within a basic query, we need to pass the SQL and bind values of the nested query to SQL::Abstract; the syntax for this is a reference to an arrayref (in other words a double reference), as explained in "Literal SQL with placeholders and bind values (subqueries)" in SQL::Abstract.

DBIx::DataModel has a feature to produce exactly this datastructure :

  my $subquery = $source1->select(..., -result_as => 'subquery');

Then it is easy to insert the subquery within another query.

  my $rows = $source2->select(
      -columns => ...,
      -where   => {foo => 123, bar => {-not_in => $subquery}},
   );

"Hashref inflation"

Unlike other ORMs, there is no need here to transform results into hashrefs, because rows returned by a select() can be used directly as hashrefs. For example here is a loop that prints a hash slice from each row :

  my $rows       = $schema->table($name)->select(...);
  my @print_cols = qw/col3 col6 col7/;
  foreach my $row (@$rows) {
    print @{$row}{@print_cols};
  }

The only differences between row objects and plain Perl hashrefs are that :

  • they are blessed into a source class

  • they may contain an additional key $row->{__schema} if DBIx::DataModel is used in multi-schema mode.

Those differences can often be ignored; but nevertheless they can be a problem with some external modules like JSON that croak when encoding a blessed reference. In that case you can use the unbless() function which removes both the blessing and the __schema key. Unblessing is recursively applied to nested datastructures :

  $schema->unbless($rows);
  my $json = JSON->new->encode($rows);

Common table expressions (WITH RECURSIVE)

The SQL syntax for common table expressions (CTEs), introduced in SQL 1999, defines a temporary name corresponding to a simple query, so that this name can be used in a more general SQL statement :

  WITH [RECURSIVE] <tmp_table_name> (<col1>, ...) AS (<simple_query>)
  SELECT <main_query>

This is useful in two situations :

  • when the tmp_table_name is needed at several places within the main query

  • for expressing queries that willl recursively traverse a graph of related nodes. See SQLite examples at https://sqlite.org/lang_with.html; but many other database management systems also support CTEs. possibly with some slight variations.

For using CTEs within DBIx::DataModel, the first step is encapsulate the WITH query as a new instance of SQL::Abstract::More, through the "with_recursive" in SQL::Abstract::More method. Then that instance can be passed to DBIx::DataModel statements through the -with argument. Here is an example borrowed from https://sqlite.org/lang_with.html :

  • suppose an initial table like this :

      CREATE TABLE family(name, mom, dad, bord, died)

    The declaration within DBIx::DataModel looks like this :

      my $schema = DBIx::DataModel->Schema('CTE_example');
      $schema->Table(qw/Family family name/); 
  • Encapsulate a descendant_of common table expression as a new instance of SQL::Abstract::More :

      sub sqla_with_CTE_descendant_of {
        my ($schema, $ancestor) = @_;
    
        return $schema->sql_abstract->with_recursive(
          [ -table     => 'parent_of',
            -columns   => [qw/name parent/],
            -as_select => {-columns => [qw/name mom/],
                           -from    => 'family',
                           -union   => [-columns => [qw/name dad/]]},
           ],
          [ -table     => 'descendant_of',
            -columns   => [qw/name/],
            -as_select => {-columns   => [qw/name/],
                           -from      => 'parent_of',
                           -where     => {parent => $ancestor},
                           -union_all => [-columns => [qw/parent_of.name/],
                                          -from    => [qw/-join parent_of {parent=name} descendant_of/]],
                       },
           ],
          );
      }

    Note: this is defined at the level of SQL::Abstract::More, not DBIx::DataModel, so the syntax for the join is -from => [qw/-join parent_of {parent=name} descendant_of/], following the specification in "join" in SQL::Abstract::More.

  • The CTE table descendant_of will recursively find all descendants of any given ancestor. This can be used as a subquery for selecting family members who are descendants :

      my $subquery    = \ ["SELECT name FROM descendant_of"];
      my $descendants = $schema->table('Family')->select(
        -with     => sqla_with_CTE_descendant_of($schema, $ancestor),
        -columns  => [qw/name born died/],
        -where    => {name => {-in => $subquery }},
        -order_by => 'born',
      );
      

Another approach would be to declare ancestor_of as a new table, and add a new association with the family table. This approach is displayed below; but it is not recommanded because it creates permanent metada within the schema, while CTEs are meant to be used as temporary constructs for building complex queries. Here is the example :

  $schema->Table(qw/Descendant_of descendant_of name/)
         ->Association([qw/Descendant_of descendants *  name/],
                       [qw/Family        family      1  name/]);
  
  my $descendants = $schema->join(qw/Descendant_of family/)->select(
    -with     => sqla_with_CTE_descendant_of($schema, $ancestor),
    -columns  => [qw/family.name born died/],
    -order_by => 'born',
  );

DATA UPDATE

Transaction

  # anonymous sub containing the work to do
  my $to_do = sub {
    $table1->insert(...);
    $table2->delete(...);
  };
  # so far nothing has happened in the database
  
  # now do the transaction
  $schema->do_transaction($to_do);

Nested transaction

  $schema->do_transaction(sub {
    do_something();
    $schema->do_transaction(sub { some_nested_code();       });
    $schema->do_transaction(sub { some_other_nested_code(); });
  });

Nested transaction involving another database

  $schema->dbh($initial_dbh);
  $schema->do_transaction(sub {

    # start working in $initial_dbh
    do_something();

    # now some work in $other_dbh
    $schema->do_transaction(sub { some_nested_code();       }, $other_dbh);

    # here, implicitly we are back in $initial_dbh
    $schema->do_transaction(sub { some_other_nested_code(); });
  });
  # commits in both $initial_dbh and $other_dbh are performed here

Generating primary keys

Most database systems have mechanisms to generate primary keys automatically, generally as a sequence of natural numbers; however, there may be situations where one would like primary keys to be generated under other algorithms, like for example taking a random number, or taking the next "free slot" in a sparse sequence of numbers. Algorithmic generation of keys can be implemented in the ORM layer by overriding the _singleInsert() method. Here is an example :

  sub insert_with_random_key {
    my ($self) = @_;
    my $class = ref $self;
    my ($key_column) = $class->primary_key;
  
    for (1..$MAX_ATTEMPTS) {
      my $random_key = int(rand($MAX_RANDOM));
  
        $self->{$key_column} = $random_key;
        eval {$self->_rawInsert; 1} 
          and return $random_key;   # SUCCESS

        # if duplication error, try again; otherwise die
        last unless $DBI::errstr =~ $DUPLICATE_ERROR;
     }
     croak "cannot generate a random key for $class: $@";
  }
  
  foreach my $class (@tables_with_random_keys) {
    define_method(
      class          => $schema->metadm->table($class)->class,
      name           => '_singleInsert',
      body           => \&insert_with_random_key,
    );
  }

Cascaded operations

Some database systems support cascaded operations : for example a constraint definition with a clause like ON DELETE CASCADE will automatically delete child rows (rows containing foreign keys) when the parent row (the row containing the primary key) is deleted.

DBIx::DataModel does not know about such cascaded operations in the database; but it can perform some cascaded operations at the ORM level, when tables are associated through a composition. In that case, the insert() method can accept a data tree as argument, and will automatically perform recursive inserts in the children tables; an example is given in the quickstart tutorial. Cascaded deletes are also supported :

  my $bach = HR->table('Employee')->fetch($bach_id); 
  $bach->expand('activities');
  $bach->delete; # deletes the Employee together with its Activities

The expand operations retrieve related records and add them into a tree in memory. Then delete removes from the database all records found in the tree.

Observe that this is not a "true" cascaded delete, because the client code is responsible for fetching the related records first.

Timestamp validation

Suppose we want to sure that the record was not touched between the time it was presented to the user in a display form and the time the user wants to update or delete that record.

In order to do this, we will suppose that every record in every table has a timestamp field TS_MODIF, updated automatically by a trigger within the database. When defining the schema, we register an auto_update callback on that column; such callbacks are called automatically both on update() and insert() calls :

  DBIx::DataModel->define_schema(
   class               => 'My::Schema',
   auto_update_columns => {TS_MODIF => \&_check_time_stamp},
  );

The body of the callback looks like this :

  sub _check_time_stamp {
    my ($record, $table, $where) = @_;
    if ($where) { # this is an update, not an insert

      my $displayed_timestamp = delete $record->{TS_MODIF};
      my $db_record  = $record->schema->table($table)->select(
        -columns   => 'TS_MODIF',
        -where     => $where,
        -for       => 'update', # optional, depends on your RDBMS
        -result_as => 'firstrow',
      )
        or croak "fetch timestamp: could not find record "
               . join(" / ", %$where);
     my $db_timestamp = $db_record->{TS_MODIF};
     $db_timestamp == $displayed_timestamp
       or croak "record in $table was modified by somebody else; please "
              . "refresh your screen and try again";
     }
  }

DATA CONVERSION

JSON

  use JSON;
  my $json_converter = JSON->new->convert_blessed(1);
  my $json_text      = $json_converter->encode($data_row);

By default, the JSON module refuses to convert any object into JSON; however, the "convert_blessed" in JSON option will accept to convert objects provided they possess a TO_JSON method. Such a method is implemented in the "DBIx::DataModel::Source" in DBIx::DataModel::Source class, so any data row can be converted into JSON.