The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

HTML::Tabulate - HTML table rendering class

SYNOPSIS

    use HTML::Tabulate qw(render);

    # Setup a simple table definition hashref
    $table_defn = { 
        table => { border => 0, cellpadding => 0, cellspacing => 3 },
        th => { class => 'foobar' },
        null => ' ',
        labels => 1,
        stripe => '#cccccc',
    };

    # Render a dataset using this table definition (procedural version)
    print render($dataset, $table_defn);

    # Object-oriented version
    $t = HTML::Tabulate->new($table_defn);
    print $t->render($dataset);

    # Setup some dataset specific settings
    $table_defn2 = {
        fields => [ qw(emp_id name title edit new_flag) ],
        field_attr => {
            # format employee ids, add a link to employee page
            emp_id => {
                format => '%-05d',
                link => "emp.html?id=%s",
                link_target => '_blank',
                align => 'right',
            },
            # uppercase all names
            qr/name$/ => { format => sub { uc(shift) } },
            # highlight new employees
            new_flag => {
                class => sub { 
                    my ($data, $row, $field) = @_;
                    $data =~ m/^y$/i ? 'new' : 'old';
                },
            },
        },
    };

    # Render the table using the original and additional settings
    print $t->render($data, $table_defn2);

DESCRIPTION

HTML::Tabulate is used to render/display a given set of data in an HTML table. It takes a data set and a presentation definition and applies the presentation to the data set to produce the HTML table output. The presentation definition accepts arguments corresponding to HTML table tags ('table', 'tr', 'th', 'td' etc.), to define attributes for those tags, plus additional arguments for other aspects of the presentation. HTML::Tabulate supports advanced features like automatic striping, arbitrary cell formatting, link creation, etc.

Presentation definitions can be defined in multiple passes, which are progressively merged, allowing general defaults to be defined in common and then overridden by more specific requirements. Presentation definitions are stored in the current object, except for those defined for a specific 'render', which are temporary.

Supported data sets include arrayrefs of arrayrefs (DBI selectall_arrayref, for example), arrayrefs of hashrefs, a simple hashref (producing single row tables), iterator objects that support first() and next() methods (like DBIx::Recordset objects or Class::DBI/DBIx::Class iterators), and (as of version 0.31) iterator subroutines returning successive rows in the dataset.

By default arrayref-based datasets are interpreted as containing successive table rows; a column-based interpretation can be forced using style => 'across'.

The primary interface is object-oriented, but a procedural interface is also available where the extra flexibility of the OO interface is not required.

PRESENTATION DEFINITION ARGUMENTS

table

Hashref. Elements become attributes on the <table> tag. e.g.

  table => { border => 0, cellpadding => 3, align => 'center' }
tr

Hashref. Elements become attributes on <tr> tags. Element values may be either scalars, which are used as literals, or subroutine references whose result value is used as the value of the tr attribute.

Note that 'tr' element subs are called differently depending on the 'style' of the table. For 'down' style tables, they are called with a single argument:

  $sub->( $data_row )

which is the reference to the current data row. For 'across' style tables, they are called with two arguments:

  $sub->( $across_row, $data )

where the first is an arrayref of the values in the data slice (column) in your dataset that are being rendered as the current row (including labels, if used), and the second is the full dataset as an arrayref of your data rows.

For instance:

  style => 'across',
  labels => 1, 
  tr => {
    class => sub { 
      my $r = shift; my $name = $r->[0]; $name =~ s/\s+/_/; lc $name
    },
  },

will set the 'class' attribute on the 'tr' to be a lowercased underscored version of the row label.

thead

Scalar/hashref. If defined and true, the first line of the table (whether labels or data) will be wrapped in <thead> ... </thead> tags. Any entries in the hashref will be used as attributes for the thead tag. Note that theads require a tbody, so tbody (following) will be set to 1 if undefined.

tbody

Scalar/hashref. If defined and true, the default treatment is to wrap the table body (the non-labels portion of the table) in a single set of <tbody> .. </tbody> tags. Any entries in the hashref (except for '-field' and '-rows', used below) will be used as attributes for the tbody tag.

Two additional tbody styles are supported. If a '"-field" => "FIELDNAME"' element exists in the tbody hashref, then the table body will be broken into tbody sections whenever the value of the given field changes (does not necessarily need to be a displayed field, of course) e.g.

  tbody => { '-field' => 'emp_gender' }

If a '"-rows" => NUMBER' element exists in the tbody hashref, the table body will be broken into tbody sections every NUMBER rows. e.g.

  tbody => { '-rows' => 25 }
thtr

Hashref. Elements become attributes on the <tr> tag of the label/heading row. (For 'across' style tables, where labels are displayed down the page, rather than in a row, thtr elements become attributes of the individual <th> tags.) Element values must be scalars.

th

Hashref. Elements become attributes on the <th> tags used for labels/headings. Element values may be either scalars, which are used as literals, or subroutine references, which are called with the following arguments:

  $sub->( $data, $row, $field )

and the result used as the attribute value. The arguments are: $data is the (label) value; $row is a reference to the entire row; and $field is the name of the field (so subreferences can be potentially used for more than one field).

For example, given the following set of labels on a table:

  'Emp ID', 'Emp Name', 'Emp Title', 'Emp Birth Dt'

you could define a class attribute to the <th> tag by doing:

  th => {
    class => sub {
      my ($d, $r, $f) = @_;
      $d =~ m/^Emp //;
      $d =~ m/\s+/_/g;
      lc $d
    },
  }

which would give a th line like (line breaks added for clarity):

  <tr>
  <th class="id">Emp ID</th>
  <th class="name">Emp Name</th>
  <th class="title">Emp Title</th>
  <th class="birth_dt">Emp Birth Dt</th>
  </tr>
td

Hashref. Elements become attributes on <td> tags. Hash values may be either scalars, which are used directly, or subroutine references, which are called with the following arguments:

  $sub->( $data, $row, $field )

and the result used as the attribute value. See the preceding th item for further explanation and discussion.

fields

Arrayref. Defines the order in which fields are to be output for this table, using the field names from the dataset. e.g.

  fields => [ qw(emp_id emp_name emp_title emp_birth_dt) ]

If 'fields' is not defined at render time and the dataset is not array-based, HTML::Tabulate will attempt to derive a useful default set from your data, and croaks if it is not successful.

fields_add

Hashref. Used to define additional fields to be included in the output to supplement a default field list, or fields derived from a data object itself. The keys of the fields_add hashref are existing field names; the values are scalar values or arrayref lists of values to be inserted into the field list after the key field. e.g.

  fields_add => {
    emp_name => [ 'emp_givenname', 'emp_surname' ],
    emp_birth_dt => 'edit',
  }

applied to a fields list qw(emp_id emp_name emp_title emp_birth_dt) produces a composite field list containing:

  qw(emp_id emp_name emp_givenname emp_surname emp_title 
     emp_birth_dt edit)
fields_omit

Arrayref. Used to omit fields from the base field list. e.g.

  fields_omit => [ qw(emp_modify_ts emp_create_ts) ]
in_fields

Arrayref. Defines the order in which fields are defined in the dataset, if different to the output order defined in 'fields' above. e.g.

  in_fields => [ qw(emp_id emp_title emp_birth_dt emp_title) ]

Using in_fields only makes sense if the dataset rows are arrayrefs.

derived

Arrayref. Defines fields that are not present in the underlying data, to avoid unnecessary lookups. (You are presumably deriving these values from other data in the row via a 'value' sub or something.)

Can also be set as a derived flag in per-field field_attr sections, if you prefer.

style

Scalar, either 'down' (the default), or 'across', to render data 'rows' as table 'columns'.

xhtml

Scalar (boolean). Turns on 'xhtml' mode if true. xhtml mode closes empty elements with a trailing slash (e.g. <br />), and renders minimised attributes in HTML (e.g. nowrap, disabled, selected, etc.) in non-minimised (nowrap="nowrap") format. Default: 0.

labels

Scalar (boolean), or hashref (mapping field keys to label/heading values). Labels can also be defined using the 'label' attribute argument in per-field attribute definitions (see 'label' below). e.g.

  # Turn labels on, derived from field names, or defined per-field
  labels => 1

Hashref, mapping field keys to URLs (full URLs or absolute or relative paths) to be used as the targets when making the label for that field into an HTML link. e.g.

  labels => { emp_id => 'Emp ID' }, 
  label_links => { emp_id => "me.html?order=%s" }

will create a label for the emp_id field of:

  <a href="me.html?order=emp_id">Emp ID</a> 
stripe

Scalar, arrayref, or hashref. A scalar or an arrayref of scalars should be HTML color values. Single scalars are rendered as HTML 'bgcolor' values on the <tr> tags of alternate rows (i.e. alternating with no bgcolor tag rows), beginning with the label/header row, if one exists. Multiple scalars in an arrayref are rendered as HTML 'bgcolor' values on the <tr> tags of successive rows, cycling through the whole array before starting at the beginning again. e.g.

  # alternate grey and default bgcolor bands
  stripe => '#999999'             

  # successive red, green, and blue stripes
  stripe => [ '#cc0000', '#00cc00', '#0000cc' ]

Stripes that are hashrefs or an arrayref of hashrefs are rendered as attributes to the <tr> tags on the rows to which they apply. Similarly to scalars, single hashrefs are applied to every second <tr> tag, beginning with the label/header row, while multiple hashrefs in an arrayref are applied to successive rows, cycling though the array before beginning again. e.g.

  # alternate stripe and default rows
  stripe => { class => 'stripe' }

  # alternating between two stripe classes
  stripe => [ { class => 'stripe1' }, { class => 'stripe2' } ]
null

Scalar, defining a string to use in place of any empty data value (undef or eq ''). e.g.

  # Replace all empty fields with non-breaking spaces
  null => '&nbsp;'
trim

Scalar (boolean). If true, leading and trailing whitespace is removed from data values.

field_attr

Hashref, defining per-field attribute definitions. Three kinds of keys are supported:

-defaults

The special literal '-defaults' is used to define defaults for all fields (but can be overridden by more specific definitions).

qr() regular expressions

qr-quoted regular expressions are used as defaults for fields where the regex matches the field name.

field names

Simple field names define attributes just for that field.

These are always merged in the order above, allowing defaults to be defined for all fields, overridden for fields matching particular regexes, and then overridden further per-field. e.g.

  # Align all fields left except timestamps (*_ts)
  field_attr => {
    -defaults => { align => 'left' },
    qr/_ts$/ => { align = 'center' },
    emp_create_ts => { label => 'Created' },
  },

Field attribute arguments are discussed in the following section.

title

Scalar, hashref, or subroutine reference, defining a title rendered above the table. A scalar title is interpreted as the title string, and rendered as a vanilla <h2> title (use hashref or subref variants for more control). A hashref title can contains 'value' and 'format' elements - 'value' is a scalar containing the title string, and 'format' is a scalar sprintf pattern (default: '<p>%s</p>') used to render the title value, or a subref called with the following arguments:

    $format->($value, $dataset, $type)

(where $type is 'title') and should return the formatted title string to be used.

Subref titles are similar, except there is no separate title string involved; they are called with the following arguments:

    $title->($dataset, $type);

(where $type is 'title') and should return the formatted title string to be used.

Examples:

    # rendered: <h2>Employee Data</h2>
    title => 'Employee Data'
    # rendered: <h3 class="red_white_blue">Employee Data</h3>
    title => {
        value => 'Employee Data',
        format => '<h3 class="red_white_blue">%s</h3>',
    }
    # rendered (e.g.): <h2>Employee Data (3 records)</h2>
    title => sub {
        my ($set, $type) = @_;
        my $title = 'Employee Data';
        $title .= ' (' . scalar(@$set) . ' records)'
            if ref $set eq 'ARRAY';
        sprintf '<h2>%s</h2>', $title;
    }
text

Scalar, hashref, or subroutine reference, defining text to be included immediately before the table (but after a 'title', if any). Treated exactly like 'title' above, except that the $type argument passed to subrefs is 'text', and the default format defined is '<p>%s</p>'.

caption

Scalar, hashref, or subroutine reference, defining text to be included as a caption to the table. Two types of output are supported: the 'text' type is treated just like 'title' and 'text' above, except that the text is included immediately after the table, the $type argument passed to subrefs is 'caption', and the default format defined is '<p>%s</p>'.

From version 0.26, a new 'caption_caption' type is supported, which is rendered as a <caption> attribute on the table (with presentation typically controlled via css). To force this type, you should use a hashref caption argument, with an explicit type of 'caption_caption'. See below for examples.

For backward compatibility, the default is old-style type => 'caption'. That will change in a future release.

For example:

  # Old style text caption, rendered below table
  # rendered <p>Employee Data</p> (below table)
  caption => 'Employee Data'
  # rendered <div class="emp_data">Employee Data</div> (below table)
  caption => { 
    value => 'Employee Data', 
    format => '<div class="emp_data">%s</div>',
  }
  # rendered (e.g.): <p>Employee Data (3 records)</p> (below table)
  caption => sub {
      my ($set, $type) = @_;
      my $caption = 'Employee Data';
      $caption .= ' (' . scalar(@$set) . ' records)'
          if ref $set eq 'ARRAY';
      sprintf '<p>%s</p>', $caption;
  }

  # New-style <caption> caption, rendered within table
  # rendered <caption>Employee Data</caption> (within table)
  caption => { 
    type => 'caption_caption',
    value => 'Employee Data', 
  }
  # rendered (e.g.): <caption>Employee Data (3 records)</caption> (within table)
  caption => {
    type => 'caption_caption',
    value => 'Employee Data',
    format => sub {
      my ($caption, $set, $type) = @_;
      $caption .= ' (' . scalar(@$set) . ' records)'
          if ref $set eq 'ARRAY';
      $caption
    }
  }
colgroups and cols

Array reference containing an ordered set of hashrefs to be rendered as individual colgroup entries. Array keys and values are mapped to attributes and values on the colgroup entries e.g.

  colgroups => [
    { align => 'center' },
    { align => 'left', span => 2 },
    { align => 'right' },
  ],

would be rendered as:

  <colgroup align="center">
  <colgroup align="left" span="2">
  <colgroup align="right">

A colgroup can also contain the special attribute cols, which defines a similar array reference containing a set of hashrefs, which are rendered as <col> items nested within the colgroup (as an alternative to using 'span'). For example, this:

  colgroups => [
    { align => 'center' },
    { align => 'left', cols => [
      { class => 'col1', span => '2' },
      { class => 'col2', width => 20 },
    ] },
  ],

would be rendered as:

    <colgroup align="center">
    <colgroup align="left">
    <col class="col1" span="2">
    <col class="col2" width="20">
    </colgroup>
data_prepend

Array reference containing supplementary data rows to be prepended to the table before the main dataset. data_prepend rows are otherwise treated exactly the same as main data rows.

Note that data_prepend is currently only supported for style => 'down'.

data_append

Array reference containing supplementary data rows to be appended to the table after the main dataset. data_append rows are otherwise treated exactly the same as main data rows.

Note that data_append is currently only supported for style => 'down'.

FIELD ATTRIBUTE ARGUMENTS

HTML attributes

Any field attribute that does not have a special meaning to HTML::Tabulate (see the remaining items in this section) is considered an HTML attribute and is used with the <td> tag for table cells for this field. e.g.

  field_attr => {
    emp_id => {
      align => 'center',
      valign => 'top',
      class => sub { my ($d, $r, $f) = @_; $f =~ s/^emp_//; $f },
    }
  }

will cause emp_id table cells to be displayed as:

  <td align="center" class="id" valign="top">

Attribute values may be either scalar, which are used directly, or subroutine references, which are called with the following arguments:

  $sub->( $data, $row, $field )

and the result used as the attribute value. The arguments are: the (unformatted) data value; a reference to the entire data row; and the field name (so subreferences can be potentially used for more than one field).

One HTML attribute that is handled specially is colspan. If you set colspan to a number greater than one, the cell will be rendered with <td colspan="$colspan" ...> (as normal), and the next $colspan-1 fields will be skipped entirely. For instance, if you have a three element table and define:

  field_attr => {
    name => {
      colspan => sub {
        my $data = shift;
        return $data =~ m/^Group/ ? 3 : undef;
      },
    },
  }

then any rows with names beginning with 'Group' will be rendered:

  <tr><td colspan="3">Group A</td></tr>

Note that 'colspan' is NOT supported with 'across' style tables, however.

value

Scalar or subroutine reference. Used to override or modify the current data value. If scalar is taken as a literal. If a subroutine reference, is called with the following arguments:

  $sub->( $data, $row, $field )

and the result used as the data value. The arguments are: the original data value itself; a reference to the entire data row; and the field name (so subrefs can potentially be used for more than one field).

This allows the value to be modified or set according to the current value, or based on any other value in the row (or anything else, for that matter) e.g.

  # Derive emp_fname from first word of emp_name
  field_attr => {
    emp_fname => { 
      value => sub { 
        my ($data, $row, $field) = @_; 
        if ($row->{emp_name} =~ m/^\s*(\w+)/) { return $1; }
        return '';
      },
    },
    edit => { value => 'edit' },
  }
format

Scalar or subroutine reference. Used to format the current data value. If scalar, is taken as a sprintf pattern, with the current data value as the single argument. If a subroutine reference, is called in the same way as the value subref above i.e. $format->($data_item, $row, $field)

Scalar or subroutine reference. Used as the link target to make an HTML link using the current data value. If scalar, the target is taken as a sprintf pattern, with the current data value as the single argument. If a subroutine reference, is called in the same way as the value subref described above i.e. $link->($data, $row, $field) e.g.

  field_attr => {
    emp_id => {
      link => 'emp.html?id=%s',
      format => '%05d',
    },
  }

creates a link in the table cell like:

  <a href="emp.html?id=1">00001</a>

Note that links are not created for labels/headings - to do so use the separate label_link argument below.

Scalar or subroutine reference. Any attribute beginning with 'link_' is used as an attribute for the HTML link created for this field (with the 'link_' prefix removed, of course). Scalar values are used as literals; subroutine references are called in the same way as the value subref above i.e. $attr->($data_item, $row, $field) e.g.

  field_attr => {
    emp_id => {
      link => 'emp.html?id=%s',
      link_class => sub { my ($d, $r, $f) = @_; "class_$f" },
      link_target => '_blank',
      link_title => 'Employee details',
    },
  }

creates a link in the table cell like:

  <a class="class_emp_id" href="emp.html?id=123" target="_blank" title="Employee details">123</a>
label

Scalar or subroutine reference. Defines the label or heading to be used for this field. If scalar, the value is taken as a literal (cf. 'value' above). If a subroutine reference is called with the field name as the only argument (typically only useful for -default or regex-based labels). Entries in the top-level 'labels' hashref are mapped into per-field label entries.

Scalar or subroutine reference. Equivalent to the general 'link' argument above, but used to create link targets only for label/heading rows. Scalar values are taken as sprintf patterns using the label as argument; subroutine references are called in the same way as the value subref above i.e. $link->($data_item, $row, $field)

Scalar or subroutine reference. Like 'link_*' attributes above, used as attributes on the HTML link created for the label for this field. Scalar values are used as literals; subroutine references are called in the same way as the value subref above i.e. $attr->($data_item, $row, $field) e.g.

  field_attr => {
    emp_id => {
      label => 'Emp ID',
      label_link => sub { my ($d, $r, $f) = @_; "?order=$f" },
      label_link_target => '_blank',
      label_link_title => sub { my ($d, $r, $f) = @_; "Order by $d" },
    },
  }

creates a link for the label like:

  <a href="?order=emp_id" target="_blank" title="Order by Emp ID">Emp ID</a>
escape

Boolean (default true). HTML-escapes '<' and '>' characters in data values.

derived

Boolean (default false). Flag indicating that this is a derived field i.e. not present in the underlying data, allowing HTML::Tabulate to avoid unnecessary lookups. (You are presumably deriving these values from other data in the row via a 'value' sub or something.)

Can also be set in a top-level 'derived' arrayref, rather than per-field, if you prefer.

composite

Arrayref. New as of version 0.30, composite fields define an ordered list of other fields that you want to appear in a single cell. For instance, given individual name fields in your data you might want to define a composite name field to use in your table instead e.g.

  field_attr => {
    fullname => {
      composite => [ qw(given_name middle_initial surname) ],
    },
    surname => {
      format => sub { uc $_[0] },
    },
    # ...
  },

Typically, the base fields appear in your data (e.g. given_name, middle_initial, and surname) but not in your table, and your composite field appears in the table but not in your data (but other patterns do make sense too sometimes).

composite_join

Scalar or subroutine reference. If a scalar, functions as the string used to join the rendered composite fields together. If a subroutine reference, is called with the following arguments:

  $composite_join->(\@composite_fields, $row, $field_name)

and is expected to join the composite fields itself and return the joined string.

Default: ' '.

METHODS

HTML::Tabulate has three main public methods:

new($table_defn)

Takes an optional presentation definition hashref for a table, sanity checks it (and croaks on failure), stores the definition, and returns a blessed HTML::Tabulate object.

merge($table_defn)

Checks the given presentation definition (croaking on failure), and then merges it with its internal definition, storing the result. This allows presentation definitions to be created in multiple passes, with general defaults overridden by more specific requirements.

render($dataset, $table_defn)

Takes a dataset and an optional presentation definition, creates a merged presentation definition from any prior definitions and the render one, and uses that merged definition to render the given dataset, returning the HTML table produced. The merged definition is discarded after the render; only definitions stored by the new() and merge() methods are persistent across renders.

render() can also be used procedurally if explicitly imported:

  use HTML::Tabulate qw(render);
  print render($dataset, $table_defn);

DATASETS

HTML::Tabulate supports the following dataset types:

Simple hashrefs

A simple hashref will generate a one-row table (or one column table if style is 'across'). Labels are derived from key names if not supplied.

Arrayrefs of arrayrefs

An arrayref of arrayrefs will generate a table with one row for each contained arrayref (or one column per arrayref if style is 'across'). Labels cannot be derived from arrayrefs, so they must be supplied if required.

Arrayrefs of hashrefs

An arrayref of hashrefs will generate a table with one row for each hashref (or one column per hashref if style is 'across'). Labels are derived from the key names of the first hashref if not supplied.

Arrayrefs of objects

An arrayref containing hash-based objects (i.e. blessed hashrefs) are treated just like unblessed hashrefs, generating a table with one row per object. Labels are derived from the key names of the first object if not supplied.

Iterators

Some kinds of iterators (utility objects used to access the members of a set) are also supported. If the iterator supports methods called First() and Next() or first() and next() then HTML::Tabulate will use those methods to walk the dataset. DBIx::Recordset objects and Class::DBI and DBIx::Class iterators definitely work; beyond those your mileage may vary - please let me know your successes and failures.

As of version 0.31, HTML::Tabulate also supports generic coderef iterators i.e. subroutines that return successive data rows on subsequent calls to the subroutine (and undef at end of data) e.g.

    # Toy example: given an array of rows in @data
    $t = HTML::Tabulate->new;
    print $t->render( sub { shift @data } );

SUBCLASSING

HTML::Tabulate is intended to be easy to subclass, to allow you to setup sensible defaults for site-wide use, for instance. Something like this seems to work well:

    package My::Tabulate;
    use base qw(HTML::Tabulate);

    sub new {
        my $class = shift;
        my $defn = shift || {};
        my %defaults = (
            # define table defaults here e.g.
            table => { border => 1 },
            labels => { foo => 'FOO', bar => 'BAR' },
        );
        my $self = $class->SUPER::new(\%defaults);
        $self->merge($defn);
        return $self;
    }

    1;

BUGS AND CAVEATS

Probably. Please let me know if you find something going awry.

Is now much bigger and more complicated than was originally envisaged. Needs to be completely refactored. Sometime.

AUTHOR

Gavin Carr <gavin@openfusion.com.au>

Contributors:

David Giller <dave@pdx.net> reported a bug in the generic subref iterator handling, and provided a fix (version 0.32).

Harry Danilevsky <harry@deerfieldcapital.com> - patch adding generic subref iterator support (version 0.31).

COPYRIGHT

Copyright 2003-2011, Gavin Carr.

This program is free software. You may copy or redistribute it under the same terms as perl itself.