The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Spreadsheet::Engine::Sheet - Spreadsheet basics

SYNOPSIS

  parse_sheet_save(\@lines, \my %sheetdata);
  my $outstr = create_sheet_save(\%sheetdata);
  
  add_to_editlog(\%headerdata, $str);

  parse_header_save(\@lines, my \%headerdata);
  my $outstr = create_header_save(\%headerdata);

  execute_sheet_command($sheetdata, $command);

  recalc_sheet(\%sheetdata);

DESCRIPTION

This is a motley bunch of functions for dealing with a spreadsheet file and/or data structure. If you plan to use any of these directly, be aware that they may move, vanish, or have significant interface changes in future releases.

EXPORTS

parse_sheet_save

  parse_sheet_save(\@lines, \my %sheetdata);

Sheet input routine. Fills %sheetdata given lines of text @lines.

Currently always returns nothing.

Sheet save format:

  linetype:param1:param2:...

Linetypes are:

  version:versionname - version of this format. Currently 1.3.

  cell:coord:type:value...:type:value... - Types are as follows:

       v:value - straight numeric value
       t:value - straight text/wiki-text in cell, encoded to handle \, :, newlines
       vt:fulltype:value - value with value type/subtype
       vtf:fulltype:value:formulatext - formula resulting in value with value type/subtype, value and text encoded
       vtc:fulltype:value:valuetext - formatted text constant resulting in value with value type/subtype, value and text encoded
       vf:fvalue:formulatext - formula resulting in value, value and text encoded (obsolete: only pre format version 1.1)
          fvalue - first char is "N" for numeric value, "T" for text value, "H" for HTML value, rest is the value
       e:errortext - Error text. Non-blank means formula parsing/calculation results in error.
       b:topborder#:rightborder#:bottomborder#:leftborder# - border# in sheet border list or blank if none
       l:layout# - number in cell layout list
       f:font# - number in sheet fonts list
       c:color# - sheet color list index for text
       bg:color# - sheet color list index for background color
       cf:format# - sheet cell format number for explicit format (align:left, etc.)
       cvf:valueformat# - sheet cell value format number (obsolete: only pre format v1.2)
       tvf:valueformat# - sheet cell text value format number
       ntvf:valueformat# - sheet cell non-text value format number
       colspan:numcols - number of columns spanned in merged cell
       rowspan:numrows - number of rows spanned in merged cell
       cssc:classname - name of CSS class to be used for cell when published instead of one calculated here
       csss:styletext - explicit CSS style information, encoded to handle :, etc.
       mod:allow - if "y" allow modification of cell for live "view" recalc

    col:
       w:widthval - number, "auto" (no width in <col> tag), number%, or blank (use default)
       hide: - yes/no, no is assumed if missing
    row:
       hide - yes/no, no is assumed if missing

    sheet:
       c:lastcol - number
       r:lastrow - number
       w:defaultcolwidth - number, "auto", number%, or blank (default->80)
       h:defaultrowheight - not used
       tf:format# - cell format number for sheet default for text values
       ntf:format# - cell format number for sheet default for non-text values (i.e., numbers)
       layout:layout# - default cell layout number in cell layout list
       font:font# - default font number in sheet font list
       vf:valueformat# - default number value format number in sheet valueformat list (obsolete: only pre format version 1.2)
       ntvf:valueformat# - default non-text (number) value format number in sheet valueformat list
       tvf:valueformat# - default text value format number in sheet valueformat list
       color:color# - default number for text color in sheet color list
       bgcolor:color# - default number for background color in sheet color list
       circularreferencecell:coord - cell coord with a circular reference
       recalc:value - on/off (on is default). If "on", appropriate changes to the sheet cause a recalc
       needsrecalc:value - yes/no (no is default). If "yes", formula values are not up to date

    name:name:description:value - name definition, name in uppercase, with value being "B5", "A1:B7", or "=formula"
    font:fontnum:value - text of font definition (style weight size family) for font fontnum
                         "*" for "style weight", size, or family, means use default (first look to sheet, then builtin)
    color:colornum:rgbvalue - text of color definition (e.g., rgb(255,255,255)) for color colornum
    border:bordernum:value - text of border definition (thickness style color) for border bordernum
    layout:layoutnum:value - text of vertical alignment and padding style for cell layout layoutnum:
                             vertical-alignment:vavalue;padding topval rightval bottomval leftval;
    cellformat:cformatnum:value - text of cell alignment (left/center/right) for cellformat cformatnum
    valueformat:vformatnum:value - text of number format (see format_value_for_display) for valueformat vformatnum (changed in v1.2)
    clipboardrange:upperleftcoord:bottomrightcoord - origin of clipboard data. Not present if clipboard empty.
       There must be a clipboardrange before any clipboard lines
    clipboard:coord:type:value:... - clipboard data, in same format as cell data

The resulting $sheetdata data structure is as follows:

   $sheetdata{version} - version of save file read in
   $sheetdata{datatypes}->{$coord} - Origin of {datavalues} value:
                                        v - typed in numeric value of some sort, constant, no formula
                                        t - typed in text, constant, no formula
                                        f - result of formula calculation ({formulas} has formula to calculate)
                                        c - constant of some sort with typed in text in {formulas} and value in {datavalues}
   $sheetdata{formulas}->{$coord} - Text of formula if {datatypes} is "f", no leading "=", or text of constant if "c"
   $sheetdata{datavalues}->{$coord} - a text or numeric value ready to be formatted for display or used in calculation
   $sheetdata{valuetypes}->{$coord} - the value type of the datavalue as 1 or more characters
                                      First char is "n" for numeric or "t" for text
                                      Second chars, if present, are sub-type, like "l" for logical (0=false, 1=true)
   $sheetdata{cellerrors}->{$coord} - If non-blank, error text for error in formula calculation
   $sheetdata{cellattribs}->{$coord}->
      {coord} - coord of cell - existence means non-blank cell
      {bt}, {br}, {bb}, {bl} - border number or null if no border
      {layout} - cell layout number or blank for default
      {font} - font number or blank for default
      {color} - color number for text or blank for default
      {bgcolor} - color number for the cell background or blank for default
      {cellformat} - cell format number if not default - controls horizontal alignment
      {textvalueformat} - value format number if not default - controls how the cell's text values are formatted into text for display
      {nontextvalueformat} - value format number if not default - controls how the cell's non-text values are turned into text for display
      {colspan}, {rowspan} - column span and row span for merged cells or blank for 1
      {cssc}, {csss} - explicit CSS class and CSS style for cell
      {mod} - if "y" allow modification in live view
   $sheetdata{colattribs}->{$colcoord}->
      {width} - column width if not default
      {hide} - hide column if yes
   $sheetdata{rowattribs}->{$rowcoord}->
      {height} - ignored
      {hide} - hide row if yes
   $sheetdata{sheetattribs}->{$attrib}->
      {lastcol} - number of columns in sheet
      {lastrow} - number of rows in sheet (more may be displayed when editing)
      {defaultcolwidth} - number, "auto", number%, or blank (default->80)
      {defaultrowheight} - not used
      {defaulttextformat} - cell format number for sheet default for text values
      {defaultnontextformat} - cell format number for sheet default for non-text values (i.e., numbers)
      {defaultlayout} - default cell layout number in sheet cell layout list
      {defaultfont} - default font number in sheet font list
      {defaulttextvalueformat} - default text value format number in sheet valueformat list
      {defaultnontextvalueformat} - default number value format number in sheet valueformat list
      {defaultcolor} - default number for text color in sheet color list
      {defaultbgcolor} - default number for background color in sheet color list
      {circularreferencecell} - cell coord with a circular reference
      {recalc} - on/off (on is default). If "on", appropriate changes to the sheet cause a recalc
      {needsrecalc} - yes/no (no is default). If "yes", formula values are not up to date
   $sheetdata{names}->{$name}-> - name is uppercase
      {desc} - description (optional)
      {definiton} - in the form of B5, A1:B7, or =formula
   $sheetdata{fonts}->[$index] - font specifications addressable by array position
   $sheetdata{fonthash}->{$value} - hash with font specification as keys and {fonts}->[] index position as values
   $sheetdata{colors}->[$index] - color specifications addressable by array position
   $sheetdata{colorhash}->{$value} - hash with color specification as keys and {colors}->[] index position as values
   $sheetdata{borderstyles}->[$index] - border style specifications addressable by array position
   $sheetdata{borderstylehash}->{$value} - hash with border style specification as keys and {borderstyles}->[] index position as values
   $sheetdata{layoutstyles}->[$index] - cell layout specifications addressable by array position
   $sheetdata{layoutstylehash}->{$value} - hash with cell layout specification as keys and {layoutstyle}->[] index position as values
   $sheetdata{cellformats}->[$index] - cell format specifications addressable by array position
   $sheetdata{cellformathash}->{$value} - hash with cell format specification as keys and {cellformats}->[] index position as values
   $sheetdata{valueformats}->[$index] - value format specifications addressable by array position
   $sheetdata{valueformathash}->{$value} - hash with value format specification as keys and {valueformats}->[] index position as values
   $sheetdata{clipboard}-> - the sheet's clipboard
      {range} - coord:coord range of where the clipboard contents came from or null if empty
      {datavalues} - like $sheetdata{datavalues} but for clipboard copy of cells
      {datatypes} - like $sheetdata{datatypes} but for clipboard copy of cells
      {valuetypes} - like $sheetdata{valuetypes} but for clipboard copy of cells
      {formulas} - like $sheetdata{formulas} but for clipboard copy of cells
      {cellerrors} - like $sheetdata{cellerrors} but for clipboard copy of cells
      {cellattribs} - like $sheetdata{cellattribs} but for clipboard copy of cells
   $sheetdata{loaderror} - if non-blank, there was an error loading this sheet and this is the text of that error

create_sheet_save

  my $outstr = create_sheet_save(\%sheetdata)

Sheet output routine. Returns a string ready to be saved in a file.

execute_sheet_command

  $ok = execute_sheet_command($sheetdata, $command);

Executes commands that modify the sheet data. Sets sheet "needsrecalc" as needed.

The commands are in the forms:

    set sheet attributename value (plus lastcol and lastrow)
    set 22 attributename value
    set B attributename value
    set A1 attributename value1 value2... (see each attribute below for details)
    set A1:B5 attributename value1 value2...
    erase/copy/cut/paste/fillright/filldown A1:B5 all/formulas/format
    clearclipboard
    merge C3:F3
    unmerge C3
    insertcol/insertrow C5
    deletecol/deleterow C5:E7
    name define NAME definition
    name desc NAME description
    name delete NAME

recalc_sheet

  recalc_sheet(\%sheetdata);

Recalculates the entire spreadsheet

parse_header_save

   parse_header_save(\@lines, my \%headerdata);

Returns "" if OK, otherwise error string.

Fills in %headerdata:

  $headerdata{version} - version number, currently 1.1
  $headerdata{fullname} - title of page
  $headerdata{templatetext} - template HTML
  $headerdata{templatefile} - where to get template (location:name), see get_template
  $headerdata{lastmodified} - date/time last modified
  $headerdata{lastauthor} - author when last modified
  $headerdata{basefiledt} - date/time of backup file before this set of edits or blank if new file first edits (survives rename)
  $headerdata{backupfiledt} - date/time of backup file holding this data (blank during edits, yyyy-mm-... in published/backup/archive)
  $headerdata{reverted} - if non-blank, name of backup file this came from (only during initial editing)
  $headerdata{editcomments} - comment text about this series of edits, used when listing backups and RSS
  $headerdata{publishhtml} - publish the HTML for this page - sometimes you only want access-controlled live view (yes/no - default yes)
  $headerdata{publishsource} - put a copy of the published .txt file along with HTML and allow live view of source (yes/no - default no)
  $headerdata{publishjs} - put an embeddable copy of the published HTML as a .js file along with HTML (yes/no - default no)
  $headerdata{publishlive} - (ignored and removed after 0.91) make the HTML be a redirect to the recalc code (yes/no - default no)
  $headerdata{viewwithoutlogin} - allow live view without being logged in (ignore login for this page)
  $headerdata{editlog} - array of entries about edits made since editing started (cleared on new open for edit)
     [0] - log entry: command string to execute_sheet_command or comment (starts with "# ")

create_header_save

  my $outstr = create_header_save(\%headerdata);

Header output routine

add_to_editlog

  add_to_editlog(\%headerdata, $str);

Adds $str to the header editlog. This should be either a string acceptable to execute_sheet_command or start with "# "

OTHER EXPORTS

These are currently exported, as they are used from multiple places. You shouldn't rely on this, however, as they will likely move somewhere else RSN.

convert_date_gregorian_to_julian

  $juliandate = convert_date_gregorian_to_julian($year, $month, $day);

From: http://aa.usno.navy.mil/faq/docs/JD_Formula.html

Uses: Fliegel, H. F. and van Flandern, T. C. (1968). Communications of the ACM, Vol. 11, No. 10 (October, 1968). Translated from the FORTRAN.

convert_date_julian_to_gregorian

  ($year, $month, $day) = convert_date_julian_to_gregorian($juliandate)

From: http://aa.usno.navy.mil/faq/docs/JD_Formula.html

Uses: Fliegel, H. F. and van Flandern, T. C. (1968). Communications of the ACM, Vol. 11, No. 10 (October, 1968). Translated from the FORTRAN.

determine_value_type

  $value = determine_value_type($rawvalue, \$type)

Takes a value and looks for special formatting like $, %, numbers, etc. Returns the value as a number or string and the type. Tries to follow the spec for spreadsheet function VALUE(v).

test_criteria

  test_criteria($value, $type, $criteria);

Determines whether a value/type meets the criteria. A criteria can be a numeric value, text beginning with <, <=, =, >=, >, <>, text by itself is start of text to match.

Returns 1 or 0 for true or false.

lookup_result_type

  $resulttype = lookup_result_type($type1, $type2, \%typelookup);

%typelookup has values of the following form:

    $typelookup{"typespec1"} = "|typespec2A:resultA|typespec2B:resultB|..."

First $type1 is looked up. If no match, then the first letter (major type) of $type1 plus "*" is looked up. $resulttype is $type1 if result is "1", $type2 if result is "2", otherwise the value of result.

copy_function_args

  copy_function_args(\@operand, \@foperand)

Pops operands from @operand and pushes on @foperand up to function start reversing order in the process.

function_args_error

  function_args_error($fname, \@operand, $$errortext)

Pushes appropriate error on operand stack and sets errortext, including $fname

function_specific_error

  function_specific_error($fname, \@operand, $errortext, $errortype, $text)

Pushes specified error and text on operand stack

top_of_stack_value_and_type

  ($value, $type) = top_of_stack_value_and_type(\%sheetdata, \@operand, \$errortext,)

Returns top of stack value and type and then pops the stack

operand_as_number

  $value = operand_as_number(\%sheetdata, \@operand, \$errortext, \$tostype)

Uses operand_value_and_type to get top of stack and pops it. Returns numeric value and type. Text values are treated as 0 if they can't be converted somehow.

operand_as_text

  $value = operand_as_text(\%sheetdata, \@operand, \$errortext, \$tostype)

Uses operand_value_and_type to get top of stack and pops it. Returns text value, preserving sub-type.

operand_value_and_type

  $value = operand_value_and_type(\%sheetdata, \@operand, \$errortext, \$operandtype)

Pops the top of stack and returns it, following a coord reference if necessary. Ranges are returned as if they were pushed onto the stack first coord first. Also sets $operandtype with "t", "n", "th", etc., as appropriate. Errortext is set if there is a reference to a cell with error.

decode_range_parts

  ($sheetdata, $col1num, $ncols, $row1num, $nrows) = decode_range_parts(\@sheetdata, $rangevalue, $rangetype)

Returns \@sheetdata for the sheet where the range is, as well as the number of the first column in the range, the number of columns, and equivalent row information.

If any errors, $sheetdata is returned as null.

coord_to_cr

  ($col, $row) = coord_to_cr($coord)

Turns B3 into (2, 3). The default for both is 1. If range, only do this to first coord.

cr_to_coord

  $coord = cr_to_coord($col, $row)

Turns (2, 3) into B3. The default for both is 1.

encode_for_save

  my $estring = encode_for_save($string)

Returns $estring where :, \n, and \ are escaped

decode_from_save

  my $estring = decode_from_save($string)

Returns $estring with \c, \n, \b and \\ un-escaped

html_escape / special_chars

  my $estring = html_escape($string)

Returns $estring where &, <, >, " are HTML escaped.

This used to be known as special_chars() but that usage is deprecated.

special_chars_nl

  my $estring = special_chars_nl($string)

Returns $estring where &, <, >, ", and LF are HTML escaped, CR's are removed

HELPERS

These are 'private' functions, not exported, and should not be relied on. The interface to any of these is subject to change at any time.

offset_formula_coords

  $updatedformula = offset_formula_coords($formula, $coloffset, $rowoffset);

Change relative cell references by offsets (even those to other worksheets so fill, paste, sort work as expected). If not what you want, use absolute references.

adjust_formula_coords

  $updatedformula = adjust_formula_coords($formula, $col, $coloffset, $row, $rowoffset)

Change all cell references to cells starting with $col/$row by offsets

format_value_for_display

  $displayvalue = format_value_for_display(\%sheetdata, $value, $cr, $linkstyle)

format_text_for_display

  $displayvalue = format_text_for_display($rawvalue, $valuetype, $valueformat, $sheetdata, $linkstyle)

format_number_for_display

  $displayvalue = format_number_for_display($rawvalue, $valuetype, $valueformat)

format_number_with_format_string

  $result = format_number_with_format_string($value, $format_string, $currency_char)

Use a format string to format a numeric value. Returns a string with the result. This is a subset of the normal styles accepted by many other spreadsheets, without fractions, E format, and @, and with any number of comparison fields and with [style=style-specification] (e.g., [style=color:red])

parse_format_string

  parse_format_string(\%format_defs, $format_string)

Takes a format string (e.g., "#,##0.00_);(#,##0.00)") and fills in %foramt_defs with the parsed info

 %format_defs
    {"#,##0.0"}->{} # elements in the hash are one hash for each format
       {operators}->[] # array of operators from parsing the format string (each a number)
       {operands}->[] # array of corresponding operators (each usually a string)
       {sectioninfo}->[] # one hash for each section of the format
          {start}
          {integerdigits}
          {fractiondigits}
          {commas}
          {percent}
          {thousandssep}
          {hasdates}
       {hascomparison} # true if any section has [<100], etc.

parse_format_bracket

  ($operator, $operand) = parse_format_bracket($bracketstr)

check_and_calc_cell

  $circref = check_and_calc_cell(\%sheetdata, $coord)

Recalculates one cell after making sure dependencies are calc'ed, too. If circular reference, returns non-null. Circular referenced detected by using $sheetdata->{checked}->{$coord}: null - not evaluated 1 - cell is being recursed into -- if get back here => circular reference 2 - cell was fully recursed into and calculated this recalc cycle

parse_formula_into_tokens

  \%parseinfo = parse_formula_into_tokens($line)

Parses a text string as if it was a spreadsheet formula

This uses a simple state machine run on each character in turn. States remember whether a number is being gathered, etc. The result is %parseinfo which has the following arrays with one entry for each token: {tokentext}->[] - the characters making up the parsed token, {tokentype}->[] - the type of the token, {tokenopcode}->[] - a single character version of an operator suitable for use in the precedence table and distinguishing between unary and binary + and -.

evaluate_parsed_formula

  ($value, $valuetype, $errortext) = evaluate_parsed_formula(\%parseinfo, \%sheetdata, $allowrangereturn)

Does the calculation expressed in a parsed formula, returning a value, its type, and error info.

If $allowrangereturn is present and true, can return a range (e.g., "A1:A10" - translated from "A1|A10|")

operand_as_coord

  $value = operand_as_coord(\%sheetdata, \@operand, \$errortext)

Gets top of stack and pops it. Returns coord value. All others are treated as an error.

operands_as_coord_on_sheet

  $value = operands_as_coord_on_sheet(\%sheetdata, \@operand, \$returntype, \$errortext)

Gets 2 at top of stack and pops them, treating them as sheetname!coord-or-name. Returns stack-style coord value (coord!sheetname, or coord!sheetname|coord|) and sets $returntype to coord or range. All others are treated as an error.

operands_as_range_on_sheet

  $value = operands_as_range_on_sheet(\%sheetdata, \@operand, \$returntype, \$errortext)

Gets 2 at top of stack and pops them, treating them as coord2-or-name:coord1. Name is evaluated on sheet of coord1. Returns stack-style range value (coord!sheetname|coord|) and sets $returntype to range. All others are treated as an error.

operand_as_sheetname

  $value = operand_as_sheetname(\%sheetdata, \@operand, \$errortext)

Gets top of stack and pops it. Returns sheetname value. All others are treated as an error. Accepts text, cell reference, and named value which is one of those two.

lookup_name

  $value = lookup_name(\%sheetdata, $name, \$valuetype, \$errortext)

Returns value and type of a named value. Names are case insensitive. Names may have a definition which is a coord (A1), a range (A1:B7), or a formula (=OFFSET(A1,0,0,5,1))

step_through_range_up

  $value = step_through_range_up(\@operand, $rangevalue, \$operandtype)

Returns next coord in a range, keeping track on the operand stack. Goes from bottom right across and up to upper left.

step_through_range_down

  $value = step_through_range_down(\@operand, $rangevalue, \$operandtype)

Returns next coord in a range, keeping track on the operand stack. Goes from upper left across and down to bottom right.

col_to_number

  $col = col_to_number($colname)

Turns B into 2. The default is 1.

number_to_col

  $coord = number_to_col($col)

Turns 2 into B. The default is 1.

special_chars_markup

  my $estring = special_chars_markup($string)

Returns $estring where &, <, >, " are HTML escaped ready for expand markup

url_encode

  my $estring = url_encode($string)

Returns $estring with special chars URL encoded.

Based on Mastering Regular Expressions, Jeffrey E. F. Friedl, additional legal characters added

url_encode_plain

  my $estring = url_encode_plain($string)

Returns $estring with special chars URL encoded for sending to others by HTTP, not publishing.

Based on Mastering Regular Expressions, Jeffrey E. F. Friedl, additional legal characters added.

find_in_sheet_cache

  my $othersheet_sheetdata = find_in_sheet_cache(\%sheetdata, $datafilename)

Load additional sheet's information for worksheet references as a sheetdata structure stored in $sheetdata->{sheetcache}->{sheets}->{$datafilename} if necessary. Return that structure as \%othersheet_sheetdata

HISTORY

Modified version of SocialCalc::Sheet from SocialCalc 1.1.0

COPYRIGHT

Portions (c) Copyright 2005, 2006, 2007 Software Garden, Inc. All Rights Reserved.

Portions (c) Copyright 2007 Socialtext, Inc. All Rights Reserved.

Portions (c) Copyright 2007, 2008 Tony Bowden

LICENSE

The contents of this file are subject to the Artistic License 2.0; you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.perlfoundation.org/artistic_license_2_0