The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

PICA::XMLParser - Parse PICA+ XML

VERSION

version 0.584

SYNOPSIS

  my $rcount = 1;
  my $parser = PICA::XMLParser->new( 
      Field => \&field_handler,
      Record => \&record_handler
  );
  $parser->parsefile($filename);

  # equivalent:
  PICA::Parser->parsefile($filename,
      Field => \&field_handler,
      Record => \&record_handler,
      Format => 'xml'  
  );

  sub field_handler {
      my $field = shift;
      print $field->string;
      # no need to save the field so do not return it
  }

  sub record_handler {
      print "$rcount\n"; $rcount++;
  }

DESCRIPTION

This module contains a parser to parse PICA+ XML. Up to now PICA+ XML is not fully standarized yet so this parser may slightly change in the future.

This module can read multiple collections per file or data stream but only the records of the current collection are saved and returned with the <records> method. Use the Collection handler to parse files with multiple collections.

PUBLIC METHODS

new ( [ %params ] )

Creates a new Parser. See PICA::Parser for a description of parameters to define handlers (Field and Record). In addition this parser supports the Collection handler that is called on a collection end tag.

parsedata

Parses data from a string, array reference or function. Data from arrays and functions will be read and buffered before parsing. Do not directly call this method without a PICA::XMLParser object that was created with new().

parsefile ( $filename | $handle )

Parses data from a file or filehandle or IO::Handle.

records ( )

Get an array of the read records (if they have been stored)

counter ( )

Get the number of read records so far. Please note that the number of records as returned by the records method may be lower because you may have filtered out some records.

finished ( )

Return whether the parser will not parse any more records. This is the case if the number of read records is larger then the limit.

PRIVATE HANDLERS

Do not directly call this methods.

start_document

Called at the beginning.

end_document

Called at the end. Does nothing so far.

start_element

Called for each start tag.

end_element

Called for each end tag.

characters

Called for character data.

_getPosition

Get the current position (file name and line number). This method is deprecated.

AUTHOR

Jakob Voß <voss@gbv.de>

COPYRIGHT AND LICENSE

This software is copyright (c) 2012 by Verbundzentrale Goettingen (VZG) and Jakob Voss.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.