The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Blog::BlogML::Reader - read data from a BlogML formatted document

SYNOPSIS

  use Blog::BlogML::Reader;
  
  my $reader = new Blog::BlogML::Reader('some/file/blogml.xml');
  my @posts = @{$reader->posts()};

DEPENDENCIES

  • XML::Parser::Expat

    This module uses XML::Parser::Expat to parse the XML in the BlogML source file.

  • HTTP::Date

    This module uses HTTP::Date to transform date strings into sortable timestamps.

EXPORT

None.

INTERFACE

filters

When creating a new reader, the default bahaviour is to parse and return every post in the entire BlogML file. This can be inefficient if, for example, you have ten-thousand posts and only want the first one. For this reason it is recommended that you give the parser some limits. This is done by adding "filters" to the constructor call. Note that once a reader is constructed it's filters cannot be modified; you must create a new reader if you wish to apply new filters.

  • to=>count

    Limits the parser to only the first count posts (starting from the top of the file and working down) in the BlogML file; that is the parser stops working after count posts. Note that the count does not apply to posts that have an "approved" attribute of false: unapproved posts are always invisible to the parser.

      $reader = new Blog::BlogML::Reader('blogml.xml', to=>3);
  • from=>count

    The parser will only start working at the count item in the BlogML file. Note that this can optionally be used in conjunction with the to filter to limit the parser to a range of posts.

      $reader = new Blog::BlogML::Reader('blogml.xml', from=>11, to=>20);
  • before=>date

    Limits the parser to posts with a creation-date before (older than) the given date. The date format can either be a string that complies with the HTTP date protocol or a number representing the Unix time.

      $reader = new Blog::BlogML::Reader('blogml.xml', before=>"2006-05-01T00:00:00");
  • after=>date

    Limits the parser to posts with a creation-date on or after (younger than) the given date. Can optionally be used in conjunction with the before filter to limit the parser to a range of dates. The date format can either be a string that complies with the HTTP date protocol or a number representing the Unix time.

      $reader = new Blog::BlogML::Reader('blogml.xml', after=>1154979460);
  • id=>n

    If you know the exact post you want, why force the parser to work on the entire file?

      $reader = new Blog::BlogML::Reader('blogml.xml', id=>123);
  • cat=>n

    Limits the parser to only the posts that belong to the category with the given id.

      $reader = new Blog::BlogML::Reader('blogml.xml', cat=>'123');

methods

  • meta()

    Returns a HASHREF of meta information about the blog.

      my $meta = $reader->meta();
      print $meta->{title};
      print $meta->{author}, $meta->{email};
  • posts()

    Returns an ARRAYREF of blog posts (in the same order as they are in the file). The number of posts returned will be limited by any filters applied when the reader was constructed.

      my $posts = $reader->posts();
      print $posts->[0]{title};
  • cats()

    Returns a HASHREF of blog categories (keys are the category id).

      my $cats = $reader->cats();
      print $cats->{'123'}{title};

EXAMPLE

        use Blog::BlogML::Reader;
        use Date::Format;

        # parse all posts in the month of April
        my $reader = new Blog::BlogML::Reader('t/example.xml',
          after  => "2006-04-01T00:00:00",
          before => "2006-05-01T00:00:00",
        );

        my $posts = $reader->posts();
        my $meta  = $reader->meta();
        my $cats  = $reader->cats();

        print "<h1>", $meta->{title}, "</h1>";
        print $meta->{author};

        foreach my $post (@$posts) {
          print "<h2>", $post->{title}, "</h2>";

          # post dates are returned in Unix time, so format as desired
          print "posted:", time2str("%o of %B %Y", $post->{time});

          print " categories:",
          join(", ",  map{$cats->{$_}{title}} @{$post->{cats}});

          print " link:", $post->{url};

          print $post->{content}, "<hr />";
        }

SEE ALSO

The website http://BlogML.com has the latest documentation on the BlogML standard. Note that the reference document "t/example.xml" included with this distribution illustrates the expected format of BlogML documents used by this module.

AUTHOR

Michael Mathews, <mmathews@cpan.org>

COPYRIGHT AND LICENSE

Copyright (C) 2006 by Michael Mathews

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.