The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

WWW::SitemapIndex::XML - XML Sitemap index protocol

VERSION

version 2.02

SYNOPSIS

    use WWW::SitemapIndex::XML;

    my $index = WWW::SitemapIndex::XML->new();

    # add new sitemaps
    $index->add( 'http://mywebsite.com/sitemap1.xml.gz' );

    # or
    $index->add(
        loc => 'http://mywebsite.com/sitemap1.xml.gz',
        lastmod => '2010-11-26',
    );

    # or
    $index->add(
        WWW::SitemapIndex::XML::Sitemap->new(
            loc => 'http://mywebsite.com/sitemap1.xml.gz',
            lastmod => '2010-11-26',
        )
    );

    # read sitemaps from existing sitemap_index.xml file
    my @sitemaps = $index->read( 'sitemap_index.xml' );

    # load sitemaps from existing sitemap_index.xml file
    $index->load( 'sitemap_index.xml' );

    # get XML::LibXML object
    my $xml = $index->as_xml;

    print $xml->toString(1);

    # write to file
    $index->write( 'sitemap_index.xml', my $pretty_print = 1 );

    # write compressed
    $index->write( 'sitemap_index.xml.gz' );

DESCRIPTION

Read and write sitemap index xml files as defined at http://www.sitemaps.org/.

METHODS

add($sitemap|%attrs)

    $index->add(
        WWW::SitemapIndex::XML::Sitemap->new(
            loc => 'http://mywebsite.com/sitemap1.xml.gz',
            lastmod => '2010-11-26',
        )
    );

Add the $sitemap object representing single sitemap in the sitemap index.

Accepts blessed objects implementing WWW::SitemapIndex::XML::Sitemap::Interface.

Otherwise the arguments %attrs are passed as-is to create new WWW::SitemapIndex::XML::Sitemap object.

    $index->add(
        loc => 'http://mywebsite.com/sitemap1.xml.gz',
        lastmod => '2010-11-26',
    );

    # single url argument
    $index->add( 'http://mywebsite.com/' );

    # is same as
    $index->add( loc => 'http://mywebsite.com/sitemap1.xml.gz' );

Performs basic validation of sitemaps added:

  • maximum of 50 000 sitemaps in single sitemap

  • URL no longer then 2048 characters

  • all URLs should use the same protocol and reside on same host

sitemaps

    my @sitemaps = $index->sitemaps;

Returns a list of all Sitemap objects added to sitemap index.

load(%sitemap_index_location)

    $index->load( location => $sitemap_index_file );

It is a shortcut for:

    $index->add($_) for $index->read( location => $sitemap_index_file );

Please see "read" for details.

read(%sitemap_index_location)

    # file or url to sitemap index
    my @sitemaps = $index->read( location => $file_or_url );

    # file handle
    my @sitemaps = $index->read( IO => $fh );

    # xml string
    my @sitemaps = $index->read( string => $xml );

Read the sitemap index from file, URL, open file handle or string and return the list of WWW::SitemapIndex::XML::Sitemap objects representing <sitemap> elements.

write($file, $format = 0)

    # write to file
    $index->write( 'sitemap_index.xml', my $pretty_print = 1);

    # or
    my $fh = IO::File->new();
    $fh->open('sitemap_index.xml', 'w');
    $index->write( $fh, my $pretty_print = 1);
    $cfh->close;

    # write compressed
    $index->write( 'sitemap_index.xml.gz' );

Write XML sitemap index to $file - a file name or IO::Handle object.

If file names ends in .gz then the output file will be compressed by setting compression on xml object - please note that it requires libxml2 to be compiled with zlib support.

Optional $format is passed to toFH or toFile methods (depending on the type of $file, respectively for file handle and file name) as described in XML::LibXML.

as_xml

    my $xml = $index->as_xml;

    # pretty print
    print $xml->toString(1);

    # write compressed
    $xml->setCompression(8);
    $xml->toFile( "sitemap_index.xml.gz" );

Returns XML::LibXML::Document object representing the sitemap index in XML format.

The <sitemap> elements are built by calling as_xml on all Sitemap objects added into sitemap index.

SEE ALSO

http://www.sitemaps.org/

AUTHOR

Alex J. G. Burzyński <ajgb@cpan.org>

COPYRIGHT AND LICENSE

This software is copyright (c) 2014 by Alex J. G. Burzyński <ajgb@cpan.org>.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.