The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Data::FlexSerializer - Pluggable (de-)serialization to/from compressed/uncompressed JSON/Storable/Sereal/Whatever

DESCRIPTION

This module was written to convert away from Storable throughout the Booking.com codebase to other serialization formats such as Sereal and JSON.

Since we needed to do these migrations in production we had to do them with zero downtime and deal with data stored on disk, in memcached or in a database that we could only gradually migrate to the new format as we read/wrote it.

So we needed a module that deals with dynamically detecting what kind of existing serialized data you have, and can dynamically convert it to something else as it's written again.

That's what this module does. Depending on the options you give it it can read/write any combination of compressed/uncompressed/maybe compressed Storable/JSON/Sereal data. You can also easily extend it to add support for your own input/output format in addition to the defaults.

SYNOPSIS

When we originally wrote this we meant to convert everything over from Storable to JSON. Since then mostly due to various issues with JSON not accurately being able to represent Perl datastructures (e.g. preserve encoding flags) we've started to migrate to Sereal::Encoder (a new serialization format we wrote) instead.

However the API of this module is now slightly awkward because now it needs to deal with the possible detection and emission of multiple formats, and it still uses the JSON format by default which is no longer the recommended way to use it.

  # For all of the below
  use Data::FlexSerializer;

Reading and writing compressed JSON

  # We *only* read/write compressed JSON by default:
  my $strict_serializer = Data::FlexSerializer->new;
  my @blobs = $strict_serializer->serialize(@perl_datastructures);
  my @perl_datastructures = $strict_serializer->deserialize(@blobs);

Reading maybe compressed JSON and writing compressed JSON

  # We can optionally detect compressed JSON as well, will accept
  # mixed compressed/uncompressed data. This works for all the input
  # formats.
  my $lax_serializer = Data::FlexSerializer->new(
    detect_compression => 1,
  );

Reading definitely compressed JSON and writing compressed JSON

  # If we know that all our data is compressed we can skip the
  # detection step. This works for all the input formats.
  my $lax_compress = Data::FlexSerializer->new(
    assume_compression => 1,
    compress_output => 1, # This is the default
  );

Migrate from maybe compressed Storable to compressed JSON

  my $storable_to_json = Data::FlexSerializer->new(
    detect_compression => 1, # check whether the input is compressed
    detect_storable => 1, # accept Storable images as input
    compress_output => 1, # This is the default
  );

Migrate from maybe compressed JSON to Sereal

  my $storable_to_sereal = Data::FlexSerializer->new(
    detect_sereal => 1,
    output_format => 'sereal',
  );

Migrate from Sereal to JSON

  my $sereal_backcompat = Data::FlexSerializer->new(
    detect_sereal => 1, # accept Sereal images as input
  );

Migrate from JSON OR Storable to Sereal

  my $flex_to_json = Data::FlexSerializer->new(
    detect_compression => 1,
    detect_json => 1, # this is the default
    detect_sereal => 1,
    detect_storable => 1,
    output_format => 'sereal',
  );

Migrate from JSON OR Storable to Sereal with custom Sereal objects

  my $flex_to_json = Data::FlexSerializer->new(
    detect_compression => 1,
    detect_json => 1, # this is the default
    detect_sereal => 1,
    detect_storable => 1,
    output_format => 'sereal',
    sereal_decoder => Sereal::Decoder->new(...),
    sereal_encoder => Sereal::Encoder->new(...),
  );

Add your own format using Data::Dumper.

See the documentation for add_format below.

ATTRIBUTES

This is a Moose-powered module so all of these are keys you can pass to "new". They're all read-only after the class is constructed, so you can look but you can't touch.

METHODS

assume_compression

assume_compression is a boolean flag that makes the deserialization assume that the data will be compressed. It won't have to guess, making the deserialization faster. Defaults to true.

You almost definitely want to turn "compress_output" off too if you turn this off, unless you're doing a one-off migration or something.

detect_compression

detect_compression is a boolean flag that also affects only the deserialization step.

If set, it'll auto-detect whether the input is compressed. Mutually exclusive with assume_compression (we'll die if you try to set both).

If you set detect_compression we'll disable this for you, since it doesn't make any sense to try to detect when you're going to assume.

Defaults to false.

compress_output

compress_output is a flag indicating whether compressed or uncompressed dumps are to be generated during the serialization. Defaults to true.

You probably to turn "assume_compression" off too if you turn this off, unless you're doing a one-off migration or something.

compression_level

compression_level is an integer indicating the compression level (0-9).

output_format

output_format can be either set to the string json (default), storable, sereal or your own format that you've added via "add_format".

detect_FORMAT_NAME

Whether we should detect this incoming format. By default only detect_json is true. You can also set detect_storable, detect_sereal or detect_YOUR_FORMAT for formats added via "add_format".

sereal_encoder

sereal_decoder

You can supply sereal_encoder or sereal_decoder arguments with your own Serial decoder/encoder objects. Handy if you want to pass custom options to the encoder or decoder.

By default we create objects for you at BUILD time. So you don't need to supply this for optimization purposes either.

METHODS

serialize

Given a list of things to serialize, this does the job on each of them and returns a list of serialized blobs.

In scalar context, this will return a single serialized blob instead of a list. If called in scalar context, but passed a list of things to serialize, this will croak because the call makes no sense.

deserialize

The opposite of serialize, doh.

deserialize_from_file

Given a (single!) file name, reads the file contents and deserializes them. Returns the resulting Perl data structure.

Since this works on one file at a time, this doesn't return a list of data structures like deserialize() does.

serialize_to_file

  $serializer->serialize_to_file(
    $data_structure => '/tmp/foo/bar'
  );

Given a (single!) Perl data structure, and a (single!) file name, serializes the data structure and writes the result to the given file. Returns true on success, dies on failure.

CLASS METHODS

add_format

add_format class method to add support for custom formats.

  Data::FlexSerializer->add_format(
      data_dumper => {
          serialize   => sub { shift; goto \&Data::Dumper::Dumper },
          deserialize => sub { shift; my $VAR1; eval "$_[0]" },
          detect      => sub { $_[1] =~ /\$[\w]+\s*=/ },
      }
  );

  my $flex_to_dd = Data::FlexSerializer->new(
    detect_data_dumper => 1,
    output_format => 'data_dumper',
  );

AUTHOR

Steffen Mueller <smueller@cpan.org>

Ævar Arnfjörð Bjarmason <avar@cpan.org>

Burak Gürsoy <burak@cpan.org>

Elizabeth Matthijsen <liz@dijkmat.nl>

Caio Romão Costa Nascimento <cpan@caioromao.com>

Jonas Galhordas Duarte Alves <jgda@cpan.org>

ACKNOWLEDGMENT

This module was originally developed at and for Booking.com. With approval from Booking.com, this module was generalized and put on CPAN, for which the authors would like to express their gratitude.

COPYRIGHT AND LICENSE

 (C) 2011, 2012, 2013 Steffen Mueller and others. All rights reserved.

 This code is available under the same license as Perl version
 5.8.1 or higher.

 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.