The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Tie::StoredOrderHash - ordered associative arrays for Perl

SYNOPSIS

# Standard hash operations use Tie::StoredOrderHash;

    tie my %hash, 'Tie::StoredOrderHash', (a => 1, b => 2, c => 3);

    $hash{d} = 4;
    print join " ", %hash;    # => a 1 b 2 c 3 d 4
    $hash{a} = 5;
    print join " ", %hash;    # => b 2 c 3 d 4 a 5

 

# Optional utility functions use Tie::StoredOrderHash qw/ ordered is_ordered /;

    my $hash = ordered [ one => 1, two => 2, three => 3 ];
    while (my($k, $v) = each %$hash) {
        print "$k: $v  ";
    } # one: 1  two: 2  three: 3

    print "stored-order hashref is ordered"
        if is_ordered($hash);
    print "regular hashref is NOT ordered"
        unless is_ordered({});

DESCRIPTION

Tie::StoredOrderHash is a(nother) implementation of a Perl hash that preserves the order in which elements are stored.

While uncooked Perl hashes make no guarantees about the order in which elements are iterated, T::SOH objects iterate over elements in a consistent order, starting with the least-recently-updated element, and finishing with the element that was most-recently added or updated.

Standard usage

The module implements TIEHASH, as one would expect. Any extra parameters to tie are treated as key-value pairs for initialisation:

    tie %hash, 'Tie::StoredOrderHash', ('key1' => 'value1', ...);

This hash is then available for lookup, storage and deletion using the standard notation:

    print $hash{"key1"};    # value1
    $hash{"foo"} = "bar";
    delete $hash{"key1"};
    exists $hash{"key1"};   # falsey

The difference from a regular hash would only be observed when iterating, either explicitly,

    tie %hash, 'Tie::StoredOrderHash', qw( 1 one 2 two 3 three );

    while (my ($key, $value) = each %hash) {
        print "[$key]: $value\n";
        # [1]: one
        # [2]: two
        # [3]: three
    }

or implicitly,

    print %hash;                # 1 one 2 two 3 three

    my @values = values %hash;  # ('one', 'two', 'three')
    my @keys = keys %hash;      # (1, 2, 3)
    my @arrayified = %hash;     # (1, 'one', 2, 'two', 3, 'three')

    $hash{2} = "re-ordered";
    print %hash;                # 1 one 3 three 2 re-ordered

Note that T::SOH is not strictly LRU (least-recently-used), as elements only "float" to the end when inserted or updated, not when accessed. Such behaviour is not hard to emulate, however, if the element is "changed to what it was" whenever it's read:

    tie my %lru, 'Tie::StoredOrderHash';

    # ... some things happen, the hash is populated somehow ...
    my $elt = ($lru{$key} = $lru{$key});    # update this value in-place

    # ... more things happen, things are said that can't be unsaid ...

    # Delete the least-recently-used element in the hash.
    my @keys = keys(%lru);
    delete $lru{$keys[0]};

Note that T::SOH attempts to "do the right thing" as much as possible during iteration:

  • if an entry after the current position is updated, then obviously no harm is done (it will be iterated eventually),

  • if an entry before the current position is updated, then the entry moves to the end of the order (and will be re-iterated),

  • if the current entry is updated, then it is moved to the end of the order and will be re-iterated.

If updates must be made to the elements during a once-through iteration, then maybe what you actually want to do is grab the keys before starting:

    foreach my $key (keys(%hash)) {
        my $value = $hash{$key};
        # ... destructive fun goes here ...
    }

... but, hey, I don't wanna harsh your buzz.

Other usage

Although the tie interface is possessing of a certain austere beauty, it can be somewhat cumbersome, especially when constructing nested data structures. A single hashref tied to Tie::StoredOrderHash may be constructed in the familiar OO fashion, using new, with an optional list of key-value pairs:

    my $hashref = Tie::StoredOrderHash->new(
        'key1' => 'value1',
        'key2' => 'value2',
        ...
    );

As a special case, if a single list-ref is passed to new, then its contents are used as the key-value pairs for initialisation:

    # Equivalent to previous code snippet
    my $hashref = Tie::StoredOrderHash->new(
        [
            'key1' => 'value1',
            'key2' => 'value2',
            ...
        ]
    );

As a beautiful and terse alternative, you may import the ordered function from Tie::StoredOrderHash, and provide it with a list-ref of initial values:

    use Tie::StoredOrderHash qw/ ordered /;

    # This is spartan (and equivalent to the previous code snippets)
    my $hashref = ordered [
        'key1' => 'value1',
        'key2' => 'value2',
        ...
    ];

    # Also does what you'd expect with nested structures
    my $nested_hashref = ordered [
        one => 1,
        two => ordered [
            a => "a",   # a is a
            b => "b",
            c => "c",
            d => ordered [
                'us' => 'all your base'
            ]
        ],
        three => 3
    ];

Also exported (but not by default) is the is_ordered function, which takes a single hashref as parameter, and returns truthfulness if the hashref is tied to Tie::StoredOrderHash (or one of its subclasses). This is useful when you're working with a bunch of maniacs who can't be consistent about their configuration files, err, for instance.

IMPLEMENTATION AND MOTIVATION

There are several other CPAN modules that iterate over insert-order, which is very close to store-order, but these typically use an array-list to keep track of ordering. This makes deletion and reordering-on-update expensive, but they may be more suitable for you than T::SOH if you have a hash that doesn't change often, or at least mostly grows at the ends. Tie::StoredOrderHash maintains a linked-list of its entries, at a little extra cost, which makes the basic insert/update/delete operations cheap. List-refs rather than hash-refs are used internally wherever possible, and this does avoid a significant performance penalty from previous versions.

Tie::StoredOrderHash might not be the Swiss Army Knife you're looking for, but it's already been cheap and useful in a variety of situations.

BUGS

Could be a few ... it's been in major production use for about a year, and it was only a couple of revisions ago that a problem with duplicate keys in the initialisation list was spotted.

More likely is that it doesn't quite do what you want. Should be simple enough to subclass it, e.g.

    package Tie::LRUHash;
    use base 'Tie::StoredOrderHash';

    # Hash with full least-recently-used behaviour
    sub FETCH {
        my ($self, $key) = @_;
        my $value = $self->SUPER::FETCH($key);
        $self->SUPER::STORE($key, $value);
    }

    1;

It would potentially also be useful to have the stack and queue style operations of Tie::IxHash, but I'm not convinced those belong in this package (maybe, once again, in a fairly trivial subclass in a future revision).

SEE ALSO

  • Tie::IxHash - Gurusamy Sarathy's original insert-order hash. It's even mentioned in one of the O'Reilly books, I think that makes it "venerable".

  • Tie::InsertOrderHash - very similar implementation to IxHash without the stack/queue utilities

  • Tie::Hash::Indexed - XS module, similar to IxHash

  • Tie::LLHash - pure Perl implementation of the insert-order hash built on top of a linked list

    Any of these is particularly suitable for hashes which are built once and not modified thereafter. Tie::StoredOrderHash is geared towards LRU-like applications.

AUTHOR

tom murtagh <cpan@notto.be>

Copyright (c) 2009 tom murtagh. All rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.