Data::Match - Complex data structure pattern matching
use Data::Match qw(:all); my ($match, $results) = match($structure, $pattern); use Data::Match; my $obj = new Data::Match; my ($match, $results) = $obj->execute($structure, $pattern);
Data::Match provides extensible complex Perl data structure searching and matching.
None are exported by default. :func exports match and matches, :pat exports all the pattern element generators below, :all exports :func and :pat.
:func
match
matches
:pat
:all
A data pattern is a complex data structure that possibly matches another complex data structure. For example:
matches([ 1, 2 ], [ 1, 2 ]); # TRUE matches([ 1, 2, 3 ], [ 1, ANY, 3 ]); # TRUE matches([ 1, 2, 3 ], [ 1, ANY, 2 ]); # FALSE: 3 != 2
ANY matches anything, including an undefined value.
ANY
my $results = matches([ 1, 2, 1 ], [ BIND('x'), ANY, BIND('x') ]); # TRUE
BIND($name) matches anything and remembers each match and its position with every BIND($name) in $result-{'BIND'}{$name}>. If BIND($name) is not the same as the first value bound to BIND($name) it does not match. For example:
BIND($name)
$result-
my $results = matches([ 1, 2, 3 ], [ BIND('x'), 2, BIND('x') ]); # FALSE: 3 != 1
COLLECT($name) is similar to BIND but does not compare first bound values.
COLLECT($name)
REST matches all remaining elements of an array or hash.
REST
matches([ 1, 2, 3 ], [ 1, REST() ]); # TRUE matches({ 'a'=>1, 'b'=>1 }, { 'b'=>1, REST() => REST() }); # TRUE
FIND searches at all depths for matching sub-patterns.
FIND
matches([ 1, [ 1, 2 ], 3], FIND(COLLECT('x', [ 1, REST() ])); # is true.
See the test script t/t1.t in the package distribution for more pattern examples.
t/t1.t
When a BIND or COLLECT matches a datum, an entry is collected in $result->{BIND} and $result->{COLLECT}, respectively. (This might change in the future)
BIND
COLLECT
$result->{BIND}
$result->{COLLECT}
Each entry for the binding name is a hash containing 'v', 'p' and 'ps' lists.
'v'
'p'
'ps'
is a list of the value at each match.
is a list of match paths describing where the corresponding match was found based on the root of the search at each match. See match_path_*. 'p' is not collected if $matchobj-gt{'no_collect_path'}.
match_path_*
$matchobj-gt{'no_collect_path'}
gt
is a list of code strings (match_path_str) that describes where the match was for each match. 'ps' is collected only if $matchobj-gt{'collect_path_str'}.
match_path_str
$matchobj-gt{'collect_path_str'}
All patterns can have sub-patterns. Most patterns match the AND-ed results of their sub-patterns and their own behavior, first trying the sub-patterns before attempting to match the intrinsic behavior. However, OR and ANY match any sub-patterns;
OR
For example:
match([ ['a', 1 ], ['b', 2], ['a', 3] ], EACH(COLLECT('x', ['a', ANY() ]))) # TRUE
The above pattern means:
For EACH element in the root structure (an array):
COLLECT each element, into collection named 'x', that is,
'x'
An ARRAY of length 2 that starts with 'a'.
'a'
On the other hand.
match( [ ['a', 1 ], ['b', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) # IS FALSE
Because the second root element (an array) does not start with 'a'. But,
match( [ ['a', 1 ], ['a', 2], ['a', 3] ], ALL(COLLECT('x', [ 'a', ANY() ])) ) # IS TRUE
The pattern below flattens the nested array into atoms:
match( [ 1, 'x', [ 2, 'x', [ 3, 'x'], [ 4, [ 5, [ 'x' ] ], 6 ] ] ], FIND(COLLECT('x', EXPR(q{! ref}))), { 'no_collect_path' => 1 } )->{'COLLECT'}{'x'}{'v'};
no_collect_path causes COLLECT and BIND to not collect any paths.
no_collect_path
Match slices are objects that contain slices of matched portions of a data structure. This is useful for inflicting change into substructures matched by patterns like REST.
do { my $a = [ 1, 2, 3, 4 ]; my $p = [ 1, ANY, REST(BIND('s')) ]; my $r = matches($a, $p); ok($r); # TRUE ok(Compare($r->{'BIND'}{'s'}{'v'}[0], [ 3, 4 ])); # TRUE $r->{'BIND'}{'s'}{'v'}[0][0] = 'x'; # Change match slice matches($a, [ 1, 2, 'x', 4 ]); # TRUE }
Hash match slices are generated for each key-value pair for a hash matched by EACH and ALL. Each of these match slices can be matched as a hash with a single key-value pair.
EACH
ALL
Match slices are useful for search and replace missions.
By default Data::Match is blind to Perl object interfaces. To instruct Data::Match to not traverse object implementation containers and honor object interfaces you must provide a visitation adapter. A visitation adapter tells Data::Match how to traverse through an object interface and how to keep track of how it got through.
package Foo; sub new { my ($cls, %opts) = @_; bless \%opts, $cls; } sub x { shift->{x}; } sub parent { shift->{parent}; } sub children { shift->{children}; } sub add_child { my $self = shift; for my $c ( @_ ) { $c->{parent} = $self; } push(@{$self->{children}}, @_); } my $foos = [ map(new Foo('x' => $_), 1 .. 10) ]; for my $f ( @$foos ) { $f->add_child($foos->[rand($#$foo)); } my $pat = FIND(COLLECT('Foo', ISA('Foo', { 'parent' => $foos->[0], REST() => REST() }))); $match->match($foos, $pat);
The problem with the above example is: FIND will not honor the interface of class Foo by default and will eventually find a Foo where $_>parent eq $foos->[0] through all the parent and child links in the objects' implementation container. To force Data::Match to honor an interface (or a subset of an interface) during FIND traversal we create a 'find' adapter sub that will do the right thing.
$_>parent eq $foos->[0]
my $opts = { 'find' => { 'Foo' => sub { my ($self, $visitor, $match) = @_; # Always do 'x'. $visitor->($self->x, 'METHOD', 'x'); # Optional children traversal. if ( $match->{'Foo_find_children'} ) { $visitor->($self->children, 'METHOD', 'children'); } # Optional parent traversal. if ( $match->{'Foo_find_parent'} ) { $visitor->($self->parent, 'METHOD', 'parent'); } } } } my $match = new Data::Match($opts, 'Foo_find_children' => 1); $match = $match->execute($foos, $pat);
See t/t4.t for more examples of visitation adapters.
t/t4.t
Data::Match employs a mostly-functional external interface since this module was inspired by a Lisp tutorial ("The Little Lisper", maybe) I read too many years ago; besides, pattern matching is largely recursively functional. The optional control hashes and traverse adapter interfaces are better represented by an object interface so I implemented a functional veneer over the core object interface.
Internally, objects are used to represent the pattern primitives because most of the pattern primitives have common behavior. There are a few design patterns that are particularly applicable in Data::Match: Visitor and Adapter. Adapter is used to provide the extensibility for the traversal of blessed structures such that Data::Match can honor the external interfaces of a class and not blindly violate encapsulation. Visitor is the basis for some of the FIND pattern implementation. The Data::Match::Slice classes that provide the match slices are probably a Veneer on the array and hash types through the tie meta-behaviors.
Data::Match::Slice
Does not have regexp-like operators like '?', '*', '+'.
Should probably have more interfaces with Data::DRef and Data::Walker.
The visitor adapters do not use UNIVERSAL::isa to search for the adapter; it uses ref. This will be fixed in a future release.
UNIVERSAL::isa
ref
Since hash keys do not retain blessedness (what was Larry thinking?) it is difficult to have patterns match keys without resorting to some bizarre regexp instead of using isa.
isa
match_path_set and match_path_ref do not work through 'METHOD' path boundaries. This will be fixed in a future release.
match_path_set
match_path_ref
'METHOD'
BIND and COLLECT need scoping operators for deeply collected patterns.
If you find this to be useful please contact the author. This is alpha software; all APIs, semantics and behaviors are subject to change.
This section describes the external interface of this module.
Default options for match.
Matches a structure against a pattern. In a list context, returns both the match success and results; in a scalar context returns the results hash if match succeeded or undef.
use Data::Match; my $obj = new Data::Match(); my $matched = $obj->execute($thing, $pattern);
use Data::Match qw(match); match($thing, $pattern, @opts)
is equivalent to:
use Data::Match; Data::Match->new(@opts)->execute($thing, $pattern);
Same as match in scalar context.
Returns a perl expression that will generate code to point to the element of the path.
$matchobj->match_path_str($path, $str);
$str defaults to '$_'.
$str
'$_'
Returns a string suitable for Data::DRef.
$matchobj->match_path_DRef_path($path, $str, $sep);
$str is used as a prefix for the Data::DRef path. $str defaults to ''; $sep defaults to $Data::DRef::Separator or '.';
''
$sep
$Data::DRef::Separator
'.'
Returns the value pointing to the location for the match path in the root.
$matchobj->match_path_get($path, $root);
$root defaults to $matchobj-gt{'root'};
$root
$matchobj-gt{'root'}
Example:
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); my $x = $results->match_path_get($thing, $results->{'BIND'}{'x'}{'p'}[0]);
The above example returns the first array that begins with 'x'.
$matchobj->match_path_set($path, $value, $root);
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); $results->match_path_set($thing, $results->{'BIND'}{'x'}{'p'}[0], 'y');
The above example replaces the first array found that starts with 'x' with 'y';
Returns a scalar ref pointing to the location for the match path in the root.
$matchobj->match_path_ref($path, $root);
my $results = matches($thing, FIND(BIND('x', [ 'x', REST ]))); my $ref = $results->match_path_ref($thing, $results->{'BIND'}{'x'}{'p'}[0]); $$ref = 'y';
The above example replaces the first array that starts with 'x' with 'y';
Version 0.05, $Revision: 1.12 $.
Kurt A. Stephens <ks.perl@kurtstephens.com>
Copyright (c) 2001, 2002 Kurt A. Stephens and ION, INC.
perl, Array::PatternMatcher, Data::Compare, Data::Dumper, Data::DRef, Data::Walker.
1 POD Error
The following errors were encountered while parsing the POD:
You forgot a '=back' before '=head1'
To install Data::Match, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Data::Match
CPAN shell
perl -MCPAN -e shell install Data::Match
For more information on module installation, please visit the detailed CPAN module installation guide.