Sort::Key::Merger - Perl extension for merging sorted things
use Sort::Key::Merger qw(keymerger); sub line_key_value { # $_[0] is available as a scratchpad that persist # between calls for the same $_; unless (defined $_[0]) { # so we use it to cache the file handle when we # open a file on the first read open $_[0], "<", $_ or croak "unable to open $_"; } # don't get confused by this while loop, it's only # used to ignore empty lines my $fh = $_[0]; local $_; # break $_ aliasing; while (<$fh>) { next if /^\s*$/; chomp; if (my ($key, $value) = /^(\S+)\s+(.*)$/) { return ($key, $value) } warn "bad line $_" } # signals the end of the data by returning an # empty list () } # create a merger object: my $merger = keymerger { line_key_value } @ARGV; # sort and write the values: my $value; while (defined($value=$merger->())) { print "value: $value\n" }
Sort::Key::Merger allows to merge presorted collections of things based on some (calculated) key.
None by default.
The functions described below can be exported requesting so explicitly, i.e.:
use Sort::Key::Merger qw(keymerger);
merges the (presorted) generated values sorted by their keys lexicographically.
Every item in @source is aliased by $_ and then the user defined subroutine generate_key_value_pair called. The result from that subroutine call should be a (key, value) pair. Keys are used to determine the order in which the values are sorted and returned.
@source
generate_key_value_pair
generate_key_value_pair can return an empty list to indicate that a source has become exhausted.
The result from keymerger is another subroutine that works as a generator. It can be called as:
keymerger
my $next = &$merger;
or
my $next = $merger->();
In scalar context it returns the next value or undef if all the sources have been exhausted. In list context it returns all the values remaining from the sources merged in a sorted list.
NOTE: an additional argument is passed to the generate_key_value_pair callback in $_[0]. It is to be used as a scrachpad, its value is associated to the current source and will perdure between calls from the same generator, i.e.:
$_[0]
my $merger = keymerger { # use $_[0] to cache an open file handler: $_[0] or open $_[0], '<', $_ or croak "unable to open $_"; my $fh = $_[0]; local $_; while (<$fh>) { chomp; return $_ => $_; } (); } ('/tmp/foo', '/tmp/bar');
This function honours the use locale pragma.
use locale
is like keymerger but compares the keys numerically.
This function honours the use integer pragma.
use integer
returns a merger subroutine that returns lines read from @files sorted by the keys that generate_key generates.
@files
generate_key
@files can contain file names or handles for already open files.
generate_key is called with the line just read on $_ and has to return the sorting key for it. If its return value is undef the line is ignored.
$_
undef
The line can be modified inside generate_key changing $_, i.e.:
my $merger = filekeymerger { chomp($_); # <-- here return undef if /^\s*$/; substr($_, -1, 10) } @ARGV;
Finally, $/ can be changed from its default value to read the files in chunks other than lines.
$/
The return value from this function is a subroutine reference that on successive calls returns the sorted elements; or all elements in one go when called in list context, i.e.:
my $merger = filekeymerger { (split)[0] } @ARGV; my @sorted = $merger->();
is like filekeymerger but the keys are compared numerically.
filekeymerger
Sort::Key, locale, integer, perl core sort function.
Salvador Fandiño, <sfandino@yahoo.com>
Copyright (C) 2005 by Salvador Fandiño.
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.4 or, at your option, any later version of Perl 5 you may have available.
To install Sort::Key::Merger, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Sort::Key::Merger
CPAN shell
perl -MCPAN -e shell install Sort::Key::Merger
For more information on module installation, please visit the detailed CPAN module installation guide.