The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Algorithm::DistanceMatrix - Compute distance matrix for any distance metric

VERSION

version 0.04

SYNOPSIS

 use Algorithm::DistanceMatrix;
 my $m = Algorithm::DistanceMatrix->new(
     metric=>\&mydistance,objects=\@myarray);
 my $distmatrix =  $m->distancematrix;
 
 use Algorithm::Cluster qw/treecluster/;
 # method=>
 # s: single-linkage clustering
 # http://en.wikipedia.org/wiki/Single-linkage_clustering
 # m: maximum- (or complete-) linkage clustering
 # http://en.wikipedia.org/wiki/Complete_linkage_clustering
 # a: average-linkage clustering (UPGMA)
 # http://en.wikipedia.org/wiki/UPGMA
 
 my $tree = treecluster(data=>$distmat, method=>'a');
 
 # Get your objects and the cluster IDs they belong to, assuming 5 clusters
 my $cluster_ids = $tree->cut(5);
 # Index corresponds to that of the original objects
 print $objects->[2], ' belongs to cluster ', $cluster_ids->[2], "\n";

DESCRIPTION

This is a small helper package for Algorithm::Cluster. That module provides many facilities for clustering data. It also provides a distancematrix function, but assumes tabular data, which is the standard for gene expression data.

If your data is tabular, you should first have a look at distancematrix in Algorithm::Cluster

 http://cpansearch.perl.org/src/MDEHOON/Algorithm-Cluster-1.48/doc/cluster.pdf

Otherwise, this package provides a simple distance matrix, given an arbitrary distance function. It does not assume anything about your data. You simply provide a callback function for measuring the distance between any two objects. It produces a lower diagonal (by default) distance matrix that is fit to be used by the clustering algorithms of Algorithm::Cluster.

NAME

Algorithm::DistanceMatrix - Compute distance matrix for any distance metric

VERSION

version 0.04

METHODS

mode

One of qw/lower upper full/ for a lower diagonal, upper diagonal, or full distance matrix.

metric

Callback for computing the distance, similarity, or whatever measure you like.

 $matrix->metric(\@mydistance);

Where mydistance receives two objects as it's first two arguments.

If you need to pass special parameters to your method:

 $matrix->metric(sub{my($x,$y)=@_;mydistance(first=>$x,second=>$y,mode=>'fast')};

You may use any metric, and may return any number or object. Note that if you plan to use this with Algorithm::Cluster this needs to be a distance metric. So, if you're measure how similar two things are, on a scale of 1-10, then you should return 10-$similarity to get a distance.

Default is the absolute values of the scalar difference (i.e. abs(X-Y))

objects

Array reference. Doesn't matter what kind of objects are in the array, as long as your metric can process them.

distancematrix

2D array of distances (or similarities, or whatever) between your objects.

(An ArrayRef of ArrayRefs.)

AUTHOR

Chad A. Davis <chad.a.davis@gmail.com>

COPYRIGHT AND LICENSE

This software is copyright (c) 2011 by Chad A. Davis.

This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.