Bio::Translator::Utils - Utilities that requrie a translation table
use Bio::Translator::Utils; # Same constructor as Bio::Translator my $utils = new Bio::Translator::Utils(); my $utils = custom Bio::Translator( \$custom_table ); my $codons = $utils->codons( $residue ); my $regex = $utils->regex( $residue ); my $indices = $utils->find( $residue ); my $orf = $utils->getORF( $seq_ref ); my $cds = $utils->getCDS( $seq_ref ); my $frames = $utils->nonstop( $seq_ref );
See Bio::Translator for more info. Utils contains utilites that require knowledge of the translation table.
my $codon_array = $translator->codons( $residue); my $codon_array = $translator->codons( $residue, \%params );
Returns a list of codons for a particular residue or start codon. In addition to the one-letter codes for amino acids, the following are valid inputs for the residue:
start: Start codons (you may also use "+" which is what the translator uses as the 1-letter code for start codons) stop: Stop codons (you may also use "*" which is the 1-letter code) lower: Start or stop codons, depending up on strand upper: Start or stop codons, depending up on strand
"lower" and "upper" match the respective ends of a CDS for a given strand (i.e. on the positive strand, lower matches the start, and upper matches them stop). Valid options for the params hash are:
strand: 1 or -1; default = 1
my $regex = $translator->regex( $residue ); my $regex = $translator->regex( $residue, \%params );
Returns a regular expression matching codons for a particular amino acid residue. In addition to the one-letter codes for amino acids, the following are valid inputs for the residue:
start: Start codons (you may also use "+" which is what the translator uses as the 1-letter code for start codons) stop: Stop codons (you may also use "*" which is the 1 letter code) lower: Start or stop codons, depending up on strand upper: Start or stop codons, depending up on strand
"lower" and "upper" match the respective ends of a CDS for a given strand (i.e. on the positive strand, lower matches the start, and upper matches the stop). Valid options for the params hash are:
my $locations = $translator->find( $seq_ref, $residue ); my $locations = $translator->find( $seq_ref, $residue, \%params );
Find the indexes of a given residue in a sequence. In addition to the one-letter codes for amino acids, the following are valid inputs for the residue:
my $orf_arrayref = $translator->getORF( $seq_ref ); my $orf_arrayref = $translator->getORF( $seq_ref, \%params );
This will get the longest region between stops and return lower and upper bounds, and the strand. Valid options for the params hash are:
strand: 0, 1 or -1; default = 0 (meaning search both strands) lower: integer between 0 and length; default = 0 upper: integer between 0 and length; default = length
Lower and upper are used to specify bounds between which you are searching. Suppose the following was the longest ORF:
0 1 2 3 4 5 6 7 8 9 10 T A A A T C T A A G ***** ***** <--------->
This will return:
[ 3, 9, 1 ]
You can also specify which strand you are looking for the ORF to be on.
For ORFs starting at the very beginning of the strand or trailing off the end, but not in phase with the start or ends, this method will cut at the last complete codon. For example, if the following was the longest ORF:
0 1 2 3 4 5 6 7 8 9 10 A C G T A G T T T A ***** <--------------->
getORF will return:
[ 1, 10, 1 ]
The distance between lower and upper will always be a multiple of 3. This is to make it clear which frame the ORF is in. The resulting hash may be passed to the translate method.
Example:
my $orf_ref = $translator->getORF( \'TAGAAATAG' ); my $orf_ref = $translator->getORF( \$seq, { strand => -1 } ); my $orf_ref = $translator->getORF( \$seq, { lower => $lower, upper => $upper } );
my $cds_ref = $translator->getCDS( $seq_ref ); my $cds_ref = $translator->getCDS( $seq_ref, \%params );
Return the strand and boundaries of the longest CDS similar to getORF.
0 1 2 3 4 5 6 7 8 9 10 A T G A A A T A A G >>>>> ***** <--------------->
Will return:
[ 0, 9, 1 ]
Valid options for the params hash are:
strand: 0, 1 or -1; default = 0 (meaning search both strands) lower: integer between 0 and length; default = 0 upper: integer between 0 and length; default = length strict: 0, 1 or 2; default = 1
Strict controls how strictly getCDS functions. There are 3 levels of strictness, enumerated 0, 1 and 2. 2 is the most strict, and in that mode, a region will only be considered a CDS if both the start and stop is found. In strict level 1, if a start is found, but no stop is present before the end of the sequence, the CDS will run until the end of the sequence. Strict level 0 assumes that start codon is present in each frame just before the start of the molecule. Level 1 is a pretty safe bet, so that is the default.
my $cds_ref = $translator->getCDS(\'ATGAAATAG'); my $cds_ref = $translator->getCDS(\$seq, { strand => -1 } ); my $cds_ref = $translator->getCDS(\$seq, { strict => 2 } );
my $frames = $translator->nonstop( $seq_ref ); my $frames = $translator->nonstop( $seq_ref, \%params );
Returns the frames that contain no stop codons for the sequence. Frames are numbered -3, -2, -1, 1, 2 and 3.
3 ----> 2 -----> 1 ------> ------- -1 <------ -2 <----- -3 <----
strand: 0, 1 or -1; default = 0 (meaning search both strands)
my $frames = $translator->nonstop(\'TACGTTGGTTAAGTT'); # [ 2, 3, -1, -3 ] my $frames = $translator->nonstop(\$seq, { strand => 1 } ); # [ 2, 3 ] my $frames = $translator->nonstop(\$seq, { strand => -1 } ); # [ -1, -3 ]
Kevin Galinsky, <kgalinsky plus cpan at gmail dot com>
To install Bio::Translator, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Bio::Translator
CPAN shell
perl -MCPAN -e shell install Bio::Translator
For more information on module installation, please visit the detailed CPAN module installation guide.