treebankFreq.pl - Perl program for finding the frequencies of words in the Treebank corpus
treebankFreq.pl [--compfile=COMPFILE --outfile=OUTFILE [--stopfile=STOPFILE] [--wnpath=WNPATH] [--resnik] [--smooth=SCHEME] PATH | --help -- version]
--compfile=filename
The name of a file containing the compound words (collocations) in WordNet
--outfile=filename
The name of a file to which output should be written
--stopfile=filename
A file containing a list of stop listed words that will not be considered in the frequency counts. A sample file can be down- loaded from http://www.d.umn.edu/~tpederse/Group01/WordNet/words.txt
--wnpath=path
Location of the WordNet data files (e.g., /usr/local/WordNet-2.1/dict)
--resnik
Use Resnik (1995) frequency counting
--smooth=SCHEME
Smoothing should used on the probabilities computed. SCHEME can only be ADD1 at this time
--help
Show a help message
--version
Display version information
PATH
Path to the raw Wall Stree Journal portion of the Treebank corpus. This is usually in the /raw/wsj subdirectory of the Treebank installation. Thus, you might run this program as treebankFreq.pl [OPTIONS] /home/sid/treebank/raw/wsj
To install WordNet::Similarity, copy and paste the appropriate command in to your terminal.
cpanm
cpanm WordNet::Similarity
CPAN shell
perl -MCPAN -e shell install WordNet::Similarity
For more information on module installation, please visit the detailed CPAN module installation guide.