bioyatea - Perl script for extracting terms from a corpus of biomedical texts (based on the module Lingua::YaTeA).
bioyatea [-help] [-man] [--rcfile=file] file
BioYaTeA is an adaptation of YaTeA (Lingua::YaTeA) for biomedical text. The tuning concerns the configuration files (in the directory share/BioYaTeA, pre-processing of the input file and post-processing of the XML output.
Lingua::YaTeA
share/BioYaTeA
Using BioYaTeA requires to have a output of TreeTagger (<http://www.ims.uni-stuttgart.de/projekte/corplex/TreeTagger/DecisionTreeTagger.html> or GeniaTagger (<http://www.nactem.ac.uk/GENIA/tagger/>. It will be the input of BioYaTeA.
To run bioyatea, a configuration file is needed (usually bioyatea.rc in /etc/bioyatea). This file describes the behaviour of the term extractor. You have to indicate the language of the configuration file you use (see section CONFIGURATION FILE FORMAT of Lingua::YaTeA for more details, ). It also indicates the path of the configuration files for the linguistic analysis. You have to adapt the path if your configuration is not standard.
An example of the configuration file is available in etc/bioyatea/bioyatea.rc from the archive directory.
etc/bioyatea/bioyatea.rc
bioyatea -e TreeTaggerOutputFile.ttg
It is assumed that the directory containing the program bioyatea is in your PATH variable and that the configuration file is /etc/bioyatea/bioyatea.rc.
/etc/bioyatea/bioyatea.rc
bioyatea.rc
/etc/bioyatea
--rcfile
bioyatea -e --rcfile MyBioYaTeAConfig.rc TreeTaggerOutputFile.ttg
More examples of the use of bioyaeta script is given below.
See Documentation in Lingua::YaTeA
Processing of a file without post-processing, with the default configuration file (/etc/bioyatea/bioyatea.rc):
bioyatea -e sampleEN.ttg
Processing of a file without post-processing. The configuration file is given in the option --rcfile:
bioyatea -e --rcfile etc/bioyatea.rc sampleEN.ttg
Processing of a file with post-processing:
bioyatea -e --rcfile etc/bioyatea.rc --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml sampleEN.ttg
Only post-processing a file (XML YaTeA output format):
bioyatea --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml sampleEN-output.xml
Processing of a file with pre-processing:
bioyatea -e --rcfile etc/bioyatea.rc --pre-processing sampleEN-prepro.ttg sampleEN.ttg
Only pre-processing a file (TreeTagger output format):
bioyatea --pre-processing sampleEN-prepro.ttg sampleEN.ttg
Processing of a file with pre-processing and post-processing:
bioyatea -e --rcfile etc/bioyatea.rc --post-processing-config etc/post-processing-filtering.conf --post-processing sampleEN-PP.xml --pre-processing sampleEN-prepro.ttg sampleEN.ttg
Documentation of Lingua::YaTeA
Wiktoria Golik <wiktoria.golik@jouy.inra.fr>, Zorana Ratkovic <Zorana.Ratkovic@jouy.inra.fr>, Robert Bossy <Robert.Bossy@jouy.inra.fr>, Claire Nédellec <claire.nedellec@jouy.inra.fr>, Thierry Hamon <thierry.hamon@univ-paris13.fr>
Copyright (C) 2012 Wiktoria Golik, Zorana Ratkovic, Robert Bossy, Claire Nédellec and Thierry Hamon
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.
To install Lingua::BioYaTeA, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Lingua::BioYaTeA
CPAN shell
perl -MCPAN -e shell install Lingua::BioYaTeA
For more information on module installation, please visit the detailed CPAN module installation guide.