The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

preProcessingRewriting - Perl script for rewriting the POS-tagged terms provided by TreeTagger.

SYNOPSIS

preProcessingRewriting [-help] [-man] [--configuration file] input_file output_file

OPTIONS

--help, -h, -? brief help message
--man, -m full documentation
input_file, -i BioYaTeA input file in TreeTagger ouput format
output_file, -o Rewriting output file (TreeTagger format)

DESCRIPTION

This script performs the pre-processing of the TreeTagger output in order to improve the extraction of both terms containing prepositional phrases (with TO and AT prepositions) and terms containing participles (past participles -ED and gerunds -ING). Context-based rules are applied to the POS tags either to trigger the extraction of relevant structures or to prevent the extraction of irrelevant ones. The modified file becomes a new input file for BioYaTeA.

Without specifying the input file, the input data are read on stdin. Without specifying the output file, the ouput data are print on stdout.

INPUT/OUTPUT FILE FORMATS

See Documentation in Lingua::YaTeA

EXAMPLES

preProcessingRewriting -i examples/sampleEN.ttg -o examples/sampleEN-prepro

preProcessingRewriting < examples/sampleEN.ttg > examples/sampleEN-prepro

SEE ALSO

Documentation of Lingua::BioYaTeA::PostProcessing, Lingua::BioYaTeA and Lingua::YaTeA

AUTHORS

Wiktoria Golik <wiktoria.golik@jouy.inra.fr>, Zorana Ratkovic <Zorana.Ratkovic@jouy.inra.fr>, Robert Bossy <Robert.Bossy@jouy.inra.fr>, Claire Nédellec <claire.nedellec@jouy.inra.fr>, Thierry Hamon <thierry.hamon@univ-paris13.fr>

LICENSE

Copyright (C) 2012 Wiktoria Golik, Zorana Ratkovic, Robert Bossy, Claire Nédellec and Thierry Hamon

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.6 or, at your option, any later version of Perl 5 you may have available.