Parse::ExuberantCTags::Merge - Efficiently merge large exuberant ctags files
use Parse::ExuberantCTags::Merge; my $merger = Parse::ExuberantCTags::Merge->new(); $merger->add_file('perltags.old', sorted => 0); $merger->add_file('perltags.new', sorted => 1); $merger->add_file('perltags.new2', sorted => 1); # potentially add more files... # sorting happens only when you call 'write': $merger->write('perltags.out');
This Perl module is intended to merge multiple exuberant ctags files. The synopsis says all about the interface. In order to be as efficient as possible, the module uses different sort methods depending on the input data. In the general case, it will use the Sort::External module to process the data. There are a few exceptions:
If two or more input files contain sorted data, we use the a merge sort to efficiently sort them before merging with the remaining data.
If the total size of the input files is small, we load them into memory and use Perl's fast sort function. Default limit: 2^21B == 4MB.
2^21B == 4MB
If the total size of the input files is extremely small, we ignore whether they're sorted or not and simply resort to Perl's sort. Default limit: 2^17B == 128kB.
2^17B == 128kB
The sorting modules are loaded at run-time on demand only.
Creates a new merger object.
Adds a file to the merging process. First argument must be the file name followed by an optional named argument 'sorted' (default: false) which affects the way the data will be merged. Mixing sorted with unsorted files is possible and will produce a sorted output.
Pre-sorted files are naturally somewhat faster to merge.
Set this to the threshold under which the total size of the input files is to be considered small enough to be sorted in memory (see above). The default should be fine.
Set this to the threshold under which the total size of the input files is to be considered small enough to be sorted in memory regardless of whether the input was partly sorted (see above). The default should be fine.
This makes more sense than it sounds. Perl's sort function is fast. For small amounts of data, its low overhead wins significantly over the sort complexity.
You can use this to set the location of the temporary files that are used for sorting and merging large files. By default, it goes into File::Spec-tmpdir()>.
File::Spec-
Benchmark.
Exuberant ctags homepage: http://ctags.sourceforge.net/
Wikipedia on ctags: http://en.wikipedia.org/wiki/Ctags
Module that can produce ctags files from Perl code: Perl::Tags
Module that can parse exuberant ctags files: Parse::ExuberantCTags
Sorting modules: Sort::External, File::MergeSort (though we use a home-grown merge-sort)
File::PackageIndexer
Steffen Mueller, <smueller@cpan.org>
Copyright (C) 2009 by Steffen Mueller
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.6 or, at your option, any later version of Perl 5 you may have available.
To install Parse::ExuberantCTags::Merge, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Parse::ExuberantCTags::Merge
CPAN shell
perl -MCPAN -e shell install Parse::ExuberantCTags::Merge
For more information on module installation, please visit the detailed CPAN module installation guide.