String::Comments::Extract - Extract comments from C/C++/JavaScript/Java source
version 0.023
use String::Comments::Extract; my $source = <<_END_ /* A Hello World program Copyright Ty Coon // ...and Buckaroo Banzai "Yoyodyne"*/ void main() { printf("Hello, World.\n"); printf("/* This is not a real comment */"); printf("// Neither is this */"); // But this is } // Last comment _END_ my $comments = String::Comments::Extract::C->extract($source) # ... returns the following result: /* A Hello World program Copyright Ty Coon // ...and Buckaroo Banzai "Yoyodyne"*/ // But this is // Last comment my @comments = String::Comments::Extract::C->collect($source) # ... returns the following list: ( ' A Hello World program Copyright Ty Coon // ...and Buckaroo Banzai "Yoyodyne"', ' But this is', ' Last comment', )
String::Comments::Extract is a tool for extracting comments from C/C++/JavaScript/Java source. The extractor is implemented using an actual tokenizer (written in C via XS [adapted from JavaScript::Minifier::XS]). By using a tokenizer, it can correctly deal with notoriously problematic cases, such as comment-like structures embedded in strings:
std::cout << "This is not a // real C++ comment " << std::endl printf("/* This is not a real C comment */\n"); # The extractor will ignore both of the above
String::Comments::Extract considers C/C++/JavaScript/Java comment structures the same, so, for now, it doesn't really matter which method you use (this means it will not complain about C++ style comments in C source).
The language agnostic interface to C/C++/JavaScript/Java comment extractor is accessible via String::Comments::Extract::SlashStar
# Can handle slash-star (/* */) and slash-slash (//) comments String::Comments::Extract::SlashStar->extract String::Comments::Extract::SlashStar->collect
Returns a string representing the comments in <source>
Comment delimeters ( /* */ // ) are left in as-is
/* */ //
Whitespace of <source> is otherwise preserved, so you'll probably have to do some post-processing to get rid of some cruft.
Returns a list containing an item for each block- or line-comment in <source>
Comment delimeters ( /* */ // ) around the comment are removed
Whitespace outside of comments may not be preserved exactly
File::Comments
Robert Krimen <robertkrimen@gmail.com>
This software is copyright (c) 2010 by Robert Krimen.
This is free software; you can redistribute it and/or modify it under the same terms as the Perl 5 programming language system itself.
To install String::Comments::Extract, copy and paste the appropriate command in to your terminal.
cpanm
cpanm String::Comments::Extract
CPAN shell
perl -MCPAN -e shell install String::Comments::Extract
For more information on module installation, please visit the detailed CPAN module installation guide.