The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

App::Mowyw::Lexer - Simple Lexer

SYNOPSIS

    use App::Mowyw::Lexer qw(lex);
    # suppose you want to parse simple math expressions
    my @input_tokens = (
        ['Int',     qr/(?:-|\+)?\d+/],
        ['Op',      qr/\+|\*|-|\//],
        ['Brace_Open',  qr/\(/],
        ['Brace_Close', qr/\)/],
        ['Whitespace',  qr/\s/, sub { return undef; }],
         );
    my $text = "-12 * (3+4)";
    foreach (lex($text, \@input_tokens){
        my ($name, $text, $position, $line) = @$_;
        print "Found Token $name: '$text'\n"
        print "    at position $position line $line\n";
    }

DESCRIPTION

App::Mowyw::Lexer is a simple lexer that breaks up a text into tokens according to regexes you provide.

The only exported subroutine is lex, which expects input text as its first argument, and a array references as second argument, which contains arrays of token names and regexes.

Each input token consists of a token name (which you can choose freely), a regexwhich matches the desired token, and optionally a reference to a functions that takes the matched token text as its argument. The token text is replaced by the return value of that function. If the function returns undef, that token will not be included in the list of output tokens.

lex returns a list of output tokens, each output token is a reference to a list which contains the token name, matched text, position of the match in the input string (zero-based, suitable for passing to substr), and line number of the start of the match (one-based, suitable for humans).

If there is unmatched text, it is returned with the token name UNMATCHED.

COPYRIGHT AND LICENSE

Copyright (C) 2007,2009 by Moritz Lenz, http://perlgeek.de/, moritz@faui2k3.org

This Program and its Documentation is free software. You may distribute it under the terms of the Artistic License 2.0 as published by The Perl Foundation.

However all code examples are public domain, so you can use it in any way you want to.

1 POD Error

The following errors were encountered while parsing the POD:

Around line 6:

=pod directives shouldn't be over one line long! Ignoring all 2 lines of content