The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

SYNOPSIS

The goal of this module is to assist in the drudgery of string cleaning by allowing data objects to define what and how to clean.

EXAMPLES

   use String::Clean;

   my $clean = String::Clean->new();

   $clean->replace( { this => 'that', is => 'was' } , 'this is a test' ); 
      # returns 'that was a test'
   
   # see the tests for more examples 

THE OPTIONS HASH

Each function can take an optonal hash that will change it's behaviour. This hash can be passed to new and will change the defaults, or you can pass to each call as needed.

   opt: 
         Any regex options that you want to pass, ie {opt => 'i'} will allow 
         for case insensitive manipulation.
   replace : 
         If the value is set to 'word' then the replace function will look for 
         words instead of just a collection of charicters. 
         example: 

            replace( { is => 'was' },
                     'this is a test',
                   ); 

            returns 'thwas was a test', where 

            replace( { is => 'was' },
                     'this is a test',
                     { replace => 'word' },
                   ); 

            will return 'this was a test' 

   strip :
         Just like replace, if the value is set to 'word' then strip will look
         for words instead of just a collection of charicters. 

   word_ boundary :
         Hook to change what String::Clean will use as the word boundry, by 
         default it will use '\b'. Mainly this would allow String::Clean to 
         deal with strings like 'this,is,a,test'.

   escape :
         If this is set to 'no' then String::Clean will not try to escape any 
         of the things that you've asked it to look for.  

You can also override options at the function level again, but this happens as merged hash, for example:

   my $clean = String::Clean->new({replace => 'word', opt => 'i'});
   $clean->strip( [qw{a}], 'an Array', {replace =>'non-word'} );
   #returns 'n rray' because opt => 'i' was pulled in from the options at new.
 

CORE FUNCTIONS

new

The only thing exciting here is that you can pass the same options hash at construction, and this will cascade down to each function call.

replace

Takes a hash where the key is what to look for and the value is what to replace the key with.

   replace( $hash, $string, $opts );

replace_word

A shortcut that does the same thing as passing {replace => 'word'} to replace.

   replace_word( $hash, $string, $opts ); 

strip

Takes an arrayref of items to completely remove from the string.

   strip( $list, $sring, $opt);

strip_word

A shortcut that does the same thing as passing {strip => 'word'} to strip.

   strip_word( $list, $string, $opt);

WRAPPING THINGS UP AND USING YAML

clean_by_yaml

Because we have to basic functions that take two seperate data types... why not wrap those up, enter YAML.

   clean_by_yaml( $yaml, $string, $opt );

But how do we do that? Heres an example:

OLD CODE

   $string = 'this is still just a example for the YAML stuff';
   $string =~ s/this/that/;
   $string =~ s/is/was/;
   $string =~ s/\ba\b/an/;
   $string =~ s/still//;
   $string =~ s/for/to explain/;
   $string =~ s/\s\s/ /g;
   # 'that was just an example to explain the YAML stuff'

NEW CODE

   $string = 'this is still just a example for the YAML stuff';
   $yaml = q{
   ---
   this : that
   is   : was
   a    : an
   ---
   - still
   ---
   for : to explain
   '  ': ' '
   };
   $string = $clean->clean_by_yaml( $yaml, $string, { replace => 'word' } );
   # 'that was just an example to explain the YAML stuff'

ISSUES TO WATCH FOR:

  • Order matters:

    As you can see in the example we have 3 seperate YAML docs, this allows for replaces to be doene in a specific sequence, if that is needed. Here in this example is would not have mattered that much, here's a better example:

       #swap all instances of 'ctrl' and 'alt' 
       $yaml = q{
       ---
       ctrl : __was_ctrl__
       ---
       alt  : ctrl
       ---
       __was_ctrl__ : alt
       };
  • Options are global to the YAML doc :

    If you need to have seperate options applied to seperate sets then they will have to happen as seprate calls.

AUTHOR

ben hengst, <notbenh at CPAN.org>

BUGS

Please report any bugs or feature requests to bug-string-clean at rt.cpan.org, or through the web interface at http://rt.cpan.org/NoAuth/ReportBug.html?Queue=String-Clean. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes.

SUPPORT

You can find documentation for this module with the perldoc command.

    perldoc String::Clean

You can also look for information at:

ACKNOWLEDGEMENTS Lindsey Kuper and Jeff Griffin for giving me a reason to cook up this scheme.

COPYRIGHT & LICENSE

Copyright 2007 ben hengst, all rights reserved.

This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.