The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

Unicode::Util - Unicode grapheme-level versions of built-in Perl functions

VERSION

This document describes Unicode::Util version 0.07.

SYNOPSIS

    use Unicode::Util qw( grapheme_length grapheme_reverse );

    # grapheme cluster ю́ (Cyrillic small letter yu, combining acute accent)
    my $grapheme = "\x{044E}\x{0301}";

    say length($grapheme);           # 2 (length in code points)
    say grapheme_length($grapheme);  # 1 (length in grapheme clusters)

    # Spın̈al Tap; n̈ = Latin small letter n, combining diaeresis
    my $band = "Sp\x{0131}n\x{0308}al Tap";

    say scalar reverse $band;     # paT länıpS
    say grapheme_reverse($band);  # paT lan̈ıpS

DESCRIPTION

This module provides Unicode grapheme cluster–level versions of Perl’s built-in string functions, tailored to work on grapheme clusters as opposed to code points or bytes.

This is an early release and major revisions are planned for the near future.

FUNCTIONS

Functions may each be exported explicitly or by using the :all tag for everything.

grapheme_length($string)

Returns the length of the given string in grapheme clusters. This is the closest to the number of “characters” that many people would count on a printed string.

grapheme_chop($string)

Returns the given string with the last grapheme cluster chopped off. Does not modify the original value, unlike the built-in chop.

grapheme_reverse($string)

Returns the given string value with all grapheme clusters in the opposite order.

TODO

grapheme_substr, graphem_index, grapheme_rindex, canonical_eq, compatibility_eq

SEE ALSO

Unicode::GCString, String::Multibyte, Perl6::Str, http://perlcabal.org/syn/S32/Str.html

AUTHOR

Nick Patch <patch@cpan.org>

COPYRIGHT AND LICENSE

© 2011–2013 Nick Patch

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.