The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

Name

Text::Diff::HTML - XHTML format for Text::Diff::Unified

Synopsis

    use Text::Diff;

    my $diff = diff "file1.txt", "file2.txt", { STYLE => 'Text::Diff::HTML' };
    my $diff = diff \$string1,   \$string2,   { STYLE => 'Text::Diff::HTML' };
    my $diff = diff \*FH1,       \*FH2,       { STYLE => 'Text::Diff::HTML' };
    my $diff = diff \&reader1,   \&reader2,   { STYLE => 'Text::Diff::HTML' };
    my $diff = diff \@records1,  \@records2,  { STYLE => 'Text::Diff::HTML' };
    my $diff = diff \@records1,  "file.txt",  { STYLE => 'Text::Diff::HTML' };

Description

This class subclasses Text::Diff::Unified, a formatting class provided by the Text::Diff module, to add XHTML markup to the unified diff format. For details on the interface of the diff() function, see the Text::Diff documentation.

In the XHTML formatted by this module, the contents of the diff returned by diff() are wrapped in a <div> element, as is each hunk of the diff. Within each hunk, all content is properly HTML encoded using HTML::Entities, and the various sections of the diff are marked up with the appropriate XHTML elements. The elements used are as follows:

  • <div class="file">

    This element contains the entire contents of the diff "file" returned by diff(). All of the following elements are subsumed by this one.

    • <span class="fileheader">

      The header section for the files being diffed, usually something like:

        --- in.txt    Thu Sep  1 12:51:03 2005
        +++ out.txt   Thu Sep  1 12:52:12 2005

      This element immediately follows the opening "file" <div> element.

    • <div class="hunk">

      This element contains a single diff "hunk". Each hunk may contain the following elements:

      • <span class="hunkheader">

        Header for a diff hunk. The hunk header is usually something like:

          @@ -1,5 +1,7 @@

        This element immediately follows the opening "hunk" <div> element.

      • <span class="ctx">

        Context around the important part of a diff hunk. These are contents that have not changed between the files being diffed.

      • <ins>

        Inserted content, each line starting with +.

      • <del>

        Deleted content, each line starting with -.

      • <span class="hunkfooter">

        The footer section of a hunk; contains no contents.

    • <span class="filefooter">

      The footer section of a file; contains no contents.

You may do whatever you like with these elements and classes; I highly recommend that you style them using CSS. You'll find an example CSS file in the eg directory in the Text-Diff-HTML distribution. You will also likely want to wrap the output of your diff in its own element (a <div> will do) styled with "white-space: pre".

See Also

Text::Diff
Algorithm::Diff

Support

This module is stored in an open GitHub repository. Feel free to fork and contribute!

Please file bug reports via GitHub Issues or by sending mail to bug-Text-Diff-HTML@rt.cpan.org.

Author

David E. Wheeler <david@justatheory.com>

Copyright and License

Copyright (c) 2005-2011 David E. Wheeler. Some Rights Reserved.

This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.