The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

B::Graph - Perl compiler backend to produce graphs of OP trees

SYNOPSIS

  perl -MO=Graph,-text prog.pl >graph.txt

  perl -MO=Graph,-vcg prog.pl >graph.vcg
  xvcg graph.vcg

  perl -MO=Graph,-dot prog.pl | dot -Tps >graph.ps

DESCRIPTION

This module is a backend to the perl compiler (B::*) which, instead of outputting bytecode or C based on perl's compiled version of a program, writes descriptions in graph-description languages specifying graphs that show the program's structure. It currently generates descriptions for the VCG tool (http://www.cs.uni-sb.de/RW/users/sander/html/gsvcg1.html) and Dot (part of the graph visualization toolkit from AT&T: http://www.research.att.com/sw/tools/graphviz/). It also can produce plain text output (which is more useful for debugging the module itself than anything else, though you might be able to make cut the nodes out and make a mobile or something similar).

OPTIONS

Like any other compiler backend, this module needs to be invoked using the O module to run correctly:

  perl -MO=Graph,-opt,-opt,-opt program.pl
  OR
  perl -MO=Graph,-opt,obj -e 'BEGIN {$obj = ["hi"]}; print $obj'
  OR EVEN
  perl -e 'use O qw(Graph -opt obj obj); print "hi!\n";'

Obj is the name of a perl variable whose contents will be examined. It can't be a my() variable, and it shouldn't have a prefix symbol ('$@^*'), though you can specify a package -- the name will be used to look up a GV, whose various fields will lead to the scalar, array, and other values that correspond to the named variable. If no object is specified, the whole main program, including the CV that points to its pad, will be displayed.

Each of the the opts can come from one of the following (each set is mutually exclusive; case and underscores are insignificant):

-text, -vcg, -dot

Produce output of the appropriate type. The default is '-text', which isn't useful for much of anything (it does draw some nice ASCII boxes, though).

-addrs, -no_addrs

Each of the nodes on the graph produced corresponds to a C structure that has an address and includes pointers to other structures. The module uses these addresses to decide how to draw edges, but it makes the graph more compact if they aren't printed. The default is '-no_addrs'.

-compile_order, -run_order

The collection of OPs that perl compiles a script into has two different layers of structure. It has a tree structure which corresponds roughly to the synactic nesting of constructs in the source text, and a roughly linked-list representation, essentially a postorder traversal of this tree, which is used at runtime to decide what to do next. The graph can be drawn to emphasize one structure or the other. The former, 'compile_order', is the default, as it tends to lead to graphs with aspect ratios close to those of standard paper.

-SVs, -no_SVs

If OPs represent a program's compiled code, SVs represent its data. This includes literal numbers and strings (IVs, NVs, PVs, PVIVs, and PVNVs), regular arrays, hashes, and references (AVs, HVs, and RVs), but also the structures that correspond to individual variables (special HVs for symbol tables and GVs to represent values within them, and special AVs that hold my() variables (as well as compiler temporaries)), structures that keep track of code (CVs), and a variety of others. The default is to display all these too, to give a complete picture, but if you aren't in a holistic mood, you can make them disappear.

-ellipses, -rhombs

The module tries to give the nodes representing SVs a different shape from those of OPs. OPs are usually rectangular, so two obvious shapes for SVs are ellipses and rhombuses (stretched diamonds). This option currently only makes a difference for VCG (ellipse is the default).

-stashes, -no_stashes

The hashes that perl uses to represent symbol tables are called 'stashes'. Since every GV has a pointer back to its stash, it's virtually inevitable for the links in a graph to lead to the main stash. Unfortunately stashes, especially the main one, can be quite big, and lead to forests of other structures -- there's one GV and another SV for each magic variable, plus all of @INC and %ENV, and so on. To prevent information overload, then, the display of stashes is disabled by default.

-fileGVs, -no_fileGVs

Another kind graph element that can be annoying are the pointers from every GV and COP (a kind of OP that occurs for every statement) to the GV that represents the file from which that code came (used for error messages). By default, these links aren't shown, to keep them from cluttering the graph. Also, perl's internal interfaces changed in a recent version, so in perl 5.005_63 or later you can't see the fileGVs at all.

-SEQs, -no_SEQs

As it is visited in the peephole optimization phase, each OP gets a sequence number, which is currently used by anything (except the peephole optimizer, to avoid visiting OPs twice). If you want to see these, ask for them. (COPs have their own sequence numbers too, but they're more interesting to look at -- for instance, they're used to bound the lifetimes of lexicals).

-types, -no_types

B::Graph always gives the type of each OP symbolically ('entersub'), but it can also print the numeric value of the type field, if you want. The default is no_types.

-float, -no_float

Almost every OP has an op_next and an op_sibling pointer, and B::Graph colors them distinctively (pink and light blue, respectively). Because of this, it isn't strictly necessary to 'anchor' the arrow on a line in the OP's box saying 'op_next'. The float option lets the graph layout engine start these arrows wherever it wants, which can sometimes lead to a more pleasing layout, at the expense of being less obvious. The default is not to float.

Lexical (my()) variables and temporary values used by individual OPs are stored in 'pads', per-code arrays linked to the CV. OPs store indexes into these arrays in the 'op_targ' field, but B::Graph can often also draw links directly from the OP to the SV that stores the name of the variable. These links don't correspond to any real pointers, however, and they can make the graph more complicated, so they are disabled by default.

WHAT DOES THIS ALL MEAN?

SvFLAGS abbreviations

    Pb     SVs_PADBUSY   reserved for tmp or my already
    Pt     SVs_PADTMP    in use as tmp
    Pm     SVs_PADMY     in use a "my" variable
    T      SVs_TEMP      string is stealable?
    O      SVs_OBJECT    is "blessed"
    Mg     SVs_GMG       has magical get method
    Ms     SVs_SMG       has magical set method
    Mr     SVs_RMG       has random magical methods
    I      SVf_IOK       has valid public integer value
    N      SVf_NOK       has valid public numeric (float) value
    P      SVf_POK       has valid public pointer (string) value
    R      SVf_ROK       has a valid reference pointer
    F      SVf_FAKE      glob or lexical is just a copy
    L      SVf_OOK       has valid offset value (mnemonic: lvalue)
    B      SVf_BREAK     refcnt is artificially low
    Ro     SVf_READONLY  may not be modified
    i      SVp_IOK       has valid non-public integer value
    n      SVp_NOK       has valid non-public numeric value
    p      SVp_POK       has valid non-public pointer value
    S      SVp_SCREAM    has been studied?
    V      SVf_AMAGIC    has magical overloaded methods

op_flags abbreviations

    V      OPf_WANT_VOID    Want nothing (void context)
    S      OPf_WANT_SCALAR  Want single value (scalar context)
    L      OPf_WANT_LIST    Want list of any length (list context)
    K      OPf_KIDS         There is a firstborn child.
    P      OPf_PARENS       This operator was parenthesized.
                             (Or block needs explicit scope entry.)
    R      OPf_REF          Certified reference.
                             (Return container, not containee).
    M      OPf_MOD          Will modify (lvalue).
    T      OPf_STACKED      Some arg is arriving on the stack.
    *      OPf_SPECIAL      Do something weird for this op (see op.h)        

BUGS

VCG has a problem with boxes that have more than about 55 arrows coming out of them, so with large arrays and hashes B::Graph will stop outputting edges and some boxes may be disconnected.

AUTHOR

Stephen McCamant <smcc@CSUA.Berkeley.EDU>

SEE ALSO

dot(1), xvcg(1), perl(1), perlguts(1).

If you like B::Graph, you might also be interested in Gisle Aas's PerlGuts Illustrated, at http://gisle.aas.no/perl/illguts/.