Text::Tradition::Collation - a software model for a text collation
use Text::Tradition; my $t = Text::Tradition->new( 'name' => 'this is a text', 'input' => 'TEI', 'file' => '/path/to/tei_parallel_seg_file.xml' ); my $c = $t->collation; my @readings = $c->readings; my @paths = $c->paths; my @relationships = $c->relationships; my $svg_variant_graph = $t->collation->as_svg();
Text::Tradition is a library for representation and analysis of collated texts, particularly medieval ones. The Collation is the central feature of a Tradition, where the text, its sequence of readings, and its relationships between readings are actually kept.
The constructor. Takes a hash or hashref of the following arguments:
tradition - The Text::Tradition object to which the collation belongs. Required.
linear - Whether the collation should be linear; that is, whether transposed readings should be treated as two linked readings rather than one, and therefore whether the collation graph is acyclic. Defaults to true.
baselabel - The default label for the path taken by a base text (if any). Defaults to 'base text'.
wit_list_separator - The string to join a list of witnesses for purposes of making labels in display graphs. Defaults to ', '.
ac_label - The extra label to tack onto a witness sigil when representing another layer of path for the given witness - that is, when a text has more than one possible reading due to scribal corrections or the like. Defaults to ' (a.c.)'.
wordsep - The string used to separate words in the original text. Defaults to ' '.
Simple accessors for collation attributes.
The meta-reading at the start of every witness path.
The meta-reading at the end of every witness path.
Returns all Reading objects in the graph.
Returns the Reading object corresponding to the given ID.
Adds a new reading object to the collation. See Text::Tradition::Collation::Reading for the available arguments.
Removes the given reading from the collation, implicitly removing its paths and relationships.
Predicate to see whether a given reading ID is in the graph.
Returns a list of sigils whose witnesses contain the reading.
Returns all reading paths within the document - that is, all edges in the collation graph. Each path is an arrayref of [ $source, $target ] reading IDs.
Links the given readings in the collation in sequence, under the given witness sigil. The readings may be specified by object or ID.
Returns true if the two readings are linked in sequence in any witness. The readings may be specified by object or ID.
Returns all Relationship objects in the collation.
Adds a new relationship of the type given in $options between the two readings, which may be specified by object or ID. Returns a value of ( $status, @vectors) where $status is true on success, and @vectors is a list of relationship edges that were ultimately added. If an array reference is passed in as $changed_readings, then any readings that were altered due to the relationship creation are added to the array.
See Text::Tradition::Collation::Relationship for the available options.
Add a relationship type definition to this collation. The argument can be either a hash or a hashref, defining the properties of the relationship. For relationship types and their properties, see Text::Tradition::Collation::RelationshipType.
Retrieve the RelationshipType object for the relationship with the given name.
Merges the $second reading into the $main one. If $concatenate is true, then the merged node will carry the text of both readings, concatenated with either $with_str (if specified) or a sensible default (the empty string if the appropriate 'join_*' flag is set on either reading, or else $self->wordsep.)
The first two arguments may be either readings or reading IDs.
Merge all readings linked with the relationship types given. If any of the selected type(s) is not a colocation, the graph will no longer be linear. The majority/plurality reading in each case will be the one kept.
WARNING: This operation cannot be undone.
Where possible in the graph, compresses plain sequences of readings into a single reading. The sequences must consist of readings with no relationships to other readings, with only a single witness path between them and no other witness paths from either that would skip the other. The readings must also not be marked as nonsense or bad grammar.
Split the given reading into two, so that the new reading is in the path for the witnesses given in @witlist. If the result is that certain non-colocated relationships (e.g. transpositions) are no longer valid, these will be removed. Returns the newly-created reading.
Clear the given witnesses out of the collation entirely, removing references to them in paths, and removing readings that belong only to them. Should only be called via $tradition->del_witness.
Return a list of sigils corresponding to the witnesses in which the reading appears.
Returns an SVG string that represents the graph, via as_dot and graphviz. See as_dot for a list of options. Must have GraphViz (dot) installed to run.
Returns a string that is the collation graph expressed in dot (i.e. GraphViz) format. Options include:
from
to
color_common
Returns the list of sigils whose witnesses are associated with the given edge. The edge can be passed as either an array or an arrayref of ( $source, $target ).
Returns a JSON structure that represents the collation sequence graph.
Returns a GraphML representation of the collation. The GraphML will contain two graphs. The first expresses the attributes of the readings and the witness paths that link them; the second expresses the relationships that link the readings. This is the native transfer format for a tradition.
Returns a CSV alignment table representation of the collation graph, one row per witness (or witness uncorrected.)
Returns a tab-separated alignment table representation of the collation graph, one row per witness (or witness uncorrected.)
Return a reference to an alignment table, in a slightly enhanced CollateX format which looks like this:
$table = { alignment => [ { witness => "SIGIL", tokens => [ { t => "TEXT" }, ... ] }, { witness => "SIG2", tokens => [ { t => "TEXT" }, ... ] }, ... ], length => TEXTLEN };
Returns the ordered list of readings, starting with $first and ending with $last, for the witness given in $sigil. If a $backup sigil is specified (e.g. when walking a layered witness), it will be used wherever no $sigil path exists. If there is a base text reading, that will be used wherever no path exists for $sigil or $backup.
Returns a list of readings at a given rank, taken from the alignment table.
Returns the reading that follows the given reading along the given witness path.
Returns the reading that precedes the given reading along the given witness path.
Returns the list of common readings in the graph (i.e. those readings that are shared by all non-lacunose witnesses.)
Returns the text of a witness (plus its backup, if we are using a layer) as stored in the collation. The text is returned as a string, where the individual readings are joined with spaces and the meta-readings (e.g. lacunae) are omitted. Optional specification of $start and $end allows the generation of a subset of the witness text. Optional specification of $use_normal_form produces a text based on the normal form, rather than the raw text, of the reading.
Returns the text of a given sequence of readings. No attempt is made to validate the sequence in question. If $use_normal_form is set to true, the normal form of each reading in the sequence will be used to construct the text.
These are mostly for use by parsers.
Link the array of readings contained in $witness->path (and in $witness->uncorrected_path if it exists) into collation paths. Clear out the arrays when finished.
Call make_witness_path for all witnesses in the tradition.
Calculate the reading ranks (that is, their aligned positions relative to each other) for the graph. This can only be called on linear collations.
A convenience method for parsing collation data. Searches the graph for readings with the same text at the same rank, and merges any that are found.
Goes through the graph identifying all pairs of readings that appear to be identical, and therefore able to be merged into a single reading. Returns the relevant identical pairs. Can be restricted to run over only a part of the graph, specified either by node or by rank.
Goes through the graph identifying the readings that appear in every witness (apart from those with lacunae at that spot.) Marks them as common and returns the list.
Calculate the text array for all witnesses from the path, for later consistency checking. Only to be used if there is no non-graph-based way to know the original texts.
Find the last reading that occurs in sequence before both the given readings. At the very least this should be $self->start.
Find the first reading that occurs in sequence after both the given readings. At the very least this should be $self->end.
Rework XML serialization in a more modular way
This package is free software and is provided "as is" without express or implied warranty. You can redistribute it and/or modify it under the same terms as Perl itself.
Tara L Andrews <aurum@cpan.org>
To install Text::Tradition, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Text::Tradition
CPAN shell
perl -MCPAN -e shell install Text::Tradition
For more information on module installation, please visit the detailed CPAN module installation guide.