The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

anvl - command to convert and manipulate ANVL records

SYNOPSIS

anvl [--format format] [other_options] [file ...]

DESCRIPTION

The anvl utility converts ANVL records to a variety of formats, including XML, Turtle, JSON, ANVL (long form), and Plain. An ANVL (A Name Value Language) record is a text-based sequence of elements ending in a blank line, where each element consists of a label, colon, and value and long values may be continued on subsequent indented lines.

This utility reads one or more file arguments (or the standard input if none) and writes on the standard output. The current version assumes input to be a stream of ANVL records. More information is given in the OPTIONS section.

EXAMPLES

The special label "erc" in front of a short form ERC (Electronic Resource Citation) record is recognized and the record is converted to long form before other processing is done.

   $ echo 'erc: a | b | c | d' | anvl --format json
   [
     {
       "erc": "",
       "who": "a",
       "what": "b",
       "when": "c",
       "where": "d"
     }
   ]

Comments may be passed through to any output format; pseudo-comments are produced if the target format doesn't natively support comments.

   $ echo '# A way to kernel knowledge.
   > erc: Kunze, John A. | A Metadata Kernel for Electronic Permanence
   >      | 20011106 | http://journals.tdl.org/jodi/article/view/43
   > ' > myfile
   $ anvl --comments -m turtle myfile
   @prefix erc: <http://purl.org/kernel/elements/1.1/> .
   <http://journals.tdl.org/jodi/article/view/43>
   # A way to kernel knowledge.

       erc:erc """""" ;
       erc:who """Kunze, John A.""" ;
       erc:what """A Metadata Kernel for Electronic Permanence""" ;
       erc:when """20011106""" ;
       erc:where """http://journals.tdl.org/jodi/article/view/43""" .

The default conversion target is to the ANVL format, which does little except to expand short form ERCs and regularize some of the whitespace.

   $ anvl myfile
   erc:
   who: Kunze, John A.
   what: A Metadata Kernel for Electronic Permanence
   when: 20011106
   where: http://journals.tdl.org/jodi/article/view/43

The verbose option can cause extra information to be output.

   $ echo 'a: b
   > #note to self
   > c: d' | anvl --verbose --comments -m xml
   <recs>
     <rec>   <!-- from record 1, line 1 -->
       <a>b</a>
       <!-- #note to self -->
       <c>d</c>
     </rec>
   </recs>

That XML conversion output can be converted back to the ANVL record,

   erc:
   a: b
   c: d

with this style sheet

   <xsl:template match="/">
   <xsl:for-each select="recs/rec">
   erc:
   <xsl:for-each select="*">
   <xsl:value-of select="local-name(.)"/>: <xsl:value-of select="."/>
   <xsl:text>
   </xsl:text>
   </xsl:for-each>
   </xsl:for-each>
   </xsl:template>

OPTIONS

--comments

Preserve comments during --format conversion, with pseudo-comments produced depending on the target format.

--find regexp

Only output records that match the regular expression, regexp. (The match is done before any expansion of short form ERCs.)

-m format[:order], --format format[:order]

Convert to the given format, currently one of "ANVL" (default), "XML", "Turtle", "JSON", "CSV", "PSV" (Pipe Separated Value), or "Plain". When converting comments to the JSON or Plain formats, pseudo-comments are output. Some options (below) apply only to specific target formats.

Optionally, format may be followed by a colon and order, which is a list of '|'-separated element names specifying particular set and ordering in which to output record elements. For example, "CSV:name|phone|email" specifies the "CSV" format with records consisting of exactly name, phone, and email. Currently, only the first instance of a named element in a record is output, and a missing element is output as if it had an empty value.

--invert

Convert element values that end with one or more commas (used in ANVL to designate sort-friendly values that may contain inversion points) to natural word order. The more terminal commas, the more inversion points tried. For example, the values

     Smith, Pat,
     McCartney, Paul, Sir,,
     Hu Jintao,

convert to the following natural word orderings

     Pat Smith
     Sir Paul McCartney
     Hu Jintao
-h, --help

Print extended help documentation.

--listformats

Print known conversion formats.

--man

Print full documentation.

--predns namespace

For Turtle conversion, use the given namespace for assertion Predicates, by default, "http://purl.org/kernel/elements/1.1/".

--show regexp

Show only those elements matching the regular expression, regexp. Matching, which can include labels and values, is done against a string (re-)constructing the element as a "label: value".

--subjelpat pattern

For Turtle conversion, use the given pattern as a regular expression to match the first instance of an ANVL element name in each input record, the corresponding value of which will become the Subject of Turtle assertions about the containing record. By default, the first element matching "^identifier$" or "^subject$" is used, unless the record appears to be an ERC (Electronic Resource Citation), in which case the first element matching "^where$" is used. Failing all else, the first non-empty element will be used.

-v, --verbose

Show more information, such as record numbers in output comments.

--version

Print the current version number and exit.

SEE ALSO

A Name Value Language (ANVL) http://www.cdlib.org/inside/diglib/ark/anvlspec.pdf

A Metadata Kernel for Electronic Permanence (pdf) http://journals.tdl.org/jodi/article/view/43

AUTHOR

John Kunze jak at ucop dot edu

COPYRIGHT

Copyright 2009-2011 UC Regents. Open source BSD license.