The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

TITLE

Synopsis 22: Package Format [DRAFT]

AUTHORS

    Jos Boumans <kane@cpan.org>
    Audrey Tang <audreyt@audreyt.org>
    Florian Ragwitz <rafl@debian.org>
    Tim Nelson <wayland@wayland.id.au>

VERSION

    Created: 3 Nov 2005

    Last Modified: 19 Dec 2008
    Version: 2

OVERVIEW

Terminology and Scope

I'll start by listing a few terms, and whether this document is supposed to cover them or not.

  • .jib files; this is the source package format, and is specified in this document

  • CPAN6; this is a piece of software for managing an archive network (such as CPAN). This is not specified in this document; see http://cpan6.org/

  • PAUSE6; this is an actual network based on the cpan6 software (see above). It also is not documented here.

  • CPAN6.pm; this is a piece of software that starts with what it can get on PAUSE6, and attempts to give you an installed perl module (this is a replacement for CPANPLUS/cpan2dist)

Inspirations

The following can be useful inspirations:

PACKAGE LAYOUT

Project directory

Step 1 of the general flow should ideally be done by an automated tool, like p5's current Module::Starter or somesuch. Suffice to say, it should produce a layout something along these lines (note, this is just an example):

    p5-Foo-Bar/
        lib/
            Foo/
                Bar.pm
        t/
            00_load.t
        _jib/
            META.info

The files in the _jib dir are part of the package metadata. The most important file is the META.info file that holds all the collected metadata about the package, which ideally gets filled (mostly) by what is described in step 2 of the General Flow. Any pre/posthook files should also go in this directory. This directory should be extensible, so new files can be added to extend functionality. See the section on Metadata Spec for details.

.jib files

These files are created in step 3 of the General Flow

JIB is a simple 3 letter combination that's not yet 'taken' as a known extension. It's purposely not perl specific, as there's nothing about the JIB specification that's limiting it to perl only.

# XXX - Also package is carrying double meaning in P6 as both namespace and source distribution. Can we remove the former meaning and refer to them as module and namespace from now on?

.jib files are archives designed to distribute source packages, not installable packages. As we will need to compile things on the client side (things that have C bits or equivalent), and because we can not know the install path before hand, a source package is an obvious choice. A binary, installable package like .deb is therefor no option.

These .jib contain metadata and installable code quite analogous to the .deb packages we know, except that the metadata is also used to compile (for the lack of a better term so far) the code on the user side.

The name of a .jib file is determined as follows:

    <prefix>-<package-name>-<version>-<authority>.<extension>

In praxis, this will produce a name along these lines:

    p5-Foo-Bar-1.1-cpan+kane.jib

The Internal layout is as follows:

    - control.tgz
        * contains the data in the _jib directory
    - data.tgz
        * contains the following directories the other directories.
            This may be limited in the future, by say, a manifest.skip
            like functionality, or by dictating a list of directories/
            files that will be included

There is room to ship more files alongside the 2 above mentioned archives. This allows us to ship an extra md5sum, version, signature, anything.

METADATA SPEC

    - Define no more than needed to get started for now
        - Allow for future extensions
        - Use YAML as metadata format as it's portable and available standard
            in perl6

Supported fields

    - Prefix        Package prefix category     (p5)
    - Name          Perl module name            (Foo-Bar)
    - Version       Perl module version         (1.2.3)
    - Authority     From S11                    (cpan+KANE)
    - Package       Full package name           (p5-Foo-Bar-1.2.3-cpan+kane)
    - Description   Description of function     (This is what it does)
    - Author        CPAN author id              (KANE)
    - Depends       Packages it depends on[1][2](p5-Foo)
    - Provides      Packages this one provides  (p5-Foo-Bar,
                                                    p5-Foo-Bar-cpan+kane)

As the <Prefix>-<Name>-<Version>-<Authority> combination make up the <Package> name, arguably, we can leave the former out. The upside is to make sure all fields contain unique information. The downside is that 3rd party parsers will need to understand the Package syntax.

Again, arguably, the Author and Authority fields overlap, and Authority can be made to hold both cases.

    [1] This is packages, *not* modules. If we need a module -> package
        mapping, this needs to be done when extracting the data from the
        compiler, and queried against the available packages cache.
    [2] See the section on L<Dependencies>

Suggested fields[3]

    - Build-Depends Packages needed to build this package
    - Suggests      Packages suggested by this package
    - Recommends    Packages recommended by this package
    - Enhances      Packages that are enhanced by this package
    - Conflicts     Packages this one conflicts with
    - Replaces      Packages this one replaces
    - Tags          Arbitrary metadata about the package,
                    like flickr and debtags
    - Contains      List of modules (and scripts?) contained
                    in the package

    [3] Steal more tags from debian policy

DEPENDENCIES

Dependency Notation

Dependency notation allows you to express the following concepts:

OR

Specifies alternatives

AND

Specifies cumulative requirements

associate VERSION requirement

Specifies a criteria for the version requirement

grouping

This allows nesting of the above expressions

Basic notation:

    a, b                        # a AND b
    [a, b]                      # a OR b
    { a => "> 2" }              # a greater than 2
    { a => 1 }                  # shorthand for a greater or equal to 1
    \[ ... ]                    # grouping

More complex examples:

    a, [b,c]                    # a AND (b OR c)
    { a => 1 }, { a => '< 2' }  # a greater or equal to 1 AND smaller than 2
    [a, \[ b, c ] ]             # a OR (b AND c) [1]

    [1] This is possibly not portable to other languages. Options seem
        thin as we don't have some /other/ grouping mechanism than [ ], { }
        and \[ ]; ( ) gets flattened and \( ) == [ ].
        We could abuse { } to create { OR => [ ] } and { AND => [ ] }
        groups, but it would not read very intuitively. It would also mean
        that the version requirements would have to be in the package naming,
        ie. 'a > 2' rather than a => '> 2'

Serialization Examples

    # a, b -- AND
    - a
    - b

    # [a, b] -- OR
    -
      - a
      - b

    # { a => "> 2" } -- VERSIONED
    a: > 2

    # { a => 1 } -- VERSIONED
    a: 1


    # \[ ... ]  -- GROUPING
    - !perl/ref:
      =:
        - ...