The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

psort - a perl-enhanced sort

SYNOPSIS

    psort [OPTION]... [FILE]...

DESCRIPTION

A perl-enhanced variant of sort(1). The specified files (or standard input) are written sorted to standard output.

By default, sorting is done using perl's cmp operator, without any use of locales or encodings.

OPTIONS

-b, --ignore-leading-blanks

Ignore any whitespace character (\s) at the beginning of a line.

-c, --check

Do not output anything. Just check if the input is sorted and return the exit value 0 for sorted and 1 for unsorted.

-C, --compare-function

Sort using a custom perl function. For your convenience, the enclosing "sub {" and "}" must not be specified. Like in perl's sort, the variables $a and $b are available.

Examples

  • Reimplementing the -n switch:

        -C '$a <=> $b'
  • Using locale comparisons:

        -C 'use locale; $a cmp $b'

Note that it is possible to put BEGIN { ... } blocks into the comparison function.

-e, --field-function

Extract the sorting field (or the sorting key) using a custom perl function. For your convenience, the enclosing "sub {" and "}" must not be specified. The current line is available in the variable $_. It is expected that the last expression is the field to be used for comparisons.

Examples:

  • Using just the identity:

        -e '$_'
  • Using only the first four characters for comparisons:

        -e 'substr($_, 0, 4)'
  • Using a regular expression:

        -e '/(\d+) wallclock/ && $1'

Note that it is possible to put BEGIN { ... } blocks into the comparison function.

--rx

Use a regular expression for extracting the sorting field. If a capture group is detected in the regexp, then this capture group is used for the extraction, otherwise the whole matched portion is used.

For example, the above mentioned -e snippet

    -e '/(\d+) wallclock/ && $1'

could be written as

    --rx '(\d+) wallclock'

Only the first capture group is used, others are ignored (for now).

The capture group detection code just uses a heuristic, which may fail in special cases.

-f, --ignore-case

Fold all characters to its uppercased version for comparison.

-i, --ignore-nonprinting

Ignore non-printing characters (everything matching the [[:^print]] character class) for comparison.

-Mmodule[=import]

Load a perl module. The syntax is the same like perl's -M option.

-mmodule[=import]

Load a perl module without default import. The syntax is the same like perl's -M option.

-n, --numeric-sort

Sort numerically. It is using perl's <=> operator.

-N, --natural-sort

Sort using Sort::Naturally, if available.

-r, --reverse

Reverse the result of comparisons.

-u, --unique

Output is made unique for adjacent lines. If -c is specified, then check for strict ordering (adjacent equal lines are considered as unsorted).

-v, --version

Print psort's version.

-V, --version-sort

Sort versions using CPAN::Version, if available.

-X, --no-warnings

By default psort warns if a custom field function or rx returns an undefined value. These warnings may be suppressed with this option.

COMPATIBILITY

Some options found in GNU/POSIX sort are also available in psort. But no attempt was done to make psort compatible to GNU/POSIX sort. Especially there's no locale support (but see above how to use locale in the -C option). There's also no encoding support (though it probably can be emulated by using <Encode/decode in the -e or -C option).

TODO

Here are some ideas for future options:

--encoding

Specify the input and output encoding.

Unicode sorting

An option to use Unicode::Collate.

Currently the longish one-liner has to be used:

    psort -MUnicode::Collate -MEncode=decode -e 'decode("utf-8", $_)' -C 'BEGIN { $Collator = Unicode::Collate->new } $Collator->cmp($a,$b)'
Sort specific columns (<-k>)

Currently one has to use something like the following to sort by columns:

    psort -e '@F=split; $F[...]'
--locale

Specify a locale.

-o

Instead writing to standard output, write to the specified output file.

-m

Assume that input files are already sorted.

-u

Output only unique lines.

AUTHOR

Slaven Rezić

COPYRIGHT AND LICENSE

Copyright (C) 2009,2011,2013,2015,2016,2018,2019,2022,2023 by Slaven Rezić

This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.8 or, at your option, any later version of Perl 5 you may have available.

SEE ALSO

sort(1), Sort::Naturally, CPAN::Version.

An alternative perl-enhanced sort program: subsort (in App::subsort).