Crypt::IDA::ShareFile - Archive file format for Crypt::IDA module
use Crypt::IDA::ShareFile ":default"; @list = sf_split( ... ); $bytes = sf_combine ( ... );
This module implements a file format for creating, storing and distributing shares created with Crypt::IDA. Created files contain share data and (by default) the corresponding transform matrix row used to split the input file. This means that share files are (again, by default) stand-alone and may recombined later without needing any other stored key or the involvement of the original issuer.
In addition to creating a number of shares, the module can also handle breaking the input file into several chunks before processing, in a way similar to multi-volume PKZIP, ARJ or RAR archives. Each of the chunks may be split into shares using a different transform matrix. Individual groups of chunks may be re-assembled independently, as they are collected, and the quorum for each is satisfied.
No methods are exported by default. All methods may be called by prefixing the method names with the module name, eg:
$foo=Crypt::IDA::ShareFile::sf_split(...)
Alternatively, routines can be exported by adding ":default" to the "use" line, in which case the routine names do not need to be prefixed with the module name, ie:
use Crypt::IDA::ShareFile ":default"; $foo=Crypt::IDA::ShareFile::sf_split(...) # ...
Some extra ancillary routines can also be exported with the ":extras" (just the extras) or ":all" (":extras" plus ":default") parameters to the use line. See the section "ANCILLARY ROUTINES" for details.
The template for a call to sf_split, showing all possible inputs and default values, is as follows:
sf_split
@list=sf_split( shares => undef, quorum => undef, width => 1, filename => undef, # supply a key, a matrix or neither key => undef, matrix => undef, # misc options version => 1, # header version rand => "/dev/urandom", bufsize => 4096, save_transform => 1, # chunking methods; pick one at most n_chunks => undef, in_chunk_size => undef, out_chunk_size => undef, out_file_size => undef, # allow creation of a subset of shares, chunks sharelist => undef, # [ $row1, $row2, ... ] chunklist => undef, # [ $chunk1, $chunk2, ... ] # specify pattern to use for share filenames filespec => undef, # default value set later on );
The minimal set of inputs is:
@list=sf_split( shares => $number_of_shares, quorum => $quorum_value, filename => "filename", );
The function returns a list of [$key,$mat,$bytes_read,@output_files] listrefs corresponding to each chunk that was created, or undef in the case of an error.
[$key,$mat,$bytes_read,@output_files]
The n_chunks, in_chunk_size, out_chunk_size and out_file_size options allow control over how (or if) the input file is broken into chunks. At most one of these options may be specified. The n_chunks option divides the input into the specified number of chunks, which will be of (more-or-less) equal size.
n_chunks
in_chunk_size
out_chunk_size
out_file_size
The filespec option allows control over naming of output files. By default this is set to '%f-%c-%s.sf' when a file is being split into several chunks, or '%f-%s.sf' where no chunking is performed. Before creating the output files, the '%f', '%c' and '%s' patterns are replaced by:
If an error is encountered during the creation of one set of shares in a multi-chunk job, then the routine returns immediately without attempting to split any other remaining chunks.
The template for a call to sf_combine, showing all possible inputs and default values, is as follows:
sf_combine
$bytes_read = sf_split ( infiles => undef, # [ $file1, $file2, ... ] outfile => undef, # "filename" # If specified, the following must agree with the values stored # in the sharefiles. There's normally no need to set these. quorum => undef, width => undef, # optional matrix, key parameters key => undef, matrix => undef, shares => undef, # required if key supplied sharelist => undef, # required if key supplied # misc options bufsize => 4096, );
$bytes_written = sf_combine ( infiles => [ $file1, $file2, ... ], outfile => $output_filename );
The return value is the number of bytes written to the output file or null in the case of some error.
The current version of the module only supports combining a single chunk with each call to ida_combine. Apart from being used in the call to open the input file, the routine does not examine the input filenames at all since all information necessary to combine the file is expected to be contained within the files themselves (along with any key/matrix parameters passed in, in the case where this information is not stored in the file itself).
ida_combine
Chunks may be combined in any order. When the final chunk is processed, if any any padding bytes were added to it during the sf_split routine, these will be removed by truncating the output file.
The extra routines are exported by using the ":extras" or ":all" parameter with the initial "use" module line. The extra routines are as follows:
$filename = sf_sprintf_filename($format,$infile,$chunk,$share);
This routine creates share file names from the given parameters. It is used internally by sf_split.
@chunk_info=sf_calculate_chunk_sizes( quorum => undef, width => undef, filename => undef, # misc options version => 1, # header version save_transform => 1, # whether to store transform in header # chunking method: pick at most one n_chunks => undef, in_chunk_size => undef, out_chunk_size => undef, out_file_size => undef, );
This returns a list of hashrefs containing information about chunk sizes, ranges, etc., with one element for each chunk which would be created with the given parameters. All input values match those which would be passed to sf_split except for the save_transform value, which specifies whether the transform matrix row for each share should be stored within the file header. Each hash in the returned list has the following keys:
save_transform
chunk_start first byte of chunk
chunk_next first byte of next chunk
chunk_size chunk_next - chunk_start
file_size share file size, including header
opt_final is the last chunk in the file?
padding number of padding bytes in (final) chunk
This routine is used internally by sf_split to calculate chunk sizes. It is available for calling routines since it may be useful to know in advance how large output files will be before any shares are created, such as for cases where there is limited space (eg, network share or CD image) for creation of those output shares.
Provided the default settings are used, created sharefiles will have all the information necessary to reconstruct the file once sufficient shares have been collected. For systems where an alternative scheme is required, see the discussion in the Crypt::IDA man page.
Each share file consists of a header and some share data. For the current version of the file format (version 1), the header format is as follows:
Bytes Name Value 2 magic marker for "Share File" format; "SF" = {5346} 1 version file format version = 1 1 options options bits (see below) 1-2 k,quorum quorum k-value 1-2 s,security security level (ie, field width, in bytes) var chunk_start absolute offset of chunk in file var chunk_next absolute offset of next chunk in file var transform transform matrix row (optional)
All values stored in the header file (and the share data) are stored in network (big-endian) byte order.
The options bits are as follows:
Bit name Settings 0 opt_large_k Large (2-byte) k value? 1 opt_large_w Large (2-byte) w value? 2 opt_final Final chunk in file? (1=full file/final chunk) 3 opt_transform Is transform data included?
All file offsets are stored in a variable-width format. They are stored as the concatenation of two values:
the number of bytes required to store the offset, and
the actual file offset.
So, for example, the offset "0" would be represented as the single byte "0", while the offset 0x4321 would be represented as the hex bytes "02", "43", "21".
Note that the chunk_next field is 1 greater than the actual offset of the chunk end. In other words, each chunk ranges from the byte starting at chunk_start up to, but not including the byte at chunk_next. That's why it's called chunk_next rather than chunk_end.
The current implementation is limited to handling input files less than 4Gb in size. This is merely a limitation of the current header handling code, and this restriction may by removed in a later version.
Currently, the only chunking options available are for no chunking or for chunking a file into a given number of chunks (with n_chunks option).
It is possible that the following changes/additions will be made in future versions:
implement a routine to scan given list of input files and group them into batches that can be passed to sf_combine (as well as weeding out sub-quorum batches and broken, overlapping or non-compliant files);
implement a file format that can keep track of created sharefiles along with parameters used to create them;
implement encryption of input data stream along with dispersed storage of decryption key in share files;
implement a cryptographic accumulator to eliminate the possibility of a cheater presenting an invalid share at the combine stage;
implement regular checksum/hash to detect damage to share data;
implement storing a row number with shares in the case where the transform data for that share is not stored in the sharefile header;
implement a scatter/gather function to disperse shares over the network and download (some of) them again to reconstitute the file (probably in a new module);
implement a network-based peer protocol which allows peers with the same file to co-ordinate key generation so that they can generate compatible shares and also share the burden of ShareFile creation and distribution (long-term goal)
I'm open to suggestions on any of these features or any other feature that anybody might want ...
See the documentation for Crypt::IDA for more details of the underlying algorithm for creating and combining shares.
This distribution includes two command-line scripts called rabin-split.pl and rabin-combine.pl which provide simple wrappers to access all functionality of the module.
rabin-split.pl
rabin-combine.pl
Declan Malone, <idablack@sourceforge.net>
Copyright (C) 2009-2019 by Declan Malone
This package is free software; you can redistribute it and/or modify it under the terms of version 2 (or, at your discretion, any later version) of the "GNU General Public License" ("GPL").
Please refer to the file "GNU_GPL.txt" in this distribution for details.
This package is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.
To install Crypt::IDA, copy and paste the appropriate command in to your terminal.
cpanm
cpanm Crypt::IDA
CPAN shell
perl -MCPAN -e shell install Crypt::IDA
For more information on module installation, please visit the detailed CPAN module installation guide.