IPC::Exe - Execute processes or Perl subroutines & string them via IPC. Think shell pipes.
use IPC::Exe qw(exe bg); my @pids = &{ exe qw( ls /tmp a.txt ), \"2>#", bg exe qw( sort -r ), exe sub { print "[", shift, "] 2nd cmd: @_\n"; print "three> $_" while <STDIN> }, bg exe 'sort', exe "cat", "-n", exe sub { print "six> $_" while <STDIN>; print "[", shift, "] 5th cmd: @_\n" }, };
is like doing the following in a modern Unix shell:
ls /tmp a.txt 2> /dev/null | { sort -r | [perlsub] | { sort | cat -n | [perlsub] } & } &
except that [perlsub] is really a perl child process with access to main program variables in scope.
[perlsub]
This module was written to provide a secure and highly flexible way to execute external programs with an intuitive syntax. In addition, more info is returned with each string of executions, such as the list of PIDs and $? of the last external pipe process (see "RETURN VALUES"). Execution uses exec command, and the shell is never invoked.
$?
exec
The two exported subroutines perform all the heavy lifting of forking and executing processes. In particular, exe( ) implements the KID_TO_READ version of
exe( )
KID_TO_READ
http://perldoc.perl.org/perlipc.html#Safe-Pipe-Opens
while bg( ) implements the double-fork technique illustrated at
bg( )
http://perldoc.perl.org/perlfaq8.html#How-do-I-start-a-process-in-the-background?
Let's dive right away into some examples. To begin:
my $exit = system( "myprog $arg1 $arg2" );
can be replaced with
my $exit = &{ exe 'myprog', $arg1, $arg2 };
exe( ) returns a LIST of PIDs, the last item of which is $? (of default &READER). To get the actual exit value $exitval, shift right by eight $? >> 8.
&READER
$exitval
$? >> 8
Extending the previous example,
my $exit = system( "myprog $arg1 $arg2 $arg3 > out.txt" );
my $exit = &{ exe 'myprog', $arg1, $arg2, [ '>', 'out.txt' ] };
The previous two examples will wait for 'myprog' to finish executing before continuing the main program.
Extending the previous example again,
# cannot obtain $exit of 'myprog' because it is in background system( "myprog $arg1 $arg2 $arg3 > out.txt &" );
# just add 'bg' before 'exe' in previous example my $bg_pid = &{ bg exe 'myprog', $arg1, $arg2, [ '>', 'out.txt' ] };
Now, 'myprog' will be put in background and the main program will continue without waiting.
To monitor the exit value of a background process:
my $bg_pid = &{ bg sub { # same as 2nd previous example my ($pid) = &{ exe 'myprog', $arg1, $arg2, [ '>', 'out.txt' ] }; # check if exe() was successful defined($pid) or die("Failed to run process in background"); # handle exit value here print STDERR "background exit value: " . ($? >> 8) . "\n"; } } or die("Failed to send process to background");
Instead of using backquotes or qx( ),
qx( )
# slurps entire STDOUT into memory my @stdout = `$program @ARGV`; # handle STDOUT here for my $line (@stdout) { print "read_in> $line"; }
we can read the STDOUT of one process with:
STDOUT
my ($pid) = &{ # execute $program with arguments exe $program, @ARGV, # handle STDOUT here sub { while (my $line = <STDIN>) { print "read_in> $line"; } # set exit status of main program waitpid($_[0], 0); }, }; # check if exe() was successful defined($pid) or die("Failed to run process"); # exit value of $program my $exitval = $? >> 8;
Perform tar copy of an entire directory:
use Cwd qw(chdir); my @pids = &{ exe sub { chdir $source_dir or die $! }, qw(/bin/tar cf - .), exe sub { chdir $target_dir or die $! }, qw(/bin/tar xBf -), }; # check if exe()'s were successful defined($pids[0]) && defined($pids[1]) or die("Failed to run processes"); # was un-tar successful? my $error = pop(@pids);
Here is an elaborate example to pipe STDOUT of one process to the STDIN of another, consecutively:
STDIN
my @pids = &{ # redirect STDERR to STDOUT exe $program, @ARGV, \"2>&1", # 'perl' receives STDOUT of $program via STDIN exe sub { my ($pid) = &{ exe qw(perl -e), 'print "read_in> $_" while <STDIN>; exit 123', }; # check if exe() was successful defined($pid) or die("Failed to run process"); # handle exit value here print STDERR "in-between exit value: " . ($? >> 8) . "\n"; # this is executed in child process # no need to return }, # 'sort' receives STDOUT of 'perl' exe qw(sort -n), # [perlsub] receives STDOUT of 'sort' exe sub { # find out command of previous pipe process # if @_[1..$#_] is an empty list, previous process was a [perlsub] my ($child_pid, $prog, @args) = @_; # output: "last_pipe[12345]> sort -n" print STDERR "last_pipe[$child_pid]> $prog @args\n"; # print sorted, 'perl' filtered, output of $program print while <STDIN>; # find out exit value of previous 'sort' pipe process waitpid($_[0], 0); warn("Bad exit for: @_\n") if $?; return $?; }, }; # check if exe()'s were successful defined($pids[0]) && defined($pids[1]) && defined($pids[2]) or die("Failed to run processes"); # obtain exit value of last process on pipeline my $exitval = pop(@pids) >> 8;
Shown below is an example of how to capture STDERR and STDOUT after sending some input to STDIN of the child process:
STDERR
# reap child processes 'xargs' when done local $SIG{CHLD} = 'IGNORE'; # like IPC::Open3; filehandles are returned on-the-fly my ($pid, $TO_STDIN, $FROM_STDOUT, $FROM_STDERR) = &{ exe +{ stdin => 1, stdout => 1, stderr => 1 }, qw(xargs ls -ld), }; # check if exe() was successful defined($pid) or die("Failed to run process"); # ask 'xargs' to 'ls -ld' three files print $TO_STDIN "/bin\n"; print $TO_STDIN "does_not_exist\n"; print $TO_STDIN "/etc\n"; # cause 'xargs' to flush its stdout close($TO_STDIN); # print captured outputs print "stderr> $_" while <$FROM_STDERR>; print "stdout> $_" while <$FROM_STDOUT>; # close filehandles close($FROM_STDOUT); close($FROM_STDERR);
Of course, more exe( ) calls may be chained together as needed:
# reap child processes 'xargs' when done local $SIG{CHLD} = 'IGNORE'; # like IPC::Open2; filehandles are returned on-the-fly my ($pid1, $TO_STDIN, $pid2, $FROM_STDOUT) = &{ exe +{ stdin => 1 }, sub { "2>&1" }, qw(perl -ne), 'print STDERR "360.0 / $_"', exe +{ stdout => 1 }, qw(bc -l), }; # check if exe()'s were successful defined($pid1) && defined($pid2) or die("Failed to run processes"); # ask 'bc -l' results of "360 divided by given inputs" print $TO_STDIN "$_\n" for 2 .. 8; # we redirect stderr of 'perl' to stdout # which, in turn, is fed into stdin of 'bc' # print captured outputs print "360 / $_ = " . <$FROM_STDOUT> for 2 .. 8; # close filehandles close($TO_STDIN); close($FROM_STDOUT);
Important: Some non-Unix platforms, such as Win32, require interactive processes (shown above) to know when to quit, and can neither rely on close($TO_STDIN), nor kill(TERM => $pid);
close($TO_STDIN)
kill(TERM => $pid);
Both exe( ) and bg( ) are optionally exported. They each return CODE references that need to be called.
exe \%EXE_OPTIONS, &PREEXEC, LIST, @REDIRECTS, &READER
\%EXE_OPTIONS is an optional hash reference to instruct exe( ) to return STDIN / STDERR / STDOUT filehandle(s) of the executed child process. See "SETTING OPTIONS".
\%EXE_OPTIONS
LIST is exec( ) in the child process after the parent is forked, where the child's stdout is redirected to &READER's stdin. It is optional if &PREEXEC is provided.
LIST
exec( )
&PREEXEC
&PREEXEC is called right before exec( ) in the child process, so we may reopen filehandles or do some child-only operations beforehand. It is optional if LIST is provided.
&PREEXEC could return a LIST of @REDIRECTS to perform common filehandle redirections and/or modify binmode settings. The @REDIRECTS may be optionally specified (as references) after LIST. Returning these strings (or references to them) will do the following preset actions:
@REDIRECTS
binmode
"2>#" or "2>null" silence stderr ">#" or "1>null" silence stdout "2>&1" redirect stderr to stdout "1>&2" or ">&2" redirect stdout to stderr "2>&-" close stderr "1><2" or "2><1" swap stdout and stderr (+) shell-way works too: \"3>&1", \"1>&2", \"2>&3", \"3>&-" "0:crlf" does binmode(STDIN, ":crlf") "1:raw" or "1:" does binmode(STDOUT, ":raw") "2:utf8" does binmode(STDERR, ":utf8")
&PREEXEC could also return array references in the mix to perform open operations. If open fails, IPC::Exe will die. Minimal validation is done for the array items, so be careful. Examples:
open
IPC::Exe
[ ">", "/path/file" ] does open(STDOUT, ">", "/path/file") [ ">>", "/path/file" ] does open(STDOUT, ">>", "/path/file") [ "2>", "/path/file" ] does open(STDERR, ">", "/path/file") [ *FH, "+>>", $file ] does open(FH, "+>>", $file)
If references to array refs are returned by &PREEXEC, then sysopen will be used instead:
sysopen
\[ *FH, $file, O_RDWR ] does sysopen(FH, $file, O_RDWR) \[ *FH, $file, O_WRONLY, 0644 ] does sysopen(FH, $file, O_WRONLY, 0644)
It is important to note that the actions & return of &PREEXEC matters, as it may be used to redirect filehandles before &PREEXEC becomes the exec process. If @REDIRECTS are provided along with &PREEXEC, the filehandle operations returned by &PREEXEC are done first prior to @REDIRECTS, in return-order.
&PREEXEC is called with arguments passed to the CODE reference returned by exe( ).
&READER is called with ($child_pid, LIST) as its arguments. LIST corresponds to the positional arguments passed in-between &PREEXEC and @REDIRECTS.
($child_pid, LIST)
If exe( )'s are chained, &READER calls itself as the next exe( ) in line, which in turn, calls the next &PREEXEC, LIST, etc.
&READER is always called in the parent process.
&PREEXEC is always called in the child process.
waitpid( $_[0], 0 ) in &READER to set exit status $? of previous process executing on the pipe. close( $IPC::Exe::PIPE ) can also be used to close the input filehandle and set $? at the same time (for Unix platforms only).
waitpid( $_[0], 0 )
close( $IPC::Exe::PIPE )
If LIST is not provided, &PREEXEC will still be called.
If &PREEXEC is not provided, LIST will still exec.
If &READER is not provided, it defaults to something like
sub { print while <STDIN>; waitpid($_[0], 0); return $? } # $_[0] is the $child_pid
exe( &READER ) simply returns &READER.
exe( &READER )
exe( ) with no arguments returns an empty list.
bg \%BG_OPTIONS, &BACKGROUND
\%BG_OPTIONS is an optional hash reference to instruct bg( ) to wait a certain amount of time for PREEXEC to complete (for non-Unix platforms only). See "SETTING OPTIONS".
\%BG_OPTIONS
&BACKGROUND is called after it is sent to the init process.
&BACKGROUND
If &BACKGROUND is not a CODE reference, return an empty list upon execution.
bg( ) with no arguments returns an empty list.
This experimental feature is not enabled by default:
Upon failure of background to init process, bg( ) can fallback by calling &BACKGROUND in parent or child process if $IPC::Exe::bg_fallback is true. To enable fallback feature, set
$IPC::Exe::bg_fallback
$IPC::Exe::bg_fallback = 1;
\%EXE_OPTIONS is a hash reference that can be provided as the first argument to exe( ) to control returned values. It may be used to return or assign STDIN / STDERR / STDOUT filehandle(s) of the child process to emulate IPC::Open2 and IPC::Open3 behavior.
The default values are:
%EXE_OPTIONS = ( pid => undef, stdin => 0, stdout => 0, stderr => 0, autoflush => 1, binmode_io => undef, );
These are the effects of setting the following options:
Set $pid to the child process PID, given a SCALAR reference. The PID will not be returned as part of the return values of exe( ).
$pid
Return a WRITEHANDLE to STDIN of the child process. The filehandle will be set to autoflush on write if $EXE_OPTIONS{autoflush} is true.
$EXE_OPTIONS{autoflush}
If given a SCALAR reference, set $TO_STDIN to the WRITEHANDLE described above. The WRITEHANDLE then will not be returned as part of the return values of exe( ).
$TO_STDIN
Return a READHANDLE from STDOUT of the child process, so output to stdout may be captured. When this option is set and &READER is not provided, the default &READER subroutine will NOT be called.
If given a SCALAR reference, set $FROM_STDOUT to the READHANDLE described above. The READHANDLE then will not be returned as part of the return values of exe( ).
$FROM_STDOUT
Return a READHANDLE from STDERR of the child process, so output to stderr may be captured.
If given a SCALAR reference, set $FROM_STDERR to the READHANDLE described above. The READHANDLE then will not be returned as part of the return values of exe( ).
$FROM_STDERR
Disable autoflush on the WRITEHANDLE to STDIN of the child process. This option only has effect when $EXE_OPTIONS{stdin} is true.
$EXE_OPTIONS{stdin}
Set binmode of STDIN and STDOUT of the child process for layer $EXE_OPTIONS{binmode_io}. This is automatically done for subsequently chained exe( )cutions. To stop this, set to an empty string "" or another layer to bring a different mode into effect.
$EXE_OPTIONS{binmode_io}
""
NOTE: This only applies to non-Unix platforms.
\%BG_OPTIONS is a hash reference that can be provided as the first argument to bg( ) to set wait time (in seconds) before relinquishing control back to the parent thread. See "CAVEAT" for reasons why this is necessary.
The default value is:
%BG_OPTIONS = ( wait => 2, # Win32 option );
By chaining exe( ) and bg( ) statements, calling the single returned CODE reference sets off the chain of executions. This returns a LIST in which each element corresponds to each exe( ) or bg( ) call.
When exe( ) executes an external process, the PID for that process is returned, or an EMPTY LIST if exe( ) failed in any operation prior to forking. If an EMPTY LIST is returned, the chain of execution stops there and the next &READER is not called, guaranteeing the final return LIST to be truncated at that point. Failure after forking causes die( ) to be called.
die( )
When exe( ) executes a &READER subroutine, the subroutine's return value is returned. If there is no explicit &READER, the implicit default &READER subroutine is called instead:
It returns $?, which is the status of the last pipe process close. This allows code to be written like:
my $exit = &{ exe 'myprog', $myarg }; # $exit = ($myprog_pid, $myprog_exit_status);
When non-default \%EXE_OPTIONS are specified, each exe( ) returns additional filehandles in the following LIST:
( $PID, # undef if exec failed $STDIN_WRITEHANDLE, # only if $EXE_OPTIONS{stdin} is true $STDOUT_READHANDLE, # only if $EXE_OPTIONS{stdout} is true $STDERR_READHANDLE, # only if $EXE_OPTIONS{stderr} is true )
The positional LIST form return allows code to be written like:
my ($pid, $TO_STDIN, $FROM_STDOUT) = &{ exe +{ stdin => 1, stdout => 1 }, '/usr/bin/bc' };
SCALAR references may be passed in \%EXE_OPTIONS for their scalars to be assigned in-place, instead of returning them in the positional LIST:
my ($pid, $FROM_STDOUT); my ($TO_STDIN) = &{ exe +{ pid => \$pid, stdin => 1, stdout => \$FROM_STDOUT }, '/usr/bin/bc' };
Note: It is necessary to disambiguate \%EXE_OPTIONS (also \%BG_OPTIONS) as a hash reference by including a unary + before the opening curly bracket:
+
+{ stdin => 1, autoflush => 0 } +{ wait => 2.5 }
Calling the CODE reference returned by bg( ) returns the PID of the background process, or an EMPTY LIST if bg( ) failed in any operation prior to forking. Failure after forking causes die( ) to be called.
EMPTY LIST
To determine if either exe( ) or bg( ) was successful until the point of forking, check whether the returned $PID is defined.
$PID
See "EXAMPLES" for examples on error checking.
WARNING: This may get a slightly complicated for chained exe( )'s when non-default \%EXE_OPTIONS cause the positions of $PID in the overall returned LIST to be non-uniform (caveat emptor). Remember, the chain of executions is doing a lot for just a single CODE call, so due diligence is required for error checking.
A minimum count of items (PIDs and/or filehandles) can be expected in the returned LIST to determine whether forks were initiated for the entire exe( ) / bg( ) chain.
Failures after forking are responded with die( ). To handle these errors, use eval.
eval
In taint mode, exe( ) will die if it is called with tainted arguments or environment variables. By default, the following environment variables are checked:
PATH PATHEXT IFS CDPATH ENV BASH_ENV PERL5SHELL
We may add to this list with:
BEGIN { push @IPC::Exe::TAINT_ENV, qw(PATH_LOCALE TERMINFO TERMPATH) }
It is highly recommended to avoid unnecessary parentheses ( )'s when using exe( ) and bg( ).
IPC::Exe relies on Perl's LIST parsing magic in order to provide the clean intuitive syntax.
As a guide, the following syntax should be used:
my @pids = &{ # call CODE reference [ bg ] exe [ sub { ... }, ] $prog1, $arg1, @ARGV, # end line with comma exe [ sub { ... }, ] $prog2, $arg2, $arg3, # end line with comma [ bg ] exe sub { ... }, # this bg() acts on last exe() only sub { ... }, };
where brackets [ ]'s denote optional syntax.
Note that Perl sees
my @pids = &{ bg exe $prog1, $arg1, @ARGV, bg exe sub { "2>#" }, $prog2, $arg2, $arg3, exe sub { 123 }, sub { 456 }, };
as
my @pids = &{ bg( exe( $prog1, $arg1, @ARGV, bg( exe( sub { "2>#" }, $prog2, $arg2, $arg3, exe( sub { 123 }, sub { 456 } ) ) ) ) ); };
Code declared in END blocks will be called upon exit, whether it be after &PREEXEC sub without a LIST command, from a die failure, or even a failed exec call.
die
The user should make provisions to handle this situation. This is desirable when END blocks must only be called in the main process (or thread).
$IPC::Exe::is_forked is set to true after the code forks in &PREEXEC and &BACKGROUND. It can be used to tell the main process/thread apart from child processes/threads:
$IPC::Exe::is_forked
END { # only run in main process/thread return if $IPC::Exe::is_forked; ### REST OF THE CODE GOES HERE ### ... }
This module is targeted for Unix environments, using techniques described in perlipc and perlfaq8. Development is done on FreeBSD, Linux, and Win32 platforms. It may not work well on other non-Unix systems, let alone Win32.
Some care was taken to rely on Perl's Win32 threaded implementation of fork( ). To get things to work almost like Unix, redirections of filehandles have to be performed in a certain order. More specifically: let's say STDOUT of a child process (read: thread) needs to be redirected elsewhere (anywhere, it doesn't matter). It is important that the parent process (read: thread) does not use STDOUT until after the child is exec'ed. At the point after exec, the parent must restore STDOUT to a previously dup'ed original and may then proceed along as usual. If this order is violated, deadlocks may occur, often manifesting as an apparent stall in execution when the parent tries to use STDOUT.
fork( )
Since fork( ) is emulated with threads, &PREEXEC and &READER really do begin their lives in the same process, but in separate threads. This imposes limitations on how they can be used. One limitation is that, as separate threads, either one MUST NOT block, or else the other thread will not be able to continue.
Writing to, or reading from a pipe will block when the pipe buffer is full or empty, respectively.
Putting the facts together, it means that a pipe writer and reader should not function (as separate threads or otherwise) in the same process for fear that one may block and not let the other continue (a deadlock).
For example, this code below will block:
&{ exe sub { print "a" x 9000, "\n" for 1 .. 3 }, # sub is &PREEXEC sub { @result = <STDIN> } # sub is &READER };
The execution stalls, and the program just hangs there. &PREEXEC is writing out more data than the pipe buffer can fit. Once the buffer is full, print will block to wait for the buffer to be emptied. However, &READER is not able to continue and read off some data from the pipe buffer because it is in the same blocked process. If it were in a separate process (as in a real fork), than a blocking &PREEXEC cannot affect the &READER.
print
fork
The way to ensure exe( ) works smoothly on Win32 is to exec processes on the pipeline chain. This code will work instead:
&{ exe qw(perl -e), 'print "a" x 9000, "\n" for 1 .. 3', # &PREEXEC exec'ed perl sub { @result = <STDIN> } # sub is &READER };
Now, &PREEXEC is no longer running in the same process, and cannot affect &READER. If the new perl process blocks, &READER in the original process can still continue to read the pipe.
perl
Writing and reading small amounts of data (to not cause blocking) between &PREEXEC and &READER is possible, but not recommended.
On Win32, bg( ) unfortunately has to substantially rely on timer code to wait for &PREEXEC to complete in order to work properly with exe( ). The example shown below illustrates that bg( ) has to wait at least until $program is exec'ed. Hence, $wait_time > $work_time must hold true and this requires a priori knowledge of how long &PREEXEC will take.
$program
$wait_time > $work_time
&{ bg +{ wait => $wait_time }, exe sub { sleep($work_time) }, $program };
This essentially renders bg &BACKGROUND useless if &BACKGROUND does not exec any programs (Win32).
bg &BACKGROUND
In summary: (on Win32)
Only use bg( ) to exec programs into the background.
Keep &PREEXEC as short-running as possible. Or make sure $BG_OPTIONS{wait} time is longer.
$BG_OPTIONS{wait}
No &PREEXEC (or code running in parallel thread) == no problems.
Some useful information:
http://perldoc.perl.org/perlfork.html#CAVEATS-AND-LIMITATIONS http://www.nntp.perl.org/group/perl.perl5.porters/2003/11/msg85488.html http://www.nntp.perl.org/group/perl.perl5.porters/2003/08/msg80311.html http://www.perlmonks.org/?node_id=684859 http://www.perlmonks.org/?node_id=225577 http://www.perlmonks.org/?node_id=742363
Perl v5.8.8+ is required.
No non-core modules are required.
Gerald Lai <glai at cpan dot org>
To install IPC::Exe, copy and paste the appropriate command in to your terminal.
cpanm
cpanm IPC::Exe
CPAN shell
perl -MCPAN -e shell install IPC::Exe
For more information on module installation, please visit the detailed CPAN module installation guide.