The Perl Toolchain Summit needs more sponsors. If your company depends on Perl, please support this very important event.

NAME

GRID::Machine - Remote Procedure Calls over a SSH link

SYNOPSIS

  use GRID::Machine;

  my $host = shift || 'mylogin@remote.machine';

  my $machine = GRID::Machine->new(host => $host, uses => [ 'Sys::Hostname' ]);

  # Install function 'rmap' on remote.machine
  my $r = $machine->sub( 
    rmap => q{
      my $f = shift;        
      die "Code reference expected\n" unless UNIVERSAL::isa($f, 'CODE');

      my @result;
      for (@_) {
        die "Array reference expected\n" unless UNIVERSAL::isa($_, 'ARRAY');

        print hostname().": processing row [ @$_ ]\n";
        push @result, [ map { $f->($_) } @$_ ];
      }
      return @result;
    },
  );
  die $r->errmsg unless $r->ok;

  my $cube = sub { $_[0]**3 };

  # RPC involving code references and nested structures ...
  $r = $machine->rmap($cube, [1..3], [4..6], [7..9]);
  print $r; # Dumps remote stdout and stderr

  for ($r->Results) {               # Output:
    my $format = "%5d"x(@$_)."\n";  #    1    8   27
    printf $format, @$_             #   64  125  216
  }                                 #  343  512  729

DESCRIPTION

This module is inspired in the IPC::PerlSSH module by Paul Evans. It provides Remote Procedure Calls (RPC) via a SSH connection. What made IPC::PerlSSH appealing to me was that

  'no special software is required on the remote end, other than the
  ability to run perl nor are any special administrative rights required;
  any account that has shell access and can execute the perl binary on
  the remote host can use this module'.

The only requirement being that automatic SSH autentification between the local and remote hosts has been established. I have tried to expand the capabilities but preserving this feature.

  • Provide Remote Procedure Calls (RPC). Subroutines on the remote side can be called with arbitrary nested structures as arguments from the local side.

  • The result of a remote call is a GRID::Machine::Result object. Among the attributes of such object are the results of the call, are the outputs produced in stdout and stderr, errmsg etc. The remote function can produce output without risk of misleading the protocol.

  • Services for the transference of files are provided

  • Support for writing and management Remote Modules and the transference of Classes and Modules between machines

  • An Extensible Protocol

METHODS ON THE LOCAL SIDE

The Constructor new

The typical call looks like:

    my $machine = GRID::Machine->new(host => 'user@remote.machine.domain');

This function returns a new instance of an object. The object is blessed in a unique class that inherits from GRID::Machine. That is, the new object is a singleton. When later the machine object is provided with new methods, those are installed in the singleton class. The following example illustrates the point.

  $ cat -n classes.pl
   1  #!/usr/local/bin/perl -w
   2  use strict;
   3  use GRID::Machine;
   4
   5  my @m = qw(orion beowulf);
   6
   7  my $m = GRID::Machine->new( host => shift @m, uses => [qw(Sys::Hostname)]);
   8  print ref($m)."\n";
   9
  10  $m->sub( one => q { print hostname().": one\n"; } );
  11  print $m->one;
  12
  13  my $p = GRID::Machine->new( host => shift @m,  uses => [qw(Sys::Hostname)] );
  14  print ref($p)."\n";
  15
  16  $p->sub( one => q { print hostname().": 1\n"; } );
  17  print $p->one;

There are two GRID::Machine objects involved: $m (for a connection to a machine named orion) and $p (connection to a machine named beowulf) created at lines 7 and 13. Two subroutines with the same name one are installed on both machines (lines 10 and 16). As remote functions they don't collide since they are being executed in two different machines. As local methods they don't collide too since the method one of $m lives in a different namespace than the method one of $p. The remote functions are called in lines 11 and 17. The result of such call is a GRID::Machine::Result object. Such GRID::Machine::Result object describes the result of the RPC. It has attributes like:

results

A reference to an ARRAY holding the results returned by the call

stdout

The ouput produced in the remote stdout during the execution of the RPC

stderr

The ouput produced in the remote stderr during the execution of the RPC

etc.

Wherever is evaluated in a string context a GRID::Machine::Result object returns a string containing the output produced (to both stdout and stderr plus any specific perl error messages as in $@) during the execution of the RPC. When executed the former program will produce an output similar to this:

                          $ classes.pl
                          GRID::Machine::138737228
                          orion: one
                          GRID::Machine::139666876
                          beowulf: 1

Exceptions

The constructor doesn't return on failure: It raises an exception if the connection can't be established. See the result of an attempt to connect to a machine when there is no automatic authentication:

  $ perl -MGRID::Machine -e " GRID::Machine->new( host => 'user@not.available')"
  ssh: connect to host not.available port 22: No route to host
  Can't execute perl in user@not.available using ssh connection with automatic authentication

Arguments of new

The following arguments are legal:

host

The host to connect. The user can be specified here. Also the port. I.e. it can be something like:

  my $machine = GRID::Machine->new(host => 'casiano@orion:2048');

If host is the empty string:

  my $machine = GRID::Machine->new(host => '');

a process executing perl in the local machine is open via open2 (no SSH call will be involved).

Instead of specifying the user, port and other ssh parameters here, the recommended way to work is to insert a section inside the /home/user/.ssh/config file:

 ...

 # A new section inside the config file: 
 # it will be used when writing a command like: 
 #                     $ ssh gridyum 

 Host orion

 # My username in the remote machine
 user casiano

 # The actual name of the machine: by default the one provided in the
 # command line
 Hostname orion.at.some.domain

 # The port to use: by default 22
 Port 2048

 # The identitiy pair to use. By default ~/.ssh/id_rsa and ~/.ssh/id_dsa
 IdentityFile /home/user/.ssh/orionid

 # Useful to detect a broken network
 BatchMode yes

 # Useful when the home directory is shared across machines,
 # to avoid warnings about changed host keys when connecting
 # to local host
 NoHostAuthenticationForLocalhost yes

command

This argument is an alternative to the host argument. Use one or the other. It allows a more specific control of the command executed.

It can be a reference to a list or a string. It fully specifies the command to execute.

Example 1: Using password authentication

The following example uses Net::OpenSSH to open a SSH connection using password authentication instead of asymmetric cryptography:

  $ cat -n openSSH.pl 
     1  use strict;
     2  use warnings;
     3  use Net::OpenSSH;
     4  use GRID::Machine;
     5  
     6  my $host = (shift() or $ENV{GRID_REMOTE_MACHINE});
     7  my @ARGS;
     8  push @ARGS, (user      => $ENV{USR})   if $ENV{USR};
     9  push @ARGS, ( password => $ENV{PASS}) if $ENV{PASS};
    10  
    11  my $ssh = Net::OpenSSH->new($host, @ARGS); 
    12  $ssh->error and die "Couldn't establish SSH connection: ". $ssh->error;
    13  
    14  my @cmd = $ssh->make_remote_command('perl');
    15  { local $" = ','; print "@cmd\n"; }
    16  my $grid = GRID::Machine->new(command => \@cmd);
    17  my $r = $grid->eval('print "hello world!\n"');
    18  print "$r\n";

when executed produces an output like this:

  $ perl openSSH.pl 
  ssh,-S,/Users/localuser/.libnet-openssh-perl/user-machine-2413-275647,-o,User=user,--,machine,perl
  hello world!
Example 2: X11 forwarding

The argument associated with command can be a string. The following example initiates a SSH connection with the remote machine with X11 forwarding:

  $ cat -n testptkdb_2.pl 
     1  #!/usr/local/bin/perl -w
     2  # Execute this program being the user
     3  # that initiated the X11 session
     4  use strict;
     5  use GRID::Machine;
     6  
     7  my $host = $ENV{GRID_REMOTE_MACHINE};
     8  
     9  my $machine = GRID::Machine->new(
    10     command => "ssh -X $host perl", 
    11  );
    12  
    13  print $machine->eval(q{ 
    14    print "$ENV{DISPLAY}\n" if $ENV{DISPLAY};
    15    CORE::system('xclock') and  warn "Mmmm.. something went wrong!\n";
    16    print "Hello world!\n";
    17  });

It will produce an output like:

  $ pp2_testptkdb.pl
  localhost:11.0

and a graphics clock will pop-up on your window.

Example 3: Debugging GRID::Machine programs

Another example of use of the command option is to put the remote side on debugging mode:

  pp2@nereida:~/LGRID_Machine/examples$ cat netcat3.pl
  #!/usr/local/bin/perl -w
  use strict;
  use GRID::Machine;

  my $port = shift || 12345;

  my $debug = qq{PERLDB_OPTS="RemotePort=beowulf:$port"};

  my $machine = GRID::Machine->new(
     command => qq{ssh beowulf '$debug perl -d'},
  );

  print $machine->eval(q{
    system('ls');
    print %ENV,"\n";
  });

Start by running netcat on the remote side:

  pp2@nereida:~/LGRID_Machine/examples$ ssh beowulf nc  -v -l beowulf -p 12345

and now run the program:

  pp2@nereida:~/LGRID_Machine/examples$ netcat3.pl

The prompt of the debugger will appear in the netcat terminal

No host and No command: no SSH connection. Just a process

If neither the host nor the command argument are specified, a process executing perl in the local machine is open via open2:

  $ cat -n commandlocal.pl 
     1  use strict;
     2  use warnings;
     3  use GRID::Machine;
     4  use Sys::Hostname;
     5  
     6  my $machine = GRID::Machine->new(uses => [ 'Sys::Hostname' ]);
     7  
     8  my $remote =  $machine->eval(q{hostname()});
     9  my $local  =  hostname();
    10  
    11  print "Local and remote machines are the same\n" if ($local eq $remote->result);
  

When executed, this program produces the following output:

  $ perl commandlocal.pl 
  Local and remote machines are the same

logic_id

An integer. Contains the logical identifier associated with the GRID::Machine. By default, 0 if it was the first GRID::Machine created, 1 if it was the second, etc. See an example:

  $ cat -n logic_id.pl 
       1  #!/usr/bin/perl -w
       2  use strict;
       3  use GRID::Machine;
       4  
       5  my $m1 = GRID::Machine->new( host => shift());
       6  my $m2 = GRID::Machine->new( host => shift());
       7  my $m3 = GRID::Machine->new( host => shift());
       8  
       9  print $m1->logic_id."\n";
      10  print $m2->logic_id."\n";
      11  print $m3->logic_id."\n";

the execution produces the following output:

  $ ./logic_id.pl machine othermachine somemachine
  0
  1
  2

log

Relative path of the file where remote STDOUT will be redirected. Each time a RPC occurs STDOUT is redirected to a file. By default the name of this file is $TMP/rperl$LOCALPID_$REMOTEPID.log, where $TMP is the name of the temporary directory as returned by File::Spec-tmpdir()>, $LOCALPID is the PID of the process running in the local machines and $REMOTEPID is the PID of the process running in the remote machine.

err

Relative path of the file where remote STDERR will be redirected. Each time a RPC occurs STDERR is redirected to a file. By default the name of this file is $TMP/rperl$LOCALPID_$REMOTEPID.err, where $TMP is the name of the temporary directory as returned by File::Spec-tmpdir()>, $LOCALPID is the PID of the process running in the local machines and $REMOTEPID is the PID of the process running in the remote machine.

report

Relative path of the report file where the (remote) method remotelog writes. By default the name of this file is $TMP/rperl$LOCALPID_$REMOTEPID.report, where $TMP is the name of the temporary directory as returned by File::Spec-tmpdir()>, $LOCALPID is the PID of the process running in the local machines and $REMOTEPID is the PID of the process running in the remote machine. Set cleanup to false to keep this file.

When executing the following program:

  $ cat logerr.pl 
  use strict;
  use GRID::Machine;

  my $machine = GRID::Machine->new( host => $ENV{GRID_REMOTE_MACHINE}, cleanup => 0);
  print $machine->eval(q{ 
    print File::Spec->tmpdir()."\n";
    my @files =  glob(File::Spec->tmpdir().'/rperl/*');
    local $" = "\n";
    print "@files\n";
    SERVER->remotelog("This message will be saved in the report file");
  });

the output will be similar to this:

  ~/grid-machine/examples$ perl logerr.pl 
  /tmp
  /tmp/rperl/1309_4318.err
  /tmp/rperl/1309_4318.log
  /tmp/rperl/1309_4318.report

The report file contains:

  $ ssh $GRID_REMOTE_MACHINE cat /tmp/rperl/1309_4318.report
  4318:Sat Apr 16 20:28:22 2011  => This message will be saved in the report file

wait

Maximum number of seconds to wait for the setting of the connection. If an automatic connection can't be established in such time. The constructor calls the is_operative function (see section "The Function is_operative") to check this. The default value is 15 seconds.

ssh

A string. Specifies the ssh command to be used. Take advantage of this if you want to specify some special parameters. Defaults to ssh.

sshoptions

An ARRAY ref or a string. Specifies options for the ssh command. See an example in which is a string:

  my $machine = GRID::Machine->new(
                  host => $host,
                  sshoptions => '-p 22 -l casiano',
                  uses => [ 'Sys::Hostname' ]
  );

an another in which is an array ref:

  my $machine = GRID::Machine->new(
                  host => $host,
                  sshoptions => [ '-l', 'casiano'],
                  uses => [ 'Sys::Hostname' ]
  );

scp

A string defining the program to use to transfer files between the local and remote machines. Defaults to scp -q -p.

cleanup

Boolean. If true the remote log files for STDOUT and STDERR will be erased when the connection ends. True by default.

sendstdout

Boolean. If true the contents of STDOUT and STDERR after each RPC are sent to the client. By default is true. The following example illustrates its use:

  $ cat -n package.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $s = shift || 0;
     6  my $machine = 'user@remote.machine.domain';
     7
     8  my $m = GRID::Machine->new( host => $machine, sendstdout => $s);
     9
    10  my $p = $m->eval(
    11    q{
    12      print "Name of the Caller Package: ";
    13      return caller(0)
    14    }
    15  );
    16  print "$p",$p->result,"\n";

when executed with argument 0 the remote output is not saved and sent, but the returned result is still available:

                    $ package.pl 1
                    Name of the Caller Package: GRID::Machine
                    $ package.pl 0
                    GRID::Machine

perl

A string. The perl interpreter to use in the remote machine. See an example:

  $ cat -n poption.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $machine = shift || 'remote.machine.domain';
     6  my $m = GRID::Machine->new(
     7    host => $machine,
     8    perl => 'perl -I/home/user/prefix -I/home/user/perl',
     9  );
    10
    11  print $m->eval( q{
    12      local $" = "\n";
    13      print  "@INC";
    14    }

when executed the program produces an output similar to this:

    $ poption.pl
    /home/user/prefix
    /home/user/perl
    /etc/perl
    /usr/local/lib/perl/5.8.4
    etc. etc.

perloptions

A string or an array ref. Contains the options to be passed to the Perl interpreter. See an example:

  my $host = "orion:22";
  my $machine = GRID::Machine->new(
                  host => $host,
                  sshoptions => '-p 22 -l casiano',
                  perloptions => [ '-w', '-MSys::Hostname' ],
  );

Take into account that -MSys::Hostname takes place at a very early stage of the boot process and the functions will be exported to the main package. Therefore a use of the function hostname exported by Sys::Hostname inside a remote sub must be done as in this example:

  my $r = $machine->sub(
    rmap => q{
      ...
      gprint ::hostname(),": Processing @$_\n";
      ...
    }

remotelibs

An ARRAY reference. The referenced array contain the list of modules that will be loaded when bootstrapping the remote perl server. It is used to extend the GRID::Machine protocol. By default the following modules are loaded:

            GRID::Machine::MakeAccessors  
            GRID::Machine::Message
            GRID::Machine::Result
            GRID::Machine::REMOTE

See the section "EXTENDING THE PROTOCOL" for a full example.

startdir

The string specifying the directory where the remote execution starts. By default the home directory. For example:

    my $m = GRID::Machine->new(host => $host, startdir => '/tmp');

If it does not exist is created.

startenv

A reference to a hash. It will be used to modify the remote %ENV.

pushinc

Reference to a list of directories. All this directories will be pushed in the @INC list of the remote machine

unshiftinc

Reference to a list of directories. All this directories will be unshifted in the @INC list of the remote machine. See an example:

    use GRID::Machine;

    my $m = GRID::Machine->new(
      host => 'remote.machine.domain',
      unshiftinc => [ qw(/home/user/prefix /home/user/perl) ],
    );

    print $m->eval(q{ local $" = "\n"; print  "@INC"; });

prefix

Libraries can be transferred from the local to the remote server. The prefix option is a string containing the directory where the libraries will be stored. By default is $ENV{HOME}/perl5lib.

uses

A reference to an ARRAY of strings. Determines the modules that will be loaded when the remote Perl interpreter is started. The enumerated modules must be available on the remote side. For instance:

  my $machine = GRID::Machine->new(host => $host, uses => [ 'POSIX qw( uname )' ])

See the section "Opaque Structures" for a full example

includes

A reference to an ARRAY of strings. Determines the "remote modules" that will be included when the remote Perl interpreter is started. The enumerated modules must be available on the local side. For instance, the following program loads the "Module":

  pp2@nereida:~/LGRID_Machine/examples$ cat -n includes.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = shift || die "Usage:\n$0 machine\n";
     6
     7  my $machine = GRID::Machine->new(
     8     host => $host,
     9     includes => [ qw{SomeFunc} ],
    10  );
    11
    12  my $r = $machine->max(7, 9, 2, 8);
    13
    14  print $r;
    15  print "Max of (7, 9, 2, 8) is: ".$r->result."\n";

A "remote module" resembles a module but contains only subroutines and uses. The functions are directly placed in the space name of the GRID::Machine object (just like the method sub does. See section "The sub Method"). Here are the contents of the "remote module" SomeFunc.pm:

  pp2@nereida:~/LGRID_Machine/examples$ cat -n SomeFunc.pm
     1  use List::Util qw{max};
     2  use Sys::Hostname;
     3
     4  sub max {
     5    print "machine: ".hostname().": Inside sub two(@_)\n";
     6    List::Util::max(@_)
     7  }

when executed, the program produces the following output:

  pp2@nereida:~/LGRID_Machine/examples$ includes.pl beowulf
  machine: beowulf: Inside sub two(7 9 2 8)
  Max of (7, 9, 2, 8) is: 9

debug

The value must be a port number higher than 1024. Used to run the remote side under the control of the debugger. See the section "REMOTE DEBUGGING"

     my $machine = GRID::Machine->new(
        host => $host,
        debug => $port,
        includes => [ qw{SomeFunc} ],
     );

survive

No exception will be produced if the connection fails. Instead undef is returned. Often used when building several GRID::Machines and you don't care (too much) if one of them fails::

  $ cat pi8.pl 
  #!/usr/bin/perl -w
  use strict;
  use GRID::Machine;
  use GRID::Machine::Group;
  use Data::Dumper;

  my @MACHINE_NAMES = split /\s+/, $ENV{MACHINES};
  my @m = map { GRID::Machine->new(host => $_, wait => 5, survive => 1) } @MACHINE_NAMES;

  my $c = GRID::Machine::Group->new(cluster => [ @m ]);

  $c->sub(suma_areas => q{
     my ($id, $N, $np) = @_;
       
     my $sum = 0;
     for (my $i = $id; $i < $N; $i += $np) {
         my $x = ($i + 0.5) / $N;
         $sum += 4 / (1 + $x * $x);
     }
     $sum /= $N; 
  });

  my ($N, $np, $pi)  = (1000, 4, 0);

  print Dumper($c->suma_areas(args => [ map {  [$_, $N, $np] } 0..$np-1 ]));

The eval Method

The syntax is:

            $result = $machine->eval( $code, @args )

This method evaluates code in the remote host, passing arguments and returning a GRID::Machine::Result object. See an example:

    use GRID::Machine qw(is_operative);
    use Data::Dumper;

    my $machine = GRID::Machine->new(host => 'user@remote.machine.domain');

    my $p = { name => 'Peter', familyname => [ 'Smith', 'Garcia'], age => 31 };

    print Dumper($machine->eval(q{
      my $q = shift;

      $q->{familyname}

      }, $p
    ));

The Result of a RPC

When executed, the former code produces the following output:

    $ struct.pl
    $VAR1 = bless( {
                     'stderr' => '',
                     'errmsg' => '',
                     'type' => 'RETURNED',
                     'stdout' => '',
                     'errcode' => 0,
                     'results' => [
                                    [ 'Smith', 'Garcia' ]
                                  ]
                   }, 'GRID::Machine::Result' );

A GRID::Machine::Result result object describes the result of a RPC. The results attribute is an ARRAY reference holding the result returned by the call. The other attributes stdout, stderr, etc. hold the respective outputs. See section "THE GRID::Machine::Result CLASS" for a more detailed description of GRID::Machine::Result objects.

The Algorithm of eval

When a call

            $result = $machine->eval( $code, @args )

occurs, the code $code should be passed in a string, and is compiled using a string eval in the remote host:

               my $subref = eval "use strict; sub { $code }";

Files STDOUT and STDERR are redirected and the subroutine referenced by $subref is called inside an eval with the specified arguments:

                my @results = eval { $subref->( @_ ) };

Errors and Exceptions

If there are errors at compile time, they will be collected into the GRID::Machine::Result object. In the following example the code to eval has an error (variable $q is not declared):

  ~/grid-machine/examples$ cat -n syntaxerr2.pl 
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $machine = GRID::Machine->new(host => 'user@machine.domain.es');
     7  
     8  my $p = { name => 'Peter', familyname => [ 'Smith', 'Garcia'] };
     9  
    10  my $r = $machine->eval( q{ $q = shift; $q->{familyname} }, $p);
    11  
    12  die  Dumper($r) unless $r->ok;
    13  
    14  print "Still alive\n";

When executed this code produces something like:

  $VAR1 = bless( {
                 'stderr' => '',
                 'errmsg' => 'user@machine.domain.es: Error while compiling eval \'$q = shift; $q->{fam...\'
                  Global symbol "$q" requires explicit package name at syntaxerr2.pl line 10, <STDIN> line 230.
                  Global symbol "$q" requires explicit package name at syntaxerr2.pl line 10, <STDIN> line 230.',
                 'type' => 'DIED',
                 'stdout' => '',
                 'errcode' => 0
               }, 'GRID::Machine::Result' );

The error message accurately reports the correct source offending line.

GRID::Machine::Result objects have an ok method which returns TRUE if the RPC call didn't died. Therefore a common idiom after a RPC is:

                          die "$r" unless $r->ok;

Scope and Visibility Issues

Since the eval method wraps the code into a subroutine (see section "The Algorithm of eval") like this

               my $subref = eval "use strict; sub { $code }";

variables declared using our inside an eval must be redeclared in subsequent evals to make them visible. The following code produces an error message:

 $ cat -n vars1.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(qc);
     4
     5  my $machine = GRID::Machine->new(host => 'user@remote');
     6
     7  $machine->eval(q{
     8    our $h;
     9    $h = [4..9];
    10  });
    11
    12  my $r = $machine->eval(qc q{
    13    $h = [map {$_*$_} @$h];
    14  });
    15
    16  die $r unless $r->noerr;

The interpreter complains about $h:

  $ vars1.pl
  user@remote: Error while compiling eval. \
    Global symbol "$h" requires explicit package name at ./vars1.pl line 13,\
                                                           <STDIN> line 198.
    Global symbol "$h" requires explicit package name at ./vars1.pl line 13, \
                                                           <STDIN> line 198.

The problem can be solved by redeclaring our $h in the second eval or changing the declaration at line 8 by use vars:

      7 $machine->eval(q{
      8   use vars qw{$h};
      9   $h = [4..9];
     10 });

Closures

One of the consequences of wrapping $code inside a sub is that any lexical variable is limited to the scope of the eval. Another is that nested subroutines inside $code will live in a (involuntary) closure. See the example:

  $ cat -n vars5.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(qc);
     4
     5  my $machine = GRID::Machine->new(host => 'casiano@beowulf.pcg.ull.es');
     6
     7  my $r = $machine->eval(qc q{
     8    my $h = 1;
     9
    10    sub dumph {
    11      print "$h\n";
    12      $h++
    13    }
    14
    15    dumph();
    16  });
    17
    18  print "Result: ".$r->result."\nWarning: ".$r->stderr;
    19
    20  $r = $machine->eval(qc q{
    21    dumph();
    22  });
    23
    24  print "Result: ".$r->result."\nWarning: ".$r->stderr;

When executed, the program produces the following warning:

  $ vars5.pl
  Result: 1
  Warning: Variable "$h" will not stay shared at ./vars5.pl line 11\
                                                 , <STDIN> line 194.
  Result: 2
  Warning: Variable "$h" will not stay shared at ./vars5.pl line 11,\
                                                   <STDIN> line 194.

The warning announces that later calls (in subsequent evals) to sub dumph can no longer reach $h (Other than trhough dumph itself). If you want lexical nested subroutines declare them through a reference:

 $ cat -n vars6.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(qc);
     4
     5  my $machine = GRID::Machine->new(host => 'casiano@beowulf.pcg.ull.es');
     6
     7  my $r = $machine->eval(qc q{
     8    my $h = 1;
     9
    10    use vars '$dumph';
    11    $dumph = sub {
    12      print "$h";
    13      $h++;
    14    };
    15
    16    $dumph->();
    17  });
    18
    19  print "$r\n";
    20
    21  $r = $machine->eval(qc q{
    22    $dumph->();
    23  });
    24
    25  print "$r\n";

The compile Method

Syntax:

                   $machine->compile( $name, $code )
                   $machine->compile( $name, $code, politely => $politely )
                   $machine->compile( $name, $code, filter => $filter )
                   $machine->compile( $name, $code, politely => $politely, filter => $filter )

This method sends code to the remote host to store it inside the remote side of the GRID::Machine object. Namely, the stored_procedures attribute of the remote object is a hash reference containing the stored subroutines. The string $code is compiled into a CODE reference which can be executed later through the call method.

The two first arguments are the name $name of the subroutine and the code $code. The order of the other arguments is irrelevant.

The subroutine name $name must be an identifier, i. e. must match the regexp [a-zA-Z_]\w*. Full names aren't allowed.

The following example uses compile to install handlers for the most common file-testing functions -r (is readable), -w (writeable), etc. (lines 8-15):

  $ cat -n compile.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $machine = $ENV{GRID_REMOTE_MACHINE} || shift;
     6  my $m = GRID::Machine->new( host => $machine );
     7
     8  for (qw(r w e x z s f d  t T B M A C)) {
     9    $m->compile( "$_" => qq{
    10        my \$file = shift;
    11
    12        return -$_ \$file;
    13      }
    14    );
    15  }
    16
    17  my @files = $m->eval(q{ glob('*') })->Results;
    18
    19  for (@files) {
    20    print "$_ is a directory\n" if $m->call('d', $_)->result;
    21  }

After the testing functions are installed (lines 8-15), a list of files in the current (remote) directory is obtained (line 17) and those which are directories are printed (lines 19-21).

Collisions

When two functions are installed with the same name the last prevails:

            use GRID::Machine;

            my $machine = shift || 'remote.machine.domain';
            my $m = GRID::Machine->new( host => $machine );

            $m->compile(one => q{print "one\n"; });

            $m->compile(one => q{ print "1\n"; });

            my $r= $m->call("one");
            print $r; # prints 1

To avoid overwriting an existent function the exists method can be used:

            use GRID::Machine;

            my $machine = shift || 'remote.machine.domain';
            my $m = GRID::Machine->new( host => $machine );

            $m->compile(one => q{ print "one\n"; });

            $m->compile(one => q{ print "1"; }) unless $m->exists('one');

            my $r= $m->call("one");
            print $r; # prints "one"

The politely argument

An alternative solution is to use the politely argument of compile. If true the function won't be overwritten:

            use GRID::Machine;

            my $machine = shift || 'remote.machine.domain';
            my $m = GRID::Machine->new( host => $machine );

            my $r = $m->compile(one => q{ print "one\n"; });

            $r = $m->compile(
              one => q{ print "1"; },
              politely => 1 # Don't overwrite if exists
            );
            print $r->errmsg."\n";

            $r= $m->call("one");
            print $r; # prints "one"

When executed, the former program produces this output:

    $ compile5.pl
    Warning! Attempt to overwrite sub 'one'. New version was not installed.
    one

The sub Method

Syntax:

                    $machine->sub( $name, $code, %args )

Valid arguments (%args) are::

  • politely => $politely

  • filter => $filter

  • around => sub { ... }

This method is identical to the compile method, except that the remote $code will be available as a (singleton) method of the $machine object within the local perl program. Therefore, two methods of two different GRID::Machine objects with the same $name are installed on different name spaces. See the example in section "The Constructor new".

The installed method $name can also be accessed as an ordinary function $name on the remote side. If a function with the same name already exists, the oldest prevails. See the call to function hi at line 15 in this example:

     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = $ENV{GRID_REMOTE_MACHINE}; 
     6  my $debug = @ARGV ? 1234 : 0;
     7  
     8  my $machine = GRID::Machine->new(host => $host, debug => $debug);
     9  
    10  $machine->sub( hi => q{ my $n = shift; "Hello $n\n"; } );
    11  
    12  print $machine->hi('Jane')->result;
    13  
    14  # same thing
    15  print $machine->eval(q{ hi(shift()) }, 'Jane')->result;

The execution produces the following output:

  $ perl subfromserver.pl 
  Hello Jane
  Hello Jane

  

The filter Argument

By default, the result of a subroutine call is a GRID::Machine::Result object. However, for a given subroutine this behavior can be changed using the filter argument. Thus, the subroutine filter_results installed in lines 12-15 of the code below, when called returns the results attribute instead of the whole GRID::Machine::Result object. The subroutine filter_result installed in lines 17-20 returns the first element of the resulting list:

  $ cat -n filter.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5
     6  my $machine = GRID::Machine->new( host => $ENV{GRID_REMOTE_MACHINE} || shift);
     7
     8  $machine->sub(
     9    nofilter => q{ map { $_*$_ } @_ },
    10  );
    11
    12  $machine->sub(
    13    filter_results => q{ map { $_*$_ } @_ },
    14    filter => 'results'
    15  );
    16
    17  $machine->sub(
    18    filter_result => q{ map { $_*$_ } @_ },
    19    filter => 'result',
    20  );
    21
    22  my @x = (3..5);
    23  my $content = $machine->nofilter(@x);
    24  print Dumper($content);
    25
    26  $content = $machine->filter_results(@x);
    27  print Dumper($content);
    28
    29  $content = $machine->filter_result(@x);
    30  print Dumper($content);

When executed the former program produces this output:

  $ filter.pl
  $VAR1 = bless( {
                   'stderr' => '',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => '',
                   'errcode' => 0,
                   'results' => [ 9, 16, 25 ]
                 }, 'GRID::Machine::Result' );
  $VAR1 = [ 9, 16, 25 ];
  $VAR1 = 9;

In general, the result of a call to a subroutine installed using

                    $machine->sub( $name, $code, filter => $filter )

will apply the $filter subroutine to the GRID::Machine::Result object resulting from the call. The filter $filter must be a GRID::Machine::Result method and must return a scalar. The filter is executed in the remote side of the GRID::Machine.

The usage of the results and result filters can be convenient when the programmer isn't interested in the other attributes i.e. stdout, stderr, etc.

The around argument

A CODE reference. By default GRID::Machine produces a proxy representative in the local side for the sub being installed. The code of the proxy simply calls the corresponding sub in the remote side:

      sub { my $self = shift; $self->call( $name, @_ ) };

You can substitute the proxy code by your own code using the around parameter. The proxy code receives the GRID::Machine object and the arguments for the remote side of the subroutine. See the following example:

  $ cat around.pl 
  #!/usr/bin/perl -w
  use strict;
  use GRID::Machine;

  my $machine = GRID::Machine->new( host => $ENV{GRID_REMOTE_MACHINE} || shift);

  $machine->sub( 
    squares => q{ map { $_*$_ } @_ },
    filter => 'results',
    around => sub { 
                my $self = shift; 
                my $r = $self->call( 'squares', @_ ); 
                map { $_+1 } @$r; 
              }
  );

  my @x = (3..5);
  my @r = $machine->squares(@x);
  print "@r\n";

When called, the squares function computes the squares on the remote side and adds one to each element in the local side:

  $ perl around.pl
  10 17 26

The makemethod Method

Syntax:

                    $machine->makemethod( $name, %args )

Valid arguments (%args) are::

  • politely => $politely

  • filter => $filter

  • around => sub { ... }

This method is identical to the sub method, except that it assumes the sub $name has been already installed in the remote side. The $name can be a fully qualified name, but the method call must use the short name. The following example produces a proxy method for the function reduce which is already available at the remote machine:

  $ cat -n makemethod.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = shift || $ENV{GRID_REMOTE_MACHINE};
     6  
     7  my $m = GRID::Machine->new(host => $host, uses => [q{List::Util qw{reduce}}]);
     8  
     9  $m->makemethod( 'reduce', filter => 'result' );
    10  my $r = $m->reduce(sub { $a > $b ? $a : $b }, (7,6,5,12,1,9));
    11  print "\$r =  $r\n";
    12  
    13  my $m2 = GRID::Machine->new(host => $host, uses => [q{List::Util}]);
    14  
    15  $m2->makemethod( 'List::Util::reduce' );
    16  $r = $m2->reduce(sub { $a > $b ? $a : $b }, (7,6,5,12,1,9));
    17  die $r->errmsg unless $r->ok;
    18  print "\$r =  ".$r->result."\n";

The execution produces:

  $ perl -w makemethod.pl 
  $r =  12
  $r =  12

The call Method

Syntax:

                  $result = $machine->call( $name, @args )

This method invokes a remote method that has earlier been defined using the compile or sub methods. The arguments are passed and the result is returned in the same way as with the eval method.

The makemethods Method

Convenience method to install several methods in a row. The following call installs methods fork, waitpid, kill and poll:

     $self->makemethods(
        [ 'fork', filter=>'result',
           around => sub { 
              my $self = shift; 
              my $r = $self->call( 'fork', @_ ); 
              $r->{machine} = $self; 
              $r 
           },
         ],
         [ 'waitpid', filter=>'result', ],
         [ 'kill', filter=>'result', ],
         [ 'poll', filter=>'result', ],
     );

Nested Structures

Nested Perl Data Structures can be transferred between the local and remote machines transparently:

      use Data::Dumper;

      my $host = shift || 'user@remote.machine.domain';

      my $machine = GRID::Machine->new(host => $host);

      my $r = $machine->sub(
        rpush => q{
          my $f = shift;
          my $s = shift;

          push @$f, $s;
          return $f;
        },
      );
      $r->ok or die $r->errmsg;

      my $f = [[1..3], { a => [], b => [2..4] } ];
      my $s = { x => 1, y => 2};

      $r = $machine->rpush($f, $s);
      die $r->errmsg unless $r->ok;

      $Data::Dumper::Indent = 0;
      print Dumper($r->result)."\n";

when executed the program above produces:

      $ nested4.pl
      $VAR1 = [[1,2,3],{'a' => [],'b' => [2,3,4]},{'y' => 2,'x' => 1}];

Aliasing

Aliasing between parameters is correctly catched. The following code presents (line 24) a remote procedure call to a function iguales where the two local arguments $w and $z are the same:

 $ cat -n alias.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(qc);
     4
     5  my $machine = GRID::Machine->new(host => shift(), uses => [ 'Sys::Hostname' ]);
     6
     7  my $r = $machine->sub( iguales => qc q{
     8      my ($first, $sec) = @_;
     9
    10      print hostname().": $first and $sec are ";
    11
    12      if ($first == $sec) {
    13        print "the same\n";
    14        return 1;
    15      }
    16      print "Different\n";
    17      return 0;
    18    },
    19  );
    20  $r->ok or die $r->errmsg;
    21
    22  my $w = [ 1..3 ];
    23  my $z = $w;
    24  $r = $machine->iguales($w, $z);
    25  print $r;

when executed the program produces the following output:

    $ alias.pl beowulf
    beowulf: ARRAY(0x8275040) and ARRAY(0x8275040) are the same

The reciprocal is true. Equality on the remote side translate to equality on the local side. The program:

  $ cat -n aliasremote.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $machine = GRID::Machine->new(host => shift(), uses => [ 'Sys::Hostname' ]);
     6
     7  $machine->sub( iguales => q{
     8      my $first = [1..3];
     9      my $sec = $first;
    10
    11      return (hostname(), $first, $sec);
    12    },
    13  );
    14
    15  my ($h, $f, $s)  = $machine->iguales->Results;
    16  print "$h: same\n" if $f == $s;
    17  print "$h: different\n" if $f != $s;

produces the following output:

                            $ aliasremote.pl beowulf
                            beowulf: same

The run Method

Syntax:

                             $m->run($command)

Is equivalent to

                             print $m->system($command)
 

Returns true if there were no messages on stderr.

The exists Method

Syntax:

                         $machine->exists(q{subname})

Returns true if, and only if, a subroutine named subname has been previously installed on that machine (via sub, compile or some other trick). See an example:

    use GRID::Machine;

    my $host = shift || 'user@remote.machine.domain';

    my $machine = GRID::Machine->new(host => $host);

    $machine->sub( one => q{ print "one\n" });

    print "<".$machine->exists(q{one}).">\n";
    print "<".$machine->exists(q{two}).">\n";

when executed the former code produces the following output:

    $ exists.pl
    <1>
    <>

FUNCTIONS ON THE LOCAL SIDE

The read_modules Function

Syntax:

                use GRID::Machine qw(read_modules)
                read_modules(qw(Module:One Module::Two, Module::Three, ...))

Searches for the specified modules Module:One, etc. in the local Perl installation. Returns a string with the concatenation of the contents of these modules.

For example, the line:

       read_modules(qw(Parse::Eyapp Parse::Eyapp::))

returns a string containing the concatenation of the contents of all the modules in the Parse::Eyapp distribution. Modules are searched by name (like 'YAML') or by subcategories ('DBD::' means all modules under the DBD subdirectories of your Perl installation, matching both 'DBD::Oracle' and 'DBD::ODBC::Changes').

The Function is_operative

The syntax is:

  is_operative($ssh, $machine, $command, $wait)

Returns true if $machine is available through ssh using automatic authentication and $command can be executed on the remote machine in less than $wait seconds. The following example illustrates its use:

  $ cat notavailable.pl
  #!/usr/local/bin/perl -w
  use strict;
  use GRID::Machine qw(is_operative);

  my $host = shift || 'user@machine.domain.es';
  my $command = shift || 'perl -v';
  my $delay = shift || 1;

  die "$host is not operative\n" unless is_operative('ssh', $host, $command, $delay);
  print "host is operative\n";

When not specified $command is perl -v and $wait is 15 seconds. The following two executions of the former example check the availability of machine beowulf:

  $ notavailable.pl beowulf
  host is operative
  pp2@nereida:~/LGRID_Machine/examples$ notavailable.pl beowulf chum
  beowulf is not operative

The negative answer for the second execution is due to the fact that no command called chum is available on that machine.

If $machine is the empty string i.e. $machine eq '', it refers to a direct connection to the local machine and thus, it succeeds most of the time.

The Function qc

Prefixes the string passed as argument with the string #line $LINE $FILE where $LINE and $FILE are the calling line and calling file respectively. Used to provide more accurate error messages when evaluating remote code. See section "Errors and Exceptions".

THE TRANSFERENCE OF FILES

The put Method

Syntax:

               $m->put([ 'file1', 'file2', ... ], 'targetdir/')
               $m->put([ 'file1', 'file2', ... ])

Transfer files from the local machine to the remote machine. When no target directory is specified the files will be copied into the current directory (i.e. $ENV{PWD}). If targetdir/ is a relative path, it is meant to be relative to the current directory on the remote machine. It returns TRUE on success. See an example:

    $ cat put.pl
    #!/usr/local/bin/perl -w
    use strict;
    use GRID::Machine;

    my $m = GRID::Machine->new( host => shift());

    $m->chdir('/tmp');
    $m->put([ $0 ]);
    $m->run("uname -a; ls -l $0");

When executed the program produces:

  $ put.pl orion
  Linux orion 2.6.8-2-686 #1 Tue Aug 16 13:22:48 UTC 2005 i686 GNU/Linux
  -rwxr-xr-x  1 casiano casiano 171 2007-07-01 11:46 ./put.pl

If there is only one source file we can specify a new name for the target. Thus, the line:

                  $m->put([ $0 ], '/tmp/newname.pl')

will copy the file containing the current program on the remote machine as /tmp/newname.pl

The get Method

Syntax:

               $m->get( [ 'file1', 'file2'], ... ], 'targetdir/')
               $m->get( [ 'file1', 'file2'], ... ])

Performs the reverse action of put. Transfer files from the remote machine to the local machine. When the paths of the files to transfer 'file1', 'filer2', etc. are relative, they are interpreted as relative to the current directory on the remote machine. See an example:

    use GRID::Machine;

    my $m = GRID::Machine->new( host => shift(), startdir => 'tutu',);

    $m->put([ glob('nes*.pl') ]);
    $m->run('uname -a; pwd; ls -l n*.pl');

    print "*******************************\n";

    my $progs = $m->glob('nes*.pl')->results;
    $m->get($progs, '/tmp/');
    system('uname -a; pwd; ls -l n*.pl');

When executed the program produces an output similar to this:

  $ get.pl remote
  Linux remote 2.6.15-1-686-smp #2 SMP Mon Mar 6 15:34:50 UTC 2006 i686 GNU/Linux
  /home/casiano/tutu
  -rwxr-xr-x 1 casiano casiano 569 2007-05-16 13:45 nested2.pl
  -rwxr-xr-x 1 casiano casiano 756 2007-05-22 10:10 nested3.pl
  -rwxr-xr-x 1 casiano casiano 511 2007-06-27 13:08 nested4.pl
  -rwxr-xr-x 1 casiano casiano 450 2007-06-27 15:20 nested5.pl
  -rwxr-xr-x 1 casiano casiano 603 2007-05-16 14:49 nested.pl
  *******************************
  Linux local 2.4.20-perfctr #6 SMP vie abr 2 18:36:12 WEST 2004 i686 GNU/Linux
  /tmp
  -rwxr-xr-x  1 pp2 pp2 569 2007-05-16 13:45 nested2.pl
  -rwxr-xr-x  1 pp2 pp2 756 2007-05-22 10:10 nested3.pl
  -rwxr-xr-x  1 pp2 pp2 511 2007-06-27 13:08 nested4.pl
  -rwxr-xr-x  1 pp2 pp2 450 2007-06-27 15:20 nested5.pl
  -rwxr-xr-x  1 pp2 pp2 603 2007-05-16 14:49 nested.pl

The copyandmake Method

Syntax:

    $m->copyandmake(
          dir => $dir,
          files => [ @files ],      # files to transfer
          make => $command,         # execute $command $commandargs 
          makeargs => $commandargs, # after the transference
          cleanfiles => $cleanup,   # remove files at the end
          cleandirs => $cleanup,    # remove the whole directory at the end
    )

copyandmake copies (using scp) the files @files to a directory named $dir in the remote machine. The directory $dir will be created if it does not exists. After the file transfer the command specified by the copyandmake option

                     make => 'command' 

will be executed with the arguments specified in the option makeargs. If the make option isn't specified but there is a file named Makefile between the transferred files, the make program will be executed. Set the make option to number 0 or the string '' if you want to avoid the execution of any command after the transfer. The transferred files will be removed when the connection finishes if the option cleanfiles is set. If the option cleandirs is set, the created directory and all the files below it will be removed. Observe that the directory and the files will be kept if they were'nt created by this connection. The call to copyandmake by default sets dir as the current directory in the remote machine. Use the option keepdir => 1 to one to avoid this.

LOADING CLASSES ONTO THE REMOTE SIDE

The modput Method

Syntax:

            $machine->modput(@Modulenames)

Where @Modulenames is a list of strings describing modules. Descriptors can be names (like 'YAML') or subcategories (like 'DBD::' meaning all modules under the DBD subdirectories of your Perl installation, matching both 'DBD::Oracle' and 'DBD::ODBC::Changes').

The following example will copy all the files in the distribution of Parse::Eyapp to the remote machine inside the directory $machine->prefix. After the call to

            my $r = $machine->install('Parse::Eyapp', 'Parse::Eyapp::')

the module is available for use on the remote machine:

  use GRID::Machine;
  use Data::Dumper;

  my $host = $ENV{GRID_REMOTE_MACHINE} ||shift;

  my $machine = GRID::Machine->new(host => $host, prefix => q{perl5lib/});

  my $r = $machine->modput('Parse::Eyapp', 'Parse::Eyapp::');

  $r = $machine->eval(q{
      use Parse::Eyapp;

      print Parse::Eyapp->VERSION."\n";
    }
  );
  print Dumper($r);

When executed, the former program produces an output like this:

  $ modput.pl
  $VAR1 = bless( {
                   'stderr' => '',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => "1.07\n",
                   'errcode' => 0,
                   'results' => [ 1 ]
                 }, 'GRID::Machine::Result' );

The include Method

"The sub Method" permits the installation of a remote subroutine as a method of the GRID::Machine object. This is efficient when only a few subroutines are involved. However for large number of subroutines that procedure is error prone. It is better to have the code in some separated module. This way we can test the components on the local machine and, once we are confident of their correct behavior, proceed to load them onto the remote machine. This is what include is for.

Syntax of include

    $m->include(
          "Some::Module", 
          exclude => [ qw( f1 f2 ) ], 
          alias => { g1 => 'min', g2 => 'max' }
    )

This call will search in the paths in @INC for Some/Module.pm. Once Some/Module.pm is found all the subroutines inside the module will be loaded as methods of the GRID::Machine (singleton) object $m. Code outside subroutines, plain old documentation and comments will be ignored. Everything after the markers __END__ or __DATA__ will also be ignored. The presence of the parameter 'exclude => [ qw( f1 f2 ) ]' means that subroutines f1 and f2 will be excluded from the process. Subroutine g1 will be renamed as min and subroutine g2 will be renamed as max.

Consider the following Remote Module:

  $ cat -n Include5.pm
     1  use strict;
     2
     3  sub last {
     4    $_[-1]
     5  }
     6
     7  sub one {
     8    print 'sub one'."\n";
     9  }
    10
    11  sub two {
    12    print "sub two\n";
    13  }

The following program includes the remote module Include5.pm:

  $ cat -n include5.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = $ENV{GRID_REMOTE_MACHINE} || 'user@remote.machine';
     6
     7  my $machine = GRID::Machine->new( host => $host);
     8
     9  $machine->include("Include5", exclude => [ qw(two) ], alias => { last => 'LAST' });
    10
    11  for my $method (qw(last LAST one two)) {
    12    if ($machine->can($method)) {
    13      print $machine->host." can do $method\n";
    14    }
    15  }
    16
    17  print $machine->LAST(4..9)->result."\n";

Then function two is excluded and the subroutine last is renamed as LAST:

  $ include5.pl
  user@remote.machine can do LAST
  user@remote.machine can do one
  9

Remote Modules

The use of include lead us to the concept of Remote Modules. A Remote Module contains a family of subroutines that will be loaded onto the remote machine via the sub method of GRID::Machine objects.

Here is a small example of Remote Module:

  $ cat -n Include.pm
     1  sub one {
     2    print "sub one\n";
     3  }
     4
     5  sub two {
     6    print 'sub two'."\n";
     7  }
     8
     9  sub three {
    10    print "sub three\n";
    11  }
    12
    13  my $a = "sub five {}\n";
    14  my $b = 'sub six {}';
    15
    16  __DATA__
    17
    18  sub four {
    19    print "four\n";
    20  }

Source after the __DATA__ or __END__ delimiters are ignored. Also, code outside subroutines (for example lines 13 and 14) and pod documentation are ignored. Only the subroutines defined in the module are loaded. See a program that includes the former remote module:

  $ cat -n include.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = $ENV{GRID_REMOTE_MACHINE} || 'user@remote.machine.es';
     6
     7  my $machine = GRID::Machine->new( host => $host);
     8
     9  $machine->include(shift() || "Include");
    10
    11  for my $method (qw(one two three four five six)) {
    12    if ($machine->can($method)) {
    13      print $machine->host." can do $method\n";
    14      print $machine->$method();
    15    }
    16  }

When executed the former program produces an output like:

  $ include.pl
  user@remote.machine.es can do one
  sub one
  user@remote.machine.es can do two
  sub two
  user@remote.machine.es can do three
  sub three

The use and LOCAL directives in Remote Modules

Two directives that can be used isinde a Remote Module are use and LOCAL:

  • A use Something pragma inside a Remote Module indicates that such module Something must be loaded onto the remote machine. Of course, the module must be available there. An alternative to install it is to transfer the module(s) on the local machine to the remote machine using modput (see section "The modput Method").

  • A LOCAL { code } directive inside a Remote Module wraps code that will be executed on the local machine. LOCAL directives can be used to massively load subroutines as in the example below.

The following remote module contains a use pragma in line 2.

  $ cat -n Include4.pm
     1  use strict;
     2  use List::Util qw(sum); # List::Util will be loaded on the Remote Side
     3
     4  sub sigma {
     5    sum(@_);
     6  }
     7
     8  LOCAL {
     9    print "Installing new functions\n";
    10    for (qw(r w e x z s f d  t T B M A C)) {
    11      SERVER->sub( "$_" => qq{
    12          my \$file = shift;
    13
    14          return -$_ \$file;
    15        }
    16      );
    17    }
    18  }

Lines 9-17 are surrounded by a LOCAL directive and thus they will be executed on the local side. The effect is to install new methods for the GRID::Machine object that will be equivalent to the classic Perl file tests: -r, -w, etc. Inside a LOCAL directive the function SERVER returns a reference to the current GRID::Machine object (see line 11).

See a program that loads the former Remote Module. The call to include will load List::Util on the remote machine importing the sum function. Furthermore, methods with names sigma, r, w, etc. will be installed:

  $ cat -n include4.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = $ENV{GRID_REMOTE_MACHINE} || 'user@remote.machine.es';
     6
     7  my $machine = GRID::Machine->new(host => $host,);
     8
     9  $machine->include(shift() || "Include4");
    10
    11  print "1+..+5 = ".$machine->sigma( 1..5 )->result."\n";
    12
    13  $machine->put([$0]);
    14
    15  for my $method (qw(r w e x s t f d)) {
    16    if ($machine->can($method)) {
    17      my $r = $machine->$method($0)->result || "";
    18      print $machine->host."->$method( include4.pl ) = <$r>\n";
    19    }
    20  }

When executed the program produces an output like:

  $ include4.pl
  Installing new functions
  1+..+5 = 15
  user@remote.machine.es->r( include4.pl ) = <1>
  user@remote.machine.es->w( include4.pl ) = <1>
  user@remote.machine.es->e( include4.pl ) = <1>
  user@remote.machine.es->x( include4.pl ) = <1>
  user@remote.machine.es->s( include4.pl ) = <498>
  user@remote.machine.es->t( include4.pl ) = <>
  user@remote.machine.es->f( include4.pl ) = <1>
  user@remote.machine.es->d( include4.pl ) = <>

Specifying filters and proxys via #gm

We can include comments grid machine comments between the name of the subroutine and the open curly bracket to specify the options for the sub installation. The comments must start with #gm . See an example:

  $ cat -n Include6.pm 
     1  use strict;
     2  
     3  sub last 
     4  #gm (filter => 'result', ) 
     5  {
     6    $_[-1] 
     7  }
     8  
     9  sub LASTitem 
    10  {
    11    $_[-1] 
    12  }
    13  
    14  sub one 
    15  #gm (
    16  #gm   filter => 'result', 
    17  #gm   around => sub { 
    18  #gm     my $self = shift; 
    19  #gm     my $r = $self->call( 'one', @_ ); 
    20  #gm     use Sys::Hostname;
    21  #gm     $r."Local machine: ".hostname()."\n" 
    22  #gm   },
    23  #gm )
    24  {
    25    SERVER->host." received: <@_>\n";
    26  }

Sub last will return just the result instead of the full GRID::Machine::Result object. We have also substituted the proxy representative of method one using the around parameter. Consider the following script example:

  $ cat -n include6.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7  
     8  my $machine = GRID::Machine->new( host => $host);
     9  
    10  $machine->include("Include6");
    11  
    12  print $machine->last(4..9)."\n";
    13  
    14  my $r = $machine->LASTitem(4..9);
    15  print Dumper($r);
    16  
    17  print $machine->one(4..9)."\n";

when executed, produces:

  $ perl include6.pl 
  9
  $VAR1 = bless( {
                   'stderr' => '',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => '',
                   'errcode' => 0,
                   'results' => [
                                  9
                                ]
                 }, 'GRID::Machine::Result' );
  some.machine received: <4 5 6 7 8 9>
  Local machine: my.local.machine

THE GRID::Machine::Core REMOTE MODULE

The creation of a GRID::Machine object through a call to GRID::Machine->new implies the loading of a Remote Module called GRID::Machine::Core which is delivered with the GRID::Machine distribution. Another module that is being included at construction time is GRID::Machine::RIOHandle.

One of the final goals of the GRID::Machine::Core remote module is to provide homonymous methods per each of the Perl CORE:: functions. At present time only a few are supported.

The following functions defined in the Remote Module GRID::Machine::Core are loaded via the include mechanism on the remote machine. Therefore, they work as methods of the GRID::Machine object on the local machine. They perform the same operations than their Perl aliases:

Function getcwd

Function chdir

Function umask

Function mkdir

Function system

Executes system on the remote machine. See an example:

  $ cat -n examples/transfer2.pl 
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(is_operative);
     4  use Data::Dumper;
     5  
     6  my $host = shift || $ENV{GRID_REMOTE_MACHINE};
     7  
     8  my $machine = GRID::Machine->new( 
     9        host => $host,
    10        cleanup => 1,
    11        sendstdout => 1,
    12        startdir => '/tmp/perl5lib',
    13        prefix => '/tmp/perl5lib/',
    14     );
    15  
    16  my $dir = $machine->getcwd->result;
    17  print "$dir\n";
    18  
    19  $machine->modput('Parse::Eyapp::') or die "can't send module\n";
    20  
    21  print $machine->system('tree');
    22  my $r =  $machine->system('doesnotexist');
    23  print Dumper $r;

Observe the overloading of bool at line 19. modput returns a GRID::Machine::Result object which is evaluated in a Boolean context as a call to the result getter.

When executed produces an output like:

  $ perl -w examples/transfer2.pl 
  /tmp/perl5lib
  .
  `---- Parse
      `---- Eyapp
          |---- Base.pm
          |---- Cleaner.pm
          |---- Driver.pm
          |---- Grammar.pm
          |---- Lalr.pm
          |---- Node.pm
          |---- Options.pm
          |---- Output.pm
          |---- Parse.pm
          |---- Scope.pm
          |---- TokenGen.pm
          |---- Treeregexp.pm
          |---- _TreeregexpSupport.pm
          |---- Unify.pm
          `---- YATW.pm

  2 directories, 15 files
  $VAR1 = bless( {
                   'stderr' => 'Can\'t exec "doesnotexist":',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => '',
                   'errcode' => -1,
                   'results' => [ -1 ]
                 }, 'GRID::Machine::Result' );

Function qx

Similar to backtick quotes. The result depends on the context. In a list context returns a list with the lines of the output. In a scalar context reurns a string with the output. The value of $" on the local machine decides the register separator used. See an example:

  $ cat -n transfer3.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine qw(is_operative);
     4  use Data::Dumper;
     5
     6  my $host = shift || 'casiano@remote.machine.es';
     7
     8  my $machine = GRID::Machine->new( host => $host );
     9  my $DOC = << "DOC";
    10  one. two. three.
    11  four. five. six.
    12  seven.
    13  DOC
    14
    15  # List context: returns  a list with the lines
    16  {
    17    local $/ = '.';
    18    my @a = $machine->qx("echo '$DOC'");
    19    local $"= ",";
    20    print "@a";
    21  }
    22
    23  # scalar context: returns a string with the output
    24  my $a = $machine->qx("echo '$DOC'");
    25  print $a;

When executed produces the following output:

  $ transfer3.pl
  one., two., three.,
  four., five., six.,
  seven.,

  one. two. three.
  four. five. six.
  seven.

Function glob

Function tar

Is equivalent to:

            system('tar', $options, ,'-f', $file)

Where $options is a string containing the options. Returns the error code from tar. Example:

  $m->tar($dist, '-xz')->ok or warn "$host: Can't extract files from $dist\n";

Function version

Syntax:

              $machine->version('Some::Module')

Returns the VERSION of the module if the given module is installed on the remote machine and has a VERSION number.

See an example of use:

  $ cat version.pl
  #!/usr/bin/perl -w
  use strict;
  use GRID::Machine;
  use Data::Dumper;

  my $host = $ENV{GRID_REMOTE_MACHINE} ||shift;

  my $machine = GRID::Machine->new(host => $host,);

  print Dumper($machine->version('Data::Dumper'));
  print Dumper($machine->version('Does::Not::Exist::Yet'));

When executed the program produces an output similar to this:

  $ version.pl
  $VAR1 = bless( {
                   'stderr' => '',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => '',
                   'errcode' => 0,
                   'results' => [ '2.121_08' ]
                 }, 'GRID::Machine::Result' );
  $VAR1 = bless( {
                   'stderr' => 'Can\'t locate Does/Not/Exist/Yet.pm in @INC \
                                (@INC contains: /etc/perl /usr/local/lib/perl/5.8.8 ...
                                BEGIN failed--compilation aborted.
                               ',
                   'errmsg' => '',
                   'type' => 'RETURNED',
                   'stdout' => '',
                   'errcode' => 0,
                   'results' => [ '' ]
                 }, 'GRID::Machine::Result' );

Function installed

Syntax:

              $machine->installed('Some::Module')

Returns TRUE if the given module is installed on the remote machine. Is equivalent to:

            system("$^X -M$module -e 0")

File Status Methods

Methods that are equivalent to the tests function

      -r -w -e -x -z -s -f -d  -t -T -B -M -A -C

are provided. Since hyphens aren't legal in Perl identifiers the hyphen has been substituted by an underscore. See an example:

  $ cat -n copyandmkdir.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = 'remote.machine.es';
     6  my $dir = shift || "somedir";
     7  my $file = shift || $0; # By default copy this program
     8
     9  my $machine = GRID::Machine->new(
    10    host => $host,
    11    uses => [qw(Sys::Hostname)],
    12  );
    13
    14  my $r;
    15  $r = $machine->mkdir($dir, 0777) unless $machine->_w($dir);
    16  die "Can't make dir\n" unless $r->ok;
    17  $machine->chdir($dir)->ok or die "Can't change dir\n";
    18  $machine->put([$file]) or die "Can't copy file\n";
    19  print "HOST: ",$machine->eval(" hostname ")->result,"\n",
    20        "DIR: ",$machine->getcwd->result,"\n",
    21        "FILE: ",$machine->glob('*')->result,"\n";

When this program runs we get an output similar to this:

                    $ copyandmkdir.pl
                    HOST: orion
                    DIR: /home/casiano/somedir
                    FILE: copyandmkdir.pl

ON THE REMOTE SIDE

The Structure of the Remote Server

As with most servers, the server side of the GRID::Machine object consists of an infinite loop waiting for requests:

  while( 1 ) {
     my ( $operation, @args ) = $server->read_operation();

     if ($server->can($operation)) {
       $server->$operation(@args);
       next;
     }

     $server->send_error( "Unknown operation $operation\nARGS: @args\n" );
  }

The Protocol

The protocol simply consists of the name of the method to execute and the arguments for such method. The programmer - using inheritance - can extend the protocol with new methods (see the section "EXTENDING THE PROTOCOL"). The following operations are currently supported:

  • GRID::Machine::EVAL

    Used by the local method eval

  • GRID::Machine::STORE

    Used by the local methods compile and sub to install code on the remote side.

  • GRID::Machine::EXISTS

    Used by the local method exists

  • GRID::Machine::CALL

    Used by the local method call

  • GRID::Machine::MODPUT

    Used by the modput method. A list of pairs (Module::Name, code for Module::Name) is sent to the remote machine. For each pair, the remote side writes to disk a file Module/Name.pm with the contents of the string code for Module::Name. The file is stored in the directory referenced by the prefix attribute of the GRID::Machine object.

  • GRID::Machine::OPEN

    Used by the open method. As arguments receives a string defining the way the file will be accessed.

  • GRID::Machine::QUIT

    Usually is automatically called when the GRID::Machine object goes out of scope

  • GRID::Machine::GPRINT

    Most requests go from the local machine to the remote Perl server. However, this and the next go in the other direction. This request is generated in the remote machine and served by the local machine. It is used when inmediate printing is required (see section "Functions gprint and gprintf")

  • GRID::Machine::GPRINTF

    This request is generated in the remote machine and served by the local machine. It is used when inmediate printing is required (see section "Functions gprint and gprintf")

  • GRID::Machine::CALLBACK

    Used to implement callbacks

The SERVER function

The SERVER function is available on the remote machine. Returns the object representing the remote side of the GRID::Machine object. This way code on the remote side can gain access to the GRID::Machine object. See an example:

    my $m = GRID::Machine->new( host => 'beowulf');

    $m->sub(installed => q { return  keys %{SERVER->stored_procedures}; });
    my @functions = $m->installed()->Results;
    local $" = "\n";
    print "@functions\n";

The stored_procedures method returns a reference to the hash containing the subroutines installed via the sub and compile methods. The keys are the names of the subroutines, the values are the CODE references implementing them. When executed the former program produces the list of installed subroutines:

                    $ accessobject.pl
                    tar
                    system
                    installed
                    getcwd
                    etc.

The read_operation Method

Syntax:

     my ( $operation, @args ) = $server->read_operation( );

Reads from the link. Returns the type of operation/tag and the results of the operation.

The send_error Method

Syntax:

     $server->send_error( "Error message" );

Inside code to be executed on the remote machine we can use the function send_error to send error messages to the client

The send_result Method

Syntax:

    $server->send_result( 
        stdout  => $stdout,
        stderr  => $stderr,
        errmsg  => $errmsg,
        results => [ @results ],
    );

Inside code to be executed on the remote machine we can use the function send_result to send results to the client

EXTENDING THE PROTOCOL

Let us see a simple example. We will extend the protocol with a new tag MYTAG. We have to write a module that will be used in the remote side of the link:

  $ cat -n MyRemote.pm      
     1  package GRID::Machine;
     2  use strict;
     3
     4  sub MYTAG {
     5    my ($server, $name) = @_;
     6
     7    $server->send_operation("RETURNED", "Hello $name!\n") if defined($name); 
     8    $server->send_operation("DIED", "Error: Provide a name to greet!\n");
     9  }
    10
    11  1;

This component will be loaded on the remote machine via the ssh link. The name of the handling method MYTAG must be the same than the name of the tag (operation type) used to send the request. Here is a client program using the new tag:

  $ cat -n extendprotocol.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $name = shift;
     6  my $host = 'user@remote.machine';
     7
     8  my $machine = GRID::Machine->new(host => $host, remotelibs => [ qw(MyRemote) ]);
     9
    10  $machine->send_operation( "MYTAG", $name);
    11  my ($type, $result) = $machine->read_operation();
    12
    13  die $result unless $type eq 'RETURNED';
    14  print $result;

When the program is executed we get the following output:

                          $ extendprotocol.pl Larry
                          Hello Larry!
                          $ extendprotocol.pl
                          Error: Provide a name to greet!

INMEDIATE PRINTING

Functions gprint and gprintf

When running a RPC the output generated during the execution of the remote subroutine isn't available until the return of the RPC. Use gprint and gprintf if what you want is inmediate output (for debugging purposes, for instance). They work as print and printf respectively.

See an example:

  $ cat -n gprint.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $host = $ENV{GRID_REMOTE_MACHINE};
     6
     7  my $machine = GRID::Machine->new(host => $host, uses => [ 'Sys::Hostname' ]);
     8
     9  my $r = $machine->sub(
    10    rmap => q{
    11      my $f = shift; # function to apply
    12      die "Code reference expected\n" unless UNIVERSAL::isa($f, 'CODE');
    13
    14
    15      print "Inside rmap!\n"; # last message
    16      my @result;
    17      for (@_) {
    18        die "Array reference expected\n" unless UNIVERSAL::isa($_, 'ARRAY');
    19
    20        gprint hostname(),": Processing @$_\n";
    21
    22
    23        push @result, [ map { $f->($_) } @$_ ];
    24      }
    25
    26      gprintf "%12s:\n",hostname();
    27      for (@result) {
    28        my $format = "%5d"x(@$_)."\n";
    29        gprintf $format, @$_
    30      }
    31      return @result;
    32    },
    33  );
    34  die $r->errmsg unless $r->ok;
    35
    36  my $cube = sub { $_[0]**3 };
    37  $r = $machine->rmap($cube, [1..3], [4..6], [7..9]);
    38  print $r;

When executed the program produces the following output:

          $ gprint.pl
          orion: Processing 1 2 3
          orion: Processing 4 5 6
          orion: Processing 7 8 9
                 orion:
              1    8   27
             64  125  216
            343  512  729
          Inside rmap!

Observe how the message 'Inside rmap!' generated at line 15 using print is the last (actually is sent to STDOUT in line 38). The messages generated using gprint and gprintf (lines 20, 26 and 29) were inmediately sent to STDOUT.

REMOTE DEBUGGING

To run the remote side under the control of the perl debugger use the debug option of new. The associated value must be a port number higher than 1024:

     my $machine = GRID::Machine->new(
        host => $host,
        debug => $port,
        includes => [ qw{SomeFunc} ],
     );

Before running the example open a SSH session to the remote machine in a different terminal and execute netcat to listen (option -l) in the chosen port:

  pp2@nereida:~/LGRID_Machine$ ssh beowulf 'netcat -v -l -p 12345'
  listening on [any] 12345 ...
                              

Now run the program in the first terminal:

  pp2@nereida:~/LGRID_Machine/examples$ debug1.pl beowulf:12345
  Debugging with 'ssh beowulf PERLDB_OPTS="RemotePort=beowulf:12345" perl -d'
  Remember to run 'netcat -v -l -p 12345' in beowulf

The program looks blocked. If you go to the other terminal you will find the familiar perl debugger prompt:

  casiano@beowulf:~$ netcat -v -l -p 12345
  listening on [any] 12345 ...
  connect to [193.145.102.240] from beowulf.pcg.ull.es [193.145.102.240] 38979

  Loading DB routines from perl5db.pl version 1.28
  Editor support available.

  Enter h or `h h' for help, or `man perldebug' for more help.

  GRID::Machine::MakeAccessors::(/home/pp2/LGRID_Machine/lib/GRID/Machine/MakeAccessors.pm:33):
  33:     1;
  auto(-1)  DB<1> c GRID::Machine::main
  GRID::Machine::main(/home/pp2/LGRID_Machine/lib/GRID/Machine/REMOTE.pm:490):
  490:      my $server = shift;
    DB<2>                        

From now on you can execute almost any debugger command. Unfortunately you are now inside GRID::Machine code and - until you gain some familiarity with GRID::Machine code - it is a bit difficult to find where your code is and where to put your breakpoints. Future work: write a proper debugger front end.

THE GRID::Machine::Result CLASS

The class GRID::Machine::Result is used by both the local and remote sides of the GRID::Machine, though most of its methods are called on the remote side.

The result of a RPC is a GRID::Machine::Result object. Such object has the following attributes:

  • type

    The type of result returned. A string. Fixed by the protocol. Common values are RETURNED and DIED.

  • stdout

    A string containing the contents of STDOUT produced during the duration of the RPC

  • stderr

    A string containing the contents of STDERR produced during the duration of the RPC

  • results

    A reference to an ARRAY containing the results returned by the RPC

  • errcode

    The contents of $? as produced during the RPC

  • errmsg

    The contents of $@ as produced during the RPC

The Constructor new

Syntax:

  GRID::Machine::Result->new(
    stdout => $rstdout, 
    errmsg  => $err, 
    stderr => $rstderr, 
    results => \@results
  )

Builds a new result object.

The ok Method

Returns TRUE if the RPC didn't died, i.e. if the type attribute is not the string 'DIED'

The noerr Method

Returns TRUE if the RPC didn't died and didn't send any messages through stderr. See an example. When running the following program:

  $ cat noerrvsok.pl
  #!/usr/local/bin/perl -w
  use strict;
  use GRID::Machine;

  my $machine = shift || $ENV{GRID_REMOTE_MACHINE};
  my $m = GRID::Machine->new( host => $machine );

  my $r = $m->eval( q{print STDERR "This is the end\n" });

  print "print to STDERR:\n";
  print "<".$r->ok.">\n";
  print "<".$r->noerr.">\n";

  $r = $m->eval( q{warn "This is a warning\n" });

  print "Warn:\n";
  print "<".$r->ok.">\n";
  print "<".$r->noerr.">\n";

we get the following output:

                $ errvsok.pl
                print to STDERR:
                <1>
                <>
                Warn:
                <1>
                <>

The result Method

Returns the first element of the list referenced by the results attribute This method is called when a GRID::Machine::Result object is evaluated in a Boolean context (i.e. bool is overloaded).

The Results Method

Returns the list referenced by the results attribute

The str Method. Stringification of a Result object

Returns the string made of concatenating stdout, stderr and errmsg. The Perl operator q("") is overloaded using this method. Thus, wherever a GRID::Machine::Result object is used on a scalar string context the str will be called.

THE GRID::Machine::Message CLASS

This class is used by both the local and the remote sides of the GRID::Machine. It implements the low level communication layer. It is responsible of marshalling the data.

The read_operation Method

Syntax:

   my ( $operation, @args ) = $server->read_operation( );

Returns the kind of operation and the data sent by the other side of the SSH link.

The send_operation Method

Examples:

  $server->send_operation("RETURNED", GRID::Machine::Result->new( %arg ));

  $server->send_operation("DIED", GRID::Machine::Result->new( 
                                    errmsg  => "$server->{host}: $message")
  );

  $server->send_operation("RETURNED", exists($server->{stored_procedures}{$name}));

Sends to other side of the link the type of the message and the arguments. It uses Data::Dumper to serialize the data structures.

REMOTE INPUT/OUTPUT

GRID::Machine objects have the open method. The open method returns a GRID::Machine::IOHandle object. Such objects very much behave as IO::Handle objects but instead they refer to handles and files on the associated machine. See a simple example:

  use GRID::Machine;

  my $machine = shift || 'remote.machine';
  my $m = GRID::Machine->new( host => $machine );

  my $f = $m->open('> tutu.txt'); # Creates a GRID::Machine::IOHandle object
  $f->print("Hola Mundo!\n");
  $f->print("Hello World!\n");
  $f->printf("%s %d %4d\n","Bona Sera Signorina", 44, 77);
  $f->close();

  $f = $m->open('tutu.txt');
  my $x = <$f>;
  print "\n******diamond scalar********\n$x\n";
  $f->close();

  $f = $m->open('tutu.txt');
  my $old = $m->input_record_separator(undef);
  $x = <$f>;
  print "\n******diamond scalar context and \$/ = undef********\n$x\n";
  $f->close();
  $old = $m->input_record_separator($old);

A remote GRID::Machine::IOHandle object is created through the call

                   my $f = $m->open('> tutu.txt')

from that moment on we can write in the file using the print and printf methods of GRID::Machine::IOHandle objects. You can see later in the former code how the diamond operator can be called to read on a remote file:

                             my $x = <$f>;

When we run the former example we get an ouput similar to this:

    $ synopsisiohandle.pl

    ******diamond scalar********
    Hola Mundo!


    ******diamond scalar context and $/ = undef********
    Hola Mundo!
    Hello World!
    Bona Sera Signorina 44   77

See also the documentation in GRID::Machine::IOHandle for more detailed information.

REMOTE PIPES

Opening pipes for input

The open method of GRID::Machine objects can be used to pipe programs as in the following example:

  pp2@nereida:~/LGRID_Machine/examples$ cat -n pipes1.pl
     1  #!/usr/local/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $machine = shift || 'remote.machine.domain';
     6  my $m = GRID::Machine->new( host => $machine );
     7
     8  my $f = $m->open('uname -a |');
     9  my $x = <$f>;
    10  print "UNAME result: $x\n"

In a scalar context open returns the handler. In list context returns the pair (handler, PID). See GRID::Machine::perlparintro for a more detailed example.

When executed the program produces an output similar to this:

  pp2@nereida:~/LGRID_Machine/examples$ pipes1.pl
  UNAME result: Linux remote 2.6.8-2-686 #1 Tue Aug 16 13:22:48 UTC 2005 i686 GNU/Linux

Opening pipes for output

Pipes can be also for input as the following example shows:

  pp2@nereida:~/LGRID_Machine/examples$ cat -n pipes.pl
   1  #!/usr/local/bin/perl -w
   2  use strict;
   3  use GRID::Machine;
   4
   5  my $machine = shift || 'remote.machine';
   6  my $m = GRID::Machine->new( host => $machine );
   7
   8  my $i;
   9  my $f = $m->open('| sort -n > /tmp/sorted.txt');
  10  for($i=10; $i>=0;$i--) {
  11    $f->print("$i\n")
  12  }
  13  $f->close();
  14
  15  my $g = $m->open('/tmp/sorted.txt');
  16  print while <$g>;

when executed, the program produces the following output:

  pp2@nereida:~/LGRID_Machine/examples$ pipes.pl
  0
  1
  2
  3
  4
  5
  6
  7
  8
  9
10

When opening a pipe for output like in line 9 in the former example

  my $f = $m->open('| sort -n > /tmp/sorted.txt')

be sure to redirect the STDOUT of the program. Otherwise, GRID::Machine will redirect it to the null device and the output will be lost.

Bidirectional pipes: open2

Synopsis:

  my $WTR = IO::Handle->new();
  my $RDR = IO::Handle->new();
  my $pid = $m->open2($fromchild, $tochild, 'command and args');

The open2 method runs the given command in machine $m and connects $fromchild for reading from command and $tochild for writing from command. Returns the PID of the process executing command.

Bidirectional pipes: open3

Synopsis:

  my $pid = $m->open3($tochild, $fromchild, $errfromchild, 'command and args');

Spawns the given command and connects $fromchild for reading from the child, $tochild for writing to the child, and $errfromchild for errors.

See an example that opens the Unix calculator bc in a remote machine:

  $ cat -n open3bc.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4
     5  my $machine = shift || 'orion.pcg.ull.es';
     6  my $m = GRID::Machine->new( host => $machine );
     7
     8  my $WTR = IO::Handle->new();
     9  my $RDR = IO::Handle->new();
    10  my $ERR = IO::Handle->new();
    11  my $pid = $m->open3($WTR, $RDR, $ERR, 'bc');
    12
    13  my $line;
    14
    15  print $WTR "3*2\n";
    16  $line = <$RDR>;
    17  print STDOUT "3*2 = $line";
    18
    19  print $WTR "3/(2-2)\n";
    20  $line = <$ERR>;
    21  print STDOUT "3/(2-2) produces error = $line\n";
    22
    23  print $WTR "quit\n";
    24  wait;

When executed, the former program produces an output like this:

  $ open3bc.pl
  3*2 = 6
  3/(2-2) produces error = Runtime error (func=(main), adr=11): Divide by zero

REMOTE PROCESSES (FORKING)

The fork method

The fork method of GRID::Machine objects can be used to fork a process in the remote machine, as shown in the following example:

  $ cat -n fork5.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7  my $machine = GRID::Machine->new( host => $host );
     8  
     9  my $p = $machine->fork( q{
    10  
    11     print "stdout: Hello from process $$. args = (@_)\n";
    12     print STDERR "stderr: Hello from process $$\n";
    13  
    14     use List::Util qw{sum};
    15     return { s => sum(@_), args => [ @_ ] };
    16   },
    17   args => [ 1..4 ],
    18  );
    19  
    20  # GRID::Machine::Process objects are overloaded
    21  print "Doing something while $p is still alive  ...\n" if $p; 
    22  
    23  my $r = $machine->waitpid($p);
    24  
    25  print "Result from process '$p': ",Dumper($r),"\n";
    26  print "GRID::Machine::Process::Result objects are overloaded in a string context:\n$r\n";

When executed, the former program produces an output similar to this:

  $ perl fork5.pl 
  Doing something while 5220:5230:some.machine:5234:5237 is still alive  ...
  Result from process '5220:5230:some.machine:5234:5237': $VAR1 = bless( {
                   'machineID' => 0,
                   'stderr' => 'stderr: Hello from process 5237
  ',
                   'descriptor' => 'some.machine:5234:5237',
                   'status' => 0,
                   'waitpid' => 5237,
                   'errmsg' => '',
                   'stdout' => 'stdout: Hello from process 5237. args = (1 2 3 4)
  ',
                   'results' => [ { 'args' => [ 1, 2, 3, 4 ], 's' => 10 } ]
                 }, 'GRID::Machine::Process::Result' );

  GRID::Machine::Process::Result objects are overloaded in a string context:
  stdout: Hello from process 5237. args = (1 2 3 4)
  stderr: Hello from process 5237

The fork method returns a GRID::Machine::Process object. The first argument must be a string containing the code that will be executed by the forked process in the remote machine. Such code is always called in a list context. The fork method admits the following arguments:

  • stdin

    The name of the file to which stdin will be redirected

  • stdout

    The name of the file to which stdout will be redirected. If not specified a temporary file will be used

  • stderr

    The name of the file to which stderr will be redirected. If not specified a temporary file will be used

  • result

    The name of the file to which the result computed by the child process will be dumped. If not specified a temporary file will be used

  • args

    The arguments for the code executed by the remote child process

GRID::Machine::Process objects

GRID::Machine::Process objects have been overloaded. In a string context a GRID::Machine::Process object produces the concatenation hostname:clientPID:remotePID. In a boolean context it returns true if the process is alive and false otherwise. This way, the execution of line 21 in the program above:

    21  print "Doing something while $p is still alive  ...\n" if $p; 

produces an output like:

  Doing something while 5220:5230:some.machine:5234:5237 is still alive  ...

if the remote process is still alive. The descriptor of the process 5220:5230:some.machine:5234:5237 is a colon separated sequence of five components:

1 - The PID of the local process executing GRID::Machine
2 - The PID of the local process in charge of the connection with the remote machine
3 - The name of the remote machine
4 - The PID of the remote process executing GRID::Machine::REMOTE
5 - The PID of the child process created by fork

When evaluated in a boolean context, a GRID::Machine::Process returns 1 if it is alive and 0 otherwise.

The waitpid method

The waitpid method waits for the GRID::Machine::Process received as first argument to terminate. Additional FLAGS as in perl waitpid can be passed as arguments. It returns a GRID::Machine::Process::Result object, whose attributes contain:

  • stdout

    A string containing the output to STDOUT of the remote child process

  • stderr

    A string containing the output to STDERR of the remote child process

  • results

    The list of values returned by the child process. The forking code is always called in a list context.

  • status

    The value associated with $? as returned by the remote child process.

  • waitpid

    The value returned by the Perl waitpid function when synchronized with the remote child process. It is usually the value is either the pid of the deceased process, or -1 if there was no such child process. On some systems, a value of 0 indicates that there are processes still running.

  • errmsg

    The child error as in $@

  • machineID

    The logical identifier of the associated GRID::Machine. By default, 0 if it was the first GRID::Machine created, 1 if it was the second, etc.

The waitall method

It is similar to waitpid but instead waits for any child process.

Behaves like the wait(2) system call on your system: it waits for a child process to terminate and returns

  • The GRID::Machine::Process::Result object associated with the deceased process if it was called via the GRID::Machine fork method, or

  • The PID of the deceased process if there is no GRID::Machine::Process associated (it was called using an ordinary fork)

  • -1 if there are no child processes. Note that a return value of -1 could mean that child processes are being automatically reaped, as described in perlipc.

See an example:

  $ cat -n wait1.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Data::Dumper;
     5  
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7  my $machine = GRID::Machine->new( host => $host );
     8  
     9  my $p = $machine->fork( q{
    10  
    11     print "stdout: Hello from process $$. args = (@_)\n";
    12     print STDERR "stderr: Hello from process $$\n";
    13  
    14     use List::Util qw{sum};
    15     return { s => sum(@_), args => [ @_ ] };
    16   },
    17   args => [ 1..4 ],
    18  );
    19  
    20  # GRID::Machine::Process objects are overloaded
    21  print "Doing something while $p is still alive  ...\n" if $p; 
    22  
    23  my $r = $machine->waitall();
    24  
    25  print "Result from process '$p': ",Dumper($r),"\n";
    26  print "GRID::Machine::Process::Result objects are overloaded in a string context:\n$r\n";

When executed produces:

  $ perl wait1.pl 
  Doing something while 1271:1280:local:1284:1287 is still alive  ...
  Result from process '1271:1280:local:1284:1287': $VAR1 = bless( {
                   'machineID' => 0,
                   'stderr' => 'stderr: Hello from process 1287
  ',
                   'descriptor' => 'local:1284:1287',
                   'status' => 0,
                   'waitpid' => 1287,
                   'errmsg' => '',
                   'stdout' => 'stdout: Hello from process 1287. args = (1 2 3 4)
  ',
                   'results' => [ { 'args' => [ 1, 2, 3, 4 ], 's' => 10 } ]
                 }, 'GRID::Machine::Process::Result' );

  GRID::Machine::Process::Result objects are overloaded in a string context:
  stdout: Hello from process 1287. args = (1 2 3 4)
  stderr: Hello from process 1287

The following example uses the fork method and waitall to compute in parallel a numerical approach to the value of the number pi:

  $ cat -n waitpi.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = $ENV{GRID_REMOTE_MACHINE};
     6  my $machine = GRID::Machine->new( host => $host );
     7  
     8  my ($N, $np, $pi)  = (1000, 4, 0);
     9  for (0..$np-1) {
    10     $machine->fork( q{
    11         my ($id, $N, $np) = @_;
    12           
    13         my $sum = 0;
    14         for (my $i = $id; $i < $N; $i += $np) {
    15             my $x = ($i + 0.5) / $N;
    16             $sum += 4 / (1 + $x * $x);
    17         }
    18         $sum /= $N; 
    19      },
    20      args => [ $_, $N, $np ],
    21    );
    22  }
    23  
    24  $pi += $machine->waitall()->result for 1..$np;
    25  
    26  print "pi = $pi\n";

The async method

The async method it is quite similar to the fork method but receives as arguments the name of a GRID::Machine method and the arguments for this method. It executes asynchronously the method. It returns a GRID::Machine::Process object. Basically, the call

            $m->async($subname => @args) 

is equivalent to:

            $m->fork($subname.'(@_)' args => [ @args ] ) 

The following example uses async to compute in parallel an approximation to the value of pi:

  $ cat -n async.pl 
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  
     5  my $host = $ENV{GRID_REMOTE_MACHINE};
     6  my $machine = GRID::Machine->new( host => $host );
     7  
     8  $machine->sub(sumareas => q{
     9         my ($id, $N, $np) = @_;
    10           
    11         my $sum = 0;
    12         for (my $i = $id; $i < $N; $i += $np) {
    13             my $x = ($i + 0.5) / $N;
    14             $sum += 4 / (1 + $x * $x);
    15         }
    16         $sum /= $N; 
    17  });
    18  
    19  my ($N, $np, $pi)  = (1000, 4, 0);
    20  
    21  $machine->async( sumareas =>  $_, $N, $np ) for (0..$np-1);
    22  $pi += $machine->waitall()->result for 1..$np;
    23  
    24  print "pi = $pi\n";

GRID::Machine::Process::Result objects

In a string context a GRID::Machine::Process::Result object produces the concatenation of its output to STDOUT followed by its output to STDERR. In a boolean context it evaluates according to its result attribute. It evaluates to true if it is an array reference with more than one element or if the only element is true. Otherwise it is false.

CALLBACKS

It may happen that the local machine has installed a useful set of modules that are not present on the remote side. It may be also imposible to transfer the modules to the remote machine using the mechanisms provided by GRID::Machine. In such situations -and many others - the callback mechanism can be helpful to achieve the task at hand.

The callback method provides a way to make a subroutine on the local side callable from the remote side. The ideas and implementation mechanisms used for callbacks is the work of Dmitriy Kargapolov (Thanks Dmitri!).

The syntax is:

                $r = $machine->callback( 'localsubname' );
                $r = $machine->callback( localsub => sub { ... } );
                $r = $machine->callback( localsub => subref );
                $r = $machine->callback( sub { ... } );
                $r = $machine->callback( subref );

On success returns true, namely returns the address of the subroutine on the remote side that works as proxy of the localsub subroutine on the local side. Exceptions will be thrown in case of failure.

The following example shows a remote subroutine (lines 16-20) that calls a subroutine test_callback that will be executed on the local side (line 19). The call to the method callback at line 23 makes the local subroutine test_callback available from the remote side.

 $ cat -n callbackbyname2.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Sys::Hostname;
     5
     6  my $host = shift || $ENV{GRID_REMOTE_MACHINE};
     7
     8  sub test_callback {
     9    print 'Inside test_callback(). Host is '.&hostname."\n";
    10
    11    return shift()+1;
    12  }
    13
    14  my $machine = GRID::Machine->new(host => $host, uses => [ 'Sys::Hostname' ]);
    15
    16  my $r = $machine->sub( remote => q{
    17      gprint hostname().": inside remote\n";
    18
    19      return 1+test_callback(2);
    20  } );
    21  die $r->errmsg unless $r->ok;
    22
    23  $r = $machine->callback( 'test_callback' );
    24  die $r->errmsg unless $r->ok;
    25
    26  $r = $machine->remote();
    27  die $r->errmsg unless $r->noerr;
    28
    29  print "Result: ".$r->result."\n";

When the former program is executed (local machine is nereida) we get an output similar to this:

                $ callbackbyname2.pl
                beowulf: inside remote
                Inside test_callback(). Host is nereida.deioc.ull.es
                Result: 4

Callbacks and Namespaces

The callback subroutine is somewhat exported onto the remote side. That is, when transforming a local subroutine in a callback you can specify it by its full name (see line 24 below) but it is called from the remote side using its single name (line 18):

  $ cat -n callbackbyname3.pl
     1  #!/usr/bin/perl -w
     2  use strict;
     3  use GRID::Machine;
     4  use Sys::Hostname;
     5
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7
     8  sub Tutu::test_callback {
     9    print 'Inside test_callback() host: ' . &hostname . "\n";
    10    return 3.1415;
    11  }
    12
    13  my $machine = GRID::Machine->new(host => $host, uses => [ 'Sys::Hostname' ]);
    14
    15  my $r = $machine->sub( remote => q{
    16      gprint hostname().": inside remote\n";
    17
    18      my $r = test_callback(); # scalar context
    19
    20      gprint hostname().": returned value from callback: $r\n";
    21  } );
    22  die $r->errmsg unless $r->ok;
    23
    24  $r = $machine->callback( 'Tutu::test_callback' );
    25  die $r->errmsg unless $r->ok;
    26
    27  $r = $machine->remote();
    28
    29  die $r->errmsg unless $r->noerr;

When executed the former program produces an output similar to this (beowulf is the remote machine):

            $ callbackbyname3.pl
            beowulf: inside remote
            Inside test_callback() host: nereida.deioc.ull.es
            beowulf: returned value from callback: 3.1415

Context and Callbacks

When a callback subroutine is called in a scalar context it returns the first element of the returned list. See line 18 in the previous code.

Anonymous Callbacks

The callback subroutine can be anonymous. The callback method support the syntax:

              $machine->callback( sub { ... } )

See an example:

 $ cat -n anonymouscallback.pl
     1  #!/usr/bin/perl
     2  use strict;
     3  use GRID::Machine;
     4  use Sys::Hostname;
     5
     6  my $host = shift || $ENV{GRID_REMOTE_MACHINE};
     7
     8  my $machine = GRID::Machine->new(host => $host, uses => [ 'Sys::Hostname' ]);
     9
    10  my $r = $machine->sub( remote => q{
    11      my $rsub = shift;
    12
    13      gprint &hostname.": inside remote sub\n";
    14      my $retval = $rsub->(3);
    15
    16      return  1+$retval;
    17  } );
    18
    19  die $r->errmsg unless $r->ok;
    20
    21  my $a =  $machine->callback(
    22             sub {
    23               print hostname().": inside anonymous inline callback. Args: (@_) \n";
    24               return shift() + 1;
    25             }
    26           );
    27
    28  $r = $machine->remote( $a );
    29
    30  die $r->errmsg unless $r->noerr;
    31
    32  print "Result = ".$r->result."\n";

When the previous example is executed using as local machine 'nereida' it produces an output similar to this:

          $ anonymouscallback.pl
          beowulf: inside remote sub
          nereida.deioc.ull.es: inside anonymous inline callback. Args: (3)
          Result = 5

Recursive Remote Procedure Calls and Callbacks

The existence of callbacks opens the possibility of nested sets of RPCs and callbacks. The following example recursively computes the factorial of a number. The execution of recursive calls alternates between remote and local sides:

  $ cat -n nestedcallback.pl
     1  #!/usr/bin/perl
     2  use strict;
     3  use GRID::Machine;
     4  use Sys::Hostname;
     5
     6  my $host = $ENV{GRID_REMOTE_MACHINE};
     7
     8  my $machine = GRID::Machine->new( host => $host, uses => [ 'Sys::Hostname' ] );
     9
    10  my $r = $machine->sub(
    11    fact => q{
    12      my $x = shift;
    13
    14      gprint &hostname . ": fact($x)\n";
    15
    16      if ($x > 1) {
    17        my $r = localfact($x-1);
    18        return $x*$r;
    19      }
    20      else {
    21        return 1;
    22      }
    23    }
    24  );
    25  die $r->errmsg unless $r->ok;
    26
    27  $r = $machine->callback(
    28
    29      localfact => sub {
    30        my $x = shift;
    31
    32        print &hostname . ": fact($x)\n";
    33
    34        if ($x > 1) {
    35          my $r = $machine->fact($x-1)->result;
    36          return $x*$r;
    37        }
    38        else {
    39          return 1;
    40        }
    41
    42      }
    43
    44  );
    45  die $r->errmsg unless $r->ok;
    46
    47  my $n = shift;
    48
    49  $r = $machine->fact($n);
    50
    51  die $r->errmsg unless $r->ok;
    52  print "=============\nfact($n) is ".$r->result."\n";

When executed, the former program produces an output similar to this (beowulf is the remote machine):

                      $ nestedcallback.pl 6
                      beowulf: fact(6)
                      nereida.deioc.ull.es: fact(5)
                      beowulf: fact(4)
                      nereida.deioc.ull.es: fact(3)
                      beowulf: fact(2)
                      nereida.deioc.ull.es: fact(1)
                      =============
                      fact(6) is 720

LIMITATIONS

Operating System

I will be surprised if this module works on anything that is not UNIX.

Opaque Structures

The RPC provided by GRID::Machine uses Data::Dumper to serialize the data. It consequently suffers the same limitations than Data::Dumper.

Namely, Opaque structures like those built by modules written using external languages like C can't be correctly transferred by the RPC system provided by GRID::Machine. An example is the transference of PDL objects (see PDL). In such cases, the programmer must transform (i.e. marshalling or project) the structure into a (linear) string on one side and rebuild (uplift) the (multidimensional) structure from the string on the other side. See an example:

    use GRID::Machine;
    use PDL;
    use PDL::IO::Dumper;

    my $host = shift || 'user@remote.machine.domain';

    my $machine = GRID::Machine->new(host => $host, uses => [qw(PDL PDL::IO::Dumper)]);

    my $r = $machine->sub( mp => q{
        my ($f, $g) = @_;

        my $h = (pdl $f) x (pdl $g);

        sdump($h);
      },
    );
    $r->ok or die $r->errmsg;

    my $f = [[1,2],[3,4]];
    $r = $machine->mp($f, $f);
    die $r->errmsg unless $r->ok;
    my $matrix =  eval($r->result);
    print "\$matrix is a ".ref($matrix)." object\n";
    print "[[1,2],[3,4]] x [[1,2],[3,4]] = $matrix";

Here the sdump method of PDL::IO::Dumper solves the problem: it gives a string representation of the PDL object that is evalued later to have the matrix data structure. When executed this program produces the following output:

            $ uses.pl
            $matrix is a PDL object
            [[1,2],[3,4]] x [[1,2],[3,4]] =
            [
             [ 7 10]
             [15 22]
            ]

Nested Uses of GRID::Machine

The remote server can't use GRID::Machine to connect to a second server. I. e. a program like this fails:

  use GRID::Machine;

  my $host = shift || 'user@machine';

  my $machine = GRID::Machine->new(host => $host, uses => [ 'GRID::Machine' ]);

  my $r = $machine->eval(q{ my $t = GRID::Machine->new(host => 'orion'); });

  print $r->result;

Call by Reference

Remote Subroutine Call by reference is not supported in this version. See the following example:

      use GRID::Machine;

      my $machine = GRID::Machine->new(
            host => 'user@remote.machine.domain',
            startdir => '/tmp',
         );

      my $r = $machine->sub(byref => q{ $_[0] = 4; });
      die $r->errmsg unless $r->ok;

      my ($x, $y) = (1, 1);

      $y = $machine->byref($x)->result;

      print "$x, $y\n"; # 1, 4

Observe that variable $x is not modified. The only way to modify a variable on the local side by a remote subroutine is by result, like is done for $y in the previous example.

Limitations of the include Method

"Remote Modules" is the term used for files containing Perl code that will be loaded onto the remote Perl server via the incldue method or through the includes argument of new. These files can only contain

  • Subroutines. Since these subroutines are anonymous in the remote side, the only way to call them from the remote side is through the attribute stored_procedures of the SERVER object:

                        SERVER->stored_procedures->{subname}
  • use Module declarations. The Module must exists in the remote server. Furthermote module import arguments as in use Module qw{ w1 w2 w3} must be in a single line.

  • POD documentation

    Variable declarations and variable initializations are ignored.

The include method parses Perl code. It is a heuristic one page length parser (72 lines at the moment of writing). It obviously can't parse everything. But works for most of the code restricted to the aforementioned limitations.

EXPORTS

When explicited by the client program GRID::Machine exports these functions:

  • is_operative

  • read_modules

  • qc

INSTALLATION

To install GRID::Machine follow the traditional steps:

   perl Makefile.PL
   make
   make test
   make install

Using Password Authentication

You can make GRID::Machine to work without automatic authentication.

The following example uses Net::OpenSSH to open a SSH connection using password authentication instead of asymmetric cryptography:

  $ cat -n openSSH.pl 
     1  use strict;
     2  use warnings;
     3  use Net::OpenSSH;
     4  use GRID::Machine;
     5  
     6  my $host = (shift() or $ENV{GRID_REMOTE_MACHINE});
     7  my @ARGS;
     8  push @ARGS, (user      => $ENV{USR})   if $ENV{USR};
     9  push @ARGS, ( password => $ENV{PASS}) if $ENV{PASS};
    10  
    11  my $ssh = Net::OpenSSH->new($host, @ARGS); 
    12  $ssh->error and die "Couldn't establish SSH connection: ". $ssh->error;
    13  
    14  my @cmd = $ssh->make_remote_command('perl');
    15  { local $" = ','; print "@cmd\n"; }
    16  my $grid = GRID::Machine->new(command => \@cmd);
    17  my $r = $grid->eval('print "hello world!\n"');
    18  print "$r\n";

when executed produces an output like this:

  $ perl openSSH.pl 
  ssh,-S,/Users/localuser/.libnet-openssh-perl/user-machine-2413-275647,-o,User=user,--,machine,perl
  hello world!

However, it seems a bad idea to have unencrypted passwords messing around. It is much better to use asymmetric cryptography.

Using Automatic authentication: Asymmetric Cryptography

Set automatic ssh-authentication with the machines where you have an SSH account.

SSH includes the ability to authenticate users using public keys. Instead of authenticating the user with a password, the SSH server on the remote machine will verify a challenge signed by the user's private key against its copy of the user's public key. To achieve this automatic ssh-authentication you have to:

  • Generate a public key use the ssh-keygen utility. For example:

      local.machine$ ssh-keygen -t rsa -N ''

    The option -t selects the type of key you want to generate. There are three types of keys: rsa1, rsa and dsa. The -N option is followed by the passphrase. The -N '' setting indicates that no pasphrase will be used. This is useful when used with key restrictions or when dealing with cron jobs, batch commands and automatic processing which is the context in which this module was designed. If still you don't like to have a private key without passphrase, provide a passphrase and use ssh-agent to avoid the inconvenience of typing the passphrase each time. ssh-agent is a program you run once per login sesion and load your keys into. From that moment on, any ssh client will contact ssh-agent and no more passphrase typing will be needed.

    By default, your identification will be saved in a file /home/user/.ssh/id_rsa. Your public key will be saved in /home/user/.ssh/id_rsa.pub.

  • Once you have generated a key pair, you must install the public key on the remote machine. To do it, append the public component of the key in

               /home/user/.ssh/id_rsa.pub

    to file

               /home/user/.ssh/authorized_keys
               

    on the remote machine. If the ssh-copy-id script is available, you can do it using:

      local.machine$ ssh-copy-id -i ~/.ssh/id_rsa.pub user@remote.machine

    Alternatively you can write the following command:

      $ ssh remote.machine "umask 077; cat >> .ssh/authorized_keys" < /home/user/.ssh/id_rsa.pub

    The umask command is needed since the SSH server will refuse to read a /home/user/.ssh/authorized_keys files which have loose permissions.

  • Edit your local configuration file /home/user/.ssh/config (see man ssh_config in UNIX) and create a new section for GRID::Machine connections to that host. Here follows an example:

     ...
    
     # A new section inside the config file: 
     # it will be used when writing a command like: 
     #                     $ ssh gridyum 
    
     Host gridyum
    
     # My username in the remote machine
     user my_login_in_the_remote_machine
    
     # The actual name of the machine: by default the one provided in the
     # command line
     Hostname real.machine.name
    
     # The port to use: by default 22
     Port 2048
    
     # The identitiy pair to use. By default ~/.ssh/id_rsa and ~/.ssh/id_dsa
     IdentityFile /home/user/.ssh/yumid
    
     # Useful to detect a broken network
     BatchMode yes
    
     # Useful when the home directory is shared across machines,
     # to avoid warnings about changed host keys when connecting
     # to local host
     NoHostAuthenticationForLocalhost yes
    
    
     # Another section ...
     Host another.remote.machine an.alias.for.this.machine
     user mylogin_there
    
     ...

    This way you don't have to specify your login name on the remote machine even if it differs from your login name in the local machine, you don't have to specify the port if it isn't 22, etc. This is the recommended way to work with GRID::Machine. Avoid cluttering the constructor new.

  • Once the public key is installed on the server you should be able to authenticate using your private key

      $ ssh remote.machine
      Linux remote.machine 2.6.15-1-686-smp #2 SMP Mon Mar 6 15:34:50 UTC 2006 i686
      Last login: Sat Jul  7 13:34:00 2007 from local.machine
      user@remote.machine:~$                                 

    You can also automatically execute commands in the remote server:

      local.machine$ ssh remote.machine uname -a
      Linux remote.machine 2.6.15-1-686-smp #2 SMP Mon Mar 6 15:34:50 UTC 2006 i686 GNU/Linux
  • Once you have installed GRID::Machine you can check that perl can be executed in that machine using this one-liner:

      $ perl -e 'use GRID::Machine qw(is_operative); print is_operative("ssh", "beowulf")."\n"'
      1

DEPENDENCIES

This module requires these other modules and libraries:

SEE ALSO

CONTRIBUTORS

  • Dmitriy Kargapolov (<dmitriy.kargapolov@gmail.com>) suggested, designed and provided an implementation for callbacks.

  • Eric Busto fixed a problem with is_operative hanging on systems that are in an odd state.

  • Alex White fixed bugs in modput and the SSH options.

  • Erik Welch fixed a bug in the (local) DESTROY method.

AUTHOR

Casiano Rodriguez Leon <casiano@ull.es>

ACKNOWLEDGMENTS

This work has been supported by CEE (FEDER) and the Spanish Ministry of Educacion y Ciencia through Plan Nacional I+D+I number TIN2005-08818-C04-04 (ULL::OPLINK project http://www.oplink.ull.es/). Support from Gobierno de Canarias was through GC02210601 (Grupos Consolidados). The University of La Laguna has also supported my work in many ways and for many years.

I wish to thank Paul Evans for his IPC::PerlSSH module: it was the source of inspiration for this module. To Alex White, Dmitri Kargapolov, Eric Busto and Erik Welch for their contributions. To the Perl Monks, and the Perl Community for generously sharing their knowledge. Finally, thanks to Juana, Coro and my students at La Laguna.

LICENCE AND COPYRIGHT

Copyright (c) 2007 Casiano Rodriguez-Leon (casiano@ull.es). All rights reserved.

These modules are free software; you can redistribute it and/or modify it under the same terms as Perl itself. See perlartistic.

This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.