APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed
RSS Feeds RSS Feeds











(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Printer Friendly Version
->
-> Fork and exec with Perl


Fork and exec with Perl



Recently I had a project that required a number of different programs that will mostly run all the time, but need to be restarted now and then with different parameters. Normally, the first thing I think of for a program that runs constantly is inittab or svc (daemontools). The svc facility is the more flexible of the two, and will be what I'll use in the final design, but in the "thinking" stages I played with using a Perl program launcher and controller.

What we have is a config file that specifies programs to run and various parameters to pass:


[NWFTP]
mainprog=ftpget.pl
datasource=ftpssite.com/pub
hours=*
timing=5,35
conversion=toscv
outputname=kansas
mylog=ftpgetlog

[PFIELD]
mainprog=httpget.pl
datasource=http://xyz.com?dd=foo
hours=0,4,8,12,16,20,24
timing=4
conversion=none
outputname=fran
mylog=pfieldllog

We want a program that can start up any number of these, stop them, and monitor their progress.

Menu

The menu's responsibility is to read the config file, see what needs to run, and run the program with all the parameters as arguments. It needs to keep track of process id's so that it can kill off jobs. It mostly doesn't care about the various parameters; it just passes those as arguments. However it does read the "mylog" variable, log to that, and display that log on request.

Supposedly, it's Windows poor performance in this area that causes Windows web programmers to shun cgi scripts in favor of asp: running such scripts means constantly creating new processes. I'm no expert in Windows process management and creation though, so all that could be wrong or "wrong now" - sometimes things that used to be true get stuck and everybody keeps doing things for reasons that no longer apply.

The reason Windows didn't pay much attention to this is that Windows programs are much more apt to use threads - and of course so are more modern Unix programs. Threads aren't the answer to every problem though, and do tend to encourage large, monolithic programs. As the Unix philosophy is standalones that work together, forking efficiency will remain important.

It is interesting how basic differences like that do affect OS design and optimization. Windows programmers don't generally understand the cooperative tools idea of Unix/Linux, and that affects basic design decisions.

When asked to start a program, this script forks, and then execs the other program.

The "fork and exec" scheme is very common to Unix programs: every time you login, you go through similar processes: init forks and execs a login prompt, which in turn does the same thing with a authentication program to check your password, which then forks and execs your shell. Network daemons do the same thing to handle multiple connections: in fact there are very few programs that don't use fork and exec at all. Because of this, Unix/Linux kernels give very special attention to making this work quickly and efficiently. Typically, the child process is set to be a "copy on write" instance of the parent, which means before it execs, it's just sharing the same code for efficiency.

Just like the system level fork, your Perl program gets duplicated by the kernel and it starts running right at the return from fork: in other words, one program does the fork(), but two programs exist after, and both see the value that fork returns. The parent program (the one that did the fork) sees the process id of the child, and the child sees "0". It can get its process id from "$$" if it needs it.

You can't be certain when your program will actually get scheduled to run, so that makes it a little tricky to tell if it immediately failed. In this case, we handle it by sleeping for two seconds before checking that the pid is still around. But there's no guarantee on that: a busy, busy system might take much longer before it gets around to giving that child process any run time. For that reason, and of course because it might fail at any time, we also check the status of each pid each time through the loop: in this case, just pressing enter would cause the status to be updated, but in the final versions we'll need a constantly updated display (see Perl Input for how to do non-blocking input).

The child process then "execs" the program we really want to run. If everything goes well, the exec never returns: your child is overlaid with that program and that's the last we worry about it. We could "wait" for the child to finish, but in this case we have no need to collect any return values, so we just set SIGCHLD to IGNORE. If you don't either wait for forked children or set this, your child processes become zombies if the parent process exits before they do. We won't be exiting at all normally, but we'll set that just in case.

Arguments are passed to exec simply by packing them up into an array and doing "exec $progname, @array". Perl exec will call a shell to execute your program if there is only "$progname" (no additional args) AND it sees shell metacharacters. That's not needed or wanted here, but $progname could be "grep foo *.dat" etc. But for our case here, no shell will be used.

It's easy to get yourself very confused and messed up if you don't account for the possibility of the exec failing. In the code below, we write a log entry and exit right after the line that does the exec. If the exec succeeds, those code lines will not execute; they'll never be reached. If it does fail (the program name spelled wrong in the config or has been deleted or for other reasons), you cannot return. If you did not have the specific exit, the child would return, and then you'd have two copies of your script running, both displaying menus on the screen and both trying to get input from the keyboard. Failures of that sort aren't always easy to see: I created such a condition deliberately by mis-spelling the "mainprog" name and commenting out the "exit". On a Linux box accessed over the network, you could plainly see the menu lines being duplicated:

Choose:
Choose:
1) NWFTP [running (13870)]
1) NWFTP [not running ]
2) PIELD [not running]
2) PIELD [not running]
 

But on my local Mac, you could easily miss it because although both processes were writing, they didn't intermingle on the screen:

Choose:
1) NWFTP [running (13870)]
2) PIELD [not running]  
Choose:
1) NWFTP [not running ]       
2) PIELD [not running]  
 

Of course input would be confused in both places, but you might not immediately realize what was happening to every other keystroke or so. Just get in the habit of putting an exit after any exec. I suspect just about all of us either have made this mistake or will someday.

An easier way to handle these things might be the daemontools package referenced above; your program would simply use "svc" to start and stop the sub-programs. Check that out before you reinvent to o many wheels; you may still decide to roll your own but you may as well know about other ways.

Here's the sample program. You'd want a much smoother and prettier implementation if you were really doing this, but it does show how fork and exec work in Perl. Be careful of Perl books here: there was (still is?) an error in "Programming Perl" that shows "exec PATHNAME LIST" rather than what is actually needed: "exec PATHNAME, LIST" (note the comma). Many other Perl books and web pages have propagated that error so it is quite common to see it, even in supposedly working programs given as examples.

#!/usr/bin/perl
use Errno qw(ESRCH EPERM);
$SIG{CHLD}='IGNORE';
#
# read in config
#
open(I, "config") or die "Couldn't open config file $!";
#
# array for parameters to be passed
#
%stuff=();
$which="";
while (<I>) {
  chomp;
  if (/^\[[A-Za-z]*\]$/) {
    # bracketed name 
    s/^.//;
    s/.$//;
    $which=$_;
    $jobs{$which}=0;
  }
  if ( /=/) {
  # parameters
  ($a,$b)=split /=/;
  $key="$which|$a";
  $stuff{$key}=$b;
  }
}
close I;
#
# We now have $jobs{NWFTP}, $jobs{PFIELD} etc, and $stuff has 
# entries like:
# 
# $stuff{NWFTP|mainprog}=ftpget.pl
# $stuff{NWFTP|datasource}=ftpsite.com/pub

# 
# loop forever
#
while (1) {
$x=1;
print "Choose:\n";
foreach (sort keys %jobs) {
  # 
  # The jobs{$_} is pid of running processes
  # Sending a kill 0 actually just checks that a
  # job is still running.  These will be zero at startup
  $s=0;
  $s=kill 0, $jobs{$_} if $jobs{$_};
  $jobs{$_}=0 if (not $s);


  $status="[not running]" if not $jobs{$_};
  $status="[running ($jobs{$_})]" if $jobs{$_};

  print "$x) $_ $status\n";
  $x++;
}
#
# get input
#
$g=<>;
chomp $g;
if ($g > 0 and $g < $x) {
  chose($g);
}
}

sub chose {
 my $g=shift;
 $x=0;
 # which one?
 foreach (sort keys %jobs) {
  $x++;
  next if $x ne $g;
  $which=$_;
  last;
  }
  print "----\n$which\n----\n";
  foreach (sort keys %stuff) {
     ($a,$b)=split /\|/;
     # so $a is'NWS' and $b is 'datasource', etc.
     next if $a ne $which;
     print "$b: $stuff{$_}\n";
  }
  $status="[not running]" if not $jobs{$which};
  $status="[running ($jobs{$which})]" if $jobs{$which};
  print "\n$status\n";
  print "\n(L)og\n";
  print "(K)ill\n" if ($jobs{$which});
  print "(S)tart\n" if (not $jobs{$which});
  $gg=<>;
  chomp $gg;

  $m="$which|mylog";
  $mylog=$stuff{$m};

  if ($gg =~ /S/i ) {
     print "Starting up $which ..\n";
     $jobs{$which}=runit($which);
     #
     # give it time to fail
     # if we check too quickly, it will always be there
     sleep 2;
     $s=kill 0, $jobs{$which};
     if (not $s) {
       print "Couldn't start $which\n";
       $jobs{$which}=0;
     }
    return;
  }
  if ($gg =~ /k/i ) {
    $j=$jobs{$which};
    $cnt= kill 1, $j;
    # if it didn't die, $cnt will be zero, so try -9
    kill 9, $j if not $cnt;
    $s=kill 0, $j;
    $jobs{$which}=0 if $s;
    $now=time();
    $today=localtime($now);
    open (O,">>$mylog") or print "Can't write to $mylog $!";
    print O "killed $which $j $today\n" if $s;
    print O "couldn't kill $which $j $today\n" if not $s;
    close O;
    return;
  }
  if ($gg =~ /l/i ) {
      open(I,"$mylog") print "Can't read $mylog $!;
      $x=0;
      while (<I>) {
       $x++;
       $z=$x % 50;
       print;
       if (not $z) {
        print "Enter for more, Q to quit\n";
        $g=<>;
        if  ($g =~ /q/i) {
           close I;
           return ;
        }
      }
  return;
  }

}

sub runit {


my $which=shift;

FORK: {  
  if ($pid=fork) {
    return $pid;
    # this is the parent, so return the pid
    # everything below here is either the child or a very major 
    # system failure
  }
  elsif (defined $pid) {
    # packup the arguments
    @estuff=($which);
    foreach (sort keys %stuff) {
      ($a,$b)=split |\|/;
      push @estuff,"$b=$stuff{$_}";
    }
    open (O,">>$mylog") or print "Can't write to $mylog $!";
    print O "Running forker.pl for " .  @estuff;
    exec "forker.pl",@estuff;
    # shouldn't reach this unless exec fails
    open (O,">>$mylog") or print "Can't write to $mylog $!";
    print O "Couldn't exec\n";
    close O;
    exit 0;
    # exit, NOT return.  We're the child process
  }
  elsif ($! == EAGAIN)  {
     sleep 3;
     redo FORK;
  }
  else {
    open (O,">>$mylog") or print "Can't write to $mylog $!";
    print O "Can't fork$!\n";
    close O;
    print " Could not fork $!";
    return 0;
  }
 }

}




If this page was useful to you, please help others find it:  





2 comments




More Articles by - Find me on Google+



Click here to add your comments
- no registration needed!
"It is interesting how basic differences like that do affect OS design and optimization. Windows programmers don't generally understand the cooperative tools idea of Unix/Linux, and that affects basic design decisions."

I think one of the reasons the UNIX philosophy never caught on in the Windows world is its MS-DOS ancestry. DOS's scripting capabilities are very limited when compared to even the earliest form of the Bourne shell. Also, redirection was very limited and pipes, in the UNIX sense, didn't exist.

Given the weakness of the MS-DOS environment, it shouldn't come as a surprise that Windows programmers aren't tuned into scripting as a way to get things done. After all, Bill Gates would just as soon you never looked behind the curtain. You might be dismayed when you saw who was pulling the levers.

--BigDumbDinosaur

Well, I wasn't thinking of scripting, but that's a valid point also. I was thinking more of general program design.

--TonyLawrence

Has anyone got any tips for spawning and managing processes from Perl scripts under Windows? I've been porting a Perl script over from Unix that carries out a lot of its work by spawning subprocesses and have found a number of difficulties. I've been using both pseudoprocesses (created using the fake ActivePerl fork) and real processes (using Win32::Process). It's been a lot of painful trial-and-error learning and I wonder whether anyone else has been down this road before?

-- TomP

Well, as discussed above, Windows doesn't do this very well.
I'm not surprised you have problems, but I can't offer any help: I just don't do Windows, sorry.

--TonyLawrence

I just ran into the same fix with processes in Windows on automated tasks. Not sure if this info's out of date - but the Win32::Job module looks like it could help. I am having trouble getting Win32::Job/Process to handle spaces in arguments (even when quoted, or maybe I'm doing it wrong), and that on Win2000 fork() doesnt return the child pid (?!), it returns a negative number that I am unsure if it is the process group id.

-- CW





Fri Jul 22 21:45:23 2005: 853   anonymous


the negative number returned is a "psuedoprocess" id. It just indicates that it's not a true to life process because of the way windows deals with processes.

Don't miss responses! Subscribe to Comments by RSS or by Email

Click here to add your comments


If you want a picture to show with your comment, go get a Gravatar

Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

Jump to Comments



Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.

I am a Kerio reseller. Articles here related to Kerio products reflect my honest opinion, but I do have an obvious interest in selling those products also.

Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.

We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.

pavatar.jpg

This post tagged:

       - Code
       - Perl
       - Programming















My Troubleshooting E-Book will show you how to solve tough problems on Linux and Unix systems!


book graphic unix and linux troubleshooting guide



Buy Kerio from a dealer
who knows tech:
I sell and support

Kerio Connect Mail server, Control, Workspace and Operator licenses and subscription renewals



Click and enter your name and phone number to call me about Kerio® products right now (Flash required)