APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Automating Program Startup


© March 2000 Tony Lawrence
March 2000

More about startup scripts at Unix and Linux startup scripts, Part 1, including more information about init replacements.

When you need something to start automatically, there are several ways to do it. Which one you use depends on your specific needs.

One of the most common startup tasks is to set a default route. Actually, there are several ways to do that, and which you use really depends on your circumstances and configuration: see Routing for details.

Any program may need environment variables set for it to work. The environment you have when logged in is NOT the same as what a program will have when executed by the means described below. Keep that in mind- you may need to to specifically set such things as PATH, TERM, etc.

See Basic Scripting if you need to know how to create a script and run it.

Inittab

If the program should restart automatically whenever it fails, then inittab with "respawn" is a good place for it. Note that on SCO OSR5 you will want to edit /etc/conf/cf.d/init.base or (depending on just what you've done) one of the other inittab sub-files in /etc/conf/init.d to make any change to inittab stick across kernel rebuilds; see "man inittab". Linux doesn't have this complication.

Anytime you change anything in /etc/inittab on SCO , you need to make the corresponding change in the appropriate sub file. When the kernel environment is rebuilt after a link, these sub-files are gathered together to replace your existing inittab. On a stock system (without third party devices that add their own entries, like multiiport serial cards), the file is built from /etc/conf/cf.d/init.base, /etc/conf/init.d/sio and /etc/conf/init.d/scohttp. Linux is straightforward; just make the change in inittab.

Adding to inittab is fairly simple. You have plenty of examples to copy from, and all you really need to understand is the run-level you want (usually 2) and what "action label" (usually "respawn") you want init to use. Here's a sample line:

mine:234:respawn:/usr/local/bin/myscript

The "mine" is just a (required) unique id, "234" means that I want it running at any level above 2, and "respawn" means to restart it if it gets killed or exits on its own. Be sure to add the same line to /etc/conf/cf.d/init.base so that it will be there after a kernel environment rebuild.

You also have to tell init to take note of your changes. Rebooting is the extreme method; using "init q" is quicker.

There are choices other than "respawn". For example "powerfail" and "powerwait" could be useful- see "man inittab".

Inittab can also be the choice if you just want something to startup once- that's what the "once" flag is for- but generally you probably wouldn't do it that way because it clutters inittab with entries that don't need to be there. Inittab should be used when you need that respawning capability because it's foolproof- your program WILL restart if it dies, no matter what. You can't get that guarantee any other way.

There is another possible trick with inittab, and that's to marry the respawn with one of the special a,b, or c run levels. These special levels (see "man init") only start when you tell them to ("init a" for example), but don't really change the run level- they just add the processes specified for that run-level. That could let you combine the bulletproof respawning (use "ondemand" instead of"respawn" here) of init with timing from cron or some other event that would call "init a" when appropriate.

If the thing you called with respawn dies instantly, init will complain, saying "respawning too rapidly" - it could be a getty, for example that thinks it sees someone logging in. Look for (for example) both tty1A and tty1a enabled, or similar problems. iLook for bad arguments to whatever init is calling too. For example, old SCO Unix might complain if there was any trailing whitespace after the speed argument to "getty".

I saw this with X failing on Linux. I changed:

id:5:initdefault:
to
id:3:initdefault:
 

while I worked on the issue.

The rc scripts

If the program needs to both start and stop (that is, there are special things you need to do when the system is shutting down like clean up temporary files, etc.) then it should have proper S and K scripts in the /etc/rc2.d hierarchy.

Typically, you want something to start at run level 2- when the system goes multi-user. If you examine inittab, you'll see that it calls /etc/rc2 with the "wait" keyword when it enters runlevel 2 (other systems, such as Linux do the same thing, although script names may be different). The /etc/rc2 script is a "superscript"- it calls other scripts. You could just add your command to /etc/rc2 itself, but that's not the way other administrators would expect you do do it.

What you are expected to do is put a script in /etc/rc2.d. It needs to be named so that "prc_sync" (SCO) or /etc/rc.d/rc (Linux) will recognize it. That means it will begin with an I, K, S or P. So that you can control when it runs in relation to the other scripts in /etc/rc2.d, you name it so that, beginning its second letter, it sorts alphabetically to the position you want. That's alphabetic, NOT numeric- so S100mine will execute before S80lp. The first letter is ignored (but it has to be S, I or P for it to run as the system starts- and that has to be uppercase S, I or P- anything else will be ignored by prc_sync).

Linux uses only S and K at this time. Scripts that begin with "K" are "Kill" scripts- normally used to stop your process (if necessary) as the system goes down (technically, as it LEAVES the run level). Note that your process will be killed anyway; you only need a K script if you need to handle the death specially. Many existing K scripts are links to S or P scripts- this works because the control script that calls them uses "start" as an argument for the startup scripts, and "stop" for the kill scripts- the scripts simply test their arguments and act accordingly.

The normal choice is "S" or "P"- scripts beginning with "P" are run in parallel (not on Linux), so don't use this if your script depends on something else (like TCP/IP or the LP service) already being started. Both "P" and "S" scripts are given only so much time to finish (120 seconds by default, but this can be changed in /etc/rc2) and will be killed if they do not complete in that time. Scripts beginning with "I" are never killed. One other difference is that the "I" scripts are allowed to read and write stdin and stdout (usually the console) so they can interact with you at startup and so that you see their output immediately. The other scripts cannot do that; their output goes to a log file which prc_sync displays AFTER the script finishes. See "man inittab" (and for SCO "man rc2.d" and "man prc_sync") for more details.

The SCO /etc/rc? scripts are much more complex than their Linux equivalents. Linux systems I've seen recently use /etc/rc.d/rc and call that single script with the argument for the desired new run level. The Linux scripts are much easier to comprehend at first glance, but the concepts are the same- run the K stuff as you leave, the S as you enter (Linux doesn't have the P type for parallel execution- at least not yet!). Different Linux versions do have different rc scripts, and it is interesting to examine them.

There are very interesting sections of /etc/rc2 (SCO) that run scripts in other directories and at first I thought they are trying to prevent hanging. You can look at the sections in /etc/rc2 that start off with a "sleep 100000 &" to see what I mean.

First thing, they capture the PID of that sleep process. Then, they start a "{" block which keeps the code within those brackets in this shell, probably so they can use the saved PID within the block without exporting it.

Within the block, they loop through the scripts, executing them with /bin/sh. Outside of that "for" loop, but within the block, a "kill -ALRM" is sent to the sleep PID. The entire block is piped to the LOGCMD which has it's own "this shell" brackets and is sent to background.

Outside of this, a "wait" is done on the sleep PID.

When I first looked at this, I immediately thought that it's purpose is to act like prc_sync and move on if the scripts being run don't complete, and actually this would do that, but not for 1 million seconds, because that "kill -ALRM" can't be reached until the scripts complete- so what's the point anyway? If the scripts finish, it seems like it sends a kill that's only necessary because of the "wait" for the sleep. If they don't finish, you'll eventually move on, but 1 million seconds is a long time to wait..

So- what the heck are they trying to do here? Why not just run the scripts without this sleep/kill/wait packaging? And why use a kill -ALRM anyway? If I were going to do something like this, I would have set a 120 second sleep around each script execution, but why would I kill the sleep with ALRM specifically- unless it's just to keep the message that would appear to match prc_sync's "Alarm Call".

The only thing all this seems to accomplish is to allow the for loop to run in background while the rc2 script waits for it. What's the point of that? If you aren't going to proceed for 277 hours, why bother?

One of SCO's TA's (https://aplawrence.com/cgi-bin/ta.pl?arg=110825) explains this, or at least pretends to. According to the TA, the scripts are run in parallel (they aren't) to speed up execution, and the "sleep" is there to prevent a bad driver load (apparently that's what these scripts are intended for) from hogging the cpu 100%. That makes no sense to me, because there are several processes started before this, including anything that starts from /etc/tcp. If you didn't have telnetd running, there would be no way to use the machine if one of these scripts did hang, so having the sleep seems superfluous. Further, if the sleep time were knocked down to a more reasonable number, a hang wouldn't stop the initialization. Maybe I'm missing something, but I sure don't think this makes any sense.

Anyway, if you put a badly written script in one of these directories, it can stop your system from booting (at least for 277 hours!) because inittab is set to "wait" for /etc/rc2 to finish- if it does not finish, getty's never start on the console. If you have TCP/IP, that will have started before these scripts, so you can telnet in. If you did a "ps", you'd see /etc/rc2 still running- kill it and init would proceed.

The only reason to use rc2.d scripts is if you want the timeout features, the parallel processing, or if you want a matching "K" script in /etc/rc0.d that will get called with "stop" (see /etc/rc0) to shut your process down as the system goes down.

If you just need it to startup, and don't need any of those features, just add your script to the /etc/rc.d/8 directory (SCO). You can name this anything you like; no special conventions are required. But understand that this means ANY script left there will be run- I've seen people make copies of the "userdef" file ("userdef.090198" etc.) thinking that these aren't executed- they will be. It usually is not a bad idea to put whatever commands are inside in the background (&) so that it doesn't delay your startup should it hang or otherwise be a long time returning. That would usually mean that you'd have two scripts- one in some other directory (/usr/local/bin perhaps) that is the actual script you want to run, and one here in /etc/rc.d/8 that just calls that script with an "&" to put it in background. Another reason to do this is when you need to delay the startup. I often use a script I call "setprint" to set flow control for serial printers. Although the multiport manufacturers all claim that this shouldn't be necessary, I have found that this sometimes won't work unless you delay it a bit. That's easy to do with a "sleep" command, but if I put the "sleep" in a script in /etc/rc.d/8/, my startup would be delayed. So, I put a "printers" script in /etc/rc.d/8 and it contains "/usr/local/bin/setprint &". The actual "setprint" script in /usr/local/bin has a "sleep 60" at its top, but that doesn't delay startup- the rc.d/8/printers script returns immediately, and then "setprint" starts doing its work 60 seconds later.

Keep in mind that these rcd.8 scripts are called by /etc/rc2.d/P88USRDEFINE, so they will actually be executing in parallel with the other P scripts and will be executing before any of the *90 scripts.

For most Linux, add commands to /etc/rc.local.

If you need something to run under the identity of some user other than root, use

su user -c "command"
 

Things have changed since this was first written. See Unix and Linux startup scripts, Part 1 for more recent information.

Inetd.conf

If you want your program to respond to a network connection, and don't need it hanging out forever waiting for one, take advantage of "inetd". You need an entry in /etc/inetd.conf (plenty of examples to copy) and also in /etc/services. You then need to signal inetd with a "kill -1" to get it to re-read its files.

More modern Linux systems use Xinets

Cron and at

A final way to do this is to use cron to start it and stop your script at certain times of day- that's sometimes done so that the program won't be running during backups, etc. You need something that will kill the process at the appropriate time, and then cron or at can start it up again after the backup finishes. You will probably want to put this in a simple shell wrapper that uses ps or some other method to determine if it is already running and abort if so, so the "start" portion can be called regularly if the program is prone to failure from outside causes. Also see Cron, At and Batch (examples)


Autologin

You may want a program running on a particular terminal or screen. This is sometimes called a "kiosk" application. You can do that with an appropriate entry in inittab, but it does take a little more setup to get everything really working as it would be had you actually logged in and run the program. John Dubois has a nice "autologin" script for SCO at ftp://ftp.armory.com/pub/admin/autologin that you can use for this. You'd create a new user (or use an existing user) who's .profile will do whatever is required to run the application. Edit /etc/inittab (and the appropriate sub-file; see above) to create the autologin. For example, here's "tony" set for autologin on tty11:

c11:234:respawn:/usr/local/bin/autologin tony tty11 38400
 

It's also possible to use commercial products like Facetterm to start multiple programs with one login. Combining that with the "autologin" script can get quite a lot happening on a single terminal!

Starting a process on another machine

You may need to start a process on some other machine because of something that has happened here. There are several ways to to that. The so called "r" commands ("rsh", "rcmd") are the obvious choice if available, but these are often disabled or removed for security reasons (see General Security). If so, you need to find some other way to do it. An snmp trap running on the other machine is one way, and so is using Expect to automate a telnet session to login and execute the command. If you are going to execute the command in background and log off, you need to use "nohup" to keep it running (or write it to ignore the HUP signal).

Ckermit can sometimes be used instead of expect; see Kermit Scripts


Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> Automating Unix/Linux Program Startup

1 comment


Inexpensive and informative Apple related e-books:

Sierra: A Take Control Crash Course

Take Control of Upgrading to El Capitan

Take Control of Preview

Take Control of Automating Your Mac

Digital Sharing Crash Course




More Articles by © Tony Lawrence






Tue Sep 25 18:29:00 2007: 3158   JoshDavis


Thanks for the great howto on the different types of init scripts. I found it very informative and easy to understand. My thoughts on the P scripts and the sleep time is that should a script lock, since sleep is being called it must prevent the whole system from locking. If one or two P scripts were to lock without sleep I would think any CPU would become 0% free causing one heck of a system wide hang.

------------------------


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





A Netscape engineer who shan't be named once passed a pointer to JavaScript, stored it as a string and later passed it back to C, killing 30. (Blake Ross)




Linux posts

Troubleshooting posts


This post tagged:

Kernel

Popular



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode