How shells call other programs

An article about How Linux or UNIX Understand which program to run got picked up by a few dozen RSS feeds recently. It's not a bad article.

It's actually slightly incorrect though.

The implication is that the shell reads the command and decides what to do. It's actually the kernel that makes a lot of the decisions.

For example, when you type "date", the shell looks through its PATH, finds /bin/date, but then immediately execs it: the kernel loads and runs "date".

You can see that if you run bash with strace and then give it various commands. When you type "date", for example, the shell goes looking in its PATH:

stat64("/usr/kerberos/bin/date", 0xbffff750) = -1 ENOENT (No such
file or directory)
stat64("/usr/local/bin/date", 0xbffff750) = -1 ENOENT (No such file
or directory)
stat64("/bin/date", {st_mode=S_IFREG|0755, st_size=38588, ...}) = 0
stat64("/bin/date", {st_mode=S_IFREG|0755, st_size=38588, ...}) = 0

The shell then clones itself (like "fork" for us older Unix folk; see the man page) and execs /bin/date:

[pid  2088] --- SIGSTOP (Stopped (signal)) @ 0 (0) ---
[pid  2088] getpid()                    = 2088
[pid  2088] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
[pid  2088] rt_sigaction(SIGTSTP, {SIG_DFL}, {SIG_IGN}, 8) = 0
[pid  2088] rt_sigaction(SIGTTIN, {SIG_DFL}, {SIG_IGN}, 8) = 0
[pid  2088] rt_sigaction(SIGTTOU, {SIG_DFL}, {SIG_IGN}, 8) = 0
[pid  2088] rt_sigaction(SIGINT, {SIG_DFL}, {0x8082b40, [],
SA_RESTORER, 0x4004feb8}, 8) = 0
[pid  2088] rt_sigaction(SIGQUIT, {SIG_DFL}, {SIG_IGN}, 8) = 0
[pid  2088] rt_sigaction(SIGTERM, {SIG_DFL}, {SIG_IGN}, 8) = 0
[pid  2088] rt_sigaction(SIGCHLD, {SIG_DFL}, {0x8074ff0, [],
SA_RESTORER, 0x4004feb8}, 8) = 0
[pid  2088] execve("/bin/date", ["date"], [/* 22 vars */]) = 0

If you type the name of a shell script instead, the shell does exactly the same thing, but the exec fails, which causes the shell to read the script and interpret it:

[pid  2086] execve("./shellscript", ["./shellscript"], [/* 21 vars
*/]) = -1 ENOEXEC (Exec format error)
[pid  2086] open("./shellscript", O_RDONLY|O_LARGEFILE) = 3
[pid  2086] read(3, "echo foo\n", 80)   = 9

However, if the script starts with a "#!" and references some other interpreter, it's the kernel (not the shell) that calls that other interpreter. (like Perl, for example).

[pid  2141] execve("./", ["./"], [/* 21 vars */]) = 0
[pid  2141] uname({sys="Linux", node="kerio", ...}) = 0
[pid  2141] brk(0)                      = 0x804bc18
[pid  2141] open("/etc/", O_RDONLY) = -1 ENOENT (No
such file or directory)
[pid  2141] open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686/mmx
/", O_RDONLY) = -1 ENOENT (No such file or directory)
.. similar lines deleted
[pid  2141] stat64("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/tls/i686/
mmx", 0xbffff220) = -1 ENOENT (No such file or directory)
[pid  2141] open("/usr/lib/perl5/5.8.0/i386-linux-thread-multi/CORE/",
 O_RDONLY) = 3

A minor point, perhaps, but more accurate.

How shells call other programs

