This article discusses how multiple operating systems
(OSes)
hosted under an emulator or virtual machine can be set up on a single
host and used to automate cross-platform testing of code compilation or
shell
scripts.
It provides details of how to do this
on a Unix-like OS, using Gentoo Linux on an Intel Pentium as an example
host and using QEMU
as the emulator. The example includes running the following
Unix-like operating systems as guests under the host: NetBSD, OpenBSD,
FreeBSD,
Solaris and Gentoo Linux (configured as a guest in case the host
OS is changed).
Fully-functional scripts and reasonably detailed configuration steps
are included. Problems and alternative approaches are documented.
This document assumes a single-user system, however it does describe
the issues (largely security-related) involved in a multi-user
implementation and
how to deal with them. Two user types are defined so that it will
be
easier for a later revision of this document to default to multi-user
instructions.
Whilst this example does not use guest OSes other than
free-as-in-beer Unix-like OSes running on x86, the approach does not
preclude their use. Some modifications to the approach may be
required for
OSes significantly different from Unix, in particular when a
shell-scripting environment is not present.
Most documentation occurred long after installation and
configuration so I may have missed some steps.
Some scripts were tidied up or fully rewritten for
public release, so they are recent and have not been long tested in
production.
I
would have liked to have used Apple's OSX as a guest OS - as well as
being a popular Unix-family OS it would have been the only non-x86
(PowerPC) test of QEMU's emulation - but I don't have a copy of OSX and
I am not
willing to pay for one.
A complete and ordered instruction
summary can be followed independently of the full article by reading
only the text formatted like this. At times the instructions
assume Gentoo Linux as the host OS. This instruction summary is
not, however, standalone, as it refers to scripts/commands/code that
must be
copied from the full article,
Any outstanding problems or issues to be aware of
are
formatted like this,
Variables that should be
replaced with appropriate values are formatted like this.
Reminders to myself of incompleteness or things to
be improved (todos) are formatted like this.
How often have you wondered whether the code you have just written
will compile on
NetBSD or whether a particular shell script will work with the Solaris
version of sed?
What if you need to know for many different operating systems and want to test on all of those OSes locally so that you have control over their configuration? i.e. no remote internet-accessed test hosts.
You could partition your PC and install all of those OSes on
separate
partitions, and then boot into each OS and test separately. This
could even be automated so that as one OS shuts down it modifies the
boot loader to boot the next OS in sequence. The problem is that
time is wasted during booting up and shutting down, and this means that
each
test run has a long duration during which the machine is unavailable
for other uses.
The availability problem can be solved by setting up one or more
separate networked machines
that handle one or more OSes and invoking test processes over the
network. This is
fine if you have more than one machine, but unless you have a dedicated
machine for each OS you will still have to waste time booting between
the
OSes. If you don't want to waste power by leaving the
test machine/s turned on all the time then you will also still have to
mess
around switching it or them on and off manually, or buy or build
some control hardware to do the job.
Use an open-source emulator or virtualisation technique to run all
the
different operating systems as processes on a host OS. Now there
is no need for more than a
single physical machine, no need to reboot the host OS and no period
of
unavailability - you can occupy yourself with your usual tasks while
the tests run.
Let me describe the final setup before giving instructions on how to achieve it.
An arbitrary script can be run on all of the following guest OSes: NetBSD, OpenBSD, FreeBSD, Solaris and Gentoo Linux.
This can be done using a single command:
runall-os-test -s my_script
The runall-os-test
script connects to each OS in turn
(first booting
it under QEMU if
necessary) through ssh and runs my_script
, piping all the
steps it
takes as well as any error message from my_script
to stderr
with a
time stamp for logging purposes.
The my_script
script can assume
that three specific local directories will exist on the guest OS under
which it is running. These directories are network-mapped to
directories on the
host OS to make it possible to review script output when the guest OSes
are offline:
my_script
must be
located under this directory and will be run from this directory as
mounted locally by the guest; any read-only files that my_script needs
may be located here.In addition the script can assume that it runs on each guest OS
under an identical
username with an identical user id (UID).
A dynamic IP addresses on a unique
subnet is assigned to each guest OS. Using some scripting the
/etc/hosts files are modified so that a guest OS can always be referred
to by its hostname from the host
OS and the host OS can always be referred to by its hostname from
the guest OS.
The main problem with non-Unices is that a shell scripting
environment can't be guaranteed. This would require modifications
to the approach perhaps so that depending on what was being tested, a
different launch or helper program was run on the guest OS - e.g. a
version of make to test a compilation.
At the moment this approach maps directories over the network using
NFS. I have, however, tested running Windows 98 and XP as guest
OSes under QEMU and they can successfully connect to a SAMBA server on
the host OS; dhcp assignment of an IP address to the virtual network
card succeeds under these OSes too. So realistically NFS is not a
limitation. I haven't investigated the feasibility of
automatically scripting changes to the hosts file on those OSes.
I suspect that it would be a little harder than for all the UNIX-like
OSes that I currently use as guests, under which it was fairly easily
achieved. Possibly it would require a custom executable;
alternatively the problem could be avoided by using a name server
approach as discussed below.
Given that a virtualisation technique would offer better
performance, why use an emulator? Firstly, because I wanted to
use only open
source or freely available software, and the only virtualisation
technique that I know of that fits that bill is Xen.
However not all of the OSes that I want to run have had the required
modifications made to their kernels so that they can run as guests
under Xen, so Xen is not yet adequate for the task. It may soon
be ready, especially since hardware support has been promised by at
least Intel and AMD. This will allow any unmodified OS kernel to
run under Xen, although a modified kernel will run even faster.
Also under Linux on x86 processors the kqemu module speeds up QEMU
when emulating an x86 machine so that performance is
reasonable and the additional performance offered by virtualisation is
not quite so compelling - although it's still a good reason to switch
to Xen when it is capable of running any x86 OS.
The other advantage of QEMU over virtualisors
like Xen or the commercial VMWare is that it can emulate CPUs and
hardware other
than that of the host. In particular this allows operating
systems designed
for PowerPCs (i.e. OSX) or Sparcs to be run. I haven't yet
attempted
to run a guest OS other than for x86 though.
An alternative to QEMU is Bochs
but my reading suggests that Bochs
is
very much slower than QEMU even when running QEMU without
the kqemu module. Boschs also does not emulate hardware other
than x86.
QEMU may not be such an appropriate choice when the host machine is
not x86-compatible and the guest OSes are to be native to the hardware
of the host. In that case other options should at least be
considered, and another option will be necessary if the host's hardware
is not emulated by QEMU.
Why not use
static
IPs? Because QEMU doesn't provide a means to set the IP address
that it assigns to each tun interface as it boots the virtual
machine. This means that to maintain a static IP address, the
guest OSes would have to be booted in a
specific sequence, and for example the last OS in the sequence could
not be booted without booting the rest of them. It is far
preferable to have the flexibility to boot them in any
order.
Some sort of name service like bind could have been used to maintain
IP address to hostname mappings rather than relying on
scripted modifications to /etc/hosts files. However I have no
other need for bind and I've chosen to avoid it - partly because of the
extra security risk of running unnecessary network software and partly
because I prefer the minimalist approach.
The name service and /etc/hosts approaches
each
require scripting on the host; the drawback to the chosen /etc/hosts
approach is that scripting is also required on each guest OS.
The choice of the three directories is somewhat arbitrary. A simpler approach might use a single writable directory. The reasoning behind the read-only directory is to prevent buggy scripts from deleting themselves or other scripts so that other guest OSes can no longer access them. The shared write-only directory is included for situations where it is desirable to make it easier to look at output from separate guest OSes on the host.
It would be useful to map one or more of the
specific directories chosen to the home directory/ies of the
test-script user(s) on the guest OS, however I have not done that for
this version of the document.
A reasonable level of proficiency in UNIX would be helpful, but the
instructions are hopefully detailed enough for anyone to follow -
provided they have access to documentation.
The reference host on which this approach was developed currently
has this configuration:
Some modification to these instructions may be required for
different
hosts.
The initial goal is to set up a virtual network of virtual machines,
each running a different operating system, Each virtual machine
may be booted or shutdown independently of any other machine (although
rebooting the host on which QEMU is running will have some unfortunate
consequences for the rest of the virtual network). There must
only be one running instance of each virtual machine.
Each virtual machine will be dynamically assigned an IP address as
it boots up. This IP address will be on a
separate subnet to all other virtual machines. A separate dhcp
server will be invoked for each virtual machine. The process that
configures and runs the dhcp server is triggered by QEMU. The
(re)mapping of hostnames to IP addresses as an address is dynamically
assigned will occur automatically on both guest OS and host OS through
scripted changes to /etc/hosts.
There are two user types that I will refer to. One is the
single qemu-invoking user. This is the user under which the QEMU
processes will run. This user needs to be able to do things that
require root privileges, such as configure network interfaces, start
dhcp servers for those interfaces and modify entries in
/etc/hosts. This is achieved through the use of the sudo package.
The other user type is the test-script user. This is a user
with an account on the host OS and matching accounts on each of the
guest OSes (same username and UID, but GID isn't required to
match). This is the user under whose account will be run the runall-os-test
script.
This document assumes a single-user system
where the test-script user is the same as the qemu-invoking user.
The provided scripts and setup instructions make the same assumption,
however the changes that would need to be made for a multi-user system
are discussed. For this reason, and because in
a future revision of this document I may remove the assumption of a
single-user system and provide specific multi-user instructions,
the single user has been separated into two. The main issue is of
course security.
I highly discourage using
the root account for either of these user types for the same reasons as
always. Anyone who doesn't
understand what this means would be well advised to find out.
These steps are all specific to Gentoo Linux. For all other
hosts, do whatever is necessary to install QEMU, preferably with kqemu
support if using Linux on x86.
As the root user for this and subsequent
commands
unless told otherwise, add kqemu to the USE flags in /etc/make.conf.
Currently (July 2005) the earliest version of QEMU with kqemu support - 0.7.0 - is masked under portage. To gain access to it, add this line to /etc/portage/package.keywords:
app-emulation/qemu ~x86
Then add this line to /etc/portage/package.mask:
>app-emulation/qemu-0.7.0
The above step prevents versions later than 0.7.0 from
being installed. Given that version 0.7.0 is currently masked, it
is taking a risk running it in the first place and given also that I
have tested qemu-0.7.0 on my machine and know that it works without
as-yet obvious problems, there is no need to risk a later
version. Those wiling to take the risk - or if this article is
old enough that versions have significantly changed - may choose to
omit this step or change the version number.
Now, emerge QEMU:
emerge qemu
This will download, compile and install QEMU version 0.7.0 as well as the kernel module kqemu.
Be aware that kqemu is not open source and has specific licence requirements that you should read before using it.
Add the qemu-invoking user to the kqemu group. This will give the user read and write permissions to /dev/kqemu which at some point will be automatically created (if not, the k/qemu documentation describes how to create it manually).
First ensure that the qemu-invoking user has read/write permission on the /dev/net/tun device:
chmod ugo+rw /dev/net/tun
will be sufficient unless restrictions are desired in a multi-user
system.
The scripts provided assume the use of the udhcpd server from the
udhcp package. For Gentoo Linux, emerge udhcp:
emerge udhcp
It may be necessary to obtain the udhcp package from its website for hosts without a
package manager or whose package manager does not include it.
For other dhcp servers the script may need modification - particularly the location and format of the option file created by the /sbin/start-tun-interface script as provided below.
QEMU by default invokes /etc/qemu-ifup at startup, passing as a
single parameter the name of the
tun interface that it is assigning to this guest OS (e.g. tun0, tun1,
etc). From the
html documentation and observation, the interface tunX corresponds to
host IP 172.<20+X>.0.1 and guest IP 172.<20+X>.0.2.
This script must achieve several things:
The /etc/qemu-if script will be run with the privileges of the qemu-invoking user. As I've already said, root should not be used to run QEMU. The script must perform privileged operations so it is invoked through sudo, which must now be installed if it is not already. Under Gentoo Linux, emerge app-admin/sudo:
emerge app-admin/sudo
The script /etc/qemu-if will be a stub that invokes the real script - /sbin/start-tun-interface - with root privileges granted by sudo. So give the qemu-invoking user permission to run /sbin/start-tun-interface as root. Run visudo and add this line to the sudoers file:
username_of_qemu-invoking_user ALL = (root) NOPASSWD: /sbin/start-tun-interface
Since the script will modify the contents of /etc/hosts it must be
careful not to allow a user to corrupt this file. There are two
issues here. One is intentional malicious vandalism and the other
is accidental error. The first issue only
applies to a multi-user system, which this document isn't specifically
tailored to. The
way to mitigate it is by not allowing a QEMU process to be invoked
directly by a test-script user. Instead the
runall-os-test script (described later) would be broken into two
parts. The first
part would run under the test-script user account, and the second part
would be solely concerned with invoking QEMU processes.
The second part would make sure that options not specified in the guest
OS config file could not be passed to qemu. It would be
non-writable by test-script users and it would be executed from the
first script by using sudo to run it as the qemu-invoking
user. So a test-script user would be defined as a user given
permission in the sudoers file to run the second script as the
qemu-invoking user. Unless there were an error in the script
allowing the
test-script user to break to a shell and assume the identity of the
qemu-invoking user, this would prevent malicious access to the
/etc/qemu-if script. Even if that were to occur,
though, there is a second layer of safety:
A third script - /sbin/update-dynamic-hosts - is used to limit any
changes to a specific section of
the hosts
file and prevent the addition of hostnames or IP addresses that exist
outside of this section. This could further
be extended to limit changes to those specific hostnames listed in the
guest OS
config file and to the possible range of IP addresses assigned by
qemu. Also
it would be appropriate (if possible) for /sbin/start-tun-interface to
check that the tun interface
specified is not already in use before continuing.
Create the /etc/qemu-ifup
script by
copying-and-pasting the following block of commands into a terminal
window (this should be done still as root). Don't do this
as the qemu-invoking user. It's safer that the qemu-invoking user
not have ownership or write permission on the scripts. A warning:
for convenience I've included the cat command to automatically create
the scripts provided in this document, however pasting into some
terminal programs, especially the
larger scripts, is problematic - I've had some terminal programs skip
lines and corrupt scripts. So be wary of this approach and if you
suspect corruption, then create each script in an editor and copy and
paste from the code provided (ie the lines between the
<<'ENDOFSCRIPT' and the ENDOFSCRIPT).
cat >/etc/qemu-ifup <<'ENDOFSCRIPT'
#!/bin/sh
# /etc/qemu-ifup
# Takes one parameter which should be tun<num> eg tun0, tun9, tun12, ...
# This will run under the qemu-invoking user account
# The qemu-invoking must be given sudo permission to run
# /sbin/start-tun-interface
# which must be owned by root and should have permissions 500 (r-x------)
sudo /sbin/start-tun-interface $1
ENDOFSCRIPT
chmod 755 /etc/qemu-ifup
Create the /sbin/start-tun-interface script by copying-and-pasting the following block of commands into a terminal window. Edit it so that the variables set in the top block have appropriate values (refer to later instructions if the meaning of any variables is unclear).
cat >/sbin/start-tun-interface <<'ENDOFSCRIPT'
#!/bin/sh
# /sbin/start-tun-interface
# Takes one parameter which should be tun<num> eg tun0,tun9,tun12,...
# Configures the address of the interface and starts a new dhcp server
# for that interface; killing any existing dhcp server.
# Searches for a file giving the hostname of the guest OS as left by a
# parent process; if a file is found, /etc/hosts is updated so the
# guest OS's hostname aliases the newly assigned IP address.
# Note that running qemu as a background process after storing the
# OS's hostname will dissociate the qemu process so that it will
# not be able to find the hostname file - it will have lost its parent.
# This needs to run as root using sudo from the qemu-invoking user
# account.
# It should be owned by root and have permissions 544 (r-xr--r--)
# Configurable variables
TMPHOSTNAMEDIR=/tmp/qemuhostnames
CONFTMPDIR=/tmp
LEASEFILEDIR=/var/lib/misc
# paths to commands
UDHCPD=/sbin/udhcpd # path to udhcpd
IFCONFIG=/sbin/ifconfig # path to ifconfig
GREP=grep # path to grep
PS=ps # path to ps
SED=sed # path to sed
KILL=kill # path to kill
STAT=stat # path to stat
DATE=date # path to date
# misc config
BASE=172 # First part of the interface's IP address
DELAGE=3600 # hostname files older than one hour may be deleted
TOOOLD=600 # a hostname file ten minutes or older is out of date
# (should average around 10 seconds)
# strip the leading "tun" (if any) from $1
TUN=${1#tun}
# convert $TUN to a definite number (0 if $TUN is not numeric) stored
# in $TUNNUM
let TUNNUM=$TUN+0 2>/dev/null || let TUNNUM=0
# if $1 was not in the form tun<num> where <num> has no superfluous
# preceding zero digits then exit with error 1
if [ "$TUN" == "$1" ] || [ $TUN != $TUNNUM ] ||
( [ $TUNNUM -eq 0 ] && [ "$TUN" != "0" ] )
then exit 1; fi
# TUN is validated as a number; now add 20 to it since
# interface tun<n> gets assigned address 172.<n+20>.0.1
let TUN=$TUN+20;
# Set up variables
IF=$1 # interface eg tun0
CFG_FILE=$CONFTMPDIR/udhcpd.$TUN.conf # dhcp server config file
LEASE_FILE=$LEASEFILEDIR/udhcpd.leases.$IF # dhcp server lease file
DHCPCMD="$UDHCPD $CFG_FILE" # cmd to start dhcp server on interface
# configure the interface
$IFCONFIG $IF $BASE.$TUN.0.1;
# check for an existing dhcp server for the interface and stop it if
# one exists
# first store into $TMP a line(s) in the format "<PID><space(s)><DHCPCMD>"
TMP=`$PS ax o pid,cmd | $GREP "$DHCPCMD" | $GREP -v grep`
if [ -n "$TMP" ]
then
# separate the TMP variable into the positional parameters
set $TMP
# store the PID (if any) of the running dhcp server (ignoring
# the remote possibility of multiple running dhcp servers on
# the interface or a dhcp server invoked with a different
# command and from other than this script)
DHCPD_PID="$1"
# kill any existing dhcp server
if [ -n "$DHCPD_PID" ]; then $KILL $DHCPD_PID; fi
fi
# remove any existing config file or lease file
rm -f $CFG_FILE 2>/dev/null
rm -f $LEASE_FILE 2>/dev/null
touch $LEASE_FILE
# get a list of name servers from /etc/resolv.conf
DIG3="[[:digit:]]\{1,3\}"
IP="\($DIG3\.\)\{1,3\}$DIG3"
NS="^[[:space:]]*nameserver[[:space:]]*"
NAME_SERVERS=`$SED -n "s/$NS\($IP\).*/\1/p" /etc/resolv.conf`
# translate newlines into spaces
NAME_SERVERS=`echo $NAME_SERVERS`
# write a specific config file for the dhcp server for this interface
cat >$CFG_FILE <<EOF
start $BASE.$TUN.0.2
end $BASE.$TUN.0.3
interface $IF
opt dns $NAME_SERVERS
option subnet 255.255.0.0
opt router $BASE.$TUN.0.1
option domain local
# short lease expiry time as Solaris 10 doesn't seem to renew its
# address on reboot if its lease hasn't expired
option lease 120
# since the MAC address is the same for all the different virtual
# machines, each ip address must be associated with the MAC address in
# a different file for each OS
lease_file $LEASE_FILE
EOF
# start the dhcp server for this interface
$DHCPCMD
# search for any file in TMPHOSTNAMEDIR containing the hostname of the
# OS being invoked; the file must have been created by an ancestor
# process and be named for that process's PID; closer ancestors are
# favoured over those more distant.
# If a hostname is found, the /etc/hosts file is updated
unset HN
# if we can't get the current time then skip the search
# the %s specifier is GNU-specific and returns the date
# in unix seconds-since-the-Epoch format
# this script must be modified if the date command does
# not support %s
if NOW=$($DATE +%s)
then
COUNT=1
# start at the current process
PID=$$;
# iterate up the process tree looking for a file named for its
# process ID until we reach init (PID of 1)
MAXCOUNT=100 # to prevent any bugs causing an infinite loop
while [ $COUNT -lt $MAXCOUNT ] && ! [ $PID -eq 1 ]
do
let COUNT=$COUNT+1
FILE=$TMPHOSTNAMEDIR/$PID
# file must exist and we must be able to stat it
if [ -f $FILE ] && TSTAMP=$($STAT -c %Y $FILE)
then
let AGE=$NOW-$TSTAMP
if [ $AGE -gt $DELAGE ]
then
# file too old - delete
rm -f $FILE 2>/dev/null
elif [ $AGE -lt $TOOOLD ] &&
read HN < $FILE && [ -n "$HN" ]
then
rm -f $FILE 2>/dev/null
# update /etc/hosts to reflect the
# guest OS hostname's newly assigned
# IP address
/sbin/update-dynamic-hosts $HN \
$BASE.$TUN.0.2
break;
fi
fi
# obtain from ps the parent process id preceded by a
# header line
PID=$($PS -p $PID -o ppid) || break # avoid inf loop
# ${#} removes the first line header returned by ps
# `echo $` removes the newline left by the header
PID=`echo ${PID#*PPID}`
done
fi
if [ -z "$HN" ]
then echo "Guest OS's hostname not found; /etc/hosts not updated" 1>&2
fi
ENDOFSCRIPT
chmod 544 /sbin/start-tun-interface
Different versions of sed may not support the [[:digit:]] or
[[:space:]] constructs, in which case they must be replaced with an
appropriate substitute: [[:space:]] represents any whitespace and
includes tab as well as space; [[:digit:]] is simply a digit from 0 to
9.
Create the /sbin/update-dynamic-hosts script by copying-and-pasting the following block of commands into a terminal window:
cat >/sbin/update-dynamic-hosts <<'ENDOFSCRIPT'
#!/bin/sh
# /sbin/update-host hostname ip-address
#
# Updates an IP address - hostname mapping in the /etc/hosts file
# Must be run as root
# Changes are limited to the section of the file enclosed by
# the delimiters (at line beginning) ##STARTDYNAMIC and ##ENDDYNAMIC
# The rules are: a hostname/IP address may be added/modified if both
# the hostname and the ip address do not exist outside the dynamic
# section.
HN="$1" # hostname
IP="$2" # IP address
HOSTS=/etc/hosts # hosts file to use
TMPFILE=/tmp/hosts.$$ # temporary file
# must be 2 arguments and $IP must be a valid-seeming IP address
DIG3="[[:digit:]]\{1,3\}"
IPREGEXP="\($DIG3\.\)\{1,3\}$DIG3"
if [ -z "$2" ] || ! echo "$IP" |
grep "$IPREGEXP[[:space:]]*$" >/dev/null 2>&1
then exit 1; fi
# delete a matching hostname or IP within the dynamic section (by not
# printing it)
# return error on matching hostname or IP outside dynamic section
# if no matching ##ENDDYNAMIC for a ##STARTDYNAMIC, then add it at end
# of file
# when end of dynamic section reached, insert the new $IP, $HP line
if ! awk "
BEGIN { dyn = 0 }
/^##STARTDYNAMIC/ { if (dyn == 0) dyn = 1;
print \$0; next; }
/^##ENDDYNAMIC/ { if (dyn == 1) {
dyn = 2;
print \"$IP\", \"$HN\";}
print \$0; next; }
/^[[:space:]]*$IP/ { if (dyn != 1) exit 1;
else next; }
/^[^#]*[[:space:]]$HN([[:space:]]|#|\$)/ {
if (dyn != 1) exit 1;
else next; }
{ print \$0; }
END { if (dyn == 0) {
dyn = 1;
print \"##STARTDYNAMIC\"; }
if (dyn == 1) {
print \"$IP\", \"$HN\";
print \"##ENDDYNAMIC\"; }}
" $HOSTS > $TMPFILE
then
rm $TMPFILE
exit 1
fi
# save a backup copy of hosts file
mv -f $HOSTS $HOSTS.old
# copy the new hosts file over the original
# clean up on error
if ! mv -f $TMPFILE $HOSTS; then rm $TMPFILE; exit 1; fi
ENDOFSCRIPT
chmod 544 /sbin/update-dynamic-hosts
That's the host network setup done, Now create the qemu-invoking user account on the host OS if it does not yet exist.
Qemu prefers to use /dev/shm, so if shm is supported by
the host kernel then add an entry
for
/dev/shm in /etc/fstab such as this:
none /dev/shm tmpfs size=400m,defaults 0 0
That size is just enough to handle running all the operating systems
simultaneously using the memory sizes shown above after they have been
installed. If there is not enough space on the shm mount, Qemu
will
not run and will print an error message explaining how to add
space. No memory is actually used by the shm device until
requested by a process and it is pageable like other process memory in
Linux.
Grant the qemu-invoking user access to /dev/shm. Unless the host is multi-user and it is desired to limit access to the shm device, this command - as root - will suffice:
chmod 777 /dev/shm
Change over to the qemu-invoking user
account
and install the guest operating systems. On each guest OS:
I won't detail the
installation process because each OS has its own sufficient
installation documentation, although these issues deserve mention:
Some general quick-start tips:
Use qemu-img
to create disk images (virtual hard disks
in a
single file) to install the operating systems onto. Ensure that
the qemu-invoking user owns the disk images and that they are
non-writable by other users; create them all in the same directory as
runall-os-test assumes this:
chmod 544 diskimage
Use qemu
to invoke an instance of the operating system installer
using the disk image to install onto and booting from a
floppy disk or cdrom image using the -boot
, -cdrom
and -fda
options. The amount of memory seen by the guest OS can be
specified with
-m
. Often it is necessary to pass -localtime
as an option (see the qemu manpage).
For reference, these are the versions of the guest OSes I have
installed as well as the approximate used space of their disk
images, the size of the -m option used during install and the size of
the -m option used post-install. Åll versions are for x86
and I recommend
considering these disk sizes as minimums (although if you are familiar
with Solaris you may not have to do a full install as I did).
Operating
System |
Version |
-m
for install |
-m
post-install |
Disk
Image Size |
FreeBSD |
5.4 |
64 |
64 |
1Gb |
Gentoo Linux |
May 2005;
kernel 2.6.11.10 |
64 |
64 |
600 Mb |
NetBSD |
2.0.2 |
64 |
64 |
400 Mb |
Solaris |
10 |
96 |
128 |
3.4 Gb |
OpenBSD |
3.7 |
64 |
64 |
400 Mb |
If IP masquerading - also known as NAT - is enabled on the host and
the host is internet-connected then the internet will be accessible
from those guest OS installers that recognise the emulated network card
(NE2000) and use dhcp to configure it (since in the previous section
the host's dhcp server was set up) - which is most of them. So
installations over the internet using ftp are possible for those OS
installers that support it.
Since the IP subnet of a guest OS may change between boots, it
cannot have a constant mapping for the host OS's hostname in its
/etc/hosts file. This is handled by some scripting triggered
by the dhcp client's acceptance of an IP
address. The details vary slightly for each
guest OS. Reminder to self: the grep -v
should
check that the host_os_hostname
occurs at the beginning
of the line with an optional arbitrary amount of whitespace preceding
it.
In the FreeBSD, NetBSD and OpenBSD guest
OSes, as root, find the first line in /sbin/dhclient-script that
contains route add default $router
and add this code
immediately after that line:
grep -v host_os_hostname /etc/hosts > /etc/hosts.new
echo "$router host_os_hostname" >> /etc/hosts.new
cp -f /etc/hosts /etc/hosts.old
mv -f /etc/hosts.new /etc/hosts
This is not an ideal approach because
/sbin/dhclient-script is a system file liable to be replaced on upgrade.
A more maintainable approach would be preferable.
In the Gentoo guest OS as root, create the /var/lib/dhcpc/dhcpd.exe file as follows:
cat >/var/lib/dhcpc/dhcpd.exe << 'ENDOFSCRIPT'
#!/bin/sh
GW=`grep GATEWAY "$1"`
grep -v host_os_hostname /etc/hosts > /etc/hosts.new
echo "${GW#*=} host_os_hostname" >> /etc/hosts.new
cp -f /etc/hosts /etc/hosts.old
mv -f /etc/hosts.new /etc/hosts
ENDOFSCRIPT
chmod 755 /var/lib/dhcpc/dhcpd.exe
and to /etc/conf.d/net add:
iface_eth0="dhcp"
gateway="eth0"
In the Solaris guest OS as root, create
the
/etc/dhcp/eventhook file as follows:
cat >/etc/dhcp/eventhook << 'ENDOFSCRIPT'
#!/bin/sh
if [ "$2" = "BOUND" ]
then
read HOSTNAME < /etc/nodename
SERVERIP=`/sbin/dhcpinfo -i $1 ServerID`
grep -v host_os_hostname /etc/hosts > /etc/hosts.new
echo "$SERVERIP host_os_hostname" >> /etc/hosts.new
cp -f /etc/hosts /etc/hosts.old
mv -f /etc/hosts.new /etc/hosts
fi;
ENDOFSCRIPT
chmod 755 /etc/dhcp/eventhook
Create the following directories on the
host
OS and on each guest OS; the directories should all be owned by root or
the qemu-invoking user; for a single-user system I suggest the
qemu-invoking user own them and that they have permission mode 744
- although
the guest directory permissions are irrelevant as they will be
overridden by the NFS server. Note that if creating directories
as per the scheme suggested below, the "common" subdirectory must be
created in each of the RW_HOST_BASE/guest_os_name directories on the
host so that it can be used as an NFS mount-point by the guest OS. The
three directories
to be used as NFS client mount-points on each guest OS are:
Name
(replace this in scripts) |
Purpose |
Location
in the reference guest
OS |
COMMON_RO_GUEST | Read-only; shared by all guest
OSes |
/netshare-ro |
COMMON_RW_GUEST | Read-write; shared by all guest
OSes |
/netshare-rw/common |
RW_GUEST | Read-write; exclusive to each
guest OS |
/netshare-rw |
On the host OS the corresponding directories that are mapped are as
below. The host
directory corresponding to RW_GUEST is determined by
RW_HOST_BASE/guest_os_name.
Name
(replace this in scripts) |
Purpose |
Location
in the reference host OS |
COMMON_RO_HOST |
Read-only; shared by all guest
OSes |
/data/qemu/netshare/common-ro |
COMMON_RW_HOST |
Read-write; shared by all guest
OSes |
/data/qemu/netshare/common-rw |
RW_HOST_BASE |
Read-write; base for guest
OS-specific dirs |
/data/qemu/netshare |
Set up the host OS's NFS server
permissions. Under Linux this can be done by adding the lines
below to the
/etc/exports file. Some of the
mount
options may be redundant, but don't remove any without checking.
COMMON_RO_HOST 172.0.0.0/255.0.0.0(ro,no_root_squash,nohide,sync,insecure)
COMMON_RW_HOST 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
RW_HOST_BASE/openbsd 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
RW_HOST_BASE/netbsd 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
RW_HOST_BASE/freebsd 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
RW_HOST_BASE/solaris 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
RW_HOST_BASE/gentoo 172.0.0.0/255.0.0.0(rw,no_root_squash,nohide,sync,insecure)
Start or restart the NFS server on the host OS (or otherwise get it to reread its export permissions),
As noted previously,
these host NFS server export permissions are not specific enough to
prevent guest OSes other than the intended owner from connecting and
writing to each supposedly OS-specific directory. This is
because the IP address associated with each guest OS is subject to
change and that
this change is only reflected in the /etc/hosts file of the host rather
than using a name service like bind. There doesn't appear to be a
way to get the
NFS server to recognise the changed permissions without restarting
it, which is not possible as it would destroy existing
connections. So instead of specific hostname-based permissions,
generic permissions on the IP
block 172.X.X.X are used. I don't believe that using bind would
solve the problem, but I haven't checked this out.
This is only a security problem on multi-user machines where a
non-root user (user1) has root access on "their own" OS run under
QEMU. This OS would not one of the guest OSes set up as part of
the configuration described in this document - it would be user1's
"personal" OS. If permissions were set up such that user1's
personal QEMU OS is properly networked through the tun interface to the
host OS then user1 could access the "exclusive" NFS directories
of the guest
OSes through NFS connections from their personal OS; indeed they could
override permissions by creating specific users with the same
username/UID as those on the host OS. Clearly this is a situation
to avoid.
The other reason that it is a minor problem though is that some
random bug or mistake could lead one of the
guest OSes to accidentally connect to another guest OS's exclusive
directory and remove/modify/create files. Pretty unlikely and on
a
single-user machine it's not a significant concern, but it would still
be
nice to fix it.
If a reboot of the guest OS has occurred since /etc/hosts changes
were automated, the host OS's hostname should already be mapped in
/etc/hosts. If the guest OS has not
yet
been rebooted, add the host OS's hostname to the /etc/hosts file.
Tailor the lines below to each OS's
required
file
format and add them to each guest OS's /etc/fstab (/etc/vfstab for
Solaris). Include an option specifying to mount the
directories automatically at boot (the auto
below
accomplishes
this). Specifying read-only (ro) or read-write (rw) may be
unnecessary for some OSes but it's useful for the mount options to
match the permissions granted by the NFS server. Specifying rw
when the NFS server only grants ro will obviously not allow the
directory to be written to. Order is important if you name
directories as suggested since the common writable directory is mounted
off a sub-directory of the exclusive writable directory.
host_os_hostname:COMMON_RO_HOST COMMON_RO_GUEST nfs auto,ro 0 0
host_os_hostname:RW_HOST_BASE/guest_os_name RW_GUEST nfs auto,rw 0 0
host_os_hostname:COMMON_RW_HOST COMMON_RW_GUEST nfs auto,rw 0 0
Then try mounting each directory and test the permissions.
Note that Solaris 10 uses NFS version 4 by
default and won't connect to Gentoo Linux's NFS server unless told to
downgrade to version 3. This
can be
achieved by:
NFS_CLIENT_VERSMAX=3
svcadm restart nfs/client
SSH is the
means by which the host OS connects to the guest OS and runs whichever
test script is specified.
Generate a ssh key for the qemu-invoking
user and each test-script
user
on
the host OS using a command like (this must be run under the account of
the user in question):
ssh-keygen -t rsa
Copy the public key for each user
on the host OS (~/.ssh/id_rsa.pub) to the authorised hosts file for the
same user on each guest OS (~/.ssh/authorized_keys). This
will allow ssh connections without asking for a password.
Remember to ensure that the sshd service runs at startup on each
guest OS,
The virtual network of virtual machines is now set up. What is
missing is a convenient (and due to permissions, in a multi-user setup,
necessary) way to bring up any not-yet-running guest OSes and run a
single script on each OS. This is achieved through
runall-os-test. As was explained above
, the script is not appropriate for multi-user hosts and would require
modification for such usage. By the way, if you independently
perform such modifications before I get around to it, I would
appreciate you forwarding them to me.
The usage of the script is explained in the initial block of
comments. Basically it boots up the guest OSes specified as
parameters and optionally runs a script on each. If no OSes are
specified, all OSes in the config file are booted. If no script is
specified, the OSes are simply booted. If a script is specified
it must be the second option and the first option must be -s, -sf or
-sn. If the script is not a full path that starts with
$COMMON_RO_LOCAL (set in the top block with the script's global
variables) then it is assumed not to exist under the $COMMON_RO_LOCAL
directory already and is copied there. A multi-user system would
need to copy it to the user's specific directory within this directory,
but on a single-user system it is sufficient to copy it to the
root. The copy is interactive by default (cp -i
) but
this can be suppressed using -sn. To perform a cp -f
,
specify -sf. The script logs all steps it takes to $LOGFILE or
stderr if this variable is not set. The stdout and stderr of the
scripts as run by ssh on the guest OSes is set to the same terminal as
runall-os-test.
The first time a ssh connection is made to a guest OS, ssh will ask
for confirmation of the host key. By default it adds IP addresses
as well as hostnames to the ~/.ssh/known_hosts file. This causes
repeated confirmation requests when the guest OS's IP address
changes. Thus the script includes a function to intelligently
strip IP addresses from the known_hosts file. Be aware if running as a single-user that
runall-os-test will remove any IP addresses matching the
$KNOWN_HOSTS_IP_REGEXP from the ~/.ssh/known_hosts file.
runall-os-test reads configuration data for each guest OS from a
config file and
relies on a couple of helper scripts - runqemuosbgnoint and runqemuos -
so that interrupts can be handled properly and the PID of the qemu
process can be
known. The helper script execs qemu
so it knows
that qemu's PID is the same as its own. Interrupts are a problem
because without job control enabled, they are passed through to child
processes. Unfortunately, the qemu child process happens to
terminate on this signal, which is not appropriate (it should allow the
guest OS to shut down properly). This could be handled by using
set -m in the main script to turn on job control, but unfortunately for
some reason this causes keyboard interrupt to be ignored during the
builtin sleep command which occurs in polling loops. So it is
deferred to the runqemuosbgnoint script, which then calls the runqemuos
script in the background. For more details see the scripts
themselves.
The script relies on lsof so this command needs to be installed if
it is not already. Under Gentoo Linux, as
root emerge lsof:
emerge lsof
runall-os-test relies on the command ~/hostname
to return the hostname of the guest OS as specified in the config file
on the host OS. So this command must be present in the
qemu-invoking user's home directory on each guest OS. An
appropriate way
to acheive this is:
~/hostname
as an executable script that runs
the appropriate command to determine the OS's hostname. For
Gentoo Linux this is /bin/hostname -a
, for Solaris this
is /usr/bin/hostname
and on the BSDs it is /bin/hostname
-s
.The runall-os-test
depends on the guest OS to
provide notification that it has completed booting. This could
alternatively have been achieved through polling for a ssh connection,
but
the chosen approach ensures that the OS is in a stable, fully
booted state (in actual fact polling is used in the script provided,
but on my system I use a simple Linux-specific utility that I wrote to
avoid polling. It is named waitfile and uses the inotify kernel
interface. If anyone is interested in the code I will provide it,
but it would bloat this document too much). The means by which
the OS notifies that it has
completed booting is by writing to a file in one of the writable
directories NFS-mapped to the host. I chose this file as COMMON_RW_HOST/guest_os_hostname
but arguably RW_HOST_BASE/guest_os_name/bootnotify
would be more appropriate. I should point out here that I've
given my guest OSes different hostnames than their OS names, so the two
variables guest_os_hostname and guest_os_name are distinct, but this
needn't be the case.
The means by which this is achieved differs for each guest OS.
Here are the details:
Append a line to the end of /etc/rc. As root, run this command on each guest BSD OS:
echo 'date >> COMMON_RW_GUEST/`hostname -s`' >> /etc/rc
Again, this is not ideal as /etc/rc is a system file liable to be replaced on upgrade. There does not seem to be a maintainable way to configure a script to run at the end of the boot process in BSD, but such an approach would be preferable.
Append a line to the end of /etc/conf.d/local.start. In
contrast to the BSD approach above, this is maintainable because the
local.start file is a configuration file intended to be
user-modifiable. There is no specification anywhere that it will
run last in the boot process, but it does, so this suits our
purposes. As
root, run this command on the guest Gentoo Linux OS:
echo 'date >> COMMON_RW_GUEST/`hostname -a`' >> /etc/conf.d/local.start
Create a new file in the /etc/rc3.d directory and give it a high number so that it runs last in the boot process. This, as with the Gentoo approach, is maintainable, although the rc boot-up sequence seems to be unofficially deprecated in favour of the services approach. As root, run this command on the guest Solaris OS:
cat <<'ENDOFSCRIPT' >/etc/rc3.d/S999999signal_host
#!/sbin/sh
if [ "$1" = start ]
then
date >> /netshare-rw/common/solocrat
fi
exit 0
ENDOFSCRIPT
chmod 744 /etc/rc3.d/S999999signal_host
The runall-os-test script performs quite a lot of checking to
ensure that two
instances of a guest OS are not booted simultaneously. This is
important because QEMU does not lock the disk image and does nothing to
prevent multiple sessions from opening the disk image for
writing. It is
possible that in some very unusual conditions (one example is where
/usr/sbin/lsof is replaced with a dummy) that the script's checks fail
and that a second OS instance is booted, but this is very
unlikely. The slender possibility of accidentally booting an OS
that is
already
running could be avoided by not having the run-os-test script boot OSes
and instead have it return an error if an OS is not reachable.
Guest OSes would instead be booted manually prior to running the tests
and a
manual check would be performed to ensure that only one instance of
each OS is running. This would require an administrator to
perform the task on a multi-user system and could be a bit of a support
problem.
Another potential problem is that on hosts where memory is limited
or many OSes need to be
run, it may be necessary to shut down each OS (or a select group of
them) as the test-script completes - runall-os-test does not do this
so by default at the end of the test run all OSes will still be
running. The script currently does not cater
for this, but could be extended to do so.
There is no functionality in the runall-os-test for it to run the
specified test script on any guest OSes that have already booted while
waiting for another guest OS to boot. This is because batch
processing is more efficient than multitasking where processes are
CPU-bound and don't do a lot of waiting - e.g. on IO. This seems
to be the best model this situation where booting the guest OS under
QEMU is fairly CPU-intensive. It also simplifies the script not
to
jump around between OSes too (KISS).
As root, cut and paste the following set
of commands to create the runall-os-test script. Edit the script
and
set the variables in the top block of the script to appropriate values.
A final
reminder that this script is not suitable for use in a multi-user
environment without modifications, because to use this script a user
would need sudo permission to run /sbin/start-tun-interface as root,
and that gives them permission not only to introduce undesired entries
into the /etc/hosts file, but more importantly to start their own QEMU
OSes with full - i.e. root - access to those parts of the host OS
filesystem exported by NFS. It assumes a single-user system where
the qemu-invoking user is identical to the test-script user. In
fact there is not really any need for it to be owned by root, it might
as well be owned by the qemu-invoking user and stored in their personal
bin directory, but I prefer to locate it in /bin and have it owned by
root but executable by all given that my system is single-user.
cat << 'ENDOFSCRIPT' >/bin/runall-os-test
#!/bin/bash
# run-qemu-os
#
# Launches the qemu-hosted OSes specified on the command line (or all
# if none specified). If a script is also specified, that script is
# run on each guest OS through ssh. A config file stores
# information on which OSes are availabe and how to invoke them. The
# location of this file is set below in the CONFIGFILE variable.
#
# $1 => -s[n|f] indicates that a test script is to be run
# [non-interactive copy | force copy]
# (default is an interactive copy)
# $2 => scriptname if $1 is -s, -sn or -sf
# $1|3..$n => hostnames of OSes to run; "all" for all (starts from $3
# if $1 is -s, -sn or -sf)
#
# The script to run (if any) must be located either
# (a) in a dir based in $COMMON_RO_HOST or
# (b) elsewhere.
# If the beginning of the specified script ($2) compares equal with
# $COMMON_RO_HOST then (a) is assumed; else (b).
# If (b), and the file specified by $2 exists, it is copied to
# $COMMON_RO_HOST with a prompt on overwrite if a file with its name
# already exists in $COMMON_RO_HOST. The prompt can be avoided by
# adding an f after th s. A chmod u+x will then be run on the
# destination file.
# It will run on the guest OS through a ssh connection. The script
# can assume that:
# 1) it will run with the same username/userid as on the host
# 2) it will have read-only access to the nfs-mounted dir
# $COMMON_RO_GUEST which corresponds to the host's directory
# $COMMON_RO_HOST and is shared by all qemu guests.
# 3) it will have write access to the nfs-mounted directory
# $RW_GUEST to which the cwd will be changed prior to the
# script's invocation; but it will actually be invoked from its
# location under $COMMON_RO_GUEST
# 4) the write access to $RW_GUEST is intended to be exclusive to
# the guest OS under which it runs, but currently this is not the
# case and it is actually writable by other guest OSes.
# 5) it will have write access to $COMMON_RW_GUEST; this is
# intentionally shared-writable by all guest OSes.
# 6) `~/.hostname` == the guest OS's hostname as known to the host
# OS (mapped in the host OS's /etc/hosts file).
# NOTE: (6) only applies on a single-user system where the
# qemu-invoking user is the same as the test-script user
# Configurable variables
CONFIGFILE=/data/qemu/config # where to read config from
QEMUIMGDIR=/data/qemu # where the disk images are located
LAUNCHDIR=/data/qemu/launch_times # where the files containing the
# pids of launched qemu processes
# are created
COMMON_RW_HOST=/data/qemu/netshare/common-rw
COMMON_RW_GUEST=/netshare-rw/common # NFS-mapped to the above
UPDIR=$COMMON_RW_HOST # dir the guest OSes write to upon
# completing boot
COMMON_RO_HOST=/data/qemu/netshare/common-ro
COMMON_RO_GUEST=/netshare-ro # NFS-mapped to the above
RW_GUEST=/netshare-rw
TMPHOSTNAMEDIR=/tmp/qemuhostnames # path to dir where the guest OS's
# hostname is stored in a file with the name
# identical to the invoked qemu process's PID
LSOF=/usr/sbin/lsof # path to lsof
PING=ping # path to ping
SSH=ssh # path to ssh
SED=sed # path to sed
DATE=date # path to date command (assumes gnu options)
STAT=stat # path to stat command
LOGFILE=~/log # if set, logging msgs go here; else stderr. Does not
# catch error messages generated by the shell or
# the user's script as run through ssh. Recommended to
# leave unset initially whilst testing
DEFAULT_START_TIMEOUT=600 # used if not specified by a guest OS ...
DEFAULT_STOP_TIMEOUT=600 # ... entry in the config file
POLLINT=10 # how many seconds to wait between polls for events
# whenever polling is used
POLLCONNECTFACTOR=6 # how many polls to skip between conn tests in
# f_poll_for_qemu_process_shutdown_or_connect
QEMU_INIT_PERIOD=20 # how many seconds to wait for a qemu process to
# open the disk image for writing
KNOWN_HOSTS_IP_REGEXP="172\.[[:digit:]]\{1,3\}\.0\.2" # IP addreses to
# remove from ~/.ssh/known_hosts
#command to print timestamp for logging
LOGTIMECMD="$DATE +%Y/%m/%d-%H:%M.%S"
#command to print timestamp followed by current hostname
#LOGTH="eval echo \$(\$LOGTIMECMD) \$HOST"
#not used anymore but left as a reminder of how to do this and an
#example of when eval is required.
#e.g. usage is: echo "$($LOGTH) :: some message"
#removing the "eval" will cause this command to print the wrong
#message due to the need for parameter substitution
# log the message specified by the passed in parameters
f_log_msg()
{
if [ -n "$LOGFILE" ]
then echo $($LOGTIMECMD) ${HOST:+${HOST}:: }$@ >> $LOGFILE
else echo $($LOGTIMECMD) ${HOST:+${HOST}:: }$@ 1>&2
fi
}
# log the message specified by the passed in parameters; treating it
# as an error message
f_log_err()
{
f_log_msg ERROR:: $@
# always return 1 so that this return value can be passed
# through as the calling function's return
}
# log the command as passed in $@ and then return the result of
# evaluating it
f_log_and_run()
{
f_log_msg CMD:: $@
eval $@
}
# remove numerical IP addresses matching the given regexp from the
# known_hosts file otherwise they provoke prompts requiring user
# response when IP address for the host changes. Remove entire
# lines if the IP address is at the beginning of the line with a
# space following it; otherwise strip it if it is preceded by a comma
# and an arbitrary amount (including none) of whitespace.
function f_strip_known_hosts_ips()
{
$SED -n "s/^$KNOWN_HOSTS_IP_REGEXP[[:space:]]//; t; \
s/,[[:space:]]*$KNOWN_HOSTS_IP_REGEXP//; p" \
~/.ssh/known_hosts >/tmp/known_hosts.$$
mv /tmp/known_hosts.$$ ~/.ssh/known_hosts
}
# attempt to run the script in $SCRIPT (if one was specified) on the
# guest OS using ssh. Returns 0 if the script runs successfully.
function f_run_script()
{
# only run script if one was specified
if [ -z "$SCRIPT" ]; then return 1; fi
f_log_msg "Running script..."
f_strip_known_hosts_ips
f_log_and_run $SSH -q $HOST $COMMON_RO_GUEST/$SCRIPT
RES=$?
if [ $RES -eq 0 ]
then
f_log_msg "Script ran OK"
return 0
fi
f_log_err "$SSH returned $RES"
return 1
}
# polls for the upfile for the current OS. Uses the time in
# $LAUNCHTIME as the boot start time. Returns 0 if a file with later
# modification time than $LAUNCHTIME appears
f_wait_for_upfile()
{
if [ -z "$LAUNCHTIME" ]
then
f_log_err "In f_wait_for_upfile, \$LAUNCHTIME not \
defined (probably a freak case of $LAUNCHDIR/$HOST being deleted \
after being determined to exist, or a date command error)"
return 1
fi
let WAITENDTIME=$LAUNCHTIME+$START_TIMEOUT
while NOW=$($DATE +%s) && [ $NOW -lt $WAITENDTIME ]
do
if UPTIME=$($STAT -c %Y $UPDIR/$HOST) &&
[ $UPTIME -gt $LAUNCHTIME ]
then return 0; fi
sleep $POLLINT
done
}
function f_launch_os_run_script()
{
f_log_msg "Launching OS and waiting for it to come up"
let LAUNCHTIME=$($DATE +%s)
if ! [ -d $TMPHOSTNAMEDIR ]
then f_log_and_run "mkdir $TMPHOSTNAMEDIR"
fi
f_log_and_run "/bin/runqemuosbgnoint $HOST $TMPHOSTNAMEDIR \
$LAUNCHDIR $QEMUOPTS"
if f_wait_for_upfile # Uses $LAUNCHTIME
then
# os has apparently come up
f_log_msg "Finished waiting: OS is up"
f_run_script
return 0
fi
# timed out waiting for upfile
f_log_err "Timed out waiting on upfile"
return 1
}
# returns 0 if the guest OS's qemu disk image is open by any
# process in a mode other than read-only; if /usr/sbin/lsof returns
# an error then this function returns 2 and the caller must not
# proceed, since we cannot be sure no other process is writing to the
# disk image; otherwise returns 1 (disk image not open for writing)
f_is_disk_image_open()
{
# if lsof not exe, return lsof error; caller must abort
if ! [ -x $LSOF ]; then return 2; fi
f_log_and_run "LSOF_OP=\$($LSOF -Fa0 $IMGFILE)"
# can't perform test below as lsof returns 1 if file is not
# open by any process as well as on real error. Not correct.
# Unsafe.
# if ! [ $? -eq 0 ]
# then return 2; fi # return lsof error; caller must abort
set -- $LSOF_OP
for OPT in $@
do
if [ ${OPT:0:1} == a ] && [ ${OPT:1:1} != r ]
then return 0; fi
done
return 1
}
# returns 0 if the PID in $1 or $QEMUPID specifies a qemu process
f_pid_is_qemu_process()
{
if COMM=$(ps -o comm -p ${1:+$QEMUPID}) &&
[ ${COMM#*COMMAND} == qemu ]
then return 0; fi
return 1
}
# returns 0 if the launch file exists and its first line
# is the PID of a Qemu process
f_launch_file_specifies_qemu_process()
{
if [ -f $LAUNCHDIR/$HOST ] &&
read QEMUPID < $LAUNCHDIR/$HOST &&
f_pid_is_qemu_process $QEMUPID
then return 0; fi
return 1
}
# returns 0 if a launch file exists and a later upfile also exists
f_later_upfile_exists()
{
if [ -f "$LAUNCHDIR/$HOST" ] &&
[ "$UPDIR/$HOST" -nt "$LAUNCHDIR/$HOST" ]
then return 0; fi
return 1
}
# returns 0 if launch file exists and at least start_timeout
# seconds have expired since it was created
f_startup_timeout_has_expired()
{
if LAUNCHTIME=$($STAT -c %Y $LAUNCHDIR/$HOST) &&
NOW=$($DATE +%s) &&
[ $(($NOW-$LAUNCHTIME)) -gt $START_TIMEOUT ]
then return 0; fi
return 1
}
# polls for $STOP_TIMEOUT seconds for the PID in $QEMUPID to terminate
# or no longer be a qemu process or for a connection to be possible
# to $HOST. If a connection is possible, 0 is returned; if the
# process in $QEMUPID exits or stops being a qemu process, 1 is
# returned; otherwise (or on error) 2 is returned.
f_poll_for_qemu_process_shutdown_or_connect()
{
if ! POLLSTART=$($DATE +%s); then return 2; fi
NOW=$POLLSTART
NUMPOLLS=1
while [ $(($NOW-$POLLSTART)) -lt $STOP_TIMEOUT ]
do
if ! f_pid_is_qemu_process $QEMUPID
then return 1; fi
sleep $POLLINT
let NUMPOLLS=$NUMPOLLS+1
if [ $NUMPOLLS -gt $POLLCONNECTFACTOR ]
then
if f_can_connect ; then return 0; fi
NUMPOLLS=1
fi
if ! NOW=$($DATE +%s); then return 2; fi
done
return 2
}
# returns 0 if a ssh connection is possible and ~/.hostname on
# the guest OS matches $HOST
f_can_connect()
{
if ! f_log_and_run "$PING -c 1 $HOST &>/dev/null"
then return 1; fi
f_strip_known_hosts_ips
if f_log_and_run "HN=\$($SSH -q $HOST ./hostname)" &&
[ "$HN" == "$HOST" ]
then return 0; fi
return 1
}
# main procedure to bring up a guest OS and run any specified script
# performs all checks necessary to ensure that only one instance of
# an OS runs
main_os_startup_function()
{
f_log_msg "Testing connection"
if f_can_connect
then f_log_msg "Test succeeded"
f_run_script
return
fi
# connect failed
f_log_msg "Connection test failed"
f_is_disk_image_open
DIOW=$?
if [ $DIOW -eq 2 ]
then f_log_err "Error whilst checking whether disk image is \
open; aborting."
return 1
elif ! [ $DIOW -eq 0 ]
then
# disk image not open
f_log_msg "Disk image is not open for writing."
if ! f_launch_file_specifies_qemu_process
then f_launch_os_run_script; return
# connect failed, NOT DIOW but existing Qemu process;
# the possibilities are that it is
# (a) initialising && hasn't yet opened the disk image
# (b) bugging and has closed the disk image fd
# (we assume not to later open it again)
elif ! f_later_upfile_exists
then
# assume (a)
f_log_msg "Launch file specifies a qemu \
process; later up file does not exist; waiting for \
${QEMU_INIT_PERIOD}s to give it time to open \ the disk image"
sleep $QEMU_INIT_PERIOD
f_log_msg "Finished waiting."
if ! f_is_disk_image_open
then f_log_err "Disk image still not \
open; aborting."; return 1
else # drop through without returning
f_log_msg "Disk image now open; \
assuming boot process has started"
fi
else
# assume (b) and don't launch new OS in case
# the "buggy" one decides to re-open the file
f_log_msg "Launch file specifies a qemu \
process and a later upfile exists. Time on upfile is \
$($STAT -c %y $UPDIR/$HOST)"
f_log_err "Not continuing in case active \
qemu process running apparently booted OS re-opens disk image"
return 1
fi
fi
# connect failed but DIOW
f_log_msg "Disk image is open for writing."
if ! f_launch_file_specifies_qemu_process;
then
# Could not connect but DIOW and no qemu process found
f_log_msg "PID in launch file is not a qemu process \
or file does not exist"
f_log_err "Can't launch OS as disk image is already \
open for writing; aborting"
return 1
fi
f_log_msg "PID in launch file specifies a qemu process"
# connect failed, but DIOW and a qemu process exists
if ! f_later_upfile_exists
then
f_log_msg "A later upfile does not exist; assuming \
OS is booting. Will wait until launchfile is ${START_TIMEOUT}s old"
# connect failed, DIOW, qemu process, no later upfile
# assume booting.
if f_startup_timeout_has_expired
then f_log_msg "Not waiting - launchfile is \
already older than ${START_TIMEOUT}s"
f_log_err "Can't launch OS; existing OS \
failed to advise of boot completion"
return 1
# $NOW was set by f_startup_timeout_has_expired
elif f_wait_for_upfile
then f_log_msg "Finished waiting - OS is up"
f_run_script
return
else f_log_err "Timed out waiting on upfile for \
previously active guest OS."
return
fi
# connect failed but DIOW, qemu process, later upfile;
# assume shutting down
else
f_log_msg "A later upfile exists; assuming OS is \
shutting down. Waiting ${STOP_TIMEOUT}s for qemu process to exit \
and testing for connection while waiting"
f_poll_for_qemu_process_shutdown_or_connect
RET=$?
f_log_msg "Finished waiting"
if [ $RET -eq 0 ]
then f_log_msg "OS is up"
f_run_script
return
elif [ $RET -eq 1 ]
then
f_log_msg "Process has exited"
if ! f_is_disk_image_open
then f_log_msg "Disk image not open"
f_launch_os_run_script
return
else f_log_err "Disk image still open for \
writing; aborting"
return
fi
fi
f_log_msg "Process still alive"
f_log_err "Cannot launch OS; aborting. Suggest ssh \
or guest os boot problem"
return
fi
f_log_err "Reached end of main_os_startup_function"
}
unset HOST # so f_log_msg doesn't try to print it
# check for script in cmd-line parameters
# copy to $COMMON_RO_HOST if required
if [ "${1:0:2}" == "-s" ]
then
if [ "${1:2:1}" == "f" ]; then CPOPTS="-f"
elif [ "${1:2:1}" != "n" ]; then CPOPTS="-i"; fi
shift
if [ "${1:0:${#COMMON_RO_HOST}}" == $COMMON_RO_HOST ]
then
# found under $COMMON_RO_HOST
SCRIPT=${1#${COMMON_RO_HOST}/}
elif [ -f "$1" ]
then
# copy to $COMMON_RO_HOST
f_log_and_run "cp $CPOPTS $1 $COMMON_RO_HOST"
SCRIPT=${1##*/}
f_log_and_run "chmod u+x $COMMON_RO_HOST/$SCRIPT"
else
echo -n "Script not specified or " 1>&2
echo "not a regular file: $1" 1>&2
echo -n "Usage:: $0 [s[f|n] scriptname] " 1>&2
echo "[OS1 OS2 OS3 ...]" 1>&2
echo " If no OSes specified; all are assumed" 1>&2
echo -n " If f is specified and the script is " 1>&2
echo "not located under $COMMON_RO_HOST " 1>&2
echo -n " then when it is copied there cp -f " 1>&2
echo "will be used;" 1>&2
echo -n " else if n is not specified then " 1>&2
echo "cp -i will be used;" 1>&2
exit 1
fi
shift
else
unset SCRIPT
fi
# check for OSes specified as cmd-line parameters
if [ -z "$1" ]
then
f_log_msg "No operating systems specified; assuming all"
ALL=0
else
ALL=1
OSES=$@
fi
# read OS data from config file and launch OSes, running script if
# specified. OSes specified on the command line will be run in config
# file order, NOT command line order. If no OSes were specified, all
# in the config file are launched
STATUS=start
while read -u 4 LINE
do
DONEOSREAD=1
set -- $LINE
# HOST IMGFILE START_TIMEOUT STOP_TIMEOUT MEM LOCALTIME
if [ -z "$LINE" ]
then
if ! [ $STATUS == start ]
then
DONEOSREAD=0
STATUS=start
fi
elif [ -z "${LINE%%\#*}" ]; then continue # skip comments
elif [ $STATUS == start ]
then
unset QEMUOPTS;
HOST=$1
shift
if [ $ALL -eq 1 ]
then
if [ -z "$OSES" ]
then
unset HOST
f_log_msg "All specified OSes done"
break
fi
OSES2=$OSES
unset OSES
MATCH=1
for OS in $OSES2
do
if [ $OS == "$HOST" ]
then MATCH=0
else OSES="$OSES $OS"; fi
done
if [ $MATCH -eq 1 ]; then continue; fi
fi
# imgfile (must be the last option)
IMGFILE=$QEMUIMGDIR/$1
QEMUOPTS=$IMGFILE
shift
START_TIMEOUT=${1:-$DEFAULT_START_TIMEOUT}
shift
STOP_TIMEOUT=${1:-$DEFAULT_STOP_TIMEOUT}
shift
# memory size option (imgfile must be last option)
QEMUOPTS="${1:+-m $1 }$QEMUOPTS"
shift
# localtime option (imgfile must be last option)
if [ "$1" == "L" ]
then QEMUOPTS="-localtime $QEMUOPTS"; fi
STATUS=opts
elif [ $STATUS == opts ]
then
QEMUOPTS="$LINE $QEMUOPTS"
fi
if [ $DONEOSREAD -eq 0 ]
then
f_log_msg "START"
# When logging to a file, indicate on stdout which
# host we're up to
if [ -n "$LOGFILE" ]; then echo "-----$HOST-----"; fi
main_os_startup_function
f_log_msg "END"
fi
done 4< $CONFIGFILE
# opened on fd 4 so that changes to stdin made by other commands
# don't break this fd. The -u 4 option to "read" at top complements
# this.
if [ -n "$OSES" ]; then f_log_err "Not found: $OSES"; fi
ENDOFSCRIPT
chmod 555 /bin/runall-os-test
Create the directory specified by
$TMPHOSTNAMEDIR with permissions
allowing only the qemu-invoking user to write to it.
Create the configuration file. It should not be writable by other than the qemu-invoking user (or better yet, root) but should be readable by all. Also it should exist in a directory with permissions such that it cannot be deleted by other than root. On the reference system it is located at /data/qemu/config. An example follows:
## GUEST OS CONFIGURATION FILE
##
## Commented lines are ignored; blank lines delimit the OSes
## For safely, timeouts must be calculated based on running without
## kqemu. Timeouts are for boot and shutdown respectively.
##
## The first line of each OS is delimited by single spaces with fields
## as below. The next set of consecutive lines are additional options
## to pass to qemu (one specified per line)
# hostname imgfile starttimeout endtimeout memsize -localtime?(L=yes)
netqemu netbsd-2.0.2.img 240 60 64 L
solqemu solaris-10.img 2400 2000 128 L
-cdrom /data/os/solaris-10/sol-10-ccd-GA-x86-iso.iso
gentqemu gentoo.img 1200 1200 64 L
freeqemu freebsd-5.4.img 720 60 64 L
-cdrom /data/os/freeBSD-5.4/5.4-RELEASE-i386-disc1.iso
# openqemu should be last as it suffers least under kqemu-less
# operation
openqemu openbsd-3.7.img 240 60 64 L
#the last OS must be terminated by a blank line
As root, cut and paste the following set
of commands to create the helper scripts for runall-os-test. Set
the QEMU variable if required (should be fine for default PATH
variables).
cat << 'ENDOFSCRIPT1' >/bin/runqemuosbgnoint
#!/bin/sh
# turn on job control so child processes don't get passed keyboard intr
set -m
/bin/runqemuos $@ &
ENDOFSCRIPT1
chmod 555 /bin/runqemuosbgnoint
cat << 'ENDOFSCRIPT2' >/bin/runqemuos
#!/bin/sh
# $1 => hostname
# $2 => temporary hostname directory
# $3 => launch_times directory
# $4.. => qemu options
# Configurable variables
QEMU=qemu # path to qemu executable
echo $1 > $2/$$
echo $$ > $3/$1
shift 3
exec $QEMU $@
ENDOFSCRIPT2
chmod 555 /bin/runqemuos
Right, that's it then. You should be able to run any scripts
on any/all OSes using runall-os-test. If it wasn't all plain
sailing, send me an email and let me know what could be improved.
Last updated Tue 9 Aug 2005 Contact The Author
Got something to add? Send me email.
More Articles by Netocrat © 2011-02-08 Netocrat
Educate the children and it won't be necessary to punish the men. (Pythagoras)
Mon Aug 8 22:57:49 2005: 946 TonyLawrence
Great job - I hadn't hadn't heard of QEMU. Thanks for this great write up.
Sun Dec 20 08:38:13 2009: 7770 Gopalakrishna
Very good article - useful for all those who are doing opensource development and want to test their code on multiple operating systems
Gopalakrishna
(link)
Tue Feb 8 15:09:07 2011: 9286 Gopalakrishna
Is there a ready to use live-cd /dvd distribution available with the above suggested techniques?
Or any equivalent live distribution that can downloaded and used for building platform independent code is great.
- Gopalakrishna Palem
Creator of CFugue
(link)
Tue Feb 8 15:16:27 2011: 9287 TonyLawrence
I have no idea, sorry.
------------------------
Printer Friendly Version
Cross-Platform Compatibility Testing On One Machine Without Rebooting Copyright © August 2005 Netocrat
Have you tried Searching this site?
This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.
Contact us
Printer Friendly Version