Jim Mohr's SCO Companion

Index

Copyright 1996-1998 by James Mohr. All rights reserved. Used by permission of the author.

Be sure to visit Jim's great Linux Tutorial web site at http://www.linux-tutorial.info/

System Monitoring


Monitoring your system is more than just watching the amount of free hard disk space or the number of users running a certain application. Many aspects of your system are static over fairly long periods of time, such as the layout of your hard disk. However, such information is as important to really knowing your system as how much free memory there is at any given time.

Both ODT and OpenServer provide a wide range of tools to not only monitor your system as it it running, but to find out how it's configured. In this chapter, we are going to talk about the tools you need and the files you can look in to find out anything about your system that you need. In the next chapter, we are going to take some of these tools and see how they can be used to help us achieve the goal of administering our system.

Finding out about your system

One of the challenging aspects of tech support at SCO is OSD, or Operating System Direct. This is a direct line to a support engineer. As a support engineer, the challenge lies in the fact that before you talk to the customer, you have no way of knowing what the problem will be. It can be anything from simple questions that are easily answered by reading the manual to long, drawn out system crashes.

Late one Monday afternoon I was working the OSD line. Since I has the one who had been idle the longest, my phone rang when the next customer came into the queue. The customer on the other end of the line, described the situation as simply that his computer would no longer boot. For some reason, the system rebooted itself and now it would not boot.

When I asked the customer how far it got and what, if any, error messages were on the screen, he replied "panic - srmountfun". At that point I knew it was going to be a five minute call. A panic with the srmountfun message basically means your filesystem is trashed. In almost every case, there is no way to recover from this. On a few rare occasions, fsck can clean things up to be able to mount. Since the customer had already tried that, this was not one of those occasions.

We began discussing the options, which were very limited. He could reinstall the operating system and then the data, or he could send his hard disk to a data recovery service. Since this was a county government office, they had the work of dozens of people on the machine. They had backups from the night before, but all of that days work would be lost.

Since there was no one else waiting in the queue to talk to me I decided to poke around a little longer. Maybe the messages we saw might give an indication of a way to recover. We booted from the emergency boot/root set again and started to look around. The fdisk utility reported the partition table as being valid and divvy reported the division table as being valid. It looked as if just the inode table was trashed, which is enough.

I was about ready to give up when the customer mentioned that the divvy table didn't look right. There were three entries in the table that had starting and ending blocks. This didn't sound right because he only had one filesystem: root.

Since the data was probably already trashed, there was no harm in continuing, so we decided to name the filesystem and try running fsck on it. Amazingly enough, fsck ran though relatively quickly and reported just a few errors. We mounted the filesystem and holding our breath we did a listing of the directory. Lo and behold there was his data. All the files appeared to be intact. Since this was all in a directory named /data, he simply assumed that there was no /u filesystem, which there wasn't. However, there was a second filesystem.

I suggested backing up the data just to be safe. However, since it was an additional filesystem, a reinstallation of the OS could preserve it. Within a couple of hours, he could be up and running again. The lessons learned? Make sure you know the configuration of your system! If at all, possible keep data away from the root filesystem and backup as often as you can afford to. The lesson for me was to have the customer read each entry one-by-one.

Being able to manage and administer your system requires that you know something about how your system is defined and configured. What values have been established for various parameters? What is the base address of your SCSI host adapters? What is the maximum UID that you can have on a system? All of these are questions that will eventually crop up, if they haven't already.

The nice thing is that the system can answer these questions for you, if you know what to ask and where to ask it. In this section we are going to take a look at where the system keeps much of its important configuration information and what you can use to get at it.

As a user, much of the information that you can get will be only useful to satisfy your curiosity. Most of the files that I am going to talk about, you can normally read. However, there are a few of the utilities, such as fdisk and divvy that you won't be able to run. Therefore, what they have to say will be hidden from you.

If you are an administrator, there are probably many nooks and crannies of the system that you never looked in. Many you probably never knew existed. After reading this section, you will hopefully gain some new insights into where information is stored. For the more advanced system administrator, this may only serve as a refresher. Who knows? Maybe the gurus out there will learn a thing or two.

Hardware and the Kernel

The first place we're going to look is that place that causes the most problems and results in the largest number of calls to SCO Support: hardware.

For those of you who have watched the system boot, you may already be familiar with what SCO calls the "hardware" screen. This gives you a good overview as to what kind of hardware you have on your system and how it is configured. Since many hardware problems are the result of misconfigured hardware, knowing what the system thinks about your hardware configuration is very useful.

Fortunately, we don't need to boot every time we want access to this information. SCO Unix provides a utility called hwconfig that shows us our hardware configurations. (At least what the OS thinks is the HW configuration.) On my system, if I run /etc/hwconfig -hc, I get this:

device

address

vec

dma

comment


======

=======

===

===

=======

fpu

-

13

-

type=80387

serial

0x3f8-0x3ff

4

-

unit=0 type=Standard nports=1

serial

0x2f8-0x2ff

3

-

unit=1 type=Standard nports=1

floppy

0x3f2-0x3f7

6

2

unit=0 type=135ds18

floppy

-

-

-

unit=1 type=96ds15

console

-

-

-

unit=vga type=0 12 screens=68k

parallel

0x378-0x37a

7

-

unit=0

adapter

0x330-0x332

11

5

type=ad ha=0 id=7fts=s

tape

-

-

-

type=S ha=0 id=2 lun=0 ht=ad

disk

-

-

-

type=S ha=0 id=0 lun=0 ht=ad fts=s

Sdsk

-

-

-

cyls=1170 hds=64 secs=32

disk

-

-

-

type=S ha=0 id=6 lun=0 ht=ad fts=s

Sdsk

-

-

-

cyls=518 hds=64 secs=32

cd-rom

-

-

-

type=S ha=0 id=5 lun=0 ht=ad


No obvious conflicts in hardware settings






The -h option showed us the output in neat little columns, all will headings over each column (h for headings). Without this option, the same information is there, however, it is as easy to read. The -c option checked for conflicts of I/O base address, IRQ and DMA channel (c for conflicts). This will catch not only duplicates, but in the case of the I/O base address, it will tell you if anything is overlapping. For more details about this, check out the hwconfig (ADM) man-page.

From the output you know the base addresses of all devices that have base address (the address column), their interrupts (the vec column), the DMA channel (the dma column) and often other pieces of information that can be very useful. For example, under the comment on the first line labeled "floppy", you might see it is unit=0 and the type=135ds18. If this is the way things should be, that's a happy thing.

However, this is not always the case. Customers repeatedly call to SCO Support during installation of the OS with "bad media". This is not because the floppy disk is bad, because the value here is not what the hardware really is. The values for the floppies are read from the CMOS. If they are incorrect, the operating system gets them incorrect. As a result, it may be trying to read a 3.5" floppy as if it were a 5.25". That won't work for long.

There is one "serial" entry for each COM port I have. In this case I have two, labeled unit=0 and unit=1. Each are of type standard (nothing special about them) and each only has one port. If you had a non-intelligent serial board with more than one port, nports= would show you how many ports you have.

One think to keep in mind here is serial mice. I have one on my system, but you couldn't tell that from the output here. At this point, the system has no way of know that a mouse is attached to the serial port.

The "console" entry is referring to both your video card and the way your console is configured. Here we see that I have a VGA card, which is standard (type=0), there are 12 multiscreens set up, with 68k of memory reserved for those screens. From the system's standpoint here, it doesn't matter that I actually have an SVGA as they both use the same driver.

If you have SCSI devices on your system, you will see something like this for your host adapter:

adapter 0x330-0x332 11 5 type=ad ha=0 id=7 fts=s

In the comments column, the type tells me that it is using the ad device driver, so I know that I have an Adaptec 154x or 174x in standard mode. (This is the same an the corresponding entry in /etc/default/scsihas) The entries ha=0 id=7 tells me that this the first (0th) host adapter and it is at ID 7. The entry fts= can mean several things, depending on what's there. The entry I have fts=s, says that I have scatter-gather enabled. Other possible entries are:

s = scatter/gather

t = tagged commands

d = 32-bit commands

b = commands are buffered

The two disk entries are for hard disks. Here we see type=S, so we know that both of my disks are SCSI. If I had an IDE or ESDI hard disk: drive, this would say type=W. Since this is SCSI, we also need to know what host adapter the device is attached to (ha=0, ht=ad ), its ID and LUN (id=0, lun=0), plus similar characteristics as the host adapter (fts=s).

If this were an IDE or ESDI hard disk: drive, instead of the SCSI configuration we would instead have a description of the drive geometry here. In order to be able to show both the SCSI configuration and hard disk geometry, there is an additional entry (Sdsk) for each disk. This is where the geometry of each drive is listed. This shows us cylinders, heads and sectors per track for each drive.

One question that is useful in debugging hardware problems is know just where the system gets this hardware information. Each of these lines is printed by the device driver. These are the configuration parameters that the drivers have been given.

If you remember from our discussion of the link kit, we know that the kernel is composed of a lot of different files that reside somewhere under /etc/conf. It is here where we have stored the configuration information that gets handed off to drivers during a kernel relink.

For our purposes, we only need to be concerned about three sub-directories under /etc/conf: pack.d, sdevice.d and cf.d. Since we went into detail about these in the section on the link kit, I will only review them briefly.

The first directory I want to talk about, /etc/conf/cf.d, is the central configuration directory. Here you can find default configuration information, which drivers are being linked into the kernel, the current value of kernel tunable parameters, etc.

When the system is installed, the defaults for the parameters that the kernel needs are kept in the file /etc/conf/cf.d/mtune. As I mentioned before, this is the master tuning file. It is a simple text file, consisting of four columns. The first is the parameter name, followed by the default value, the minimum and lastly the maximum. Any value that has not changed is set to the default here. (the second column)

If any kernel parameters have a value other than a default, the parameter name is placed in stune along with the new value. This is the system tuning file. Although changes can be made to stune by hand, it is "safest" to use the configuration tool provided with the OS you have. If you have ODT 3.0, then this tool is sysadmsh, if you have OpenServer this is the Kernel/Hardware Manager. Both of which start up the utility /etc/conf/cf.d/configure, which you could start yourself, if you wanted.

When the kernel is rebuilt, each parameter is first assigned the value defined in mtune and then any value in stune overwrites that default. If you want to figure out what each of these parameters mean, take a look at the Chapter entitled "Kernel Parameter Reference" in the System Administrator's Guide if you have ODT 3.0 and Appendix B (Configuring Kernel Parameters) of the Performance Guide, if you have OpenServer.

Also in /etc/conf/cf.d is the master device configuration file: mdevice. As we know, this tells us all the devices that can be configured on the system, as well as specific characteristics about that device. This, however, does not tell us what devices are currently configured, just what devices could be configured.

The mdevice file provides a couple of pieces of information that may come in handy. One of them is the device major number. The block major number is column 5 and the character major is column 6. This is useful when trying to determine what device nodes are associated with what device driver. If the name of device node is well chosen, then it is easy to figure out what kind of device it is. Otherwise, you have to guess.

For example, it's fairly obvious that the device /dev/tty1a has something to do with a tty device. What about /dev/ptmx? Well, it has a major number of 40. Looking in mdevice,  I see that this is the clone device driver. Therefore, I know it has something to do with streams. (See the discussion of the /dev directory) The mdevice file also contains the DMA channel the device uses (column 9), which is useful to know when trying to track down hardware conflicts.

What devices are actually configured can be found in the /etc/conf/cf.d/sdevice file. This file is generated during the kernel relink by concatenating all the files in /etc/conf/sdevice.d. The first column of each entry in sdevice matches the first column in mdevice  We can then make the connection, if we need to, from the device node to the corresponding entry in sdevice. (major number ® mdevice ® sdevice)

As we mentioned in the section on the link kit, we know that the sdeviced files contain (among other things) the IRQ and base address. This is useful when hardware problems arise. We also find in these files a very useful piece of information: whether the device driver will be included at all. If there is a Y in the second column, then this driver will be included. If there is an N, this device will be left out.

Although it is not a common occurrence, it has happened that a device could no longer be accessed after a kernel relink. The reason being that there was an N in the second column. More often than not, this is the result of an over zealous system administrator who wants to reduce the size of his kernel by pulling out "unnecessary" drivers. However, on occasion I have seen it where third party device driver removal scripts forget to put things back the way they were.

In this /etc/conf/cf.d is also the SCSI configuration file, mscsi. This tells us what SCSI devices the administrator wants to configure on the system. The reason I said it that way is because this is a place where errors regularly occur. What can often occur is an administrator will be unsure of his SCSI configuration and will try several different ones until he gets it right.

This "shotgun" method of system administrator rarely works. You may get the device working, but you can forget adding anything else in the future. This is especially true for SCSI hard disk. Only those devices that are actually configured on the system should be in mscsi.

Let's think back to the hardware screen. If you have SCSI devices on your system, they will be listed here and may be configured exactly as you see it on the screen. Just like other kinds of devices, SCSI devices may be configured incorrectly and still show up during boot.

However, you may have installed a SCSI device that does not appear in the hardware screen or there is one there that you didn't configure. The place to look is the mscsi file. If the device is there, then the odds are either the entry was input incorrectly, or it is not configured the way you expect. More information can be found in the mscsi (F) man-page.

A thing to note is that some SCSI devices (like hard disks) do not show up until after they have been accessed the first time, usually when the filesystems on them are mounted. Therefore, if you have mutliple hard disks, the second and subsequent ones will not show up until the filesystems are mounted. This is usually when you go into multi-user mode. In addition, until you go into multi-user mode, the output of hwconfig may be invalid. Therefore, you might not see every device at boot.

The reason for this is that the printcfg() routine may not get called during system boot-up. The device driver may not do anything special in the initialization routine (where printcfg() is normally called) so it and simply prints the configuration string. On the other hand, it may wait to call printcfg() until the device is first accesses, as in the case of hard disks.

We next jump to another directory at the same level as cf.d: /etc/conf/pack.d. We know that the pack.d directory contains a sub-directory for each device that can be configured essentially matching what is in mdevice. In the sub-directory is the device driver itself (Driver.o) as well as a configuration file, space.c. (Some of directories contain files called stubs.c. These only contain things like function declarations, but no configuration information.)

The space.c files can contain a wealth of information about the way your hardware is configured. Granted much of the information requires knowledge of how the specific driver functions. However, skimming through these files can give you some interesting insights into what your system can do.

As a warning, don't go playing with the values in these files unless you know what you're doing. You have the potential for really messing things up.

Despite the fact that they're object code, which makes reading them difficult, the Driver.o files often provide some useful information. Let's assume that your system is panicking and you cannot figure out what's causing it. By running crash on the dump image and entering the command panic, you will get a stack trace that tells you the function the kernel was in when it panicked. If it's not obvious from the name, what the function is, you can use either the strings or nm command to search through the Driver.o files to find that function. If the panic is always in the same function, this usually indicates a problem with the hardware or something else related to the driver. (We'll talk more about this technique later in the section on problem solving.)

Another directory at the same level as pack.d and sdevice.d is init.d. The files in this directory are concatenated onto end of the file /etc/conf/cf.d/init.base to form your /etc/inittab file. This serves a couple of functions. First, many of the processes that are started as the system boots up are started out of /etc/inittab. These you find in init.base. Second, this is where the initial terminal configuration information is kept. For the console terminal devices this is also found in init.base. However, for terminal on the COM port or multiport boards the information for how they are configured is found in the files in /etc/conf/init.d. See the chapter on starting and stopping your system for more details.

On a default system, you should have at least the file /etc/conf/init.d/sio. This contains the default configuration for the standard serial ports (tty1a and tty2a).  When intelligent multiport boards are added, there will probably be an extra file in this directory. If you are running OpenServer, then you will also find the file scohttp¸ which controls the SCO httpd daemon, /etc/scohttpd.

Terminals

Since we were just talking about terminals, lets take a look at some other information that relates to terminals.

I mentioned that the init.base file and the files in /etc/conf/init.d established the default configuration for terminals. To a great extent this is true, however, there is a little something extra that needs to be addressed. Whereas the /etc/conf/cf.d/init.base and /etc/conf/init.d/* files tell us about the default configuration, (such what process should be started on the port and at what run levels) it is the /etc/gettydefs file that tells the default behavior of the terminals. (Rather then mentioning both the /etc/conf/cf.d/init.base and /etc/conf/init.d/* files, I'll just talk about /etc/inittab, which is functionally the same thing.)

Each line in /etc/inittab that refers to a terminal device, points to an entry in the /etc/gettydefs file. The entry for /dev/tty1a might look like this:

Se1a:234:respawn:/etc/getty tty1a m

From out discussion of the /etc/inittab file in the chapter on Starting and Stopping the System, we see that this entry starts the /etc/getty command. Two arguments are passed to getty: the terminal it should run on (tty1a) and the gettydefs entry that should be used (m). The /etc/gettydefs file defines such characteristics as the default speed, parity and the number of data bits. For example, the m entry which the inittab entry above points to, might look like this:

m # B9600 HUPCL # B9600 CS8 SANE HUPCL TAB3 ECHOE IXANY #\r\nlogin: # m

The fields are

label # initial_flags # final_flags #login_prompt # next_label

The label entry is what is being pointed to in the inittab file. The initial_flags are the default serial line characteristics that are set, unless a terminal type is passed to getty. Normally, the only characteristic that needs to be passed is the speed. However, we also set HUPCL (hang up on last close).

The final_flags are set just prior to getty executing login. Here again, we set the speed and HUPCL. However, we also set the terminal to SANE, which is actually several characteristics. (Look at the gettydefs(F) man-page for more details.) We also set TAB3, which turns tabs into space, ECHOE which echoes the erase character as a backspace-space-backspace combination, and lastly IXANY which allows any character to restart output if stopped by the XOFF character.

In many cases after you login, you are prompted to input the type of terminal you are working on. This appears as something like:

TERM = (ansi)

This prompt is a result of two things. First, the system checks the file /etc/ttytype. This file consists of two columns. The first is the terminal type, followed by the tty name (e.g., tty02, tty1a). If your tty device is listed here, then the system knows (or thinks) that you are logging in using a specific terminal type on that port. This is port dependent and not user dependent.

If your terminal is not listed here, then you are prompted to input it. This is the result of the tset line in either your .profile or .login. Check out the tset(C) man-page for more details.

This is a useful mechanism if you have a lot of serial terminals that are always connected to the same port (say, on a multi-port board). That way the users don't have to be bothered with either typing in their terminal type. In addition, you as the system administrator don't have to worry about users calling up saying their terminal doesn't work when they input the wrong terminal type.

Hard Disks and Filesystems

A common problem that causes longer calls to support is the layout of the hard disk. Many administrators are not even aware of the number of partitions and filesystems they have. This is not always their fault, as they often inherit the system without any information on how it's configured.

The first aspect is the geometry. This is such information as the cylinders, heads and sectors per track. In most cases, the geometry of the hard disk is reported to you on the hardware screen when the system boots. You can also run the dkinit program by hand.

To find how your hard disk (or hard disks) is laid out, there are several useful programs. The first is fdisk, which is normally used to partition the disk. Using the -p option, you can get fdisk to just print out the partition table. This tells you which partitions are on the disk, their starting and ending tracks, the type of partition and which one is active. The output is not necessarily intuitive, so let's take a quick look at it. On my system, I get output like this:

1 9600 41535 31936 UNIX Active

2 1 9599 9599 DOS (32) Inactive

3 41536 74815 33280 UNIX Inactive

Each line represents a single partition. The fields are:

partition_no. start_track end_track size type status

Here we have three partitions with the first UNIX partition being active. Note that although the DOS partition is physically the first partition, it shows up as the second partition in the fdisk table. In addition, it accurately recognized the fact that it is a 32-bit DOS partition.

If we look carefully and compare the ending blocks with the starting blocks of the physically next partition, we see that, in this case. there are no gaps. Small gaps (just a few tracks) are nothing to have a heart attack over as you are only loosing a couple of kilobytes. However, larger gaps indicate that the whole hard disk was not partitioned and you are loosing space.

If you have multiple hard disks on your system, hwconfig may show you this. What happens if it doesn't? Maybe it's a SCSI hard disk that's never mounted, so it doesn't print out the configuration information. How can you figure out if you have more than one hard disk? You could take a look in /dev for any hd device. If you look back on the section on major and minor numbers, you can figure out what hard disks have devices assigned. However, it's possible that the hard disk existed at one time, but doesn't anymore. Maybe the previous administrator like the "shotgun" approach to system administration and tried to configure every possible combination. The device nodes might be there, but there is no physical device associated with them. Therefore, you need a way to figure exactly what devices are physically on the system.

No worries! The fdisk utility will tell you. If you try to print out the partition table for all the possible hard disks, the worst that can happen is you get an error message saying it can't open the device. To do this, run these four commands:

fdisk -p -f /dev/rhd00

fdisk -p -f /dev/rhd10

fdisk -p -f /dev/rhd20

fdisk -p -f /dev/rhd30

Once you get a response that fdisk can't open a device, then you know you've probably found your last hard disk. If you actually do have more than four hard disks, you need to try the same fdisk command on the other hard disk devices. If you never do get the message that it cannot open a the device, then there are probably physical devices associated with every device node.

To find out what filesystems or divisions are on your disks, you can use the mount command. However, this only tells you which ones are currently mounted. This is useful on a running system to determine if a directory is part of one filesystem or another. Although the df command (more on that later) will tell you what filesystems are mounted, it doesn't tell you what options were used, such as whether the filesystem is read only or not. On a few occasions I have had customers call in reporting filesystems problems because they could write to them, only to find out they were mounted as read-only.

What if you suspect there are more filesystems then are mounted? Unfortunately, finding out what filesystems are on your system is not as easy a figuring out what partitions are there. When you run mkdev fs or the Filesystem Manager on OpenServer, an entry for each filesystems is placed in /etc/default/filesys. If you find more here than you see with mount, then they are either not getting mounted properly or the entry is missing the options necessary to mount them automatically. Check the filesys(F) man-page for more details.

Sometimes there are filesystem on your disk, without an entry in /etc/default/filesys. This happens often when people are not paying attention during the install and specify a /u filesystem. When they don't run mkdev fs they find half of their disk missing, and they end up calling SCO Support. Fortunately, SCO recognized the problems that this caused and it no longer occurs in OpenServer. Instead you are prompted to add the filesystem.

The easiest ones to find are those that you see simply by running divvy -P. This defaults to the partition with your root filesystem on it. For example, if I run it on my system I get:

0

0

14999

1

15000

39999

2

40000

429941

3

429942

469941

4

469942

509941

6

509942

509951

7

0

510975

Each line represents a single division, with the fields in each entry being the division number, the starting block, and the ending block. Note that the block sizes in divvy are 1K and not 512 bytes like other utilities. From this output, I see I have 5 divisions (0-4), plus recover (6) and the whole disk (7). (Think back on our discussion of filesystems.) Since this is the root partition, I know that one of these is probably the swap space. (I know it is division 1)

If I looked in /etc/default/filesys and saw fewer entries than I saw here (taking in swap, recover and the whole disk into account, of course), then I would know something is not as it appears. One short coming is that this output does not give me the names of divisions. This is because the -P option is just reading the entries in the division table and displaying them.

So, where are they names coming from? From the device nodes. When divvy is run interactively, you see a table similar to the above output, but with the name of the device included. What divvy does is finds that first block device (alphabetically) in /dev that has the correct major and minor number. If I created a new block device node with major number of 1 and a minor number of 41, instead of seeing swap as the name of the division, I would see jim. This is because /dev/jim shows up alphabetically before /dev/swap.

We can also pass a device node as an argument to divvy. If this is the name of a filesystem, divvy will figure out what partition it is on and displays the appropriate division table. For example, divvy -P /dev/root would give me the exact same output as above.

If we wanted, we could also specify the character device for that partition. If I ran divvy -P /dev/rhd01, the output would again be the same. To get all the divisions, you will need to run divvy for all of the Unix partitions that you found with fdisk. We could run divvy on all the filesystem names. This would show us all the divisions including any that were not yet given names. However, this won't help us on disks where a filesystem spans an entire partition. The nice thing is that the output of fdisk will help us.

We can see from the example above for fdisk that to go through the hard disks, we increase the first number in the device name by one. To go through the partitions, we increase the second number by one. However, we don't have to go though each possible partition number to get the results we want. The output of fdisk gave us the partition numbers already.

Let's take the above fdisk output:

1 9600 41535 31936 UNIX Active

2 1 9599 9599 DOS (32) Inactive

3 41536 74815 33280 UNIX Inactive

From this output, I know to run divvy on /dev/hd01, /dev/hd02 and /dev/hd03. If this were the second hard disk, I would change the devices accordingly. For example, the divvy command to show the first partition would be:

divvy -P /dev/rhd11

You could write a quick shell script that explicitly ran through each value. However, I feel that checking everything yourself gives you a better understand of how your system in configured. Keep in mind that a division with no filesystem on it may not be wrong. There are applications (such as some databases) that require either a raw partition (no divisions) or a raw division (no filesystem).

System Defaults

A lot of default information can be found in the /etc/default directory. This directory contains a wealth of information about the default state of your system. I guess that's why the directory is called /etc/default, huh? If you have just upgraded to OpenServer or are otherwise familiar with ODT 3.0, then I highly recommend looking at this directory right now. You heard me, take a look. There are quite a few changes between the two releases. Because of the significance of this directory, knowing about the differences is an important part of administering your system.

The most obvious change is that that virtually all of the files are symbolic links to the "real" files living in /var/opt/K/SCO/Unix/5.0.0Cd/etc/default. You will find that there are new files that didn't exist in previous releases as well as files that are no longer used. Note that 5.0.0Cd is the release number and may, therefore, be different.

Most of the files have man-pages associated with them or are related to programs that have them. For more details, take a look at the default(F) man-page. It has a list of other man-pages related to these files.

Going through each file and discussion each entry would not be effective use of our time. Each of the files either has a man-page specifically for it, or there is a program associated with the file that has a man-page. Instead, I am going to talk about some of the changes as well as address some of the more significant files.

Have you ever wondered why when you simply press 'enter' at the Boot: prompt you get hd(40)unix? If so, check out the variable DEFBOOTSTR in /etc/default/boot. The /boot program reads the DEFBOOTSTR (default boot string) variable from /etc/default/boot and when you press enter without any other input, the default boot string is echoed. It is the DEFBOOTSTR variable that defines the default boot behavior, hence the name. (See the section on starting the system for more details)

If we wanted, we can change the default boot string to something else. In fact, that's what I did on my system. Instead of hd(40)unix, in my /etc/default/boot file it looks like this:

DEFBOOTSTR=hd(40)dos

Therefore, any time I get to the Boot: prompt and simply press ENTER, I am brought into DOS. The reason I did that was for my son. Since I have loads of educational programs for him on my DOS partition, I wanted a way for him to get to DOS easily. All he needs to do is press enter at the right place and he gets to where he needs to be. Now he's a little older and understands that DOS is different than UNIX and could type in DOS himself. However, I leave it in for historical reasons.

So, how do I get to UNIX? Well, one way would be to type in the old DEFBOOTSTR by hand ( hd(40)unix ). Rather than doing that, I use a trick called boot aliasing. We talked about this in some detail in the section on starting and stopping the system.

In the /etc/default/boot file you will also find out whether the system automatically boots (AUTOBOOT) or not. If there is no TIMEOUT value, the system automatically boots after 60 seconds. Otherwise, the system boots after the number of seconds defined in the TIMEOUT variable. Sometimes the overzealous administrator will want to set TIMEOUT to 0, so the system autoboots immediately after reaching the Boot: prompt. This is not a good thing. You want to give yourself at least a couple of seconds, in case you are having problems and need to boot into maintenance mode. Otherwise you will automatically go into multi-user mode.

One change to this file in OpenServer is the addition of the BOOTMNT variable. This determines how to mount the /dev/boot filesystem. If set to read-only(RO), then you can't simply copy files onto this filesystem. However, certain system utilities need to be able to write to this filesystem no matter what.

The /etc/default/passwd file contains the default settings for using passwords. This include such things as the minimum allowable length of the password, how often you can change it, and how complex the password has to be. Many of the values are dependent on what level of security you have. Therefore, in order to maintain consistency, I wouldn't recommend making changes unless you change the security level to match.

If you have ODT 3.0, then the file /etc/default/authsh is what is used to determine basic aspects of the user's environment, such as their home directory, shell, group, etc. If you are running OpenServer, this file still exists, but the file /etc/default/accounts is used instead. Although the format of the new file is easier to read, the old name had significance since it is the /tcb/bin/authsh utility that actually does the work when a user account is created.

Permissions Files

Another sub-directory of /etc, /etc/perms, contains more useful information about your system. The files within the /etc/perms directory are related to what products and packages are installed on your system. This is useful for finding out where a particular file is without having to do a search of your entire system. For this, do:

grep <file_name> /etc/perms/* | more

where <file_name> is whatever I am looking for. After using this several times, I put it into a shell script. If you are running OpenServer, then the /etc/perms/* may exist, but their content is not the same. Basically, OpenServer no longer users the files in /etc/perms, but are kept for backwards compatibility. Therefore this command may not work.

Although the contents of the files do not directly tell you what is currently installed, you can find out what programs are/should be available, plus what their permissions, owner and group ought to be. (NOTE: You don't need to correct permission's problems by hand. You can use the fixperm or fixmog utilities.)

Although not quite as verbose or easy to read, OpenServer does have a file list, similar to those in /etc/perms/. These are the file lists located within the SSO. You find these in several locations throughout the system. Normally these will be in a subdirectory called .softmgmt under the /var/opt directory. For example, the file lists for the operating system portion of OpenServer are found in /var/opt/K/SCO/Unix/5.0.0Cd/.softmgmt, whereas are those for TCP are found in /var/opt/K/SCO/tcp/2.0.0Cd/.softmgmt. The nice thing about ODT was that all this information was concentrated in about a dozens files. OpenServer has it spread over a couple of hundred!

If you want to find out what is currently installed, you can use the swconfig program to list products as well as packages that are installed. In most cases, custom will also tell you if a particular package is partially installed or not at all as well as some programs that have been removed. This is usually the case with bundled products such as ODT or the ODT Development System. If you have ODT 3.0 and are curious what has been installed on your system, check out the file /usr/lib/custom/history. Unfortunately, this no longer exists in OpenServer. The closest is /var/opt/K/SCO/Unix/5.0.0Cd/custom/custom.log, but this is rather difficult to read.

On both ODT 3.0 and OpenServer you'll find the file /etc/auth/system/files. This contains a list of files and permissions from the perspective of the TCB and not necessarily related to what is installed. In many cases, individual files are not mentioned as it is expected that every file in a specific directory has the same permissions. There some files may not appear here.

You can a one-liner like the one above to look through the OpenServer configuration files:

egrep <file_name> {,/var}/opt/K/SCO/*/*/.softmgmt/*.fl

Using egrep is necessary here because of the syntax we are using to look through both the /opt and the /var/opt directories.

User Files

The /etc directory contains the all-important passwd file. This gives important information about what users are configured on the system, what their user ID number is, what their default group is, where there home directory is and even what shell they use by default.

The default group is actually a group ID number rather than a name. However, it's easy to match up the group ID with the group name by looking at /etc/group. This also gives you a list of users, broken down into what groups they belong to. Note the "groups" is plural.

Another aspect of information about users is what privileges they have on the system. As we talked about in the section on security, what users have a particular privilege can be found in the files in /etc/auth/subsystem. These are referred to the subsystem authorizations. Privileges listed on a per user basis are found in /tcb/files/auth/?, where ? is the first letter of the user's account name, such as r for root or u for uucp. The default values for these files are kept in /etc/auth/system/default.

Network Files

If you are running TCP/IP, there are a couple of places to look for information about your system. First, check out the file /etc/resolv.conf. If you don't find it and you know you are running TCP/IP, don't worry! The fact that it is missing, tells you that you are not running a nameserver in your network. (A nameserver is a machine that contains data on how to communicate with other machines in a network.) If it is not there, you can find a list of machines that your machine knows about and can contact by name, look at /etc/hosts. If you are running a nameserver, this information is kept on the nameserver itself.

The content of the /etc/hosts file is the IP address of a system followed by its fully qualified name and then any aliases you might want to use. A common alias is simply to use the node name, leaving of the domain name. Each line in the /etc/resolv.conf file contains one of a couple different types of entries. The two most common are the domain entry, which is set to the local domain name, and the nameserver which is followed by the IP address of the name "resolver". See the section on TCP/IP for more information on both of these files.

It's possible that your machine is the nameserver itself. To find this out look at the file /etc/named.boot. If this exists, then you are probably a name server. The /etc/named.boot file will tell you the directory where the name server database information is kept. For information about the meaning of these entries, check out the named(ADMN) man-page as well as the section on TCP/IP.

Another place to look is the TCP startup script in /etc/rc2.d. Often static routes are added there. If these static routes use tokens from either /etc/networks or /etc/gateways that are incorrect, then the routes will be incorrect. By using the -f option to the route command you can flush all of the entries and start over.

Although not as often corrupted or otherwise goofed up, there are a couple of other files that require a quick peek. If you think back to our telephone switchboard analogy for TCP, we can think of the /etc/services file as the phonebook that the operator uses to match up names to phone numbers. Rather than names and phone numbers, /etc/services matches up the service requested to the appropriate port. To determine the characteristics of the connection, inetd uses /etc/inetd.conf. This contains such information as whether to wait for the first process to be finished before allowing new connections.

A common place for confusion, incorrect entries and the inevitable calls to support deals with user equivalence. A we talked about in the section on TCP/IP, when user equivalence is set up between machines many remote command can be executed without the user having to produce a password. One of the more common misconceptions is the universality of the /etc/hosts.equiv file. While this file determines with what other machine user equivalence should be established, the one user it does not apply to is root. To me this is rightly so. While is does cause administrators who are not aware of this to be annoyed, it is nothing compared to the problems if it was to allow root and this is not what you expected.

In order to allow root access, you need to create a .rhosts file in roots home directory (usually /) containing the same information as /etc/hosts.equiv, but only applying to the root account. The most common mistake made with this file is the permission. If the permission are such that any other user other than root (as the owner of the file) can read it, the user equivalence mechanism will fail. Looking in /etc/hosts.equiv and $HOME/.rhosts tells you want remote users have access to what user accounts.

In the section on networking, I introduced the concepts of a "chain." This is the link between network interface and the various network protocols. Which links are configured is kept in /usr/lib/lli/chains. This is simply a list of the various chains, with the upper layer on the left side and the lower layer on the right side.

If you are running NFS, there are two places to check for NFS mounted filesystems. To check for remote filesystems that yours mounts, take a look at /etc/default/filesys. This is default location to list all mounted filesystems, not just local ones. If you are the one exporting filesystems or directories, the place to look is /etc/exports.

Other Files

Next we get to a very foreboding and seldom frequented portion of your system: /usr/include. If you are a programmer, you know what's in here. If not, you probably thought that this directory was only for programmers. Well, sort of.

There are really only three times when you needed to be concerned with this directory. First, if you are a programmer. Second, if you're relinking the kernel. Do you remember all the space.c files in /etc/conf/pack.d? Well, they all refer to include files somewhere in /usr/include.

The parent directory, /usr/include, basically contains the include files that are consistent across Unix dialects. There are some useful things in here, such as the maximum value that an unsigned integer can take on. This is 4294967295 and you can find it in /usr/include/limits.h. There are some less useful things such as pi divided by four out to 20 decimal places. This is defined in the /usr/include/math.h as 0.78539816339744830962.

The third time? If you are just curious about your system. Even if you have just a basic knowledge of C, poking around in these files can reveal some interesting things about your system.


File

Purpose

Where to find more information

User and Security Files



/etc/auth/subsystems

manipulation routines for Subsystems database

subsystems(S)

/etc/auth/system/authorize

subsystem authorization file

authorize(F)

/etc/auth/system/default

system default database file

default(F)

/etc/auth/system/devassign

device assignment database file

devassign(F)

/etc/auth/system/files

file control database

files(F)

/etc/auth/system/ttys

terminal control database file

ttys(F)

/etc/group

User group information

group(F), chmod(C)

/etc/passwd

User account information

password(F), chmod(C)

Kernel Files



/etc/conf/cf.d/init.base

Base for /etc/inittab

inittab(F)

/etc/conf/cf.d/mdevice

device driver module description file

mdevice(F)

/etc/conf/cf.d/mevent

Master event file

event(FP

/etc/conf/cf.d/mfsys

configuration file for filesystem types

mfsys(FP)

/etc/conf/cf.d/mscsi

SCSI peripheral device configuration file

mscsi(F)

/etc/conf/cf.d/mtune

Master kernel tunable parameter file

mtune(F)

/etc/conf/cf.d/sdevice

local device configuration file

sdevice(F)

/etc/conf/cf.d/sevent

System event file

event(FP)

/etc/conf/cf.d/sfsys

local filesystem type file

sfsys(FP)

/etc/conf/cf.d/stune

local tunable parameter file

stune(F)

/etc/conf/mfsys.d

Master filesystem configuration file

mfsys(FP)

/etc/conf/node.d

Device node configuration files

idmknod(ADM)

/etc/conf/pack.d

Device drivers and configuration files

mdevice(F), sdevice(F)

/etc/conf/sdevice.d

Device configuration files

mdevice(F), sdevice(F)

/etc/conf/sfsys.d

Local filesystem configuration files

sfsys(FP)

Networking Files



/etc/auto.master

Default master automount file

automount(NADM)

/etc/bootptab

Internet Bootstrap Protocol server database

bootptab(SFF)

/etc/exports

Directories to export to NFS clients

exports(NF)

/etc/gateways

List of gateways

routed(ADM)

/etc/hosts

Hostname to IP address mapping file

hosts(SFF)

/etc/hosts.equiv

Lists of trusted hosts and remote users

hosts.equiv(SFF), .rhosts(SFF)

/etc/inetd.conf

Configuration file for inetd

inetd.conf(SFF)

/etc/named.boot

Default initialization file for named

named.boot(SFF)

/etc/networks

Known networks

networks(SFF)

/etc/pppauth


point-to-point authentication database

pppauth(SFF)

/etc/pppfilter

PPP packet filtering configuration file

packetfilter(SFF)

/etc/ppppool

IP address pool file for PPP network interfaces

ppppool(SFF)

/etc/ppphosts

point-to-point link configuration file

ppphosts(SFF)

/usr/lib/named or

/etc/named.d

Configuration files for named

named(ADMN)

/usr/lib/uucp/Configuration

Protocol configuration file for UUCP

uucp(C), Configuration(F)

/usr/lib/uucp/Devices

Configured UUCP Devices

uucp(C), Devices(F)

/usr/lib/uucp/Permissions

UUCP authorization file

uucp(C), Permissions(F)

/usr/lib/uucp/Systems

Remote UUCP Systems

uucp(C), Systems(F)

X-Windows Files



$HOME/.mwmrc

MWM configuration file

mwm(XC), X(X)

$HOME/.pmwmrc

PMWMconfiguration file

mwm(C), X(X)

$HOME/Main.dt

X-Desktop configuration file

dxt3(XC)

$HOME/Personal.dt

X-Desktop configuration file

dxt3(XC)

/usr/lib/X11/system.mwmrc

System default MWM configuration file

mwm(XC), X(X)

/usr/lib/X11/system.pmwmrc

System default PMWM configuration file

mwm(XC), X(X)

/usr/lib/X11/app-defaults

Application specific defaults

X(X)

$HOME/.Xdefaults-hostname

Host specific defaults

X(X)

System Default Files



/etc/default

system default database file

default(F)

/etc/default/archive

archive devices

archive(F)

/etc/default/authsh /etc/default/accounts

Account creation parameters

authsh(ADM)

/etc/default/backup

XENIX backup devices

xbackup(ADM)

/etc/default/boot

System boot options

boot(F)

/etc/default/cc

Read by /bin/cc

cc(CP)

/etc/default/cleantmp

Interval/location for tmp file cleanup

cleantmp(ADM)

/etc/default/cron

Cron configuration file

cron(C)

/etc/default/device.tab

Device table for package utilities

pkgadd(ADM)

/etc/default/dumpdir

XENIX archive device

xdumpdir(ADM)

/etc/default/filesys

Fileysystem mount table

filesys(F)

/etc/default/format

Floppy disk format device and verification

format(C)

/etc/default/goodpw

Password checking options

goodpw(ADM)

/etc/default/idleout

Interval for closing idle logins

idleout(ADM)

/etc/default/issue

System default banner

issue(F)

/etc/default/lang

System locales

locale(M)

/etc/default/lock

Logout interval for locks on serial terminals

lock(C)

/etc/default/login

System login parameters

login(M)

/etc/default/lpd

Print service options

lpadmin(ADM)

/etc/default/man

Man-page configuration file

man(C)

/etc/default/mapchan

Character device mapping

mapchan(M)

/etc/default/mapkey

Monitor screen mapping

mapkey(M)

/etc/default/merge

SCO Merge


/etc/default/mnt

Remote filesystem types

mnt(C)

/etc/default/msdos

DOS command confiugration file

doscmd(C)

/etc/default/passwd

Password parameters

passwd(C)

/etc/default/purge

Files to be purged

purge(C)

/etc/default/pwr

Power management configuration file

pwrd(ADM)

/etc/default/restore

XENIX restore device

xrestore(ADM)

/etc/default/scsihas

SCSI host adapter driver names

scsi(HW)

/etc/default/slot

MCA adapter configuration data

slot(C)

/etc/default/su

Root command parameters

su(C), asroot(ADM)

/etc/default/tape

Default tape device

tape(C)

/etc/default/tar

Archive devices

tar(C)

/etc/default/whois

Parameter is used by the whois service

whois(TC)

Miscellaneous Files



/etc/checklist

list of file systems processed by fsck

checklist(F)

/etc/dktab

virtual disk configuration file

dktab(F)

/etc/inittab

Configuration file for init

inittab(F), init.base(F)

/etc/rc*

System startup scripts

rc0(ADM), rc2(ADM)

/etc/ttytype

Terminal type to tty device mapping file

ttytype(F)

Table 0.1 Configuration files and where to find more information

What the system is doing now

At any given moment, there could be dozens, if not hundreds, of different things happening on your system. Each requires systems resources, which may not be sufficient for everyone to have an equal share. As a result, resources must be shared. As different processes interact and go about their business, which resource a process has and the amount of that resource that it is allocated will vary. As a result, performance of different processes will vary as well. Sometimes, the overall performance reaches a point that becomes unsatisfactory. The big question is what is happening?

Users might be experiencing slow response times and tell you to buy a faster CPU. I have seen many instances where this was the case, and afterwards the poor administrator is once again under pressure because the situation hasn't changed. Users still have slow response times. Sometimes the users tell the administrator to increase the speed on their terminal. Obviously 9600 isn't fast enough when they are doing large queries in the database, so a faster terminal will speed up the query, right?

Unfortunately, things are not that simple. Perhaps you, as the system administrator, understand that increasing the baud rate on the terminal or the CPU speed won't do much to speed up large database queries, but you have a hard time convincing users of that. On the other hard, you might be like many administrators "unlucky" enough to have worked with a computer before, so you are thrown into the position, as often is the case. What many of us take as "common knowledge," you have never experienced before.

The simplest solution is to hire a consultant who is familiar with your situation (hardware, software, usage, etc) to evaluate your system and make changes. However, computer consultants are like lawyers. They may charge enormous fees, talk in unfamiliar terms and in the end you still haven't gained anything.

Now, not all computer consultants or lawyers are like that. It's simply a matter of not understanding what they are telling you. If you do not require that they speak in terms that you understand, you can end up getting taken to the cleaners.

If you feel you need a consultant, then do two things. Like any other product, you need to shop around. Keep in mind that the best one to get is not necessary the cheapest, just as the best one is not necessarily the most expensive. The second key aspect is to know enough about your system to understand what the consultant is saying. At least, conceptually.

In this section, we are going to combine many of the topics and issues we discussed previously to find out exactly what our system is doing at this moment. By knowing what the system is doing, you are in a better position to judge if it is doing what you expect it to, plus you can make decisions as to what could/should be changed. This also has a side benefit of helping you should you need to call a consultant.

So, where do we start? Well, rather than defining a particular scenario and saying what should we do if this happened, let's talk about the programs and utilities in terms of what they tell us. Therefore, I am going start with general user activity and proceed to more specifics.

Users

It's often useful to know just how many users are logged onto your system. As I mentioned before, each process requires resources to run. The more users logged on to your system, the more processes there are using your resources. In many cases, just seeing how many users are logged in rings bells and turns on lights in your head to say that something is not right.

The easy way to figure out how many users are logged in is with the who command. Without any options who simply gives you a list of what users are logged in, plus the terminal they are logged into and the time they logged in. If you use the -q option (for quick), you get just a list of who is logged on, plus the user count. For example:

root root root jimmo

# users=4

For every user logged in, there is at least one process. If the user first gets to a shell and starts their application that way, they probably have two processes. If you are running OpenServer, each time a user logs in, the login process is still there, so you need to add and extra process for this. This brings up the total number of process to three times the number of users. Granted the shell is sleeping, waiting for the application to finish and the login process is sleeping since it only used to monitor the total number of logins. However, they are still taking up system resources.

Although I rarely use who with any option, except -q, it does have several other options that I have used on occasion. One is the -b option, which tells you when the system was last rebooted. Another, the -r option, which tells you what run-level you are in, is used by both /etc/rc2 and /etc/rc3.

If you use the -u option, the last field in each line is the PID of that user's shell. This is a good starting point if you find that problems are limited to a specific user, or group of users. You can then use this PID to search through the output of the ps command to find out what else the user is doing.

Processes

The ps command gives you a process status. Without any options, it gives you the process status for the terminal you are running the command on. That is, if you are logged in several times, ps will only show you the processes on that terminal and none of the others. For example, I have four sessions logged in on the system console, when I switch to one and run ps, on I get:

PID

TTY

TIME

CMD

625

ttyp0

00:00:03

ksh

991

ttyp0

00:00:00

ps

This only shows those processes running on the terminal where I started the ps (in this case ttyp0). Note that if you are running ODT 3.0 then the output is slightly different.

If I am not on that terminal, but still want to see what is running there, I can use the -t option. A nice aspect of the behavior of the -t is that you don't have to specify the full device name, or even the 'tty' portion. It suffices just to give the tty number. For example, to get the same output as above I could enter (no matter where I was):

ps -tp0

Keep in mind that if I was on a pseudo terminal, the terminal number also includes the p. If I have console or serial terminal, then the p isn't used as it not part of the tty name. For example, if I wanted to check processes on tty04, I would enter:

ps -t04

Note also that you do not specify the /dev/ portion of the device name, even if you specify the tty portion. For example, this works:

ps -tttyp0

or

ps -t ttyp0

but this doesn't

ps -t /dev/ttyp0

If we are curious as to what a particular user is running, we can use the -u option. This will tell use every process that is owned by that user.

Although running ps like this does show who is running what, it tells us little about the behavior of the process itself. In the section on processes, I showed you the -l option, which shows you much more information. If I add the -l (long) option, I might output that looks like this:

F

S

UID

PID

PPID

C

PRI

NI

ADDR

SZ

WCHAN

TTY

TIME

CMD

20

S

0

608

607

3

75

24

fb11b9e8

132

f01ebf4c

ttyp0

00:00:02

ksh

20

O

0

1221

608

20

37

28

fb11cb60

184

-

ttyp0

00:00:00

ps

When problems arise, one column that I use quite often is the TIME column. This tells me the total time that this process has been running. Note that the time for ksh is only 2 seconds although I actually logged in on this terminal several hours before I issued the command. The reason is that the shell spends most of its time either waiting for you to input something or waiting for the command that you entered to finish. Nothing out of the ordinary here.

Unless I knew specifically on what terminal the problem, existed, I would probably have to show every process in order to get something of value. This would be done with the -e option (for everything). The problem I have is that I have to look at every single line to see what the total time is. So, to make my life easier I can pipe it to sort. My sort field will be field 13 (TIME), so I use the -k 13 option. Since I want to see the list in reverse order (largest value first), I also use the -r option. Since I probably only want the first few entries, piping it through head would not be a bad idea. So, the command would look like this:

ps-el | sort -r -k 13 | head -5

On my system I get this:

F

S

UID

PID

PPID

C

PRI

NI

ADDR

SZ

WCHAN

TTY

TIME

CMD

20

S

12709

654

627

0

76

24

fb11bdf0

8588

f0213424

?

01:22:25

wabiprog

20

S

0

620

619

3

76

0

fb11b738

2692

f0213424

tty01

00:12:31

Xsco

20

S

12709

655

654

0

76

24

fb11c0a0

1012

f0213424

?

00:00:49

wabifs

20

S

0

624

623

1

76

24

fb11bb40

928

f0213424

tty01

00:00:29

scoterm

At the very top of the list we see wabiprog. This is the process that essentially is Wabi. Since I have been typing a lot of this text with Microsoft Word for Windows running under Wabi, it is not surprising that I have such a large value for the time. Every time I press a key, every time I scroll, and every time I click on a menu is counted towards the total time of wabiprog. If you are running a database application, or something similar, it is probable that you have at least one process with this high a TIME.

Figuring out what is a reasonable value is not always easy. The most effective method I have found is to monitor these values while it is behaving "correctly". You then have a rough estimate of the amount of time particular processes need, and you can quickly see when something is out of the ordinary.

Something else that I use regularly is the PID-PPID pair. If I come across a process that doesn't look right, I can follow the PID to PPID chain until I find a process with a PPID of 1. Since process 1 is init, I know that this process is the starting point. Knowing this is often useful when I end up having to kill a process. Sometimes, the process is in an unkillable state. This happens in two cases. First, the process may be making the transition to becoming defunct, in which case I can ignore it. It may also be stuck in some part of the code in kernel mode. In which case, it won't hear my kill signal. In such cases, I have found it useful to kill one of its ancestors (such as a shell). The hung process is inherited by init and will eventually disappear. However, in the meantime, the user can get back to work. Afterwards come the task of figuring out what the problem was.

In the section on processes, I mentioned the crash program. This is a very useful tool to monitor your system. Unlike other monitoring programs, crash does not have a pretty graphics interface, so it appears rather unfriendly at first. In fact, it is a quite unfriendly program and doesn't take too kindly to people poking around who don't know what they're doing. Not that you can do any damage, it's just that many of the error messages tell you nothing other than you did something wrong.

Start it up simply as crash and this is what you see:

dumpfile = /dev/mem, namelist = /unix, outfile = stdout

>

The dump file is where the crash program is getting it information. The default location is /dev/mem, which is the device used to access memory. Since we want to take a look at the memory on a running system, this is a logical place. If, on the other hand, the system panicked and there was a dump image on the swap device (/dev/swap), you could use that as your dumpfile. From the command line, we specify the dump file with the -d option.

The namelist is basically a table that converts machine names to their human readable equivalent. We use the /unix program, since that's normally what is used to load the kernel and crash needs that file as a reference point. If you have relinked (created a new kernel) then the kernel in memory does not match what is on the hard disk in /unix. Therefore, you would need to specify the one you booted with (normally /unix.old). This is done from the command line with the -n option. Because we are reading these files (/dev/mem or /dev/swap and /unix) the user running crash needs to have read permission. Since most users don't have permission to read either, crash is usually run by root.

In our case, we want to immediately see the results of our input. Therefore, we want the output to go to stdout. This is our outfile. If we wanted we could tell crash to output everything to a file for late examination. The file we want to write to is specified from the command line with the -w option. If we do specify an output file, we will still see our prompt ( >), but all output is send to the outfile.

The first thing we need is a process to look at, so we need to take a look at the process table to find one that looks interesting. To see the process table, input either proc or p. Possibly you have more than a screenful of information. To send this through more, use an exclamation mark instead of a pipe symbol. Note that unlike pipes from the shell, the exclamation mark must be preceded by a space, as in:

proc ! more

or

proc !more

We now get a list of all the processes on the system sorted by their slot in the process table. In my case, I took a look at the ksh process we used above. I found it in slot 66, so I reran the command as proc 66, so all I got was the header and the one entry, which looked like this:

PROC TABLE SIZE = 83

SLOT

ST

PID

PPID

PGRP

UID

PRI

CPU

EVENT

NAME

FLAGS

66

s

1107

1106

1107

0

75

0

spt_tty+0x68

ksh

load

The entries in this output are: are slot in the process table, run state, process ID, parent process ID, process group, user ID, priority of the processes, event it is waiting on, the name of the command and flags. The list of possible flags can be found in <sys/proc.h>. Most of these fields are the same as we saw in the ps output.

One difference is the EVENT column. However, the difference is in name only since this is the wait channel for that process. This makes sense since we wait on events. However, here it is in a slightly different format (spt_tty+0x68). This is telling us that the wait channel is at an offset of 0x68 from the start of the spt_tty function. Using the nm function of crash, which will translate a symbol (name) into an address, like this:

nm spt_tty

This gives me:

spt_tty 0xf01ebee4 .bss

Which is the function name, the address in the kernel and what segment it is in. We see here that the spt_tty function is at address 0xf01ebee4. Adding the 0x68 to it, we get 0xf01ebf4c. If we look back at the ps output, this is the same place.

Now this is a little round about way of finding out the same WCHAN value, but we found out something more valuable than the actual wait channel. This is the name of the function, in this case spt_tty(). Right off the bat, I can tell that it has something to do with a terminal because of the tty portion of the name. Since I am familiar with the system, I know that the spt driver is used for pseudo-ttys. Since I had the ksh running on a pseudo-tty, this convinces me that the event being waited on was something to do with a terminal.

Unfortunately, for those of you running ODT 3.0 things are a little more complicated. You don't get the nice little symbol name in the EVENT column. No worries, there is just another step you need to add. Whereas nm translates from the symbol to the address, ds translates from the address to the closest symbol it can find, prior to the address you gave it. So with ODT, you input the hex value you get in the event column and will get a symbol name back.

Okay, so what good is this information? Well, in this case the primary value is educational and satisfying our curiosity. You might have a process that is asleep and can't seem to wake up. (Somewhat like me on Monday morning) This is a way of determining what the process is waiting on. For example, we might have just used ps to show that one process has a large value in the TIME column. We then check that process using crash and discover that it is waiting on the database_query() function (which I just made up). You therefore know it is waiting on something with the database. (Probably)

It is also possible to use crash to find out how much memory you have available.There are several variables you can read to find out exactly what you are looking for. To do this you need to use the od function of crash which will give you the value of the kernel variable you input. For example, to get the current amount a free memory, the command would be:

> od -d freemem

Which might give you:

f0114c1c: 0000000740

The first number is the location (address) of the freemem variable in the kernel. The second value is the decimal value (because of the -d) of the number of free pages. This is the same value as you would get by running sar -r.

To find the amount of swappable memory, look at the availsmem variable and at the availsrmem  variable for non-swappable (resident) memory.

What other information can you get through crash? One is the open files a process has. This is accomplished by first looking at that process' uarea with either the u or user command. The top part of my ksh process would look like this:

PER PROCESS USER AREA FOR PROCESS 66

USER ID's: uid: 0, gid: 3, real uid: 0, real gid: 3

supplementary gids: 3 0 1

PROCESS TIMES: user: 62, sys: 252, child user: 1785, child sys: 2300

PROCESS MISC:

command: ksh, psargs: ksh

proc: P#66, cntrl tty: 58,1

start: Wed Jun 7 12:25:14 1995

mem: 0xfdd5, type: exec su-user

proc/text lock: none

current directory: I#538

OPEN FILES AND POFILE FLAGS:

[ 0]: F#305 r [ 1]: F#305 w [ 2]: F#305 w

[ 3]: F#214 r w [ 5]: F#305 [31]: F#263 c r w

FILE I/O:

u_base: 0x806f7ac, file offset: 135349, bytes: 256

segment: data, cmask: 0022, ulimit: 2097151

file mode(s): read

Although I don't have the space to go over each entry, you can see many of the same entries that we had in both the ps output and the proc listing. One additional entry is the one labeled OPEN FILES AND POFILE FLAGS. By using the values here we can track down what files are being currently used by that process. Since this section is on process, let's wait a minute and talk about that in the section on Files and Filesystem, below.

Files and Filesystems

Knowing how much space is left on your filesystems is another thing you should monitor. I have seen many instances where the root filesystem gets so close to 100% full that nothing more can get done. Since the root filesystem is where unnamed pipes are created by default, many process die terrible deaths if they cannot create a pipe. If the system does get that full, it can prevent further logins (as each login writes to log files). If root is not already logged in and can remove some files, then there you have problems.

So the solution is to monitor your filesystems to ensure that none of them get too full, especially the root filesystem. A rule of thumb, whose origins are lost somewhere in UNIX mythology is that you should make sure that there is at least 15% free on your root filesystem. Although 15% on a 200MB hard disk is one-tenth the amount of free space as 15% on 2Gb drive, it is a value that is easy to monitor. Think of 10-15Mb as a danger sign and you should be safe. However, you need to be aware of how much the system can change and how fast. If the system could change 15 Mb in a matter of hours, then 15Mb may be too small a margin.

Use df to find out how much free space is on each mounted filesystem. Without any option, the output of df is one filesystem per line, showing how many blocks and how many inodes are free. While this is interesting, I am really more concerned with percentages. Very few administrators know how long it takes to use up 1000 blocks, however most understand the significance if those 1000 blocks mean that the filesystem is 95% full.

Since I am less concerned with how many inodes, the option I used most with df is -v which show mean the data block usage. On my system I get something like this;

Mount Dir

Filesystem

blocks

used

free

%used

/

/dev/root

779884

745434

34450

96%

/stand

/dev/boot

30000

16634

13366

56%

/u1

/dev/u1

80000

39312

40688

50%

/u2

/dev/u2

80000

64680

15320

81%

/odtroot

/dev/odtroot

400002

382100

17902

96%

/usr/dos/c

/dev/dsk/0sC

306840

284952

21888

93%

/usr/dos/d

/dev/dsk/1sD

511440

482024

29416

95%

/usr/dos/e

/dev/dsk/1sE

521672

468056

53616

90%

/usr/dos/g

/dev/dsk/2sC

820784

128

820656

1%

/odtroot/u

/dev/data2

628888

173580

455308

28%

We see that my root filesystem is getting dangerously full. Although I have about 17Mb free (these are 512-byte blocks) and most of my data is kept on either the /u1 or /u2 filesystem, I need to be aware of the situation. Note however, that the /u2 filesystem is only 81% used but it has less than half the free space as the root filesystem. Since this is where my data is, I am much more concerned with it being at 81% than I am with root being at 95%. Note that I can also monitor free space on my DOS partitions.

Another utility, dfspace, is a shell script that gets its information from df and shows the data in an easier to interpret format. It also shows you what's available in megabytes as well as percentage. Although this is a nice verbose output, I like df better in that it is easier to read. I can also more easily build a shell script around it to monitor usage automatically.

The shortcoming with df is that is tells you about the entire hard disk and can't really point to where the problems are. A full filesystem can be cause by one of two things. First. there are a few large files. This often happens when log files are not cleaned out regularly. One prime example in the MMDF channel log file (/usr/mmdf/log/chan.log). I have had this file reach to over 30 MB in just a few hours! In this case the solution is to monitor these files and clean them out as often as necessary.

The other case is when you have a lot of little files. This is similar to ants at a picnic. Individually they are not very large. However, hundreds swarming across your hotdog is not very appetizing. If the files are scattered all over your system, then you will have a hard time figuring out where they are. However, if they are scattered across the system, then the odds are that no one program created them, so they are all probably wanted (if not needed). Therefore, you simply need a bigger disk.

If, on the other hand, the files are concentrated in one directory, it is more likely that a single program is the cause. As with the large log files, a common culprit is MMDF. Now, I don't want to sound like I am opposed to MMDF, in fact I like it. The issue is that if there is a configuration problem in MMDF, mail will get backed-up. If it then has trouble sending a message to the originator to say that it just had trouble sending mail, then this gets backed up as well. On busy systems, this can grow to thousands of messages. This isn't the fault of MMDF, but the person who configured it. Just as it's not the fault of the water company when you get a large bill after leaving your tap running during your two week vacation.

To detect either case, you can use a combination of two commands. First is find, which we already know from previous encounters is used to find files. Next is du which is use to determine disk usage. Without any options, du gives you the disk usage for every file that you specify. If you don't specify any, it gives you the disk usage for every file from your current directory on down.

Note that this usage is in blocks. Since even if a block contains a single byte, that block is used and no longer available for any other file. However, if you look at a long listing of a file you see the size in bytes. A one byte file still takes up one data block. The size indicated in a long directory listing will usually be less than what you get if you multiple the number of blocks by the size of the block (512 bytes).

To get the sum of a directory without seeing the individual files, use the -s option. To look for directories that are exceptionally large, we can find all the directories and use du -s. We also need to be sure that we don't count multiple links more than once, so we include the -u option as well. We then sort the output as numerical values and in reverse order (-nr) to see the larger directories first. Like this:

find / -type d -exec du -us {} \; | sort -nr > /tmp/fileusage

I do the redirection into the file /tmp/fileusage for two reasons. First, I have a copy of the output that I can use later if I need to. Second, this command is going to take a very long time. Since I started in /, the first directory found is /. Therefore, the disk usage for the entire system (including mounted filesystem) will be caluclated. Only after it has calculated the disk usage for the entire system does it go on to the individual directories.

We can avoid this problem in a couple of ways. First, by using -print instead of -exec in the find and then piping it first to grep -v. This strips out the / and we can then pipe that output to xargs. This way we avoid the root directory.

Personally, this is not very pretty, especially if I were going to be using the command again. I would much rather create a list of directories and use this as arguments to du. That way we can filter out those directories that we don't need to check or only include those that we do want to check. For example, we already know that /usr/mmdf/logs might contain a very large file. This would be a good directory to monitor. Another is the MMDF spool directory (/usr/spool/mmdf/lock/home). If you can find out the directories that your applications use, this would also be a good directory to include.

On occasion, it's nice to figure out what files a process has open. Maybe the process is hung and you want some details before you decide to kill it. We can use crash to find this out. In the case, I took a look at the vi session I used to write this text. The entry looked like this:

PROC TABLE SIZE = 100

SLOT

ST

PID

PPID

PGRP

UID

PRI

CPU

EVENT

NAME

FLAGS

14

s

264

176

264

0

28

0

f0093e84

vi

load

Since this process is taking up slot 14, I want to look at the uarea of the process in slot 14. To look at the uarea input either user or just u. So, we would input u 14. This also takes more than a screen, so we need to pipe it through more, as well. Near the top is the entry for open files. Mine looks like this:

OPEN FILES AND POFILE FLAGS:

[ 0]: F#25 r [ 1]: F#25 w [ 2]: F#25 w

[ 3]: F#32 [ 4]: F#23 r w [ 5]: F#47 w

These are my file descriptors. The numbers inside the square brackets are the descriptor numbers. The F# numbers are the slots in the file table taken up by those descriptors. Following that are the access rights I have to the descriptor (r-read, w-write). Notice that for file descriptors 0, 1, and 2 the slot in the file table is the same. Do you remember what file descriptors 0, 1, and 2 are? Stdin, stdout and stderr. Since I have not done any file redirection, these are the same. This shows me that all three point to the same place. In addition to the big three, there are three other files I have open.

To find out what entries these are in the inode table, we first look in the file table. As you might have guessed we need to give the slot number in the file table. If we want to see what slot 25 is, for example, we would input either f 25 or file 25. (This is the file number of my stdin, stdout and stderr) The output would look like this:

FILE TABLE SIZE = 200

SLOT

RCNT

I/FL

OFFSET

FLAGS

25

6

I178

13b3c

read write

Here we have the slot number, the reference count, the inode table slot number, the offset and the read/write flags.

Next, we need to look at the inode table, which is basically the same process. We input either inode 178 or i 178 and this gives us:

INODE TABLE SIZE = 300

SLOT

MAJ/MIN

FS

INUMB

RCNT

LINK

UID

GID

SIZE

MODE

MNT

M/ST

RCVD

FLAGS

178

1,40

2

130

1

1

0

15

0

c---600

0

S

0

-

In this case, the column we are looking for is the INUMB entry. This gives us the actual inode number, here 130. When I look at the MAJ/MIN column this tells me the major and minor number of the filesystem that this file is on. In this case 1,40 which I (we?) know to be the root filesystem on ODT 3.0. If I had run this on OpenServer, this entry would probably be 1,42. Since we now have the real inode number we can go look to find what file this is associated with.

To find the file, I have a few choices. There are two commands that can be used: ncheck and find -inum. If I had no idea where the file was, either of these could work. However, experience has told me that very low inode numbers (i.e. < 200) are usually in the /dev directory since numbers this low are usually created earlier in the install process. Also, since I know that these are files referred to by stdin, stdout and stderr, I know that these are probably tty device. So, I only need to look in the /dev directory. In this case, find would be a better choice since ncheck searches entire filesystems and find can be told to look in specific directories. So I would run:

find /dev -inum 130 -exec l -i {} \;

This gives me:

130 crw------- 1 root terminal 0, 0 May 06 12:01 /dev/tty01

If you haven't already figured it out, the kernel doesn't use that last step (the conversion from inode to file name). This is for us humans, to figure out the name of the file. Instead, the kernel takes the inode number and the offset into the fstypesw table (The FS entry in the inode information. In our case 2 for an EAFS) to be able to access the file itself.

By following this procedure for any of the file descriptors listed, we can find out every file a process has open. For example, file number 32 in the example uarea above turns out to be /usr/bin/vi, which makes sense since vi has to be open for me to use it.

Checking Kernel Parameters

There are three programs that can be used to check the current state of your kernel parameters: crash, configure and sysdef.  When you run crash, many of the kernel parameters can be viewed using the var function. This is more than a screenful, so you will need to pipe it through something. Although configure is used to configure kernel parameters, using the -x option to configure will give you a list of all the configurable kernel parameters as well as their current values. Despite what the man-page says about listing all the kernel tunable parameters, sysdef is missing some that are reported by configure. Because of this, I find configure more informative.

The System Activity Reporter and Performance Tuning

SCO UNIX performance tuning is often through of as a black art. I've talked with many customers who call in to SCO Support expecting that we will say a few magic words, wave our wizard's wand and, abracadabra, their system is running better. This is often compounded by the fact that often support engineers don't have the time to go into long, detailed explanations. Instead they quickly look over output from various system utilities and tell the customer to increase kernel parameter X. Miraculously the system instantly runs better. From the customers stand point, this is "magic."

Well, not really. Some customers do express their frustrations at not being able to improve the situation themselves. This is not because they aren't smart enough, but is the same reason that many people bring their cars to a mechanic for a tune-up. By comparison to replacing the block, a tune-up is a relatively simple procedure. However, many people don't have the skills to be able to do it themselves.

This applies to system tuning as well. Since many customers do not have the skills, the turn to the mechanic to do it for them. I remember when I was about 18, a couple of my friends who were really car enthusiasts. When their cars suddenly started making a strange sound, I can still remember them saying that the franistan had come loose from the rastulator. Well, at least that's what it sounded like to me at the time. The reason why I couldn't figure that out myself was that I didn't have the training or experience. However, they did have the experience and could tell just by listening. This is the same as many system administrators, who don't have the training or experience to tune an SCO UNIX system. However, you can.

Although a book like this cannot provide the experience, it can provide some of the training. Keeping with the car analogy, we've talked about the transmissions, the breaks, the drive shaft, the electrical system and even talked about the principles of the internal combustion engineer. With that knowledge we can now understand why it is necessary to have clean spark plugs or the proper mixture of air and gasoline.

With a car's engine we often get a "feeling" for its proper behavior. When it starts to misbehave, we know something is wrong, even though we may not know how to fix. The same applies, in principle to an SCO system. Although many garages can afford the high tech equipment that you plug your car into and it shows you what the car is doing. From that, it is a simple step for the mechanic to determine the proper course of action. What we need for an SCO UNIX system is a tool that does the same thing. That tool is the system activity reporter — sar.

The system activity reporter is a very useful tool to see what your system is doing. It can provide information about a wide range of aspects of your system such as memory usage, number or processes and even paging activity. By default the system gathers information about your system every 20 minutes during a normal work day (8 AM - 5 PM) and once an hour the rest of the time (on the hour, 20 and 40 minutes after the hour). This is done by the /usr/lib/sa/sa1 program through the sys user's crontab. Information is stored in the /usr/adm/sa directory. The files are of the form sadd where dd is the day of the month.

On OpenServer, turning off the data collection is fairly easy. There is the sar_enable utility that you use to toggle it on and off. See the sar_enable(ADM) man-page for more details.

This information in the sadd files is processed once a day by the /usr/lib/sa/sa2 program through root's crontab. This information is also stored in /usr/adm/sa. The file here have the form sardd where dd is again the day of the month. Unlike the sadd files, the sardd files are ASCII and can therefore be read without any problems. They are kept for one month (until the next day of the month with that number), so you have a limited record of system activity. The sadd files, on the other hand are in a format that sar understands and are read when you run sar.

There are so many options to sar that uppercase letters had to be used as well. Because of the large number of different options, it would be difficult to talk about all of them. Since they are all listed in the sar(ADM) man-page and since many of them would require extensive explanation of what they mean, we are only going to talk about those options that are most commonly used. The general syntax for sar is:

sar <options> frequency repetitions

Where frequency is how many seconds between readings and repetitions is how many readings. Therefore, running

sar 5 10

would run sar with the default options and give a reading once every 5 seconds for 10 seconds. If you leave off the last number (repetitions) sar waits the number of seconds specified and reports once. It's not recommended that you run sar at intervals less than 5 seconds since the fact sar is running may have an influence on the output being generated.

If you don't specify a time interval, sar shows you the activity for the current day. If you want to read the activity for some other day, you specify the file you want read (and therefore the day) with the -f option. You can also specify the time of the day you want by using the -s option for the start time and the -e option for the ending time. For example, if I want to find out about system activity in the afternoon on the 26th of the month, the command might look something like this:

sar -s12 -e17 -f/usr/adm/sa/sa26

If the day I ran this was the 14th, let's say, then this would show me the data for the 26th of the previous month. Keep in mind that the information is gathered by cron every 20 minutes, during "business hours" and once an hour other times. It's possible that the cron job doesn't actually get started until, let's say, 17:00:03. In that case, the above command would not report that entry. If we really wanted that entry, we could use the fact that the start and end times can be specified down to the second. Therefore, I could have used the command:

sar -s12 -e17:00:03 -f/usr/adm/sa/sa26

Much of the information sar provides is in the form of a percent. In this case, you will see a percent-sign (%) in the column heading. If the value reported is units per second, the information is displayed with a trailing /s. The default behavior of sar (that is, with no options specified) is the same as the -u option which shows the cpu usage. We would then see the statistics for today. We could also leave off the reporting criteria (the -u) and just specify the time. For example, we could run it once a second for 5 seconds, like this:

sar 1 5

This might give us something like this:

13:03:13

%usr

%sys

%wio

%idle

13:03:14

5

4

0

92

13:03:15

0

2

0

97

13:03:16

0

2

0

97

13:03:17

0

2

0

97

13:03:18

0

2

0

97






Average

2

2

0

96

Since I am the only user on my system, idle times this high are not unexpected. If you add up the entries in each column you'll notice that it doesn't always add up to 100%. Sometimes it's a little over (like the first entry) and sometimes it's under (like entries 2-5). This is due to the manner in which sar does the calculation and is to be expected. (Rounding and averaging)

The two columns that need to be monitored more often are %wio and %idle. The %wio column shows what percentage of the time the system is waiting for I/O. The SCO manuals say that if this value is constantly over 15%, then there is an I/O bottleneck, and I agree. (Do you expect that I wouldn't?) The operative word here is "constantly." If you run into cases when the %wio shoots up above 15% and then back down, this is normal. For example, when you are doing your daily backups, the %wio will probably get a lot higher than at other times. However, if you run sar over an extended period (several minutes) and see that most values for %wio are under 15% then you're doing well.

The %usr and %sys columns tell you the percentage of time spent running user and system code, respectively. It is difficult to say what values here are valid or not. This is all dependent on your application.

Another very useful option is -r, which show memory usage. A 3 second interval might look like this:

13:23:18 freemem freeswp

13:23:19 243 45968

13:23:20 243 45968

13:23:21 243 45968


Average 243 45968

The freemem column shows up how much free physical memory we have. This is in 4K pages. Therefore, I have just under 1Mb of RAM available. The freeswap column, as you might have guessed, tells us how much swap we have. This is in 512 byte blocks. Therefore, I have just under 23 Mb of swap. In this case, nothing changed as I am the only one of the system. However, a real good example of changing values is if you were to start sar-r on one console multiscreen and then change to another screen and start up X. Things change dramatically.

Personally, the amount of swap I have free makes me think. When I installed the system I configured it with 25Mb of swap. Since I have already used some of my swap space, I will want to monitor this. A little swapping can be tolerated, especially if you can't afford the RAM. (Like me) Too much swapping can lead to severe slowdowns in the system. This is because every time the system swaps, it has to access the disk, which is slow in comparison to accessing memory. I therefore want to monitor this value as well as both the %wio and %idle value when I run sar -u. If I am using alot of my swap space and spending a lot of time waiting for I/O with almost no idle time, then the system is doing a lot of "busy" work and not getting much real work done. Since the system has to do the swapping, you also see increases in sio% when you have to swap.

If you have a large %wio value, then things get more complicated if you have multiple hard disks. This is because the %wio value is combined from all the disks. In order to see activity on all drives, you need to use the -d option. We can compare what is being reported for each driver in terms of the percentage of the time the device was busy (%busy) and the number of reads and writes per seconds (r+w/s). If there is a wide gap you might want to consider spreading your data across multiple drives to even the load. If you are running OpenServer, this might be the perfect opportunity to consider of something like disk striping. For example, when I ran sar -d for three seconds on my system, I got this.

13:50:01

device

%busy

avque

r+w/s

blks/s

avwait

avserv

13:50:02

Sdsk-0

4.50

1.00

1.80

30.63

0.00

25.00


Sdsk-1

0.90

1.00

0.90

28.83

0.00

10.00


Sdsk-2

0.90

1.00

0.90

28.83

0.00

10.00









13:50:03

Sdsk-1

0.97

1.00

0.97

31.07

0.00

10.00


Sdsk-2

0.97

1.00

0.97

31.07

0.00

10.00









13:50:04

Sdsk-0

0.98

1.00

0.98

31.37

0.00

10.00


Sdsk-1

0.98

1.00

0.98

31.37

0.00

10.00


Sdsk-2

0.98

1.00

0.98

31.37

0.00

10.00









Average

Sdsk-0

1.90

1.00

0.95

20.89

0.00

20.00


Sdsk-1

0.95

1.00

0.95

30.38

0.00

10.00


Sdsk-2

0.95

1.00

0.95

30.38

0.00

10.00

If we look at the %busy (the percentage of the time the disk was servicing requests), we see that at 13:50:02, my first disk was five times more active than the others. If this were consistent throughout the day, then it would say me that this disk is being overburdened. Since the first disk is where all by applications and other programs are kept (it contains the root filesystem), it is not surprising that this has a higher value. Since the proportion evens out over time, however, I am not concerned.

Note that the intervals here are less than the 5 seconds I suggested earlier. Since sar is already in memory when this is being read, more than likely the system does not need to read the disk because of sar. However, to be sure that you are getting as accurate a reading as possible, it is a good idea to eliminate sar as much as possible.

When accessing a file, every program does so with that file's name. As we talked about in the section on kernel internals, the name is converted to the inode number by the namei() function. The first time a file is opened, namei() goes through all the gyrations we talked about before to find the inode number. Once it finds the inode numbers, it caches the name and inode in a structure called (what else?) the namei cache. The next time a file is accessed, the system looks in the namei cache. If the filename is there, all is well. If not, namei() will have to look elsewhere. If the directory is in the buffer cache, then no disk access is necessary. Namei() can read the inode from the directory. However, this is still not as fast as getting it from the namei cache as there is still a delay caused by looking in the buffer cache.

The number of times that an entry is found in any cache is referred to as a cache hit. The number of namei cache hits can be monitored using the -n option. Although showing the same information, the output in OpenServer is different from that in ODT 3.0. This is because OpenServer actually maintains two types of namei caches. The first is used to hold the name-to-inode mappings for AFS, EAFS and HTFS filesystems. The other is for DTFS filesystems. On a busy system 5 second of sar -n run might look like this:

14:23:11

H_hits

Hmisses

(%Hhit)

D_hits

Dmisses

(%Dhit)

14:23:14

4

1

( 80%)

9

41

( 18%)

14:23:15

1

1

( 50%)

5

29

( 14%)

14:23:16

2

0

(100%)

12

57

( 17%)

14:23:17

4

6

( 40%)

1

7

( 12%)

14:23:18

1

2

( 33%)

0

0

( 0%)








Average

2

2

( 54%)

5

26

( 16%)

If these values were to continue like this, I might want to consider increasing the size of my namei caches. On the average, the DTFS finds the name of the file in the cache only 16% of the time. This means that 85% percent of the time it looks for a inode number it must, at least, look in the buffer cache. Unless a lot of different files are being accessed in the same directory, then a lot of namei cache misses probably also means that the directory is not in the buffer cache.

The size of the namei cache is determined by the S5CACHEENTS kernel parameter in ODT and the HTCACHEENTS and DTCACHENTS kernel parameters in OpenServer, where HTCACHEENTS is for AFS, EAFS and HTFS filesystems and DTCACHEENTS is for DTFS filesystems. Note that names longer than 14 characters are not cached.

Keep in mind that the number of entries in the cache is not the only factor in the speed of access. The cache is an unsorted list. In order to find a particular entry, we would have to search through the entire list. Well, we would if it weren't for something called a hash queue. Simply put a hash queue is a list of entries grouped together. Let's assume we have 26 such hash queues, one for each letter of the alphabet. Entries are placed in the queue that corresponds to the first letter of the filename. As a result, much fewer entries need to be searched to find the right one.

The average length of the hash queue is simply the number of entries in the cache divided by the number of queues. If the hash queues were based on letters of the alphabet, then each has queue would be 1/26th the size of the name cache. The rule of thumb is to have the length of each hash queue to be less than or equal to four. In other words, whenever you change the size of the cache you should change the number of hash queues accordingly. Since the size of table may be dynamic in OpenServer, this ratio may not apply. The kernel parameters for the hash queues are S5HASHQS in ODT 3.0 and HTHASHQS and DTHASHQS in OpenServer.

Although we've sped up the translation from file name to inode, we haven't yet began to read the hard disk to get our data. As we talked about in the section on filesystems, the inode contains the locations of where the data resides on the hard disk. We cannot keep every inode for every filesystem in memory all the time, so like other aspects of the system, we only keep what is currently needed. It is therefore possible that we may have found the inode quickly through the namei cache, but the inode is not in memory so we have to access the disk to read it. If we increased the size of the inode table in memory we would speed things up even further.

First we need to see if this is even necessary. Remember, you should never try to fix something that's not broken. To see the activity in the inode table, we use the  v option to sar. Running for three seconds, gave me this:

16:02:45 proc-sz ov inod-sz ov file-sz ov lock-sz

16:02:46 101/ 154 0 332/ 819 0 302/ 682 0 29/ 128

16:02:47 101/ 154 0 332/ 819 0 302/ 682 0 29/ 128

16:02:48 102/ 154 0 333/ 819 0 304/ 682 0 29/ 128

This output shows us the size of process table (proc-sz), inode table (inod-sz), file table (file-sz) and the record lock table (lock-sz). The first three also report how many times the table overflowed between the samplings (ov). In each case, the two numbers represent the current value followed by the size of the table. If you are running OpenServer and have not set a maximum table size, this instead shows the maximum that the table has grown to.

If you have OpenServer, then life is easier. With OpenServer the size of the kernel inode table is dynamic. However, you can set the maximum size with the MAX_INODE parameter. On the other hand, if you still have ODT 3.0, then the size of the kernel inode table is determined at relink time by the NINODE parameter. The kernel parameter for the size of the hash queue in both cases is NHINODE.

Here again, we would have to search through every entry except for the existence of the inode hash queues. Like those for the namei cache, these decrease the amount if time needed to find the appropriate entry and should maintain the 1:4 relationship with the cache size and the number of hash queues.

As with other types of I/O, terminal I/O can present a serious bottle neck. If it gets too bad, then usually the best solution is getting an intelligent multi-port board that takes some of the load off the system. In the meantime, you can take a look to see if there is anything you can to help relieve the situation. The key aspect of serial I/O that causes problems are clists. Both incoming outgoing characters are placed into a clist. When you run out of them, characters get lost.

Using the -g option to sar, you can monitor how many times you run out of clists each second. Although decreasing the baud rate of the terminal will help to eliminate clists overflows, you normally don't win friends this way. The solution is to increase the number of clists.

The next issue we're going to talk is very much a two-edged sword. This is the size of your buffer cache. When you boot your system, you see the size of your buffer cache when the system displays I/O bufs=. This is determined by the NBUF kernel parameter. If set to 0, then the system allocates space based on the total amount of memory you have. Otherwise, the system will allocate the number of 1K buffers that NBUF is set to.

Let's take the case where you notice that a great deal of time is being spent writing to and reading from the disk (maybe using the %wio column of sar -u). You conclude that by increasing the size of the buffer cache, you could decrease the need to go to the disk so often. In principle, this is a valid conclusion. If you are generally accessing the same data all the time, then increasing the same of the buffer cache increases how much of that data is kept in memory. On the other hand, if you are constantly updating and changing different pieces of data, then having a larger buffer cache may do you no good. In fact, it may be counter productive, if you are not careful. If you increase the size of the buffer cache too much, then there is less memory for processes and you spend more time paging or even swapping.

To monitor the activity in the buffer cache, you would use the -b option to sar. Here the things to monitor are the percentage of cache hits when reading (%rcache) and percentage of cache hits when writing (%wcache). If you think back to the section on the CPU, I mentioned that the system spends about 80% of it's time executing 20% of the code. This means that there is a lot of repetition. If you have to go back to the disk every time to need the same portion of code, you are wasting time. Therefore, if you have a low %rcache value, then you might want to think about increasing the size of your buffer cache.

What's low? Well, being consistently above 90% is not unexpected. If you have a lot of users, all using the same application, then they are going to be using the same code. If on the other hand you have a lot of different applications, then the value will be lower. However, if it averages below 80% I would seriously consider increasing the size of the buffer cache.

There are a couple of things to point out. First, applications typically do more reading than writing, so potentially the %wcache value is of less importance. Therefore, if you increase your buffer cache and the %wcache value doesn't change much, although %rcache does, don't worry about it. Compare the number of kilobytes written per second to the number of kilobytes read (lwrit/s and lread/s, respectively). If the number of reads is substantially higher than writes, then you probably won't get much better write performance out of increasing the buffer cache.

The next issue is the law of diminishing returns. The only time I see either read or write percentages at 100% is when I am the only one on the system and all I am doing is running different options to sar. The sar code is already in memory. Therefore, there is no need to read the buffer cache at all. Zero hits out of zero requests is still a 100% hit rate. (We're optimists here) So, if you get your %rcache up to 98% (which is completely possible) then be happy. Don't give yourself an ulcer trying to squeeze out that last 2%. You probably won't get it anyway.

Another common problem is when the system runs out of regions. If you think back to the discussion on kernel internals, you'll remember that each process has at least three regions: text, data and stack. If so configured, processes can also have a fourth region: shared data. Unless you are getting awfully close to maxing out the number of processes, then shared memory is not usually a problem. However, you do run into problems when you have either have a lot of processes using shared memory or are close to filling up your process table.

This problem occurs when you have not defined enough regions in your kernel. If you have OpenServer, the number of regions grows dynamically, unless you set it with the MAX_REGION parameter. In ODT this is the NREGION parameter. Any time you are using shared memory, you need to account for this when determining the ratio of processes to regions.

We can use sar -v (mentioned above) to monitor the number of processes used (proc-sz). However, we have to return to our old friend crash. The region function inside of crash shows you the current state of the region table. So by counting the number of lines (minus the header) we find out how many entries are in the region table. Just like we can pipe the output to more, we can pipe it to any other program. So, to get a count of the number of lines, the command would be:

region ! wc -l

Since this also counts the lines in the header, we need to subtract 4 to give us the number of slots in ODT, but only subtract 2 with OpenServer. Once we've figured out the number of regions being used, we need to compare this to the number of processes.

The key here is to make two checks. The first is immediately after going into multi-user mode and preferably before any third party apps have started. The first check is to get a base value from which to make later determinations as to what is expected or not. Once users start running the applications, make another check to see what the regions and processes are doing.

SCO also provides another useful tool monitoring tool: vmstat. Although this does not provide the scope of options available with sar, it does provide a quick overview as well as some of of the information available with sar. Like sar, vmstat can be told to run over a period of time. For example, vmstat 1 100 would show output once a second for 100 seconds. I find this a useful tool for monitoring system statistics while some other know even is occurring. For example, if I want to monitor behavior while I was running a large application, vmstat would give me a good overview.

For configuring your kernel there is tunesh. When you run this, you are asked a set of questions about (such as number of serial terminals, network connection, etc.). Based on your answers to these questions and information it determines itself, tunesh adjusts several kernel parameters to best fit the kernel to system. Although this is not always perfect, it does provide a certain of amount of tuning and is valuable to the novice administrator. See the tunesh(AMD) man-page for details.

Unfortunately, there is much more to kernel tuning than this. Entire books have been writing about the subject. SCO provides a performance guide that covers many more issues than we had space for. However, I wanted to cover the tuning issues that cause the most problems and are easiest to address.

There is also The SCO Performance Tuning Handbook by Gina Miscovich and David Simons, published by Prentice Hall. Although this was written before the release of OpenServer, it is still applicable when you take into account the dynamic parameters. Not only does this provide some great tuning tips, it provides more insights into the inner working of the kernel.

Although it does not address SCO, another book is System Performance Tuning by Mike Loukides, published by O'Reilly and Associates. This provides a excellent overview of the concepts involved.

Getting It All At Once

The basis for this script was provided to me by Tom Melvin. There were a couple of sections that referred specifically to 3rd party software installed on his system that I didn't include. However, checking versions of installed software would be a good addition. I also removed the entire section that checked the UUCP configuration, as well a section that did some benchmarking. While this is all useful information, my intention is not to provide you with the complete monitoring tool, but to provide you will some ideas on what you can do. Note that this scripts takes for granted that most everything is at the default. (File names and locations, etc)

Other things that this script is missing is information on your automount filesystems if you are using it, or your NIS maps. Another enhancement could include the contents of stune to show any changed kernel parameters. If you have a network, you could have this script run out of cron on each machine and then copy it to a central administration machine. You could also include a section that tested connectivity within the network. Other enhancements could include command line (or menu) options that only check certain aspects of the configuration. You could also add some checks by sar to determine system performance.


:

# Script to poke around on the system


# In this case I do not use any of the functions in std_funcs. However

# I always include out of habit. If I ever decide to expand the script

# to be more interactive, I don't have to worry about it not finding

# functions. Watch the dot in front of the file name.

. /usr/lib/sh/std_funcs


# Although I may not use then until much later, I always like to

# define my variables at the top of the script.


SYSTEM=`uname`

CONFIG_DIR=/usr/local/lib

CONFIG_FILE=${CONFIG_DIR}/cnf.${SYSTEM}

LINE_BREAK="-----------------------------------------------"


# This uses the fact that everything inside of the brackets is one

#expression. If it evaluates to false, (that is /usr/local/lib does exist)

# then the second half with the mkdir won't run.


[ ! -d /usr/local/lib ] && mkdir -p $CONFIG_DIR


# Send all standard out to the configuration file. Any echoes or cat's all

# go to stdout. This exec has redirected stdout to the config file.

# Messages are explicitely sent to /dev/tty


exec > $CONFIG_FILE


is_root()

{

# Is the user sufficiently powerful ( root )


case `id` in

*root*) ;; # root

"") ;; # single user

*) # someone else

echo "This program must be run by the superuser (root) " >/dev/tty

exit 1

;;

esac

}


sys_conf()

{

echo "SYSTEM CONFIGURATION On: `date` \c"


echo "\nO.S. Configuration: \n `uname -X `"


if [ -s "/usr/bin/uptime" ]

then

echo "\nSystem stats\n"

uptime

fi

}


# Note that this shows you only currently mounted filesystems. I added

# the extra line to check the number of inodes used as it can become an

# issue

disk_usage()

{

echo "Checking disk usage" >/dev/tty

echo $LINE_BREAK

echo "Disk usage:"

df -v

df -i

}


hardware_conf()

{

echo "Checking installed hardware" >/dev/tty

echo $LINE_BREAK

echo "Installed hardware:"

hwconfig -h 2>/dev/null

}


memory()

{

echo $LINE_BREAK

echo "Memory installed"

grep 'mem:' /usr/adm/messages | tail -1

}


software_conf()

{

echo "Checking installed software" >/dev/tty

echo $LINE_BREAK

echo "Installed software: \n"

swconfig 2>/dev/null

}


# This doesn't take into account you having more than 2 drives

drive_settings()

{

echo "Drive settings" >/dev/tty

echo $LINE_BREAK

echo "Drive settings"

dparam /dev/rhd00 2>/dev/null

dparam /dev/rhd10 2>/dev/null

}


# This doesn't take into account having more than 2 drives. If the drive or

# partition does not exist, error are simply sent to /dev/null. There two loops. The

# outer one first prints the partition table and then calls the inner loop. The inner

# loop runs divvy on all 4 possible parititions.

{

for disk in 0 1

do

fdisk -p -f /dev/rhd${disk}0 >/tmp/ckconfig.$$ 2>/dev/null

if [ $? -eq 0 ]; then

echo "--------------------"

echo "Partitions on disk $disk"

cat /tmp/ckconfig.$$

for partition in 1 2 3 4

do

divvy -P /dev/hd${disk}${partition} >/tmp/ckconfig.$$ 2>/dev/null

if [ $? -eq 0 ]; then

echo "----------"

echo "Divisions on partition $partition of disk $disk"

cat /tmp/ckconfig.$$

fi

done

fi

done

}


enabled_ttys()

{

echo "Determining enabled terminal ports" >/dev/tty

echo $LINE_BREAK

echo "Active Terminal ports"

grep "respawn" /etc/inittab

}


printers()

{

echo "Checking printer set-up" >/dev/tty

echo $LINE_BREAK

echo "\nPrinter setup"

lpstat -t

echo "\n"


for fle in `lc /usr/spool/lp/admins/lp/interfaces/*`

do

xyz=`basename $fle`

echo "Printer - $xyz \c"

grep "#!" $fle | sed 's/#!//'

grep "^stty" $fle

echo "\n"

done

}


filesystems()

{

echo $LINE_BREAK

echo "\n/etc/default/filesys"

cat /etc/default/filesys 2>/dev/null

echo $LINE_BREAK

echo "\nCurrently Mounted filesystem"

mount

}


root_cron()

{

echo "Cron settings for root" >/dev/tty

echo $LINE_BREAK

echo "\nCron settings for root :"

crontab -l

}


invalid_dev()

{

echo "Invalid device files" >/dev/tty

echo $LINE_BREAK

echo "Files in /dev"

find /dev -type f -exec l {} \;

}


# This is not necessarily the best way to check for this information.

# We are making a lot of assumptions here.

check_net()

{

echo $LINE_BREAK

NAMED=`ps -ef | grep -v grep | grep named`

if [ -n "$NAMED" ]

then

echo "You are (probably) a name server"

return

else

if [ -f /etc/resolv.conf ] then

echo "You are (probably) a nameserver client"

return

fi


echo "You are not using the name server."

echo "Contents of Hosts file:"

cat /etc/hosts

fi

}


show_messages()

{

echo $LINE_BREAK

echo "\nMessages file\n"

tail -30 /usr/adm/messages

}


echo "This script will have a look around your system to see how it is \

configured." >/dev/tty

echo "Information will be stored in the file $CONFIG_FILE" >/dev/tty


is_root

sys_conf

drive_settings

disk_usage

hardware_conf

memory

software_conf

enabled_ttys

printers

filesystems

root_cron

invalid_dev

check_net

show_messages


echo "\nFinished" >/dev/tty




Next: Problem Solving

Index

Copyright 1996-1998 by James Mohr. All rights reserved. Used by permission of the author.

Be sure to visit Jim's great Linux Tutorial web site at http://www.linux-tutorial.info/