Jim Mohr's SCO Companion

Index

Copyright 1996-1998 by James Mohr. All rights reserved. Used by permission of the author.

Be sure to visit Jim's great Linux Tutorial web site at http://www.linux-tutorial.info/

System Monitoring


Monitoring your system is more than just watching the amount of free hard disk space or the number of users running a certain application. Many aspects of your system are static over fairly long periods of time, such as the layout of your hard disk. However, such information is as important to really knowing your system as how much free memory there is at any given time.

Both ODT and OpenServer provide a wide range of tools to not only monitor your system as it it running, but to find out how it's configured. In this chapter, we are going to talk about the tools you need and the files you can look in to find out anything about your system that you need. In the next chapter, we are going to take some of these tools and see how they can be used to help us achieve the goal of administering our system.

Finding out about your system

One of the challenging aspects of tech support at SCO is OSD, or Operating System Direct. This is a direct line to a support engineer. As a support engineer, the challenge lies in the fact that before you talk to the customer, you have no way of knowing what the problem will be. It can be anything from simple questions that are easily answered by reading the manual to long, drawn out system crashes.

Late one Monday afternoon I was working the OSD line. Since I has the one who had been idle the longest, my phone rang when the next customer came into the queue. The customer on the other end of the line, described the situation as simply that his computer would no longer boot. For some reason, the system rebooted itself and now it would not boot.

When I asked the customer how far it got and what, if any, error messages were on the screen, he replied "panic - srmountfun". At that point I knew it was going to be a five minute call. A panic with the srmountfun message basically means your filesystem is trashed. In almost every case, there is no way to recover from this. On a few rare occasions, fsck can clean things up to be able to mount. Since the customer had already tried that, this was not one of those occasions.

We began discussing the options, which were very limited. He could reinstall the operating system and then the data, or he could send his hard disk to a data recovery service. Since this was a county government office, they had the work of dozens of people on the machine. They had backups from the night before, but all of that days work would be lost.

Since there was no one else waiting in the queue to talk to me I decided to poke around a little longer. Maybe the messages we saw might give an indication of a way to recover. We booted from the emergency boot/root set again and started to look around. The fdisk utility reported the partition table as being valid and divvy reported the division table as being valid. It looked as if just the inode table was trashed, which is enough.

I was about ready to give up when the customer mentioned that the divvy table didn't look right. There were three entries in the table that had starting and ending blocks. This didn't sound right because he only had one filesystem: root.

Since the data was probably already trashed, there was no harm in continuing, so we decided to name the filesystem and try running fsck on it. Amazingly enough, fsck ran though relatively quickly and reported just a few errors. We mounted the filesystem and holding our breath we did a listing of the directory. Lo and behold there was his data. All the files appeared to be intact. Since this was all in a directory named /data, he simply assumed that there was no /u filesystem, which there wasn't. However, there was a second filesystem.

I suggested backing up the data just to be safe. However, since it was an additional filesystem, a reinstallation of the OS could preserve it. Within a couple of hours, he could be up and running again. The lessons learned? Make sure you know the configuration of your system! If at all, possible keep data away from the root filesystem and backup as often as you can afford to. The lesson for me was to have the customer read each entry one-by-one.

Being able to manage and administer your system requires that you know something about how your system is defined and configured. What values have been established for various parameters? What is the base address of your SCSI host adapters? What is the maximum UID that you can have on a system? All of these are questions that will eventually crop up, if they haven't already.

The nice thing is that the system can answer these questions for you, if you know what to ask and where to ask it. In this section we are going to take a look at where the system keeps much of its important configuration information and what you can use to get at it.

As a user, much of the information that you can get will be only useful to satisfy your curiosity. Most of the files that I am going to talk about, you can normally read. However, there are a few of the utilities, such as fdisk and divvy that you won't be able to run. Therefore, what they have to say will be hidden from you.

If you are an administrator, there are probably many nooks and crannies of the system that you never looked in. Many you probably never knew existed. After reading this section, you will hopefully gain some new insights into where information is stored. For the more advanced system administrator, this may only serve as a refresher. Who knows? Maybe the gurus out there will learn a thing or two.

Hardware and the Kernel

The first place we're going to look is that place that causes the most problems and results in the largest number of calls to SCO Support: hardware.

For those of you who have watched the system boot, you may already be familiar with what SCO calls the "hardware" screen. This gives you a good overview as to what kind of hardware you have on your system and how it is configured. Since many hardware problems are the result of misconfigured hardware, knowing what the system thinks about your hardware configuration is very useful.

Fortunately, we don't need to boot every time we want access to this information. SCO Unix provides a utility called hwconfig that shows us our hardware configurations. (At least what the OS thinks is the HW configuration.) On my system, if I run /etc/hwconfig -hc, I get this:

device

address

vec

dma

comment


======

=======

===

===

=======

fpu

-

13

-

type=80387

serial

0x3f8-0x3ff

4

-

unit=0 type=Standard nports=1

serial

0x2f8-0x2ff

3

-

unit=1 type=Standard nports=1

floppy

0x3f2-0x3f7

6

2

unit=0 type=135ds18

floppy

-

-

-

unit=1 type=96ds15

console

-

-

-

unit=vga type=0 12 screens=68k

parallel

0x378-0x37a

7

-

unit=0

adapter

0x330-0x332

11

5

type=ad ha=0 id=7fts=s

tape

-

-

-

type=S ha=0 id=2 lun=0 ht=ad

disk

-

-

-

type=S ha=0 id=0 lun=0 ht=ad fts=s

Sdsk

-

-

-

cyls=1170 hds=64 secs=32

disk

-

-

-

type=S ha=0 id=6 lun=0 ht=ad fts=s

Sdsk

-

-

-

cyls=518 hds=64 secs=32

cd-rom

-

-

-

type=S ha=0 id=5 lun=0 ht=ad


No obvious conflicts in hardware settings






The -h option showed us the output in neat little columns, all will headings over each column (h for headings). Without this option, the same information is there, however, it is as easy to read. The -c option checked for conflicts of I/O base address, IRQ and DMA channel (c for conflicts). This will catch not only duplicates, but in the case of the I/O base address, it will tell you if anything is overlapping. For more details about this, check out the hwconfig (ADM) man-page.

From the output you know the base addresses of all devices that have base address (the address column), their interrupts (the vec column), the DMA channel (the dma column) and often other pieces of information that can be very useful. For example, under the comment on the first line labeled "floppy", you might see it is unit=0 and the type=135ds18. If this is the way things should be, that's a happy thing.

However, this is not always the case. Customers repeatedly call to SCO Support during installation of the OS with "bad media". This is not because the floppy disk is bad, because the value here is not what the hardware really is. The values for the floppies are read from the CMOS. If they are incorrect, the operating system gets them incorrect. As a result, it may be trying to read a 3.5" floppy as if it were a 5.25". That won't work for long.

There is one "serial" entry for each COM port I have. In this case I have two, labeled unit=0 and unit=1. Each are of type standard (nothing special about them) and each only has one port. If you had a non-intelligent serial board with more than one port, nports= would show you how many ports you have.

One think to keep in mind here is serial mice. I have one on my system, but you couldn't tell that from the output here. At this point, the system has no way of know that a mouse is attached to the serial port.

The "console" entry is referring to both your video card and the way your console is configured. Here we see that I have a VGA card, which is standard (type=0), there are 12 multiscreens set up, with 68k of memory reserved for those screens. From the system's standpoint here, it doesn't matter that I actually have an SVGA as they both use the same driver.

If you have SCSI devices on your system, you will see something like this for your host adapter:

adapter 0x330-0x332 11 5 type=ad ha=0 id=7 fts=s

In the comments column, the type tells me that it is using the ad device driver, so I know that I have an Adaptec 154x or 174x in standard mode. (This is the same an the corresponding entry in /etc/default/scsihas) The entries ha=0 id=7 tells me that this the first (0th) host adapter and it is at ID 7. The entry fts= can mean several things, depending on what's there. The entry I have fts=s, says that I have scatter-gather enabled. Other possible entries are:

s = scatter/gather

t = tagged commands

d = 32-bit commands

b = commands are buffered

The two disk entries are for hard disks. Here we see type=S, so we know that both of my disks are SCSI. If I had an IDE or ESDI hard disk: drive, this would say type=W. Since this is SCSI, we also need to know what host adapter the device is attached to (ha=0, ht=ad ), its ID and LUN (id=0, lun=0), plus similar characteristics as the host adapter (fts=s).

If this were an IDE or ESDI hard disk: drive, instead of the SCSI configuration we would instead have a description of the drive geometry here. In order to be able to show both the SCSI configuration and hard disk geometry, there is an additional entry (Sdsk) for each disk. This is where the geometry of each drive is listed. This shows us cylinders, heads and sectors per track for each drive.

One question that is useful in debugging hardware problems is know just where the system gets this hardware information. Each of these lines is printed by the device driver. These are the configuration parameters that the drivers have been given.

If you remember from our discussion of the link kit, we know that the kernel is composed of a lot of different files that reside somewhere under /etc/conf. It is here where we have stored the configuration information that gets handed off to drivers during a kernel relink.

For our purposes, we only need to be concerned about three sub-directories under /etc/conf: pack.d, sdevice.d and cf.d. Since we went into detail about these in the section on the link kit, I will only review them briefly.

The first directory I want to talk about, /etc/conf/cf.d, is the central configuration directory. Here you can find default configuration information, which drivers are being linked into the kernel, the current value of kernel tunable parameters, etc.

When the system is installed, the defaults for the parameters that the kernel needs are kept in the file /etc/conf/cf.d/mtune. As I mentioned before, this is the master tuning file. It is a simple text file, consisting of four columns. The first is the parameter name, followed by the default value, the minimum and lastly the maximum. Any value that has not changed is set to the default here. (the second column)

If any kernel parameters have a value other than a default, the parameter name is placed in stune along with the new value. This is the system tuning file. Although changes can be made to stune by hand, it is "safest" to use the configuration tool provided with the OS you have. If you have ODT 3.0, then this tool is sysadmsh, if you have OpenServer this is the Kernel/Hardware Manager. Both of which start up the utility /etc/conf/cf.d/configure, which you could start yourself, if you wanted.

When the kernel is rebuilt, each parameter is first assigned the value defined in mtune and then any value in stune overwrites that default. If you want to figure out what each of these parameters mean, take a look at the Chapter entitled "Kernel Parameter Reference" in the System Administrator's Guide if you have ODT 3.0 and Appendix B (Configuring Kernel Parameters) of the Performance Guide, if you have OpenServer.

Also in /etc/conf/cf.d is the master device configuration file: mdevice. As we know, this tells us all the devices that can be configured on the system, as well as specific characteristics about that device. This, however, does not tell us what devices are currently configured, just what devices could be configured.

The mdevice file provides a couple of pieces of information that may come in handy. One of them is the device major number. The block major number is column 5 and the character major is column 6. This is useful when trying to determine what device nodes are associated with what device driver. If the name of device node is well chosen, then it is easy to figure out what kind of device it is. Otherwise, you have to guess.

For example, it's fairly obvious that the device /dev/tty1a has something to do with a tty device. What about /dev/ptmx? Well, it has a major number of 40. Looking in mdevice,  I see that this is the clone device driver. Therefore, I know it has something to do with streams. (See the discussion of the /dev directory) The mdevice file also contains the DMA channel the device uses (column 9), which is useful to know when trying to track down hardware conflicts.

What devices are actually configured can be found in the /etc/conf/cf.d/sdevice file. This file is generated during the kernel relink by concatenating all the files in /etc/conf/sdevice.d. The first column of each entry in sdevice matches the first column in mdevice  We can then make the connection, if we need to, from the device node to the corresponding entry in sdevice. (major number ® mdevice ® sdevice)

As we mentioned in the section on the link kit, we know that the sdeviced files contain (among other things) the IRQ and base address. This is useful when hardware problems arise. We also find in these files a very useful piece of information: whether the device driver will be included at all. If there is a Y in the second column, then this driver will be included. If there is an N, this device will be left out.

Although it is not a common occurrence, it has happened that a device could no longer be accessed after a kernel relink. The reason being that there was an N in the second column. More often than not, this is the result of an over zealous system administrator who wants to reduce the size of his kernel by pulling out "unnecessary" drivers. However, on occasion I have seen it where third party device driver removal scripts forget to put things back the way they were.

In this /etc/conf/cf.d is also the SCSI configuration file, mscsi. This tells us what SCSI devices the administrator wants to configure on the system. The reason I said it that way is because this is a place where errors regularly occur. What can often occur is an administrator will be unsure of his SCSI configuration and will try several different ones until he gets it right.

This "shotgun" method of system administrator rarely works. You may get the device working, but you can forget adding anything else in the future. This is especially true for SCSI hard disk. Only those devices that are actually configured on the system should be in mscsi.

Let's think back to the hardware screen. If you have SCSI devices on your system, they will be listed here and may be configured exactly as you see it on the screen. Just like other kinds of devices, SCSI devices may be configured incorrectly and still show up during boot.

However, you may have installed a SCSI device that does not appear in the hardware screen or there is one there that you didn't configure. The place to look is the mscsi file. If the device is there, then the odds are either the entry was input incorrectly, or it is not configured the way you expect. More information can be found in the mscsi (F) man-page.

A thing to note is that some SCSI devices (like hard disks) do not show up until after they have been accessed the first time, usually when the filesystems on them are mounted. Therefore, if you have mutliple hard disks, the second and subsequent ones will not show up until the filesystems are mounted. This is usually when you go into multi-user mode. In addition, until you go into multi-user mode, the output of hwconfig may be invalid. Therefore, you might not see every device at boot.

The reason for this is that the printcfg() routine may not get called during system boot-up. The device driver may not do anything special in the initialization routine (where printcfg() is normally called) so it and simply prints the configuration string. On the other hand, it may wait to call printcfg() until the device is first accesses, as in the case of hard disks.

We next jump to another directory at the same level as cf.d: /etc/conf/pack.d. We know that the pack.d directory contains a sub-directory for each device that can be configured essentially matching what is in mdevice. In the sub-directory is the device driver itself (Driver.o) as well as a configuration file, space.c. (Some of directories contain files called stubs.c. These only contain things like function declarations, but no configuration information.)

The space.c files can contain a wealth of information about the way your hardware is configured. Granted much of the information requires knowledge of how the specific driver functions. However, skimming through these files can give you some interesting insights into what your system can do.

As a warning, don't go playing with the values in these files unless you know what you're doing. You have the potential for really messing things up.

Despite the fact that they're object code, which makes reading them difficult, the Driver.o files often provide some useful information. Let's assume that your system is panicking and you cannot figure out what's causing it. By running crash on the dump image and entering the command panic, you will get a stack trace that tells you the function the kernel was in when it panicked. If it's not obvious from the name, what the function is, you can use either the strings or nm command to search through the Driver.o files to find that function. If the panic is always in the same function, this usually indicates a problem with the hardware or something else related to the driver. (We'll talk more about this technique later in the section on problem solving.)

Another directory at the same level as pack.d and sdevice.d is init.d. The files in this directory are concatenated onto end of the file /etc/conf/cf.d/init.base to form your /etc/inittab file. This serves a couple of functions. First, many of the processes that are started as the system boots up are started out of /etc/inittab. These you find in init.base. Second, this is where the initial terminal configuration information is kept. For the console terminal devices this is also found in init.base. However, for terminal on the COM port or multiport boards the information for how they are configured is found in the files in /etc/conf/init.d. See the chapter on starting and stopping your system for more details.

On a default system, you should have at least the file /etc/conf/init.d/sio. This contains the default configuration for the standard serial ports (tty1a and tty2a).  When intelligent multiport boards are added, there will probably be an extra file in this directory. If you are running OpenServer, then you will also find the file scohttp¸ which controls the SCO httpd daemon, /etc/scohttpd.

Terminals

Since we were just talking about terminals, lets take a look at some other information that relates to terminals.

I mentioned that the init.base file and the files in /etc/conf/init.d established the default configuration for terminals. To a great extent this is true, however, there is a little something extra that needs to be addressed. Whereas the /etc/conf/cf.d/init.base and /etc/conf/init.d/* files tell us about the default configuration, (such what process should be started on the port and at what run levels) it is the /etc/gettydefs file that tells the default behavior of the terminals. (Rather then mentioning both the /etc/conf/cf.d/init.base and /etc/conf/init.d/* files, I'll just talk about /etc/inittab, which is functionally the same thing.)

Each line in /etc/inittab that refers to a terminal device, points to an entry in the /etc/gettydefs file. The entry for /dev/tty1a might look like this:

Se1a:234:respawn:/etc/getty tty1a m

From out discussion of the /etc/inittab file in the chapter on Starting and Stopping the System, we see that this entry starts the /etc/getty command. Two arguments are passed to getty: the terminal it should run on (tty1a) and the gettydefs entry that should be used (m). The /etc/gettydefs file defines such characteristics as the default speed, parity and the number of data bits. For example, the m entry which the inittab entry above points to, might look like this:

m # B9600 HUPCL # B9600 CS8 SANE HUPCL TAB3 ECHOE IXANY #\r\nlogin: # m

The fields are

label # initial_flags # final_flags #login_prompt # next_label

The label entry is what is being pointed to in the inittab file. The initial_flags are the default serial line characteristics that are set, unless a terminal type is passed to getty. Normally, the only characteristic that needs to be passed is the speed. However, we also set HUPCL (hang up on last close).

The final_flags are set just prior to getty executing login. Here again, we set the speed and HUPCL. However, we also set the terminal to SANE, which is actually several characteristics. (Look at the gettydefs(F) man-page for more details.) We also set TAB3, which turns tabs into space, ECHOE which echoes the erase character as a backspace-space-backspace combination, and lastly IXANY which allows any character to restart output if stopped by the XOFF character.

In many cases after you login, you are prompted to input the type of terminal you are working on. This appears as something like:

TERM = (ansi)

This prompt is a result of two things. First, the system checks the file /etc/ttytype. This file consists of two columns. The first is the terminal type, followed by the tty name (e.g., tty02, tty1a). If your tty device is listed here, then the system knows (or thinks) that you are logging in using a specific terminal type on that port. This is port dependent and not user dependent.

If your terminal is not listed here, then you are prompted to input it. This is the result of the tset line in either your .profile or .login. Check out the tset(C) man-page for more details.

This is a useful mechanism if you have a lot of serial terminals that are always connected to the same port (say, on a multi-port board). That way the users don't have to be bothered with either typing in their terminal type. In addition, you as the system administrator don't have to worry about users calling up saying their terminal doesn't work when they input the wrong terminal type.

Hard Disks and Filesystems

A common problem that causes longer calls to support is the layout of the hard disk. Many administrators are not even aware of the number of partitions and filesystems they have. This is not always their fault, as they often inherit the system without any information on how it's configured.

The first aspect is the geometry. This is such information as the cylinders, heads and sectors per track. In most cases, the geometry of the hard disk is reported to you on the hardware screen when the system boots. You can also run the dkinit program by hand.

To find how your hard disk (or hard disks) is laid out, there are several useful programs. The first is fdisk, which is normally used to partition the disk. Using the -p option, you can get fdisk to just print out the partition table. This tells you which partitions are on the disk, their starting and ending tracks, the type of partition and which one is active. The output is not necessarily intuitive, so let's take a quick look at it. On my system, I get output like this:

1 9600 41535 31936 UNIX Active

2 1 9599 9599 DOS (32) Inactive

3 41536 74815 33280 UNIX Inactive

Each line represents a single partition. The fields are:

partition_no. start_track end_track size type status

Here we have three partitions with the first UNIX partition being active. Note that although the DOS partition is physically the first partition, it shows up as the second partition in the fdisk table. In addition, it accurately recognized the fact that it is a 32-bit DOS partition.

If we look carefully and compare the ending blocks with the starting blocks of the physically next partition, we see that, in this case. there are no gaps. Small gaps (just a few tracks) are nothing to have a heart attack over as you are only loosing a couple of kilobytes. However, larger gaps indicate that the whole hard disk was not partitioned and you are loosing space.

If you have multiple hard disks on your system, hwconfig may show you this. What happens if it doesn't? Maybe it's a SCSI hard disk that's never mounted, so it doesn't print out the configuration information. How can you figure out if you have more than one hard disk? You could take a look in /dev for any hd device. If you look back on the section on major and minor numbers, you can figure out what hard disks have devices assigned. However, it's possible that the hard disk existed at one time, but doesn't anymore. Maybe the previous administrator like the "shotgun" approach to system administration and tried to configure every possible combination. The device nodes might be there, but there is no physical device associated with them. Therefore, you need a way to figure exactly what devices are physically on the system.

No worries! The fdisk utility will tell you. If you try to print out the partition table for all the possible hard disks, the worst that can happen is you get an error message saying it can't open the device. To do this, run these four commands:

fdisk -p -f /dev/rhd00

fdisk -p -f /dev/rhd10

fdisk -p -f /dev/rhd20

fdisk -p -f /dev/rhd30

Once you get a response that fdisk can't open a device, then you know you've probably found your last hard disk. If you actually do have more than four hard disks, you need to try the same fdisk command on the other hard disk devices. If you never do get the message that it cannot open a the device, then there are probably physical devices associated with every device node.

To find out what filesystems or divisions are on your disks, you can use the mount command. However, this only tells you which ones are currently mounted. This is useful on a running system to determine if a directory is part of one filesystem or another. Although the df command (more on that later) will tell you what filesystems are mounted, it doesn't tell you what options were used, such as whether the filesystem is read only or not. On a few occasions I have had customers call in reporting filesystems problems because they could write to them, only to find out they were mounted as read-only.

What if you suspect there are more filesystems then are mounted? Unfortunately, finding out what filesystems are on your system is not as easy a figuring out what partitions are there. When you run mkdev fs or the Filesystem Manager on OpenServer, an entry for each filesystems is placed in /etc/default/filesys. If you find more here than you see with mount, then they are either not getting mounted properly or the entry is missing the options necessary to mount them automatically. Check the filesys(F) man-page for more details.

Sometimes there are filesystem on your disk, without an entry in /etc/default/filesys. This happens often when people are not paying attention during the install and specify a /u filesystem. When they don't run mkdev fs they find half of their disk missing, and they end up calling SCO Support. Fortunately, SCO recognized the problems that this caused and it no longer occurs in OpenServer. Instead you are prompted to add the filesystem.

The easiest ones to find are those that you see simply by running divvy -P. This defaults to the partition with your root filesystem on it. For example, if I run it on my system I get:

0

0

14999

1

15000

39999

2

40000

429941

3

429942

469941

4

469942

509941

6

509942

509951

7

0

510975

Each line represents a single division, with the fields in each entry being the division number, the starting block, and the ending block. Note that the block sizes in divvy are 1K and not 512 bytes like other utilities. From this output, I see I have 5 divisions (0-4), plus recover (6) and the whole disk (7). (Think back on our discussion of filesystems.) Since this is the root partition, I know that one of these is probably the swap space. (I know it is division 1)

If I looked in /etc/default/filesys and saw fewer entries than I saw here (taking in swap, recover and the whole disk into account, of course), then I would know something is not as it appears. One short coming is that this output does not give me the names of divisions. This is because the -P option is just reading the entries in the division table and displaying them.

So, where are they names coming from? From the device nodes. When divvy is run interactively, you see a table similar to the above output, but with the name of the device included. What divvy does is finds that first block device (alphabetically) in /dev that has the correct major and minor number. If I created a new block device node with major number of 1 and a minor number of 41, instead of seeing swap as the name of the division, I would see jim. This is because /dev/jim shows up alphabetically before /dev/swap.

We can also pass a device node as an argument to divvy. If this is the name of a filesystem, divvy will figure out what partition it is on and displays the appropriate division table. For example, divvy -P /dev/root would give me the exact same output as above.

If we wanted, we could also specify the character device for that partition. If I ran divvy -P /dev/rhd01, the output would again be the same. To get all the divisions, you will need to run divvy for all of the Unix partitions that you found with fdisk. We could run divvy on all the filesystem names. This would show us all the divisions including any that were not yet given names. However, this won't help us on disks where a filesystem spans an entire partition. The nice thing is that the output of fdisk will help us.

We can see from the example above for fdisk that to go through the hard disks, we increase the first number in the device name by one. To go through the partitions, we increase the second number by one. However, we don't have to go though each possible partition number to get the results we want. The output of fdisk gave us the partition numbers already.

Let's take the above fdisk output:

1 9600 41535 31936 UNIX Active

2 1 9599 9599 DOS (32) Inactive

3 41536 74815 33280 UNIX Inactive

From this output, I know to run divvy on /dev/hd01, /dev/hd02 and /dev/hd03. If this were the second hard disk, I would change the devices accordingly. For example, the divvy command to show the first partition would be:

divvy -P /dev/rhd11

You could write a quick shell script that explicitly ran through each value. However, I feel that checking everything yourself gives you a better understand of how your system in configured. Keep in mind that a division with no filesystem on it may not be wrong. There are applications (such as some databases) that require either a raw partition (no divisions) or a raw division (no filesystem).

System Defaults

A lot of default information can be found in the /etc/default directory. This directory contains a wealth of information about the default state of your system. I guess that's why the directory is called /etc/default, huh? If you have just upgraded to OpenServer or are otherwise familiar with ODT 3.0, then I highly recommend looking at this directory right now. You heard me, take a look. There are quite a few changes between the two releases. Because of the significance of this directory, knowing about the differences is an important part of administering your system.

The most obvious change is that that virtually all of the files are symbolic links to the "real" files living in /var/opt/K/SCO/Unix/5.0.0Cd/etc/default. You will find that there are new files that didn't exist in previous releases as well as files that are no longer used. Note that 5.0.0Cd is the release number and may, therefore, be different.

Most of the files have man-pages associated with them or are related to programs that have them. For more details, take a look at the default(F) man-page. It has a list of other man-pages related to these files.

Going through each file and discussion each entry would not be effective use of our time. Each of the files either has a man-page specifically for it, or there is a program associated with the file that has a man-page. Instead, I am going to talk about some of the changes as well as address some of the more significant files.

Have you ever wondered why when you simply press 'enter' at the Boot: prompt you get hd(40)unix? If so, check out the variable DEFBOOTSTR in /etc/default/boot. The /boot program reads the DEFBOOTSTR (default boot string) variable from /etc/default/boot and when you press enter without any other input, the default boot string is echoed. It is the DEFBOOTSTR variable that defines the default boot behavior, hence the name. (See the section on starting the system for more details)

If we wanted, we can change the default boot string to something else. In fact, that's what I did on my system. Instead of hd(40)unix, in my /etc/default/boot file it looks like this:

DEFBOOTSTR=hd(40)dos

Therefore, any time I get to the Boot: prompt and simply press ENTER, I am brought into DOS. The reason I did that was for my son. Since I have loads of educational programs for him on my DOS partition, I wanted a way for him to get to DOS easily. All he needs to do is press enter at the right place and he gets to where he needs to be. Now he's a little older and understands that DOS is different than UNIX and could type in DOS himself. However, I leave it in for historical reasons.

So, how do I get to UNIX? Well, one way would be to type in the old DEFBOOTSTR by hand ( hd(40)unix ). Rather than doing that, I use a trick called boot aliasing. We talked about this in some detail in the section on starting and stopping the system.

In the /etc/default/boot file you will also find out whether the system automatically boots (AUTOBOOT) or not. If there is no TIMEOUT value, the system automatically boots after 60 seconds. Otherwise, the system boots after the number of seconds defined in the TIMEOUT variable. Sometimes the overzealous administrator will want to set TIMEOUT to 0, so the system autoboots immediately after reaching the Boot: prompt. This is not a good thing. You want to give yourself at least a couple of seconds, in case you are having problems and need to boot into maintenance mode. Otherwise you will automatically go into multi-user mode.

One change to this file in OpenServer is the addition of the BOOTMNT variable. This determines how to mount the /dev/boot filesystem. If set to read-only(RO), then you can't simply copy files onto this filesystem. However, certain system utilities need to be able to write to this filesystem no matter what.

The /etc/default/passwd file contains the default settings for using passwords. This include such things as the minimum allowable length of the password, how often you can change it, and how complex the password has to be. Many of the values are dependent on what level of security you have. Therefore, in order to maintain consistency, I wouldn't recommend making changes unless you change the security level to match.

If you have ODT 3.0, then the file /etc/default/authsh is what is used to determine basic aspects of the user's environment, such as their home directory, shell, group, etc. If you are running OpenServer, this file still exists, but the file /etc/default/accounts is used instead. Although the format of the new file is easier to read, the old name had significance since it is the /tcb/bin/authsh utility that actually does the work when a user account is created.

Permissions Files

Another sub-directory of /etc, /etc/perms, contains more useful information about your system. The files within the /etc/perms directory are related to what products and packages are installed on your system. This is useful for finding out where a particular file is without having to do a search of your entire system. For this, do:

grep <file_name> /etc/perms/* | more

where <file_name> is whatever I am looking for. After using this several times, I put it into a shell script. If you are running OpenServer, then the /etc/perms/* may exist, but their content is not the same. Basically, OpenServer no longer users the files in /etc/perms, but are kept for backwards compatibility. Therefore this command may not work.

Although the contents of the files do not directly tell you what is currently installed, you can find out what programs are/should be available, plus what their permissions, owner and group ought to be. (NOTE: You don't need to correct permission's problems by hand. You can use the fixperm or fixmog utilities.)

Although not quite as verbose or easy to read, OpenServer does have a file list, similar to those in /etc/perms/. These are the file lists located within the SSO. You find these in several locations throughout the system. Normally these will be in a subdirectory called .softmgmt under the /var/opt directory. For example, the file lists for the operating system portion of OpenServer are found in /var/opt/K/SCO/Unix/5.0.0Cd/.softmgmt, whereas are those for TCP are found in /var/opt/K/SCO/tcp/2.0.0Cd/.softmgmt. The nice thing about ODT was that all this information was concentrated in about a dozens files. OpenServer has it spread over a couple of hundred!

If you want to find out what is currently installed, you can use the swconfig program to list products as well as packages that are installed. In most cases, custom will also tell you if a particular package is partially installed or not at all as well as some programs that have been removed. This is usually the case with bundled products such as ODT or the ODT Development System. If you have ODT 3.0 and are curious what has been installed on your system, check out the file /usr/lib/custom/history. Unfortunately, this no longer exists in OpenServer. The closest is /var/opt/K/SCO/Unix/5.0.0Cd/custom/custom.log, but this is rather difficult to read.

On both ODT 3.0 and OpenServer you'll find the file /etc/auth/system/files. This contains a list of files and permissions from the perspective of the TCB and not necessarily related to what is installed. In many cases, individual files are not mentioned as it is expected that every file in a specific directory has the same permissions. There some files may not appear here.

You can a one-liner like the one above to look through the OpenServer configuration files:

egrep <file_name> {,/var}/opt/K/SCO/*/*/.softmgmt/*.fl

Using egrep is necessary here because of the syntax we are using to look through both the /opt and the /var/opt directories.

User Files

The /etc directory contains the all-important passwd file. This gives important information about what users are configured on the system, what their user ID number is, what their default group is, where there home directory is and even what shell they use by default.

The default group is actually a group ID number rather than a name. However, it's easy to match up the group ID with the group name by looking at /etc/group. This also gives you a list of users, broken down into what groups they belong to. Note the "groups" is plural.

Another aspect of information about users is what privileges they have on the system. As we talked about in the section on security, what users have a particular privilege can be found in the files in /etc/auth/subsystem. These are referred to the subsystem authorizations. Privileges listed on a per user basis are found in /tcb/files/auth/?, where ? is the first letter of the user's account name, such as r for root or u for uucp. The default values for these files are kept in /etc/auth/system/default.

Network Files

If you are running TCP/IP, there are a couple of places to look for information about your system. First, check out the file /etc/resolv.conf. If you don't find it and you know you are running TCP/IP, don't worry! The fact that it is missing, tells you that you are not running a nameserver in your network. (A nameserver is a machine that contains data on how to communicate with other machines in a network.) If it is not there, you can find a list of machines that your machine knows about and can contact by name, look at /etc/hosts. If you are running a nameserver, this information is kept on the nameserver itself.

The content of the /etc/hosts file is the IP address of a system followed by its fully qualified name and then any aliases you might want to use. A common alias is simply to use the node name, leaving of the domain name. Each line in the /etc/resolv.conf file contains one of a couple different types of entries. The two most common are the domain entry, which is set to the local domain name, and the nameserver which is followed by the IP address of the name "resolver". See the section on TCP/IP for more information on both of these files.

It's possible that your machine is the nameserver itself. To find this out look at the file /etc/named.boot. If this exists, then you are probably a name server. The /etc/named.boot file will tell you the directory where the name server database information is kept. For information about the meaning of these entries, check out the named(ADMN) man-page as well as the section on TCP/IP.

Another place to look is the TCP startup script in /etc/rc2.d. Often static routes are added there. If these static routes use tokens from either /etc/networks or /etc/gateways that are incorrect, then the routes will be incorrect. By using the -f option to the route command you can flush all of the entries and start over.

Although not as often corrupted or otherwise goofed up, there are a couple of other files that require a quick peek. If you think back to our telephone switchboard analogy for TCP, we can think of the /etc/services file as the phonebook that the operator uses to match up names to phone numbers. Rather than names and phone numbers, /etc/services matches up the service requested to the appropriate port. To determine the characteristics of the connection, inetd uses /etc/inetd.conf. This contains such information as whether to wait for the first process to be finished before allowing new connections.

A common place for confusion, incorrect entries and the inevitable calls to support deals with user equivalence. A we talked about in the section on TCP/IP, when user equivalence is set up between machines many remote command can be executed without the user having to produce a password. One of the more common misconceptions is the universality of the /etc/hosts.equiv file. While this file determines with what other machine user equivalence should be established, the one user it does not apply to is root. To me this is rightly so. While is does cause administrators who are not aware of this to be annoyed, it is nothing compared to the problems if it was to allow root and this is not what you expected.

In order to allow root access, you need to create a .rhosts file in roots home directory (usually /) containing the same information as /etc/hosts.equiv, but only applying to the root account. The most common mistake made with this file is the permission. If the permission are such that any other user other than root (as the owner of the file) can read it, the user equivalence mechanism will fail. Looking in /etc/hosts.equiv and $HOME/.rhosts tells you want remote users have access to what user accounts.

In the section on networking, I introduced the concepts of a "chain." This is the link between network interface and the various network protocols. Which links are configured is kept in /usr/lib/lli/chains. This is simply a list of the various chains, with the upper layer on the left side and the lower layer on the right side.

If you are running NFS, there are two places to check for NFS mounted filesystems. To check for remote filesystems that yours mounts, take a look at /etc/default/filesys. This is default location to list all mounted filesystems, not just local ones. If you are the one exporting filesystems or directories, the place to look is /etc/exports.

Other Files

Next we get to a very foreboding and seldom frequented portion of your system: /usr/include. If you are a programmer, you know what's in here. If not, you probably thought that this directory was only for programmers. Well, sort of.

There are really only three times when you needed to be concerned with this directory. First, if you are a programmer. Second, if you're relinking the kernel. Do you remember all the space.c files in /etc/conf/pack.d? Well, they all refer to include files somewhere in /usr/include.

The parent directory, /usr/include, basically contains the include files that are consistent across Unix dialects. There are some useful things in here, such as the maximum value that an unsigned integer can take on. This is 4294967295 and you can find it in /usr/include/limits.h. There are some less useful things such as pi divided by four out to 20 decimal places. This is defined in the /usr/include/math.h as 0.78539816339744830962.

The third time? If you are just curious about your system. Even if you have just a basic knowledge of C, poking around in these files can reveal some interesting things about your system.


File

Purpose

Where to find more information

User and Security Files



/etc/auth/subsystems

manipulation routines for Subsystems database

subsystems(S)

/etc/auth/system/authorize

subsystem authorization file

authorize(F)

/etc/auth/system/default

system default database file

default(F)

/etc/auth/system/devassign

device assignment database file

devassign(F)

/etc/auth/system/files

file control database

files(F)

/etc/auth/system/ttys

terminal control database file

ttys(F)

/etc/group

User group information

group(F), chmod(C)

/etc/passwd

User account information

password(F), chmod(C)

Kernel Files



/etc/conf/cf.d/init.base

Base for /etc/inittab

inittab(F)

/etc/conf/cf.d/mdevice

device driver module description file

mdevice(F)

/etc/conf/cf.d/mevent

Master event file

event(FP

/etc/conf/cf.d/mfsys

configuration file for filesystem types

mfsys(FP)

/etc/conf/cf.d/mscsi

SCSI peripheral device configuration file

mscsi(F)

/etc/conf/cf.d/mtune

Master kernel tunable parameter file

mtune(F)

/etc/conf/cf.d/sdevice

local device configuration file

sdevice(F)

/etc/conf/cf.d/sevent

System event file

event(FP)

/etc/conf/cf.d/sfsys

local filesystem type file

sfsys(FP)

/etc/conf/cf.d/stune

local tunable parameter file

stune(F)

/etc/conf/mfsys.d

Master filesystem configuration file

mfsys(FP)

/etc/conf/node.d

Device node configuration files

idmknod(ADM)

/etc/conf/pack.d

Device drivers and configuration files

mdevice(F), sdevice(F)

/etc/conf/sdevice.d

Device configuration files

mdevice(F), sdevice(F)

/etc/conf/sfsys.d

Local filesystem configuration files

sfsys(FP)

Networking Files



/etc/auto.master

Default master automount file

automount(NADM)

/etc/bootptab

Internet Bootstrap Protocol server database

bootptab(SFF)

/etc/exports

Directories to export to NFS clients

exports(NF)

/etc/gateways

List of gateways

routed(ADM)

/etc/hosts

Hostname to IP address mapping file

hosts(SFF)

/etc/hosts.equiv

Lists of trusted hosts and remote users

hosts.equiv(SFF), .rhosts(SFF)

/etc/inetd.conf

Configuration file for inetd

inetd.conf(SFF)

/etc/named.boot

Default initialization file for named

named.boot(SFF)

/etc/networks

Known networks

networks(SFF)

/etc/pppauth


point-to-point authentication database

pppauth(SFF)

/etc/pppfilter

PPP packet filtering configuration file

packetfilter(SFF)

/etc/ppppool

IP address pool file for PPP network interfaces

ppppool(SFF)

/etc/ppphosts

point-to-point link configuration file

ppphosts(SFF)

/usr/lib/named or

/etc/named.d

Configuration files for named

named(ADMN)

/usr/lib/uucp/Configuration

Protocol configuration file for UUCP

uucp(C), Configuration(F)

/usr/lib/uucp/Devices

Configured UUCP Devices

uucp(C), Devices(F)

/usr/lib/uucp/Permissions

UUCP authorization file

uucp(C), Permissions(F)

/usr/lib/uucp/Systems

Remote UUCP Systems

uucp(C), Systems(F)

X-Windows Files



$HOME/.mwmrc

MWM configuration file

mwm(XC), X(X)

$HOME/.pmwmrc

PMWMconfiguration file

mwm(C), X(X)

$HOME/Main.dt

X-Desktop configuration file

dxt3(XC)

$HOME/Personal.dt

X-Desktop configuration file

dxt3(XC)

/usr/lib/X11/system.mwmrc

System default MWM configuration file

mwm(XC), X(X)

/usr/lib/X11/system.pmwmrc

System default PMWM configuration file

mwm(XC), X(X)

/usr/lib/X11/app-defaults

Application specific defaults

X(X)

$HOME/.Xdefaults-hostname

Host specific defaults

X(X)

System Default Files



/etc/default

system default database file

default(F)

/etc/default/archive

archive devices

archive(F)

/etc/default/authsh /etc/default/accounts

Account creation parameters

authsh(ADM)

/etc/default/backup

XENIX backup devices

xbackup(ADM)

/etc/default/boot

System boot options

boot(F)

/etc/default/cc

Read by /bin/cc

cc(CP)

/etc/default/cleantmp

Interval/location for tmp file cleanup

cleantmp(ADM)

/etc/default/cron

Cron configuration file

cron(C)

/etc/default/device.tab

Device table for package utilities

pkgadd(ADM)

/etc/default/dumpdir

XENIX archive device

xdumpdir(ADM)

/etc/default/filesys

Fileysystem mount table

filesys(F)

/etc/default/format

Floppy disk format device and verification

format(C)

/etc/default/goodpw

Password checking options

goodpw(ADM)

/etc/default/idleout

Interval for closing idle logins

idleout(ADM)

/etc/default/issue

System default banner

issue(F)

/etc/default/lang

System locales

locale(M)

/etc/default/lock