Kernel Link Failures

This is an old SCO Unix article. If you are having trouble with Linux kernel building, this isn't going to help, sorry. There are other posts here that might, so use the Search box above to find them.

That's a pretty awful feeling, isn't it? You've got to link a new kernel because you need to change a value or needed to add something, and it fails. The near gibberish it outputs looks completely unhelpful and you haven't a clue where to start. Well, this article hopes to give you some clues.

A cover your butt procedure I always follow is to link a kernel BEFORE you change anything. If it fails, you know it was already broken, and didn't break because of something you did. If you are feeling really paranoid, answer "N" to the "Do you want this kernel to boot by default" message, and then do:


sum -r /stand/unix ./unix
 

and see if the two files are the same- they certainly should be if you haven't changed anything yet. If they aren't, I suggest:

btmnt -w
cd /stand
cp unix unix.good
cd
btmnt -d
 

We're going to start with an actual case. A local consultant called me because he had tried to increase a kernel variable, but the link failed. The increase was critical to the proper functioning of the system, and he couldn't fix it.

As it turns out, I could have identified the problem in seconds. Unfortunately, I didn't realize that at the time (live and learn), but even if I had thought of that method, I would have dismissed it because I was sure the problem was elsewhere. I'll tell you what I should have done that would have instantly told me what was wrong, but I'll hold off explaining why until later. Here's what would have given me the answer I needed:

cd /etc/conf/cf.d
diff sdevice sdevice.new
 

Think about that as you read along.

This article doesn't go into the whole subject of drivers and the link directories very deeply. You might want to read Understanding Device Drivers if you want to understand more.

The first thing I did was this:

cd /etc/conf/cf.d
script /tmp/linkerr
./link_unix
 

After the script finished belching out its errors, I used CTRL-D to exit "script", and went to look at /tmp/linkerr. Here it is:

# ./link_unix

        The UNIX Operating System will now be rebuilt.
        This will take a few minutes.  Please wait.

        Root for this system build is /
undefined                       first referenced
 symbol                             in file
putctl                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
sdistributed                        /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
freemsg                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
qreply                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
flushq                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
putq                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
qsize                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
getq                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
putbq                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
allocb                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
linkb                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/merge/Driver.o
copyb                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
dupb                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
freeb                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
canput                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
putnext                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/log/Driver.o
putctl1                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o
qenable                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ptm/Driver.o
bufcall                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ldterm/Driver.o
pullupmsg                           /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o
copymsg                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/timod/Driver.o
msgdsize                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
unlinkb                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
rmvq                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
insq                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/tirdwr/Driver.o
lock_stp                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
backq                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
unlock_stp                          /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
qdetach                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
at_qrunflag                         /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strwaitbuf                          /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
dupmsg                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
lock_str_bfsleep                    /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strmaxblk                           /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
getclass                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
allocq                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
streams                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
freeq                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
setq                                /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
shlock_str_qnext                    /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
clnopen                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
noenable                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
qdisable                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
strdoioctl                          /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
strwaitq                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
findmod                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
qattach                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ksl/Driver.o
strqset                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o
adjmsg                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ip/Driver.o
strmsgsz                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/rip/Driver.o
unbufcall                           /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o
bsize                               /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/iknt/Driver.o
esballoc                            /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/net0/Driver.o
mblock                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o
emblock                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/ipl/Driver.o
rbsize                              /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/nfs/Driver.o
i386ld fatal: Symbol referencing errors. No output written to unix
ERROR: Can not link-edit unix


idbuild: idmkunix had errors.
System build failed.
# 
 

Pretty awful mess, isn't it? I was convinced that a driver file in /etc/con/pack.d must be missing or horribly corrupted. Actually, though, it couldn't have been a missing driver file- the link_unix would have reported that in plain English. A really badly corrupted driver file would have also barfed differently, though the error message wouldn't be as obvious (I'll show examples of that later).Could it be that a good driver had been copied incorrectly- for example somehow copying /etc/conf/pack.d/clone/Driver.o to /etc/conf/pack.d/kbd ? No, because that would give us multiply defined symbols, and there's no mention of that in the output.

How about a Driver.o from a different release, or from a backup prior to the application of patches? Yes, that could cause these kind of errors, and that was my first thought. Yet, I know the local consultant pretty well, and that doesn't sound like something he would have done, even accidentally, so I gave up that and decided that some needed driver was just not being linked into the kernel. Now to find it.

I picked a symbol from the list of errors and went looking for it like this:

cd /etc/conf/pack.d
for i in */Driver.o
do 
strings $i | grep esballoc && echo $i
done
 

Let me say right away: that's NOT the best way to look for symbols in a .o file, but I got lucky and "str" popped up as a match. I checked /etc/conf/sdevice.d/str, and it was marked N:

str     N       0       0       0       0       0       0       0       0
 
cartoon

Now that's pretty odd: it shouldn't have been: "str" is the Streams driver and is necessary for just about everything on the network. I changed it to "Y" and tried the link again:

# ./link_unix

        The UNIX Operating System will now be rebuilt.
        This will take a few minutes.  Please wait.

        Root for this system build is /
undefined                       first referenced
 symbol                             in file
clnopen                             /var/opt/K/SCO/link/1.1.1Eb/etc/conf/pack.d/socket/Driver.o
i386ld fatal: Symbol referencing errors. No output written to unix
ERROR: Can not link-edit unix


idbuild: idmkunix had errors.
System build failed.
 

That's better; a lot less errors, but still no success. When you are linking a kernel, even one error is one too many. So I tried my script again, but with clnopen this time:

cd /etc/conf/pack.d
for i in */Driver.o
do 
strings $i | grep clnopen >> echo $i
done
 

This didn't work, though. It's not that "clnopen" isn't somewhere in one of those Driver.o files, it's that "strings" isn't good enough to find it. However, I had other weapons: I was dialed in to the customer, but was working from my own machine which happens to be the same OS release. On my machine, I have the Development System installed, and the Development System has "nm". So on my system I did this:

cd /etc/conf/pack.d
for i in */Driver.o
do 
nm $i | grep clnopen >> echo $i
done
 

Bingo! The "clone" driver has "clnopen", and sure enough, it too was turned off in /etc/conf/sdevice.d (nobody knows how or why this happened, by the way). I turned it back on, and now the kernel linked successfully.

If I had not had "nm", I could have done this:

cd /etc/conf/pack.d
for i in */Driver.o
do 
hd $i | grep clnopen && echo $i
done
 

As I said at the outset, if I had done a diff on the two sdevice files, this would have shown me:

60c60
< clone Y       1       0       0       0       0       0       0       0
---
> clone N       1       0       0       0       0       0       0       0
319c319
< str   Y       0       0       0       0       0       0       0       0
---
> str   N       0       0       0       0       0       0       0       0
 

The reason that works is that link_unix apparently doesn't replace sdevice until the link is successful (sdevice is built from the individual files in /etc/conf/sdevice.d). That's very helpful for this kind of error, because it immediately shows you what has changed since the last successful link.

Other Linking Errors

Of course, there are other things that can go wrong. One I see now and then is where a new device has been partially installed or partially removed, and the kernel fails to link because enough of it is still there to confuse it. In a case like this, you want to look in /etc/conf/cf.d/mdevice, and the offending device will probably be at the end of it. If you are not really sure, you can just comment out the line you think is the problem by putting a "#" at the beginning of the line; if the kernel then relinks, that was it. For example, here's the end of my mdevice; the E3H was the last thing I added to this machine:

vdsp    ocriI   ioc             vdsp    0       126     0       0       -1
vgic    ociI    ioc             vgic    0       127     1       1       -1
vkbd    ocwiI   ioc             vkbd    0       128     0       0       -1
vmouse  ociI    ioc             vmse    0       129     1       1       -1
vw      I       icS             vw      0       130     8       128     -1
net0    I       iSc             net0    0       131     1       256     -1
e3E     I       icSH            e3e     0       132     0       1       -1
ipl     Iocir   ico             ipl     0       133     1       1       -1
net1    -       iSc             net1    0       134     1       256     -1
e3H     I       icSH            e3H     0       135     0       1       -1
 

Corruption

What about a corrupted driver? The errors you get will depend upon the nature of the corruption, but let's try some experiments (if you aren't comfortable and sure of yourself, don't try this on a working machine):

cd /etc/conf/pack.d/str
mv Driver.o Safe
date > Driver.o
cd /etc/conf/cf.d/
./link_unix
 

When I did this, I got a message saying that the file "Wed" (it happened to be Wednesday) couldn't be opened for input. Let's try something else:

cd /etc/conf/pack.d/str
cp /bin/ls Driver.o
cd /etc/conf/cf.d/
./link_unix
 

This time I got a message complaining that it couldn't open "file ELF". That would be a very definite sign of corruption: Driver files would always be "COFF".

To put everything back as it was:

cd /etc/conf/pack.d/str
rm Driver.o
mv Safe Driver.o
 

I hope this gives you a little more confidence should you ever run into a broken kernel relink. Certainly other errors are possible, but these are the most common I've seen.



Got something to add? Send me email.





(OLDER) <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> SCO Unix Kernel Link Failures




Increase ad revenue 50-250% with Ezoic


More Articles by

Find me on Google+

© Tony Lawrence



Kerio Samepage


Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us





I wanted to learn how to swim, so Google showed me how to turn on the water at the sink and let me splash it around a bit. They then dragged me into a helicopter, flew way out into the ocean and dumped me out. (Tony Lawrence)

The object-oriented model makes it easy to build up programs by accretion. What this often means, in practice, is that it provides a structured way to write spaghetti code. (Paul Graham)








This post tagged: