APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

clfree panic and no logins at console

© November 2003 Tony Lawrence
November 2003

This is based on a real incident, though the facts have been simplified a bit to make it easier to follow.

"Type cd space slash e t c", I yelled.

Somewhere in Ohio, a slightly frazzled service tech questioned what he'd heard: "cd slash e c t?"

When doing Unix support with a Windows user, I always try to be very patient and very friendly. Having a lousy phone connection doesn't help this process: it's hard to sound friendly when you have to shout. But at least we were making progress. When the call started, the machine had crashed and was not coming up. Actually, that turned out to be not quite true, but it looked that way to the people on-site, and, given the circumstances, that was a reasonable conclusion.

From their point of view, here's what happened: all serial printers suddenly stopped working. At the same time, a "PANIC: clfree - Free block " message appeared on the console screen. Being unable to do anything else, they powered off, and when the system came back up, it of course needed to run fsck. They did that, but the system just "went dead" after the fsck - no logins.

I was a little confused at first, because they told me this was a SCO 5.0.5 system, but you shouldn't see that clfree panic after 3.2v4.2. It's possible to have a similar problem on 5.0.5, but the message would be "PANIC: HTFS: Free block freed on HTFS" normally. Either way, I had a good idea what part of the problem was. The solution would be simple: get to single user mode, run "fsck -b -s -y /dev/root" or "fsck -ofull", possibly update some patches and we'd be done. But it wasn't going to be that easy.

In fact, the scratchy voice at the other end told me that he'd been trying to get to single user mode, but without any luck. No matter what he did, the system would either panic again, or just "go dead" on him, necessitating a power cycle reboot. Hmm. That didn't sound good. Maybe missing some important files, like inittab? But no, as I had him read me what he saw on the screen, it was apparent TCP/IP was starting up. He had no logins on the console, and had no Digiboard connnected terminals anyway, but I asked him to try to telnet in from his laptop, and to his surprise, he got in.

I knew what was wrong now: one of the rc scripts hadn't finished. Inittab is set to "wait" for the rc scripts to finish- if one does not finish, getty's never start on the console or on the Digiboards. If you have TCP/IP, that will have started before these scripts, so you can telnet in.

But which script? If this had been 5.0.5, I would have looked at /etc/rc2, as described at OpenServer 5.0.5, system hangs just before the login prompt when booting to multiuser mode.. However, a "uname -X" told me that this was 3.2v4.2 as I had suspected. So, I had the tech do this:

cd /etc/rc2.d
ls -lut

(See Troubleshooting for the why behind that).

I asked him if all the dates he could see were the same. He said most were, but the last was dated several days ago. That told me that this script had never been reached, because some other script was hung. I then had him do

ls -lut | head -1

That told me that the LAST script executed was S88USERDEF. Taking an educated guess, I had him immediately do:

cd /etc/rc.d/8
ls -lut

and asked again if the dates were all the same. He said (as I expected) that there were only three files, "pcu", digscr", and "userdef", and that "userdef" had an old date on it. I asked if "pcu" was the first file listed, and he said it was. I asked him to look at it with "more", and he said it was "gibberish". That shouldn't be: that is a text script that is part of the Digi initialization. I asked him to edit it and put a "exit 0" as the first line, and then to type "reboot".

This time, the system happily came to a "Control-D" prompt. I had him put in the root password, and run "fsck -b -s -y /dev/root". That had to clear a lot of files, but I could tell from the modes that these were temporary files and named pipes, so I wasn't too concerned. After fsck finished, we went multi-user, and everything appeared to be working, except that none of the Digiboard printers worked. That didn't particularly surprise me, as we had short circuited an initialization file and may have had a defective board anyway. I asked the tech if he knew about "mpi" to run Digi's diags, and he was already familiar with that, so at this point I left him, suggesting that he at least should try downloading new Digi software, but that a better idea would be to put the printers on a print server and eliminate all need for serial ports. He agreed that was a good idea.

I do not know what caused the problem with the Digiboard file. I did suggest that this anomaly and the file system corruption might be an indication that the hard drives or memory could be failing, and cautioned that he should be religious about backups and consider an upgrade to new hardware as soon as possible. He assured me that was already planned.

The combination of the clfree panic and the /etc/rc.d/8 hanging made this a more difficult problem than it otherwise would have been.

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> clfree panic and no logins at console (SCO Unix)

Inexpensive and informative Apple related e-books:

Take Control of iCloud, Fifth Edition

Take Control of Parallels Desktop 12

Take Control of Upgrading to El Capitan

Photos: A Take Control Crash Course

Sierra: A Take Control Crash Course

More Articles by © Tony Lawrence

We used to get a lot of "PANIC: HTFS: Free block freed on HTFS" errors on our 5.0.5 machine, until I applied: (SLS) OSS647A, the Sdsk Supplement for that machine. I also had to apply the proper IBM ipsraid driver, and the machine has been fine since. After a couple of 3AM pages by our night supervisor, and many 'fsck's' later, I'm glad that those patches fixed the problem with that machine!

- Bruce Garlock

Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

Printer Friendly Version

FORTRAN's tragic fate has been its wide acceptance, mentally chaining thousands and thousands of programmers to our past mistakes. (Edsger W. Dijkstra)

Linux posts

Troubleshooting posts

This post tagged:



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode