Always try a cold boot

A Linux customer called asking how to shutdown his server. He said he had a network problem and wanted to reboot. I gave him the command and asked what the problem was.

He said his switch had "gone nuts" and that he had replaced it but the server still couldn't be reached. I asked him if there were lights at the port where the server was plugged in; there were not. He moved the wire directly to the router; still no response. The server was coming back up by now but was hanging trying to reach his NTP server - it wasn't looking good.

I had him log in and check "ifconfig eth0" and "netstat -rn". All was correct. He changed out the cable; still no lights.

"Guess it's time to call Dell - I have 4 hour hardware response", he said.

I said I'd like to try one more thing. I had him do an "init 0" and power off the machine and unplug it.. I asked him to wait at least two minutes and then plug it back in and power it on. He said he'd call me back.

When he did, he reported the machine working. He seemed surprised, but I wasn't. First, it's always worthwhile to completely power off. Registers that might be in an "impossible" state get reset. He had said the switch had "gone nuts". I assumed that meant flashing lights, going on and off or worse - that malfunction could have sent almost anything down the wire, confusing the nic in the server. It could have blown it out too, but that wasn't the case here.

Always try a cold reboot. It can't make things any worse and sometimes it brings things back to life.

Wed Apr 1 02:50:38 2009: 5940   MikeHostetler

I've hit this recently while configuring a Stallion Serial Board on SCO. You had to actually power off the board (unplug it from the machine) to get it to reset. And a similar thing with getting a mouse in an AUX port to be recognized -- a cold boot!

Sometimes all this new-fangled technology needs the old school touch.

