APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

Debugging a network failure

Some material is very old and may be incorrect today

© August 2006 Anthony Lawrence

As is happening in much of the U.S. right now, we are experiencing extreme heat in New England, and of course that means high electrical loads from air conditioning, and also means late afternoon thunderstorms with lightning.

This morning I had a call from a Boston customer. "Everything's down", he said, "Internet and our servers. We can't get to anything. But.. the guys who came in early are still working."

That last bit told me that the servers weren't down. Most likely machines weren't getting IP addresses from the router. People who had come in early had been able to get IP's and still had them. Latecomers didn't. So.. dead router? Maybe. We tried power cycling it, no change. Lights on it looked right, but he still wasn't working.

I had him walk back to his office and try "ipconfig/renew" from a DOS window. No response, just hung. As I couldn't reach his router remotely, a dead router was certainly a possibility. But we needed to be sure.

I asked him if he had a laptop in the building; he didn't. So, if a computer won't come to the router, we'll pick up the router and bring it to a computer. "Doesn't it need to be connected to the T1?", he asked. Well, sure, for Internet access, but not to hand out IP's. So I had him unplug the router and carry it to his office. Unplug his machine from the wall, plug that wire into a LAN port on the router, do the "ipconfig/renew" again. Bingo - he had an IP address. The router is not dead, at least not on the LAN side. So I had him bring it back and plug it back in where it belonged. Just to be sure, I had him walk back to his office and do "ipconfig/renew" again. No luck.

Ok, we have a wiring problem or a switch port problem. I knew the wiring was new and I had tested it myself, so I doubted that. But sometimes mice will chew wires, so it might have to be checked. But before trying that, I asked him to trace the wire from the LAN side of the router to his office switch. He had free ports there, so I had him switch it. Sent him walking back to his office (poor guy was getting a lot of exercise this morning) and try "ipconfig/all". It worked, telling us that a dead port on his switch was the problem. I had him reboot and try to access his server. That now worked.

But the Internet still didn't, and I still couldn't access his router. It would have been surprising if I could: a dead port on the LAN side of his system wouldn't prevent me from accessing his router over the Internet. Back to look at the router.

The T1 unit plugs into a five port switch and one wire runs from that to the WAN side of the router. The other ports are for other devices with public IP's, but those are still on an older DSL line. As soon as whoever handles those devices gets their act together and reconfigures them, they can plug into this little switch too, but right now the other ports are empty. I asked him to pull both wires out of the little switch and put them back into unused ports. Ayup, instant success: I could access his router.

So, what happened? Probably a power surge of some kind early this morning. Did he have surge protectors? Yes, but.. some were old. And who knows if a surge protector will really work anyway? If only a small spike gets through, that may not bother some equipment at all but could kill other equipment dead. Or it might just temporarily confuse it: unplug the switch for ten seconds and it might be fine. But.. it might also be weakened by the experience, so considering the cost, I recommend just replacing anything suspect. In this case, he needs new switches anyway and it wouldn't hurt to have an electrician look over the wiring rat's nest and improve it.

This kind of debugging and fault resolution is no different than any other: test, isolate, test, repeat. It can be easier if you have a laptop and spare equipment, but mostly it's just old fashioned logic.

Good time to review your electrical systems, isn't it?

Got something to add? Send me email.

(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

-> -> Debugging a network failure


More Articles by © Anthony Lawrence

Wed Aug 2 17:12:32 2006: 2351   TonyLawrence

Another Boston customer just needed a router reset.. they have two routers side by side, one needed resetting, the other didn't.

Wed Aug 2 20:19:10 2006: 2354   BigDumbDInosaur

Why on earth isn't this hardware attached to a high quality UPS? If it's that important to their operation it should be powered by something more than just a surge protector (most of which don't do much in the way of stopping the truly damaging stuff).

Aside from that little rant, this article is a classic example of proper troubleshooting technique. Of course, it helps if the guy/gal at the other end of the phone line can follow instructions and doesn't mind hoofing around the office. <Grin>

Wed Aug 2 20:22:36 2006: 2355   TonyLawrence

They are UPS/surge protectors.


Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us

If you just want to use the system, instead of hacking on its internals, you don't need source code. (Andrew S. Tanenbaum)

Linux posts

Troubleshooting posts

This post tagged:



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode