APLawrence.com -  Resources for Unix and Linux Systems, Bloggers and the self-employed

A non-technical guide to understanding and fixing TCP/IP problems on a network


© August 2011 Anthony Lawrence

Obviously the title is a bit incorrect in the "non-technical" claim, because we are dealing with a geekish subject here and I can't avoid being a little "techie". Still, my intent is to make it possible for a non-technical person to understand this and perhaps even identify and fix common networking problems.

By the way, your network is obviously working if you are reading this. If it were not working, you wouldn't be able to read it, would you? You might want to print this out while you can.. just a thought.. the "Printer Friendly" link above is the best way to do that.

Command Line

I apologize to those who cringe at the very thought of it, but this is going to sometimes require working at the command line. For Windows, that's the DOS command program accessed by finding it in your menus or doing Start->Run and typing "cmd" (Vista and Win 7, hold the "Windows key" and hit "R" instead of Start->Run). For Mac OS X, you need Terminal. Hit Apple-space and type Terminal to find it.

Linux users, stop laughing because I had to spell all that out.

Reaching the Internet vs. reaching local machines

I'm going to cover two main issues here: problems in your local network and problems reaching the Internet or other networks. I'll do the latter first because even home users with one machine can have trouble reaching the Internet. The second part is about local network issues - printers, file sharing and so on. You'll find that in the Local Network Problems section below.

I'll also cover VPN problems very briefly.

Quick checks

If you are having trouble right now, you should first get to a command prompt (see above for how) and type "ping 8.8.8.8". If that works (if it doesn't hang or say "0 packets received"), skip directly to the section on DNS Problems and start reading there. If it does not work, keep reading here.

Can't connect to Internet

The first thing to understand about getting to the Internet is that (assuming you aren't using dialup) a router is involved always. You may not realize that and support people may refer to your "modem", but somewhere, whatever you do is passing through a router.

That router has an IP address. That address is your gateway to the Internet. You will see the terms "default route" and "gateway" used; in this context they mean the same thing.

The next important thing to understand is that your computer itself needs an IP address also and it has to be on the same "subnet" as the gateway address. We simply have to get a little geeky here to explain it all, but there is one quick thing you can do that could tell you that you definitely have a subnetting problem.

What is your computer's IP address?

First, you need to find out what your computer is using as its IP address. Before you can do that, you need to know what interface you are using. Do not panic! All I mean is that you need to know if you are trying to make a wireless connection or are going through a plugged in cable. That much is easy, right?

Now let's find it in your computer.

Linux

Type "ifconfig -a" at the command prompt and look for your interface. That might be "eth0" or "eth1" for a wired connection. The result might look something like this (I bolded the important stuff but it will not be bold when you do it):

eth0 Link encap:Ethernet HWaddr 00:C0:F0:77:FD:AD
inet addr:192.168.11.1 Bcast:192.168.11.255 Mask:255.255.255.0
inet6 addr: fe80::2c0:f0ff:fe77:fdad/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:913240 errors:230 dropped:0 overruns:0 frame:230
TX packets:663990 errors:7 dropped:0 overruns:0 carrier:12
collisions:0 txqueuelen:1000
RX bytes:179148797 (170.8 MiB) TX bytes:53220450 (50.7 MiB)
Interrupt:9 Base address:0xb000

lo Link encap:Local Loopback
inet addr:127.0.0.1 Mask:255.0.0.0
inet6 addr: ::1/128 Scope:Host
UP LOOPBACK RUNNING MTU:16436 Metric:1
RX packets:9814 errors:0 dropped:0 overruns:0 frame:0
TX packets:9814 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:0
RX bytes:3655065 (3.4 MiB) TX bytes:3655065 (3.4 MiB)
 

You ignore "lo". The IP of this machine is 192.168.2.103 and its MAC address (more on that later) is 00:C0:F0:77:FD:AD (which also tells me that the card is a Kingston Technology card - I know that by looking it up in a table that translates the first 3 parts of it.)

To get the gateway, type 'netstat -rn"

Kernel IP routing table
Destination     Gateway         Genmask         Flags   MSS Window  irtt Iface
0.0.0.0         192.168.11.1      0.0.0.0         UG        0 0          0 eth0
192.168.11.0     0.0.0.0         255.255.255.0   U         0 0          0 eth0
 

The gateway (default route) is 192.168.11.1.

Mac OS X

You could also use "ifconfig -a" on Mac, but if you are command line phobic, the information Is available from the Network section of System Preferences.

>o0: flags=8049<UP,LOOPBACK,RUNNING,MULTICAST> mtu 16384
	inet6 ::1 prefixlen 128 
	inet6 fe80::1%lo0 prefixlen 64 scopeid 0x1 
	inet 127.0.0.1 netmask 0xff000000 
gif0: flags=8010<POINTOPOINT,MULTICAST> mtu 1280
stf0: flags=0<> mtu 1280
en0: flags=8863<UP,BROADCAST,SMART,RUNNING,SIMPLEX,MULTICAST> mtu 1452
	ether 00:16:cb:8d:48:f7 
	inet 192.168.11.2 netmask 0xffffff00 broadcast 192.168.11.255
	media: 100baseTX <full-duplex,flow-control>
	status: active

Mac preferences panel - Network

Note that in the Mac Preferences, the gateway is shown as "Router".

If you are a recent Mac convert from Windows, you might gave gotten a surprise if you typed "ipconfig". OS X has a command of that name and it is useful, but it works a bit differently.

To get the gateway from Terminal, type 'netstat -rn". Macs spit out a lot more stuff, but what you want is right at the top:

Routing tables

Internet:
Destination        Gateway            Flags        Refs      Use   Netif Expire
default            192.168.11.1       UGSc           14        0     en0
127                127.0.0.1          UCS             0        0     lo0
127.0.0.1          127.0.0.1          UH              8    27380     lo0
169.254            link#4             UCS             0        0     en0
 

Windows

In the DOS command window, you can use "ipconfig" and "ipconfig/all". On some Windows versions, you can run "ipconfig" from Start->Run. You can also find what you need in Control Panel, Networking. Find the interface you are using and right click on it. Choose Status and then click on the Support tab. Click Details to find all the information you need, including the gateway.

Windows ipconfig - Network
Windows Network Status

By the way, if Windows tells you that the network cable is unplugged, that's the problem. If you can see that it really is not, you could have a defective cable, a defective network card or a defective router.

We've seen IPV6 cause problems on IPV4 networks. If you don't need it, disable it.

It can't hurt to do these Windows commands (see "Command line" above):

nbtstat -RR
netsh interface ip delete arpcache
ipconfig /flushdns
ipconfig /registerdns
ipconfig /release
ipconfig /renew 
 

I've seen many Windows machines mistakenly configured to use a Proxy Server. Most Windows should NOT need that.

See Proxy Server for where to shut that off. Also see LAN Connection settings keep changing back to proxy server after restart if that keeps happening.

Ready for the techie stuff?

What, you though that was the techie stuff? Ok, maybe it was, but we have to do a little more.

What you need to do is compare the computer IP address to the gateway address. First thing, they cannot be the same. Every address on a network (a "subnet", which may also be referred to as a "broadcast domain" in this context), has to be unique. No two devices can have the same IP address.

The second thing is that they have to be in the same "subnet". This part can get very techie, but fortunately most simple networks will almost never have a problem with this part. I'll run through it just in case, but you probably don't have to worry about this.

Your subnet is determined by the "subnet mask". In the examples above, that was shown as either 255.255.255.0 or ffffff00 (they mean the same thing).

If you are part of a big network, you may see other numbers in the mask that are not 255. Figuring out which parts have to be the same gets more complicated. I do have other articles here that explain all that.

This is important because, although the numbers cannot be the same, they have to be partially the same. How similar they have to be depends on that "subnet mask".

In tis case, we know that the first three parts have to be the same because there are three 255's in the mask. So if our computer address is 192.168.11.2, the gateway MUST begin with 192.168.11 and the remaining number must not be 2.

You don't need to understand subnetting and masks right now. Just realize that this part of the address has to be the same.

So, if you see that your gateway is (for example) 192.168.2.1 and your computer has an address that begins with 169, you will definitely not be able to access the Internet.

If you do see an address that begins with 169, that's usually a self assigned address. It usually means that the device has been configured to ask a DHCP server for an address, but did not get one. That could be that it can't "see" the DHCP server or that the server has no mote addresses to hand out.

What to do? You can try "repairing" the connection. From the Windows command line, 'ipconfig /renew" can do that, or just click on the Repair button in that Status window.

Mac has a similar option under the "Advanced" tab - it's "Renew DHCP Lease". Linux has different methods in different distributions, but "ifdown eth0; ifup eth0" is generic.

You could also assign a manual, "static" address. That is a little more geeky, yes. I'm not going to cover that here in depth, but basically you need to pick an address that is not the same as any other local machine and that is in the proper subnet for the router and the rest of the network. See Networks 101.

It don't mean a thing if it ain't got that ping

(Dirk Hart, a fellow geek type, said that)

Assuming that the addresses seem reasonable, you should attempt to ping the router. If the gateway is 192.168.2.1, you'd type (at a command prompt)

ping 192.168.2.1
 

Windows will stop after 4 pings, Mac and Linux will keep going until you interrupt. That interruption might be Control-C. If that doesn't work, Control-\ should, but if that won't do it either, you may have to close the command or Terminal Window.

You want to see responses from the gateway. Unfortunately, that gateway may not be configured to respond to pings. It's good to know that ahead of time, so ideally you would have tried pinging when everything was working normally and would know whether it should or should not respond. Most routers are configured to respond.

If it does respond, that's good, because it means that stuff is at least partially working.

If it does not and you know that it should, you don't need to check anything else. Something is broken - your computer or the router or the cables or switches. If you have other devices, you could go check Local Network Problems below to do some additional diagnostics. If not, there are things you might try, such as replacing the network card or updating its drivers.

If there is a switch (you might think of it as a "hub", but it is most likely a switch") where the cable from your computer and the router are both plugged in, try moving your computers wires to a different port. The switch might be built into your modem; you can still try moving the cable. Pay attention to lights on the switch or modem - when you unplug, one light should turn off.

In large companies, it might matter where you plug in. In a home network or a small office, any port on the switch is as good as any other.

Failing that, you want to talk to the people who people who provide the router. No doubt they will want to lead you through everything we have already done. That's ok, right?

Into the wild

If we can ping the gateway, we now need to try to go beyond that. We need an ip address that exists on the Internet.

A fairly easy one to remember is 8.8.8.8, which happens to be one of Googles DNS servers and should always be available. Google's other DNS is 8.8.4.4. Some other numbers that should be reachable are 208.67.222.222, 208.67.220.220, 208.67.222.220, and 208.67.220.222. Those are DNS servers at Opendns.com.

If you cannot ping any of these, your router may be bad. The problem could be farther out on the Internet, though. We can use another tool to check that.

Traceroute

Before we get to tracing, consider what that ping of 8.8.8.8 told us if it worked. If you could do that, we immediately knew that your computer, your router and all local wiring is good. It saved a lot of checking. That's why I had you try that before you understood why you were doing it.

But right here, that didn't work. We were able to ping the router on its "inside" address, but we can't get beyond. Is it the router? Traceroute will help us find out.

Windows users: your command will be "tracert". Linux and Mac use "traceroute". You'll type "traceroute 8.8.8.8".

There are websites that can do traceroutes for you. That's nice, but if the Internet isn't working, you can't get to those, can you? So what's the point? Well, sometimes PART of the Internet works, but other parts do not. That usually means that something is broken somewhere, but not right here. Somebody cut a cable in another state or a router lost power somewhere - traceroute can help find that.

Here's an example trace to 8.8.8.8. I took out some of the intermediate results to save space.

traceroute to 8.8.8.8 (8.8.8.8), 64 hops max, 40 byte packets
1 router (192.168.11.1) 0.997 ms 0.514 ms 0.442 ms
2 l100.bstnma-vfttp-118.verizon-gni.net (71.162.121.1) 5.584 ms 4.609 ms 4.923 ms
3 11-0-9-1718.bstnma-lcr-07.verizon-gni.net (130.81.133.154) 7.504 ms 7.224 ms 7.373 ms
4 so-0-3-0-0.bos-bb-rtr1.verizon-gni.net (130.81.29.252) 7.611 ms 7.029 ms 7.541 ms
...
15 216.239.49.145 (216.239.49.145) 59.689 ms
72.14.232.25 (72.14.232.25) 51.970 ms
72.14.232.21 (72.14.232.21) 47.149 ms
16 google-public-dns-a.google.com (8.8.8.8) 47.146 ms 46.706 ms 47.403 ms
 

This completed after 16 hops, reaching 8.8.8.8. If it failed to reach even the first hop, that could mean either you or your isp has a problem. If it fails somewhere in the middle, someone else has a problem.

Traceroute isn't an absolute indicator because it can fail when other things would work, but the diagnostic output can show you where a problem is NOT, and that alone is very helpful. If you know that the problem is out at hop 12, calling your local ISP probably is not going to get it fixed, but if it is hop 2, they likely are at fault. Check Wikipedia for more on trace route.

DNS Problems

Everything has worked up to this point. You were able to ping 8.8.8.8 which means your computer network is working. However, you still cannot browse the Internet. Why?

It could be DNS. You test by pinging Internet names: ping yahoo.com, for example. Just as you did when pinging numbers, try a few. Ping ibm.com and google.com. If none of those works (and the numeric pings did work), you are having DNS problems.

DNS is what translates names to numbers and numbers are what the Internet really uses. Your computer has been configured to use some DNS server and apparently that server is not doing its job.

Do you remember that I said that 8.8.8.8 and the other numbers I gave you are DNS servers?

You can plug one or more of those into your network configuration and possibly fix your issue instantly. For Linux, you'd edit /etc/resolve.conf and replace the "nameserver" line. With Mac OS X, you'd click on the interface in System Preferences->Network and change the DNS Server boxes. For Windows, you'd right click on the interface, and choose "Properties". That brings up a scary box with a lot of stuff in it. One of them says something about TCP/IP. Double click on that and you'll see the places where you can specify DNS servers.

Windows dns

Still does not work

We have DNS working but you still can't browse. Your problem very often is just that your browser is confused. This is unfortunately very common with Internet Explorer which is why I recommend having Chrome or Safari or some other browser available so that you can quickly see if it is just the browser. If Safari works and IE does not, you know the issue is fully within IE.

That won't help if things don't work now, of course. If IE is not working, you can't go get Chrome or Safari.

So what to do? Sometimes just quitting and restarting will fix it. Sometimes clearing the cache and deleting temporary Internet files will clear it up. Other times you need to get into more geeky stuff. Might I suggest you just use a better browser to start with?

I have found that clearing "Automatically detect settings", quitting and restarting will "wake up" IE. Sometimes just the opposite is true - telling it to "Automatically detect" fixes it.

Windows IE settings

Still broken?. Try Googling (using a browser that works - maybe you have to go to a neighbor,) for "fix Internet Explorer". You'll find enough to keep you busy, I promise.

VPN Problems

Most VPN issues are far too geekish to cover here. However, there is one simple "gotcha" issue I see frequently.

You've been told that you can connect to the VPN server at your workplace. You have been given an account, a password and full instructions. Unfortunately, it doesn't work. It can be as simple as you and your work using the same subnet. That is, if your scheme at home is 192.168.2.x and the internal network at work uses the same scheme, that VPN cannot work.

Changing all of your home might involve nothing more than reprogramming your local router, but it's all a bit too much to cover here.

Another possible VPN issue is whether or not you are using the remote system for your internet access. That can be slower and can be routing you through a proxy server. However, it may also be the only way to get access to things you might need.

A silly Windows thing

I'm not sure what causes this, but my neighbor's computer keeps getting the "Use a proxy server" box checked (Control Panel, Internet Options, Connections, Lan Settings). If you are supposed to use a proxy server, that would be correct, but most of us are not. Having that checked breaks her browsing, unchecking it returns it to working order instantly.

Most home users will NOT have a proxy server, so that should usually be unchecked. If it IS checked, make a not of the address and try unchecking. Always quit and restart IE after these changes.

Telnet checks

You can eliminate the browser antirely by using telnet from the command line:

telnet yahoo.com 80
Trying 72.30.38.140...
Connected to yahoo.com.
Escape character is '^]'.
get /index.html
<HEAD><TITLE>Redirect</TITLE></HEAD>
<BODY BGCOLOR="white" FGCOLOR="black">
<FONT FACE="Helvetica,Arial"><B>
 "<em>https://failsafe.fp.yahoo.com/404.html</em>".<p></B></FONT>

<!-- default "Redirect" response (302) -->
</BODY>
Connection closed by foreign host.
 

(Modern versions of Windows stupidly disable telnet; see Enabling Telnet in Windows 7.)

If that works, your browser is at fault. Try another browser (Chrome, Firefox, Opera).

Local Network Problems

Local network problems come from wiring, bad switches, duplicate IP addresses, missing IP addresses, firewall blocking and plain old confusion.

You learned above that every device on your network needs a unique IP address but it is not and cannot be entirely unique: some part of it needs to match. You know how to check IP addresses on computers, but what about printers and other devices?

If you have software that controls the device, check that first. Many printers can print out a configuration sheet that includes IP information. HP print servers often have a button that, if held down for a few seconds, will send configuration info to an attached printer.

Arp

The 'arp' command is another way to determine IP. Here is a sample from a Windows machine:

ARP -a
Interface: 192.168.11.8 --- 0x2
Internet Address Physical Address Type
192.168.11.1 00-18-01-47-4e-01 dynamic
192.168.11.3 00-1a-4b-14-1e-d3 dynamic
192.168.11.5 00-14-51-ed-86-42 dynamic
 

On Linux or Mac, I suggest using "arp -an" to get similar results.

Notice the Physical Address column? That's the Hardware or MAC address I talked about earlier. On your local network, devices actually use that address to talk to each other. So, if you know the hardware address and can see that in arp, you can determine the IP.

You can also use arp to set an IP for many simple devices, but again that's a bit more than we want to cover here.

What we are looking for is a mismatch. Say that we know test the printer that is not working is 192.168.11.5, but its hardware address is NOT 00-14-51-ed-86-42. You may have found the hardware address from configuration information or it may even be on a label on the device itself.

If there is a mismatch, you have a duplicate IP - whatever does own 00-14-51-ed-86-42 is using the IP you want for the printer. If it is easy enough to change the printer, do that. Otherwise you need to hunt down that 00-14-51-ed-86-42 machine or device.

A device must have been recently accessed to show up in rap!

Wiring

Ethernet cable testers are not terribly expensive. If you are working for a an office, you'd want a good model that can test length as well as connection wiring and includes tone tracing ("fox and hound").

For a home user, it's easier to just swap the cable. If the new one works, throw away the old.

The length testing is often just for a reality check. If you know the cable you are testing should be about 50 feet but the tester says 300, something is screwy. Either your wiring guy left a lot of looped cable in the ceiling or that cable has a problem. The fox and hound tracer can help you find mislabeled wiring - you think this wire is plugged in at the other end, but actually it is not because of a labeling error. Finally, the wiring must be pinned out correctly. The cable tester is simply invaluable for those reasons.

Switches

A switch is not a router. A router may have a switch built into it but you don't want a router where you need a switch. Inexpensive switches don't have IP addresses - they are invisible on the network. Managed switches have their own IP address and can provide functions like enabling or disabling specific ports or segmenting a network. That's all beyond our scope here; what you need to know is SWITCHES GO BAD.

Individual ports go bad and whole switches go bad. You can go crazy trying to figure out a network problem caused by a flaky switch. Move the wires to another switch to test, particularly if thus switch is a few years old.

If you are reasonably quick about it, you can swap out one switch for a new one on the fly. Nothing will get interrupted if you just quickly swap the wires. Watch out for accidentally plugging a switch into itself - most switches will go insane if you make that mistake. With some advanced switches you can do that and would do it for redundancy, but that's far beyond this article.

Firewalls Firewall configuration is way beyond our scope, but firewalls and even antivirus software can interfere with network function. It is worth just shutting anything like that off to see if it is involved in your difficulties. If so, at least you know where it us even if you can't fix it yourself.

How to shut it off - there are too many possibilities. The built in firewalls in mac and Windows are easy enough to find, but third part software may not be so obvious. Look around the menus of any AntiVirus software - that sometimes includes a firewall component that can be shut off for testing.

Confusion

Step back. Now that you know how things work, pause and try to think about where your problem might come from. For example, if the MAC address of a device you are trying to find indicates that it is an Intel card, you don't need to look at HP printers. If you can pick up a device and move it to a new location and it works, you know it is a physical problem.

Network troubleshooting can be difficult. Obviously I have only skimmed the surface here. I have more technically oriented articles if you want to go deeper. I also sell a Unix and Unix troubleshooting book that can help if you are in that environment.


Got something to add? Send me email.





(OLDER)    <- More Stuff -> (NEWER)    (NEWEST)   

Printer Friendly Version

->
-> A non-technical guide to understanding and fixing TCP/IP problems on a network

1 comment


Inexpensive and informative Apple related e-books:

Take Control of Parallels Desktop 12

Photos: A Take Control Crash Course

iOS 8: A Take Control Crash Course

Take Control of Automating Your Mac

Are Your Bits Flipped?




More Articles by © Anthony Lawrence






Sun Aug 14 20:00:00 2011: 9704   BigDumbDinosaur

gravatar


Good synopsis!

Reiteration to everyone who wanders over to this article: Print it and save it somewhere. When (not if) your connectivity goes kaput, you'll be glad to have the "non-volatile" paper copy handy as you attempt to troubleshoot your network.

If you are having trouble right now, you should first get to a command prompt (see above for how) and type "ping 8.8.8.8".

For those who skimmed the article, the IP address 8.8.8.8 is one of Google's DNS servers. You can also try 4.2.2.2 or 4.2.2.3, both of which are top-level (root) domain servers. If you can't reach them you ain't going nowhere!

It don't mean a thing if it ain't got that ping.

Duke Ellington must be groaning... :)



------------------------


Printer Friendly Version

Have you tried Searching this site?

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more.

Contact us


Printer Friendly Version





Premature optimization is the root of all evil. (Donald Knuth)




Linux posts

Troubleshooting posts


This post tagged:

Basics

Linux

Microsoft

Networking

Troubleshooting

Unix



Unix/Linux Consultants

Skills Tests

Unix/Linux Book Reviews

My Unix/Linux Troubleshooting Book

This site runs on Linode