(OLDER) <- More Stuff -> (NEWER) (NEWEST)
Printer Friendly Version



Best of the Newsgroups: pings out of order


What is this stuff?

If this isn't exactly what you wanted, please try our Search (there's a LOT of techy and non-techy stuff here about Linux, Unix, Mac OS X and just computers in general!):



From: Bela Lubkin <belal@sco.com>
Subject: Re: Curious pings on SCO 5.0.4/6
Date: Thu, 6 Nov 2003 18:16:03 GMT
References: <14biqvckqcobdh1h3mis4undq2tjhof16o@4ax.com> <20031105193824.GT14056@sco.com> <HnwFxH.23Ft@wjv.com> <20031106003805.GW14056@sco.com> <Hnxu33.B3p@wjv.com> 

Bill Vermillion wrote:

> In article <20031106003805.GW14056@sco.com>,
> Bela Lubkin  <belal@sco.com> wrote:



> >For whatever reason (I'll speculate in a moment), OSR5 `ping` does a
> >reverse DNS lookup of _every_ packet it receives.  It doesn't try to
> >cache IP-to-name information.  This is probably so that if you had a
> >long-running ping and one day someone changed that address's name, ping
> >would suddenly start reporting the new name.  This _could_ have been
> >implemented with a cache, some knowledge of DNS record timeouts, etc.,
> >but it wasn't.
> 
> That seems a rather bizarre way to do things.  That's just from my
> way of thinking about things.  Most of the things based on names
> lookup an IP for a given name.  The chances that someone would
> change a name on an IP but that would typically be seen only on a
> local network would it not.  As anything outside is going to rely
> on someone elses DNS and when the address/IP resolv is made up
> stream, even if it has to go to the root servers to get that IP
> initially, then the next level up will cache that name/ip
> resolution as long as the TTL is still valid.   That's my
> impression, but I've never looked at the source code.

You could have a "link watcher daemon" running, something like:

  # Record link status every 2 minutes, forever

  ping -i 120 123.456.789.012 > /usr/spool/linkwatch 2>&1 &

If the owner of "123.456.789.012" changed its name one day, it would
eventually show up in your log.  Maybe not right away, because even
though ping did an RDNS lookup for every packet, the server might not
time out the old name for a few hours.  But eventually it would see the
name change.

ping could have some awareness of DNS TTL, and only re-lookup an address
whose expiration time had passed.  The fact is, it doesn't.



> >I think you're right, because of the out of order receipt.  That means
> >that there was blockage somewhere along the way.  Some router between
> >the two machines was holding either the outgoing packets or the replies
> >-- _not_ losing them, just holding them and eventually letting them all
> >fly at once.  During this holding period they got out of order (which is
> >fairly normal, routers do not guarantee in-order delivert).  When ping
> >finally received them back, it reported them as having taken various
> >times about 1.0 second apart, because they all arrived at the same time
> >but were _sent_ 1 second apart.
> 
> I can envision that something somewhere is waiting until it gets a
> packet big enough to send a minimum amount of data, or waits a
> pre-determined interval to return that.  But why I have no idea.
> 
> But what we don't know is just how far apart the pinged IP is.
> The orginal poster had munged the original IP and gave no clue
> as to what/where it was.   Long delays would/could be indicative
> of a typical land/satellite link.  The sent data goes via land
> line, and the return data is intercepted as I recall at the level
> 3 of the ISO stack, and diverted to an uplink and the down to the
> end user.  That would almost guarantee a minimum of about 700ms.
> And I would think you really would want to aggregate the data
> and send in bigger chunks.   
> 
> I've read about problems on what are called 'elephants' [ELFN -
> Exteremely Long Fat Networks - very high speed distant links where
> they make the packets HUGE and have large windows, otherwise
> the data is slowed by the handshake/protocols/etc of small packets
> and few outstanding].  Probably has nothing to do with this, but it
> reminded me of disusssion I'd recentely seen.

The fastest packets in the original ping output were 40ms -- clearly not
a satellite or anything particularly weird.  In fact let me re-quote a
bit of it:

> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=3 ttl=62 time=40 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=0 ttl=62 time=3080 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=1 ttl=62 time=2080 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=2 ttl=62 time=1090 ms

Bear in mind that each sequential packet is _sent_ 1 second after the
previous one.  So if two packets were sent at times 100000.23 and
100001.23, and both were received simultaneously at 100002.50, what
would ping report?  It would show 2270ms for the first and 1270ms for
the second.  If we assume that the above four packets were sent at exact
1-second interval, they were received at almost exactly the same time:

  icmp_seq=3 time=40 ms    sent=100003.23 received=100003.27
  icmp_seq=0 time=3080 ms  sent=100000.23 received=100003.31
  icmp_seq=1 time=2080 ms  sent=100001.23 received=100003.31
  icmp_seq=2 time=1090 ms  sent=100002.23 received=100003.32

I built that by assigning an arbitrary time to the first packet
(100000.23), then adding 1.00 second to each according to its sequence
number, then adding the reported round-trip time.  This reads as if some
part of the link was down for at least 3 seconds, delaying either the
outbound or return trip of packets 0-2.  The delay between return
receipt of packets 3 & 0 suggests that other data was also stuck in the
buffers, otherwise all 4 returns would have been received back to back.

The fact that packets 0 & 1 were received during the same 10ms timer
tick sets a lower limit on the speed of the slowest link -- if those
packets were traveling back to back, the slowest link must be able to
transmit at least 64 bytes per 10ms, or 6400 bytes/sec, or approximately
64Kbps.  We don't know much about upper limit (any number of other,
non-ping packets could have been traveling at the same time).

> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=21 ttl=62 time=40 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=15 ttl=62 time=6110 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=16 ttl=62 time=5110 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=17 ttl=62 time=4120 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=18 ttl=62 time=3120 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=19 ttl=62 time=2120 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=20 ttl=62 time=1120 ms

This bit shows that the link sometimes goes down for as much as 6
seconds at a time.

In each of these two examples, we got back the last sequential packet
first, and its time was very low.  Many router-like devices have
behaviors where receiving a packet to a particular known remote
destination will cause link bring-up.  It seems likely that the packet
which caused link bring-up would also travel the link first.  It is also
common that packets already in holding buffers will not actively cause
link bring-up (that is, the router may attempt bring-up when the packet
is first received, but if it is unsuccessful, it won't try again until
_another_ packet triggers the behavior).  This router is pretty good
about keeping packets in received order, but the bring-up triggering
behavior causes the small amount of disordering we see in the output.

Two other segments are different:

> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=4 ttl=62 time=2240 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=5 ttl=62 time=1240 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=6 ttl=62 time=240 ms

> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=7 ttl=62 time=2400 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=8 ttl=62 time=1400 ms
> >> >> >64 bytes from kasseob4 (10.22.136.54): icmp_seq=9 ttl=62 time=400 ms

In each of these segments, it looks like some _other_ packet, other
business this system had with stuff over the link, caused link bring-up.
None of the ping packets are out of order, and the newest packet clearly
had been held up for longer than the physical link turnaround time.

This analysis seems like exactly the sort of thing that an expert system
could be good at.  Sure enough, searching on ``"expert system" ping
routers troubleshooting'' turns up a bunch of matches.  I wonder if any
of them are actually any good?

I suspect the original sample we were shown was during an especially bad
period, that usually pings are clean with only occasional excursions.  I
bet if we had 1000 pings' worth of results, we could diagnose it much
more closely (or at least an expert system could, it would have the
patience...)

>Bela<




Click here to add your comments



Don't miss responses! Subscribe to Comments by RSS or by Email

Click here to add your comments


If you want a picture to show with your comment, go get a Gravatar



/Bofcusm/2379.html copyright 1997-2004 (various authors) All Rights Reserved

Have you tried Searching this site?

Unix/Linux/Mac OS X support by phone, email or on-site: Support Rates

This is a Unix/Linux resource website. It contains technical articles about Unix, Linux and general computing related subjects, opinion, news, help files, how-to's, tutorials and more. We appreciate comments and article submissions.

Publishing your articles here

Jump to Comments



Many of the products and books I review are things I purchased for my own use. Some were given to me specifically for the purpose of reviewing them. I resell or can earn commissions from the sale of some of these items. Links within these pages may be affiliate links that pay me for referring you to them. That's mostly insignificant amounts of money; whenever it is not I have made my relationship plain. I also may own stock in companies mentioned here. If you have any question, please do feel free to contact me.

Specific links that take you to pages that allow you to purchase the item I reviewed are very likely to pay me a commission. Many of the books I review were given to me by the publishers specifically for the purpose of writing a review. These gifts and referral fees do not affect my opinions; I often give bad reviews anyway.

We use Google third-party advertising companies to serve ads when you visit our website. These companies may use information (not including your name, address, email address, or telephone number) about your visits to this and other websites in order to provide advertisements about goods and services of interest to you. If you would like more information about this practice and to know your choices about not having this information used by these companies, click here.



More:
       - OSR5
       - Bofcusm
       - Bela


Unix/Linux Consultants

Skills Tests

Guest Post Here











My Favorites

Change Congress