# \f(CBtraceroute rider.fc.net\fP
traceroute to rider.fc.net (207.170.123.194), 30 hops max, 40 byte packets
 1  gw (223.147.37.5)  1.519 ms  1.168 ms  1.100 ms
 2  gw (223.147.37.5)  1.244 ms !H  1.242 ms !H  0.955 ms !


.H3 "The link layer"
.X "link layer, debugging"
.X "debugging, link layer"
.X "ping, command"
.X "command, ping"
The first thing to do is to ensure that the link layer is running.  You can do
this by 
.Command ping ing
another address on the same link.  In the case of a PPP link, you don't get any
choice: ping the address at the other end of the link.  In the case of an
Ethernet, you have a choice of addresses.  In either case, be sure to use the IP
address, not the name: in order to get the address from the name, you may need
to issue a DNS query, which runs at the application layer.  The chances that the
name resolution will fail are much higher than the chances that the
.Command ping
will fail.
.P
.X "gw.example.org"
.X "free-gw.example.net"
A successful ping, here from \fIgw.example.org\fP\| to
\fIfree-gw.example.net\fP, will look like this:
.Dx
$ \f(CBping 139.130.136.129\fP
PING 139.130.136.129 (139.130.136.129): 56 data bytes
64 bytes from 139.130.136.129: icmp_seq=0 ttl=255 time=145.203 ms
64 bytes from 139.130.136.129: icmp_seq=1 ttl=255 time=140.743 ms
64 bytes from 139.130.136.129: icmp_seq=2 ttl=255 time=138.039 ms
64 bytes from 139.130.136.129: icmp_seq=3 ttl=255 time=139.783 ms
64 bytes from 139.130.136.129: icmp_seq=4 ttl=255 time=136.698 ms
64 bytes from 139.130.136.129: icmp_seq=5 ttl=255 time=138.753 ms
64 bytes from 139.130.136.129: icmp_seq=6 ttl=255 time=208.389 ms
64 bytes from 139.130.136.129: icmp_seq=7 ttl=255 time=187.463 ms
64 bytes from 139.130.136.129: icmp_seq=8 ttl=255 time=128.463 ms
64 bytes from 139.130.136.129: icmp_seq=9 ttl=255 time=333.895 ms
64 bytes from 139.130.136.129: icmp_seq=10 ttl=255 time=180.670 ms
\f(CB^C\fP                              \fIstop by hitting \fBCtrl-C\f(CW
--- 139.130.136.129 ping statistics ---
11 packets transmitted, 11 packets received, 0% packet loss
round-trip min/avg/max/stddev = 128.463/170.736/333.895/57.179 ms
.De

.H3 "The network layer"
.X "/etc/namedb/named.root"
.X "A.ROOT-SERVERS.NET."
After you are sure that the link layer is functional, check the network layer.
Try to ping a system which is not directly connected.  Typically, you'll have
been having communication problems with a specific system, so try to ping that
system.  If you don't have a specific machine, use one of the root name servers,
since we'll probably be performing a name server lookup later on.  The addresses
are in the file 
.File /etc/namedb/named.root ,
but you can usually rely on the address of \fIA.ROOT-SERVERS.NET.\fP\| to be
\f(CW216.136.204.21\fP.  The
.Command ping
looks much as before:
.Dx
$ \f(CBping freefall.FreeBSD.org\fP
PING freefall.FreeBSD.org (216.136.204.21): 56 data bytes
64 bytes from 216.136.204.21: icmp_seq=0 ttl=244 time=496.426 ms
64 bytes from 216.136.204.21: icmp_seq=1 ttl=244 time=491.334 ms
64 bytes from 216.136.204.21: icmp_seq=2 ttl=244 time=479.077 ms
64 bytes from 216.136.204.21: icmp_seq=3 ttl=244 time=473.774 ms
64 bytes from 216.136.204.21: icmp_seq=4 ttl=244 time=733.429 ms
64 bytes from 216.136.204.21: icmp_seq=5 ttl=244 time=644.726 ms
64 bytes from 216.136.204.21: icmp_seq=7 ttl=244 time=490.331 ms
64 bytes from 216.136.204.21: icmp_seq=8 ttl=244 time=839.671 ms
64 bytes from 216.136.204.21: icmp_seq=9 ttl=244 time=773.764 ms
64 bytes from 216.136.204.21: icmp_seq=10 ttl=244 time=553.067 ms
64 bytes from 216.136.204.21: icmp_seq=11 ttl=244 time=454.707 ms
64 bytes from 216.136.204.21: icmp_seq=12 ttl=244 time=472.212 ms
64 bytes from 216.136.204.21: icmp_seq=13 ttl=244 time=448.322 ms
64 bytes from 216.136.204.21: icmp_seq=14 ttl=244 time=441.352 ms
64 bytes from 216.136.204.21: icmp_seq=15 ttl=244 time=455.595 ms
64 bytes from 216.136.204.21: icmp_seq=16 ttl=244 time=460.040 ms
64 bytes from 216.136.204.21: icmp_seq=17 ttl=244 time=476.943 ms
64 bytes from 216.136.204.21: icmp_seq=18 ttl=244 time=514.615 ms
64 bytes from 216.136.204.21: icmp_seq=23 ttl=244 time=538.232 ms
64 bytes from 216.136.204.21: icmp_seq=24 ttl=244 time=444.123 ms
64 bytes from 216.136.204.21: icmp_seq=25 ttl=244 time=449.075 ms
^C
--- 216.136.204.21 ping statistics ---
27 packets transmitted, 21 packets received, 22% packet loss
round-trip min/avg/max/stddev = 441.352/530.039/839.671/113.674 ms
.De
In this case, we have a connection.  What about the packet loss rate?  How high
a packet drop rate is still acceptable?  1% or 2% is probably still all right,
and you'll see that often enough.  By the time you get to 10%, though, things
look a lot worse.  10% packet drop rate doesn't mean that your connection slows
down by 10%.  For every dropped packet, you have a minimum delay of one second
until TCP retries it.  If that retried packet gets dropped too\(emwhich it will
every 10 dropped packets if you have a 10% drop rater\(emthe second retry takes
another 3 seconds.  If you're transmitting packets of 64 bytes over a 33.6 kb/s
link, you can normally get about 60 packets through per second.  With 10% packet
loss, the time to get these packets through will be about 8 seconds: a
throughput loss of 87.5%.
.P
With 20% packet loss, the results are even more dramatic.  Now 12 of the 60
packets have to be retried, and 2.4 of them will be retried a second time (for 3
seconds delay), and 0.48 of them will be retried a third time (6 seconds
delay).  This makes a total of 22 seconds delay, a throughput degradation of
nearly 96%.
.P
Theoretically, you might think that the degradation would not be as bad for big
packets, such as you might have with file transfers with 
.Command ftp .
In fact, the situation is worse then: in most cases the packet drop rate rises
sharply with the packet size, and it's common enough that
.Command ftp
will time out completely before it can transfer a file.
.P
.X "freebie"
.X "hub"
.X "freebie.example.org"
.X "hub.FreeBSD.org"
The following example shows the result of sending some text on a
less-than-perfect 
.Command ssh
connection to \fIhub.FreeBSD.org\fP.  To make things more readable, the names
have been truncated to \fIfreebie\fP\| and \fIhub\fP.  In real-life output, they
would be reported as \fIfreebie.example.org\fP\| and \fIhub.FreeBSD.org\fP.  In
addition,
.Command tcpdump
reports a \fItos\fP\| (type of service) field, which has also been removed,
since it doesn't interest us here.
.Dx
# \f(CBtcpdump -i ppp0 host hub.freebsd.org\fP
14:16:35.990506 freebie.1019 > hub.22: P 20:40(20) ack 77 win 17520 (DF)
14:16:36.552149 hub.22 > freebie.1019: P 77:97(20) ack 40 win 17520 (DF)
14:16:36.722290 freebie.1019 > hub.22: . ack 97 win 17520 (DF)
14:16:39.344229 freebie.1019 > hub.22: P 40:60(20) ack 97 win 17520 (DF)
14:16:41.321850 freebie.1019 > hub.22: P 40:60(20) ack 97 win 17520 (DF)
.De
The previous two lines are retries of the same acknowledgement, since
\fIhub\fP\| did not respond in time.
.Dx
14:16:42.316150 hub.22 > freebie.1019: P 97:117(20) ack 60 win 17520 (DF)
.De
This was the missing acknowledgement\(emit came another second later.
.Dx
14:16:42.321773 freebie.1019 > hub.22: . ack 117 win 17520 (DF)
14:16:47.428694 freebie.1019 > hub.22: P 60:80(20) ack 117 win 17520 (DF)
14:16:48.590805 freebie.1019 > hub.22: P 80:100(20) ack 117 win 17520 (DF)
14:16:49.055735 freebie.1019 > hub.22: P 100:120(20) ack 117 win 17520 (DF)
14:16:49.190703 hub.22 > freebie.1019: P 137:157(20) ack 100 win 17520 (DF)
.De
Here, \fIfreebie\fP\| has sent data to \fIhub\fP, and \fIhub\fP\| has replied
with an acknowledgement up to serial number \f(CW100\fP.  Unfortunately, the
data it sent (serial numbers \f(CW137\fP to \f(CW157\fP) don't line up with the
last previously received data (serial number \f(CW117\fP at 14:16:42.316150).
\fIfreebie\fP\| thus repeats the previous acknowledgement and then continues
sending its data:
.Dx
14:16:49.190890 freebie.1019 > hub.22: . ack 117 win 17520 (DF)
14:16:49.538607 freebie.1019 > hub.22: P 120:140(20) ack 117 win 17520 (DF)
14:16:49.599395 hub.22 > freebie.1019: P 157:177(20) ack 120 win 17520 (DF)
.De
.X "freebie"
Here, \fIhub\fP\| has sent yet more data, now acknowledging the data that
\fIfreebie\fP\| sent at 14:16:49.055735.  It still hasn't sent the data in the
serial number range \f(CW117\fP to \f(CW136\fP, so \fIfreebie\fP\| resends the
last acknowledgement again and continues sending data:
.Dx
14:16:49.599538 freebie.1019 > hub.22: . ack 117 win 17520 (DF)
14:16:49.620506 freebie.1019 > hub.22: P 140:160(20) ack 117 win 17520 (DF)
14:16:50.066698 hub.22 > freebie.1019: P 177:197(20) ack 140 win 17520 (DF)
.De
.X "hub"
Again \fIhub\fP\| has sent more data, still without sending the missing packet.
\fIfreebie\fP\| tries yet again, and then continues sending data:
.Dx
14:16:50.066868 freebie.1019 > hub.22: . ack 117 win 17520 (DF)
14:16:51.820708 freebie.1019 > hub.22: P 140:160(20) ack 117 win 17520 (DF)
14:16:52.308992 hub.22 > freebie.1019: . ack 160 win 17520 (DF)
14:16:55.251176 hub.22 > freebie.1019: P 117:217(100) ack 160 win 17520 (DF)
.De
Finally, \fIhub\fP\| resends the missing data, with serial numbers from
\f(CW117\fP to \f(CW217\fP.  \fIfreebie\fP\| is now happy, and acknowledges
receipt of all the data up to \f(CW217\fP.  That's all we transmitted, so after
about 1.5 seconds the two systems exchange final acknowledgements:
.Dx
14:16:55.251358 freebie.1019 > hub.22: . ack 217 win 17420 (DF)
14:16:56.690779 hub.login > freebie.1015: . ack 3255467530 win 17520
14:16:56.690941 freebie.1015 > hub.login: . ack 1 win 17520 (DF)
.De
.X "traceroute, command"
.X "command, traceroute"
This example shows us that the connection is less than perfect.  Why?  You can
use 
.Command traceroute
to find out where it's happening, but unless the place is within your ISP's
network, you can't do much about it.


.H3 "No connection"
But maybe you don't get anything back.  You see something like this:
.Dx
$ \f(CBping rider.fc.net\fP
PING rider.fc.net (207.170.123.194): 56 data bytes
^C
--- rider.fc.net ping statistics ---
8 packets transmitted, 0 packets received, 100% packet loss
.De
.X "tcpdump, command"
.X "command, tcpdump"
.Command tcpdump
shows:
.Dx
# \f(CBtcpdump -i ppp0 host rider.fc.net\fP
13:30:32.336432 freebie.example.org > rider.fc.net: icmp: echo request
13:30:33.355045 freebie.example.org > rider.fc.net: icmp: echo request
13:30:34.374888 freebie.example.org > rider.fc.net: icmp: echo request
13:30:35.394728 freebie.example.org > rider.fc.net: icmp: echo request
13:30:36.414769 freebie.example.org > rider.fc.net: icmp: echo request
.De
.X "rider.fc.net"
For some reason, \fIrider.fc.net\fP\| is not reachable.  But why?  We know it's
not the local link, but somewhere the data isn't getting through.  Where?
.Command traceroute
can help:
.Dx
.ps 7
# \f(CBtraceroute -q 2 rider.fc.net\fP
traceroute to rider.fc.net (207.170.123.194), 30 hops max, 40 byte packets
 1  free-gw.example.ne***e.example.net (139.130.136.129)  359.252 ms  138.754 ms 
 2  Ethernet1-0.way1.Adelaide.example.net (139.130.237.65)  417.399 ms  641.075 ms 
 3  Fddi0-0.way-core1.Adelaide.example.net (139.130.237.226)  653.558 ms  807.843 ms
 4  Serial5-5.pad-core2.Sydney.example.net (139.130.249.209)  861.472 ms  165.041 ms 
 5  Fddi0-0.pad8.Sydney.example.net (139.130.249.228)  163.000 ms  207.969 ms
 6  bordercore4-hssi0-0.SanFrancisco.mci.net (166.48.19.249)  347.656 ms  404.727 ms
 7  core2.Dallas.mci.net (204.70.4.69)  383.040 ms  639.875 ms 
 8  borderx1-fddi-1.Dallas.mci.net (204.70.114.52)  436.243 ms  575.502 ms 
 9  smart-technologies.Dallas.mci.net (204.70.114.110)  1025.478 ms  936.228 ms 
10  freeside-100Mb.smart-nap.net (208.10.195.146)  1166.775 ms  1216.596 ms
11  6jane.fc.net (207.170.70.133)  1235.122 ms  398.822 ms 
12  * * *
13  *^C
.De
The \f(CW-q\fP option tells
.Command traceroute
how many packets to send for each hop.  By default it's 3.  Reducing it to 1 or
two speeds things up.
.P
This example shows that the data gets through fine as far as \fI6jane.fc.net\fP,
after which it disappears completely.  This is a pretty good sign that the
problem lies in the network \fIfc.net\fP.  If, as in this case, \fIrider\fP\|
has a PPP connection, it's a good assumption that it's currently not connected.
.P
On the other hand, you might see something like:
.Dx
# \f(CBtraceroute rider.fc.net\fP
traceroute to rider.fc.net (207.170.123.194), 30 hops max, 40 byte packets
 1  gw (223.147.37.5)  1.519 ms  1.168 ms  1.100 ms
 2  * * *
 3  * * *
.De
In this case, there is obviously something wrong on the local network.  You can
get the data as far as \fIgw\fP, but that's as far as it goes.
