Topic Options
#1236 - 08/30/04 04:45 PM local ISP has a high number of PL and Error
bleeaua Offline


Registered: 08/30/04
Posts: 2
The local ISP routers at hop #3 (before the border routers) are constantly switching to other routers 209.88.128.129 -to- ....137 -to- ....145 -to- ....153. There is a persistent PL and errors at hop #3, whereas the hop #4 has very minimal PL and errors. Why would there be such a high number of PL and errors at hop #3?

Also interestingly to note is that hops 5 through 18 recently have been experiencing a very high number of PL and errors as well. In the past this has been very clean (low PL and errors)

"VoIP" is our main application, and with these type of errors and PL, the VoIP quality becomes unacceptable. Can any one please give me your expert opininion of what may be happening?

Does the increasing spyware and adaware phenomena have any effect on this problem?


Attachments
1244-sip1.net2phone 08312004 blee.png



Top
#1237 - 08/30/04 05:16 PM Re: local ISP has a high number of PL and Error [Re: bleeaua]
Pete Ness Offline



Registered: 08/30/99
Posts: 1106
Loc: Boise, Idaho
It looks like you already have a route change mask set up on hop 3, right? If not, you probably want to add one.

Having an intermediate router show packet loss where downstream routers do not isn't a sign that anything's wrong with this router (or set of routers), just that it doesn't care as much about ICMP as it could. While not entirely normal, this isn't so unusual these days. In most cases, you can "ignore" any packet loss or latency that is added at this router that is not visible in downstream hops.

Now hop 4 does have a tiny bit of latency and packet loss, but not enough to really cause you much greif in your VoIP conversation.

On the other hand, the hop 4 -> hop 5 link has *significant* problems that is almost certainly affecting your VoIP experience. The same packet loss is being reflected downstream as well, although the further you get from this router, the less packet loss, which is a bit interesting.

You might try switching to single-outstanding-request mode (see http://www.nessoft.com/kb/22 for details) just to make sure there's no reporting issue with hop 4/5. This is pretty unlikely, but is a small possibility. It looks like you're using a 1 second trace interval, and that could also be affecting things a bit (some ISPs put rules in place to limit ICMP packet use). Note that I reproduced some of the same information by running from me back to you, though, so I don't think this is much of an issue for you.

More likely is that the link between hop 4 and hop 5 is oversubscribed, is being overused somehow, or has some other problem. You should probably contact your ISP and ask about this packet loss / latency jump. If they are seeing data that is just as acute as you are, they are probably already working on solving this problem.

You probably picked a period that was the worst-case packet loss / latency. Does it get better at different times of day? If you run PingPlotter for 24 or 48 hours, do you see differences based on the time of the day? If so, then you have a pretty strong case that it's a bandwidth limit, or something based on load rather than a hardware failure. Your ISP, setarnet.aw, may need to add additional capacity to solve this problem, or maybe fix some misbehaving hardware.

For a discussion of bandwidth limits / saturation, see http://www.pingplotter.com/tutorial/ScenarioSaturatedPipe.html.

If you want a bit further analysis, post another graph with a bit more time and more samples and we'll comment a bit more on this. In reality, your best bet is to contact your ISP and ask them to explain this problem (focusing on the fact that your VoIP experience is somewhat lacking, and you've used PingPlotter to show where the problem originates, rather than on focusing specifically on the PingPlotter results).

Spyware / adware probably isn't a huge factor here, although it could impact bandwidth use a bit, that probably doesn't represent the majority of network traffic.

- Pete

Top
#1238 - 08/31/04 01:21 PM Re: local ISP has a high number of PL and Error [Re: bleeaua]
bleeaua Offline


Registered: 08/30/04
Posts: 2
I have attached a longer sampling of the data as per your needs.

It does get better during sleeping hours as suspected.
The ISP backbone consist of 3x T3 (45Mb@ or total 135MB) as connection. Their customer base is about 3000-4000 subscribers. Many subscribed at the cheaper 128kbs(down)/64kbs(up), but are actually receiving about 1-3MB speeds if working properly. Although there are several price/speed packages, it appears that every customer is getting the same speed. Higher priced plans sometimes are even slower than the smallest plan.

If you need a complete 24 hours sample set, I can have this posted on a website for your download and further analysis.

Thanks for your expert comments


Attachments
1246-sip1.net2phone08312004-1pm.png



Top
#1239 - 08/31/04 02:05 PM Re: local ISP has a high number of PL and Error [Re: bleeaua]
Pete Ness Offline



Registered: 08/30/99
Posts: 1106
Loc: Boise, Idaho
Here's a shot of monitoring back to you (or a core router inside your ISP's network at least) from my DSL line.


Attachments
1247-core-1.setardsl.aw.png



Top
#1240 - 08/31/04 02:22 PM Re: local ISP has a high number of PL and Error [Re: Pete Ness]
Pete Ness Offline



Registered: 08/30/99
Posts: 1106
Loc: Boise, Idaho
There are some interesting characteristics here. First off, there's the difference in the final destination's packet loss and the previous hop's packet loss. This is almost certainly occuring because of some difference in the return route.

There are a couple of possible reasons for this. 1) The data is coming back through a route that we can't see when monitoring from your side, but it' has additional latency and packet loss when that network link is added. Or ... 2) Because of oversaturation, ICMP TTL expired packets are down-prioritized and dropped.

I can't say for sure which of these conditions are in play from the information I have. Remember, though, that the intermediate hops only offer information about possible problems. We don't really care about the characteristics of performance at the intermediate hops unless that contributes to the final destination. A lot of the badness in the intermediate hops is *not* being translated to the final destination, so we can only use it as "clues".

What we *can* say is that your packet loss and latency at the final destination are unacceptable, and that the link between hops 4 and 5 is the likely culprit. When I trace back from me to you, the link between sprintlink and setarnet also looks like the problem. We show very similar latency, but slightly different packet loss characteristics.

All evidence that I see points to a problem with the link between Sprintlink and Setarnet. It looks like you are a customer of Setarnet, so you should contact them with your problem. It might be a problem with the physical connection between these two services (especially if you're in Aruba where Setarnet's contact information shows them).

You don't have a whole lot of choices here besides just letting Setarnet know there's a problem, and then encouraging them to fix it. If they don't acknowledge the problem (or the severity of the problem), use your collected PingPlotter data to convince them, along with a description of your VoIP problems.

Good luck! Let us know what Setarnet has to say about this!

- Pete

Top

Search

Who's Online
1 registered (Robertdee), 21 Guests and 0 Spiders online.
Key: Admin, Global Mod, Mod