Web Hosting Talk







View Full Version : Peer1 bandwidth let me down? Experts: Please suggest alternative


helios3
07-23-2004, 07:50 PM
I have a dedicated server with NDCTech (excellent company!) that is hosted with Peer1.net bandwidth in New York.

It's a VoIP server, and 50% of my traffic gets sent to a single IP at the NAP of the Americas in Miami.

Unfortunately, that route was dropping packets and the VoIP calls sounded terrible several times today. The IP route that was taken was Peer1 NY to Asburn, VA and then Verio from there to Miami.

Pings from my home machine to the same final destination showed zero packet loss. With ICMP filtering and so on, I couldn't really tell what the problem was from NY to MIA . . .but I am pretty sure it was one of the Verio routers.

My question is this: what should I do about it? I suppose I could get NDCTech to speak to Peer1 and change the route to NY->Miami via Global Crossing or via C&W. I have tested both those routes using online looking glasses, and the seem good.

But there is no easy way of altering the VoIP packets returning from Miami back to New York. That's someone else's network setup, I don't think I can affect it. And I do know that it takes the same route back (Verio/Peer1).

What do you experts recommend?

I was thinking of asking NDCTech to switch me to their Level(3) bandwidth, which I believe means IP packets go NY->Miami along Level(3) only. I'm guessing going across one network start to finish is a lot better than two (Peer1+Verio).

The reason why I chose Peer1 bandwidth in the first place is because gamers use it and say the latency is great. But clearly something is wrong. Peer1 is supposed to be a blend of GLBX, ATT, C+W and MCI (but not MCI in Peer1 New York). If the data was taking ANY of those routes, it would go straight to Miami on a single network. But it doesn't.

So my customers might not appreciate me using Level(3) bandwidth as much as they liked Peer1, but I have to move that VoIP data to its final destination and that's not working right now.

Any comments or suggestions would be greatly appreciated.

Thanks,

Lars

P.S. When I first looked at Peer1 months ago, a trace from Montreal to Miami when through New York and from NY to Miami it was all GLBX. Not any more, I guess.

P.P.S. Here is the traceroute. Remember, pings from another machine (Sprint IP network) consistently showed 0 packet loss. Pings from NY machine on Peer1 had 2% packet loss for several 30 minute stretches, and pings from Mia to my NY machine showed the same,

1 69.90.121.3 (69.90.121.3) 1.018 ms 0.822 ms 0.856 ms
2 GIG4-0.nyc-gsr-c.peer1.net (216.187.123.14) 0.584 ms 0.618 ms 0.703 ms
3 OC48POS0-0.nyc-gsr-d.peer1.net (216.187.123.2) 1.165 ms 1.219 ms 1.130 ms
4 GIG2-0.nyc-gsr-b.peer1.net (216.187.123.5) 1.125 ms 1.147 ms 0.783 ms
5 OC48POS1-0.wdc-gsr-a.peer1.net (216.187.123.226) 6.583 ms 6.832 ms 6.505 ms
6 ge-2-3-0.r01.asbnva01.us.bb.verio.net (206.223.115.12) 7.511 ms 7.306 ms 7.110 ms
7 p16-1-0-0.r21.asbnva01.us.bb.verio.net (129.250.5.21) 7.203 ms 7.247 ms 7.117 ms
8 p16-4-0-0.r02.asbnva01.us.bb.verio.net (129.250.2.63) 7.526 ms 7.526 ms 7.388 ms
9 p4-1-2-0.r01.miamfl02.us.bb.verio.net (129.250.5.69) 35.510 ms 33.397 ms 33.323 ms
10 ge-1-2.a00.miamfl02.us.ra.verio.net (129.250.30.18) 33.323 ms 32.914 ms 33.228 ms
11 ge-4-2.a00.miamfl02.us.ce.verio.net (130.94.195.118) 32.954 ms 33.350 ms 33.119 ms
12 border5.ge4-1.bbnet2.mia003.pnap.net (69.25.0.77) 32.852 ms 33.087 ms 33.243 ms
13 63.251.144.69 (63.251.144.69) 33.865 ms * 33.433 ms

wheimeng
07-23-2004, 08:07 PM
Packet loss does affect the quality of the call, so it is extremely vital to have good bandwidth. Try Internap, I prefer Internap to Peer1.

IdealBandwidth
07-23-2004, 08:08 PM
Id pick Savvis over Internap.

Defcon|Rich
07-23-2004, 08:08 PM
Give them a call. I'm sure they will help you out.

helios3
07-23-2004, 08:43 PM
Can someone comment on whether I can affect both the route TO Miami as well as the route the data takes from Miami (not my computers) back to New York?

helios3
07-24-2004, 12:11 AM
OH NO, NDCTech doesn't have Level(3) bandwidth ready quite yet.

ARRRRRRRRRRRRRRRRRGH

rusko
07-24-2004, 01:12 AM
STOP. you obviously don't have sufficient clue to diagnose the problem, so why are you posting your half-baked guesses at what the issue might be? based on what you posted and the ping and traceroute test i've done from our own peer1 feed in new york, this has nothing to do with peer1's network. pulver has gear at peer1-nyc and peer1 offers an onnet 0% packet loss SLA - eightball points to... 'bullcrap'.

see ping flood test:
---
[root@liszt]# ping -f 63.251.144.69
PING 63.251.144.69 (63.251.144.69) 56(84) bytes of data.
.
--- 63.251.144.69 ping statistics ---
1944 packets transmitted, 1943 received, 0% packet loss, time 27231ms
rtt min/avg/max/mdev = 32.451/32.868/44.310/0.689 ms, pipe 4, ipg/ewma 14.015/32.845 ms
---

as you can see, no packet loss and excellent jitter (or lack of it). prime voip performance.

see traceroute:
---
[root@liszt]# traceroute 63.251.144.69
traceroute to 63.251.144.69 (63.251.144.69), 30 hops max, 38 byte packets
1 69.28.209.3 (69.28.209.3) 0.417 ms 0.224 ms 0.474 ms
2 OC48POS1-0.wdc-gsr-a.peer1.net (216.187.123.226) 6.300 ms 6.149 ms 6.622 ms
3 ge-2-3-0.r01.asbnva01.us.bb.verio.net (206.223.115.12) 6.434 ms 6.496 ms 6.537 ms
4 p16-1-0-0.r21.asbnva01.us.bb.verio.net (129.250.5.21) 6.671 ms 6.953 ms 6.541 ms
5 p16-4-0-0.r02.asbnva01.us.bb.verio.net (129.250.2.63) 72.894 ms 6.670 ms 16.419 ms
6 p4-1-2-0.r01.miamfl02.us.bb.verio.net (129.250.5.69) 32.325 ms 32.399 ms 32.459 ms
7 ge-1-2.a00.miamfl02.us.ra.verio.net (129.250.30.18) 32.437 ms 32.332 ms 32.432 ms
8 ge-4-2.a00.miamfl02.us.ce.verio.net (130.94.195.118) 32.908 ms 32.540 ms 32.412 ms
9 border5.ge4-1.bbnet2.mia003.pnap.net (69.25.0.77) 32.533 ms 32.493 ms 32.469 ms
10 63.251.144.69 (63.251.144.69) 33.058 ms * 32.927 ms
---

this shows the same results as your traceroute. namely, there *appears* to be packet loss on hop 10, which is the destination and *is not* on the peer1 network. however, this is likely not to be packet loss at all - i would vote for icmp rate limiting which is widely deployed nowadays.

just out of curiosity, how did you establish that there is 2% packet loss, that it is on peer1's network and that changing routes will help?

paul

rusko
07-24-2004, 01:20 AM
Originally posted by helios3
[B]If the data was taking ANY of those routes, it would go straight to Miami on a single network. But it doesn't.


bgp makes routing decisions based on AS path length (the number of networks in the route). this is widely known to be a poor performance metric. so why are you advocating it?

moreover, internap uses verio transit. if you are unhappy with the fact that your data travels over verio to reach your internap box, don't use internap =] internap is a tier 2 with *no* settlement-free peering - your data will not get to internap through a 'single network', no matter what the route.

paul

2uantuM
07-24-2004, 02:03 AM
--- 63.251.144.69 ping statistics ---
1944 packets transmitted, 1943 received, 0% packet loss, time 27231ms
rtt min/avg/max/mdev = 32.451/32.868/44.310/0.689 ms, pipe 4, ipg/ewma 14.015/32.845 ms
---

as you can see, no packet loss and excellent jitter (or lack of it). prime voip performance.

Check again... haha

rusko
07-24-2004, 03:24 AM
Originally posted by 2uantuM
Check again... haha

why? the test was a ping -f (ping with flood option). since i ctrl-c'ed it, one packet was on the wire mid flight at the time of program termination. this is expected behaviour - try it yourself.

paul

helios3
07-24-2004, 11:42 AM
Rusko: You ran your test early Saturday morning. I ran my tests during business hours of a weekday. BIG DIFFERENCE. So who is the clueless one?

Obviously you are going to defend Peer1 and advise against switching because that is WHAT YOU SELL.

The problem is NOT at hop 10, if you read my post you will see that simultaneous pinging to the same IP from another computer over Sprint's IP network showed ZERO PACKET LOSS.

After much pingning to all points on the route, I can to the same conclusion as what your own traceroute above shows. Look at the pings from hop #5. That Verio router is unreliable/overloaded.

Rusko: Who's clueless now?

nickn
07-24-2004, 11:57 AM
Originally posted by helios3
Rusko: You ran your test early Saturday morning. I ran my tests during business hours of a weekday. BIG DIFFERENCE. So who is the clueless one?

Obviously you are going to defend Peer1 and advise against switching because that is WHAT YOU SELL.

The problem is NOT at hop 10, if you read my post you will see that simultaneous pinging to the same IP from another computer over Sprint's IP network showed ZERO PACKET LOSS.

After much pingning to all points on the route, I can to the same conclusion as what your own traceroute above shows. Look at the pings from hop #5. That Verio router is unreliable/overloaded.

Rusko: Who's clueless now?

I have to agree, Rusko, you don't seriously think that because when you choose to run a ping flood and it didn't result in packet loss, that this makes the poster clueless do you? Because, this would be pretty clueless.

Why would you doubt the poster? He says he sees packet loss, he's the customer, it would definitely be easiest for him to stay where he is, so I don't see him looking for a reason to leave Peer1.

Poster, if I was you, I'd contact ndctech and have them contact peer1 with the traceroutes/packet loss evidence and timestamps, give peer1 a chance to resolve the issue first.

mainarea
07-24-2004, 12:47 PM
root@dallas [~]# traceroute 63.251.144.69
traceroute to 63.251.144.69 (63.251.144.69), 30 hops max, 38 byte packets
1 57.67-18-145.reverse.theplanet.com (67.18.145.57) 1.152 ms 0.605 ms 2.541 ms
2 18.67-18-144.reverse.theplanet.com (67.18.144.18) 1.045 ms 0.446 ms 0.566 ms
3 ibr1-ge-1-0-0-v2.dllstx3.theplanet.com (12.96.160.33) 0.827 ms 0.877 ms 0.725 ms
4 ge-9-3.a00.dllstx04.us.ra.verio.net (157.238.228.37) 0.951 ms 1.115 ms 1.006 ms
5 xe-0-3-0-4.r20.dllstx09.us.bb.verio.net (129.250.31.46) 1.216 ms 0.998 ms 1.250 ms
6 p16-0-0-0.r01.atlnga03.us.bb.verio.net (129.250.4.195) 25.812 ms 25.898 ms 25.685 ms
7 p4-1-2-0.r00.miamfl02.us.bb.verio.net (129.250.5.7) 38.700 ms 38.225 ms 38.372 ms
8 ge-1-1.a00.miamfl02.us.ra.verio.net (129.250.30.3) 38.227 ms 38.778 ms 38.400 ms
9 ge-4-2.a00.miamfl02.us.ce.verio.net (130.94.195.118) 33.391 ms 33.224 ms 33.358 ms
10 border5.ge3-1.bbnet1.mia003.pnap.net (69.25.0.13) 38.938 ms 38.799 ms 38.798 ms
11 * 63.251.144.69 (63.251.144.69) 39.992 ms *

Packetloss from ThePlanet in Dallas shows up at the last hop only. Looks like it could just be rate limiting, I get no packetloss on pings or pingfloods.

Look at the pings from hop #5. That Verio router is unreliable/overloaded.
If you don't see the pings continuing to be high down the line, then that latency most likely isn't real. Can somebody else confirm that for me?

- Matt

mainarea
07-24-2004, 01:34 PM
With ICMP filtering and so on, I couldn't really tell what the problem was from NY to MIA
Contact your provider & see what info they can give you. Basically, you have no clue where the problem lies, so switching to Level3 might not even fix your issues.

Show Level 3 (New York, NY) Traceroute to 63.251.144.69

1 ae-0-55.bbr1.NewYork1.Level3.net (64.159.17.129) 0 msec
ae-0-51.bbr1.NewYork1.Level3.net (64.159.17.1) 0 msec
ae-0-53.bbr1.NewYork1.Level3.net (64.159.17.65) 4 msec
2 as0.mp2.Miami1.Level3.net (64.159.3.249) 32 msec
as-1-0.mp1.Miami1.Level3.net (64.159.0.1) 32 msec 44 msec
3 unknown.Level3.net (64.159.1.174) 56 msec
ge-10-0.hsa1.Miami1.Level3.net (64.159.1.90) 32 msec 52 msec
4 INTERNAP.hsa1.Level3.net (64.156.216.10) 32 msec 32 msec 36 msec
5 border5.ge3-1.bbnet1.mia003.pnap.net (69.25.0.13) [AS12180 {INTERNAP-2BLK}] 32 msec 36 msec 32 msec
6 * * *
7 63.251.144.69 [AS12180 {INTERNAP-2BLK}] 32 msec * 32 msec

You still get that packetloss in traces at the end.

- Matt

rusko
07-24-2004, 06:02 PM
Originally posted by helios3
Rusko: You ran your test early Saturday morning. I ran my tests during business hours of a weekday. BIG DIFFERENCE. So who is the clueless one?


where is the traceroute that shows packet loss on peer1?


Obviously you are going to defend Peer1 and advise against switching because that is WHAT YOU SELL.


agreed. however, i am not going to lie to defend what i sell. if i had seen anything, anything at all that would point the finger at peer1, i would have suggested that you invoke the sla. however, there is no data to pin the issue on peer1 and as such, the title of the thread is inaccurate at best.


After much pingning to all points on the route, I can to the same conclusion as what your own traceroute above shows. Look at the pings from hop #5. That Verio router is unreliable/overloaded.


<clue> not necessarily. when you see ping spikes pinging a router directly (instead of pinging to a hop past it), they are likely to be caused by cef/bgp scanner running. this is expected and *does not* mean that the router is 'unreliable/overloaded'. </clue>

as you can see in the traceroute, pings to hops past 5 do not show the same spikes, so the above explains the traceroute perfectly.

Rusko: Who's clueless now?

as a friendly suggestion, it would have been much better if you had collected raw data such as traceroutes and posted them here, asking people to diagnose the problem for you. folks with troubleshooting experience and networking clue, as rare as they are here, would have been glad to help. contacting your provider (ndctech) would have worked as well. my problem with this thread is that you don't know where the problem is, yet feel confident enough to pin it on peer1. the rest of your statements with respect to routes serve to show that you do not have an understanding of how routing affects performance. there is nothing wrong with that - you probably know more about, say, voip than i do.

if you are asking for advice, ask the right questions. this includes supplying valid raw data and not making hasty conclusions which cloud the issue.

paul

rusko
07-24-2004, 06:04 PM
Originally posted by mainarea
Contact your provider & see what info they can give you. Basically, you have no clue where the problem lies, so switching to Level3 might not even fix your issues.

agreed.


Show Level 3 (New York, NY) Traceroute to 63.251.144.69

1 ae-0-55.bbr1.NewYork1.Level3.net (64.159.17.129) 0 msec
ae-0-51.bbr1.NewYork1.Level3.net (64.159.17.1) 0 msec
ae-0-53.bbr1.NewYork1.Level3.net (64.159.17.65) 4 msec
2 as0.mp2.Miami1.Level3.net (64.159.3.249) 32 msec
as-1-0.mp1.Miami1.Level3.net (64.159.0.1) 32 msec 44 msec
3 unknown.Level3.net (64.159.1.174) 56 msec
ge-10-0.hsa1.Miami1.Level3.net (64.159.1.90) 32 msec 52 msec
4 INTERNAP.hsa1.Level3.net (64.156.216.10) 32 msec 32 msec 36 msec
5 border5.ge3-1.bbnet1.mia003.pnap.net (69.25.0.13) [AS12180 {INTERNAP-2BLK}] 32 msec 36 msec 32 msec
6 * * *
7 63.251.144.69 [AS12180 {INTERNAP-2BLK}] 32 msec * 32 msec

You still get that packetloss in traces at the end.

- Matt

that's no packet loss, since direct pings aren't dropping any icmp. icmp rate limiting most likely.

paul

rusko
07-24-2004, 06:18 PM
Originally posted by nickn
I have to agree, Rusko, you don't seriously think that because when you choose to run a ping flood and it didn't result in packet loss, that this makes the poster clueless do you? Because, this would be pretty clueless.

no. the original poster didn't post anything that shows actual packet loss and made several points about routing that do not compute at all. finding the cause of packet loss, if there is indeed packet loss present, is rather easy in the majority of cases, yet all we have here is a traceroute that doesn't show any packet loss and a long post that is based on a conclusion that is unsupported at best and incorrect at worst.


Why would you doubt the poster? He says he sees packet loss, he's the customer, it would definitely be easiest for him to stay where he is, so I don't see him looking for a reason to leave Peer1.


because i see the data he based his diagnosis on and the data does not support his conclusion. indeed, it is easier for him to stay where he is - as such, i am *helping* by pointing out that the conclusion he reached is wrong and that he should investigate further before doing something as drastic as switching carriers. please note that everybody else jumped on the bandwagon without actually giving the issue any thought.


Poster, if I was you, I'd contact ndctech and have them contact peer1 with the traceroutes/packet loss evidence and timestamps, give peer1 a chance to resolve the issue first.

i would take this a bit further. contact ndctech, sure. also do extensive traceroute tests, preferably traceroute -q 200 destination_ip > log, to try and figure out where the packet loss is happening. be aware of icmp rate limiting, cef and all the other nifty tricks that will affect the validity of your tests. give it a think and then contact *the appropriate entity* to have the issue resolved. this may be ndctech if the packet loss is on his gear, peer1 if it is on their network, or verio/internap if the issue rests with them.

paul

helios3
07-24-2004, 07:52 PM
THANK YOU TO ALL who have posted so far!!

A question that I have wondered is why traceroute tends to show more packet loss than pings.

I can run a continuous ping and lose maybe 2 out of every 100 packets. I would say that one out of every 3 traceroutes to the final destination have showed something like 26ms * 28ms

So that particular traceroute showed 33% packet loss to the final destination, and if that happens very often, it equals way more than 2% packet loss.

Any ideas why???

Thanks!

rusko
07-24-2004, 08:00 PM
read above, this is likely a result of icmp rate limiting enabled on a given router. the packets are getting dropped deliberately and as such, this is not a malfunction. naturally, packets that are being forwarded through said router are not mucked with.

paul

Joshua
07-24-2004, 08:14 PM
Run "mtr 63.251.144.69" during times that you usually experience packetloss. Unlike the standard traceroute utilities, I haven't seen any packetloss on the last hop, even due to ICMP limiting. That script will be able to show you packetloss at each hop, as well as the best ping time, worst ping time, and average ping time for each hop.

-Josh

lostpacket
07-24-2004, 08:18 PM
helios,

my 2 cents looking at this

Packets Pings
Hostname %Loss Rcv Snt Last Best Avg Worst
1. 69.28.209.3 0% 1127 1127 0 0 1 13
2. wdc-dis-1.ne.peer1.net 0% 1127 1127 6 6 6 19
3. ge-2-3-0.r01.asbnva01.us.bb.verio.n 0% 1127 1127 6 6 6 16
4. p16-1-0-0.r21.asbnva01.us.bb.verio. 0% 1127 1127 8 6 7 50
5. p16-4-0-0.r02.asbnva01.us.bb.verio. 0% 1127 1127 6 6 8 84
6. p4-1-2-0.r01.miamfl02.us.bb.verio.n 0% 1127 1127 32 32 32 52
7. ge-1-2.a00.miamfl02.us.ra.verio.net 0% 1127 1127 32 32 40 241
8. ge-4-2.a00.miamfl02.us.ce.verio.net 0% 1126 1127 31 31 37 242
9. border5.ge4-1.bbnet2.mia003.pnap.ne 0% 1126 1127 31 31 37 240
10. 63.251.144.69 0% 1126 1126 31 31 31 48

is that your packetloss / latency problem is coming from something on this network;

7. ge-1-2.a00.miamfl02.us.ra.verio.net 0% 1127 1127 32 32 40 241
8. ge-4-2.a00.miamfl02.us.ce.verio.net 0% 1126 1127 31 31 37 242
9. border5.ge4-1.bbnet2.mia003.pnap.ne 0% 1126 1127 31 31 37 240

I see constant response time spikes. This would deffinatly affect VOIP / RTP traffic.

As stated by rusko I would deffinatly have ndc check thier side, but I suspect whats above as its the only 3 hope that dont have consitant response times.

FYI im tracking from Peer1 NYC also.

Good luck getting your issue is resolved.

rusko
07-24-2004, 08:41 PM
Originally posted by lostpacket

I see constant response time spikes. This would deffinatly affect VOIP / RTP traffic.


that's likely to be just bgp scanner/cef, not relevant here imo.

paul

helios3
07-27-2004, 02:08 PM
Ok, here is an update:

Pings done at the exact same time, 1:59PM on 27-Jul-2004

From my Peer1 Server in New York (NDCTech)
--- 63.251.144.69 ping statistics ---
401 packets transmitted, 379 received, 5% packet loss, time 84023ms
rtt min/avg/max/mdev = 34.466/35.331/45.459/0.783 ms

From my computer in Canada (data goes via Spintlink)
--- 63.251.144.69 ping statistics ---
405 packets transmitted, 405 received, 0% packet loss, time 84863ms
rtt min/avg/max/mdev = 55.192/57.392/84.572/2.288 ms


Furthermore, pings to this server (in Miami) from computers right next to it show zero packet loss.


I guess after complaining to Peer1 they moved the route (one-way I'm guessing NY to Miami) to ATT.

Hostname %Loss Rcv Snt Last Best Avg Worst
1. 69.90.121.3 0% 72 72 1 0 1 1
2. 216.187.123.14 0% 72 72 0 0 0 1
3. 216.187.123.2 0% 72 72 1 0 0 1
4. 216.187.123.5 0% 72 72 1 0 0 1
5. 12.118.100.17 0% 72 72 1 0 1 1
6. tbr1-p013802.n54ny.ip.att.net 0% 72 72 1 1 2 3
7. tbr1-cl8.phlpa.ip.att.net 0% 72 72 4 3 4 7
8. tbr2-p013601.phlpa.ip.att.net 0% 72 72 5 4 5 6
9. tbr1-cl9.wswdc.ip.att.net 0% 72 72 7 7 8 11
10. tbr2-p013601.wswdc.ip.att.net 0% 72 72 8 7 7 10
11. tbr1-cl1.attga.ip.att.net 0% 72 72 21 20 21 26
12. gbr4-p50.ormfl.ip.att.net 0% 72 72 29 28 29 30
13. gar1-p360.miufl.ip.att.net 0% 71 71 37 36 36 37
14. 12.118.175.30 9% 65 71 34 34 40 167
15. border5.ge4-1.bbnet2.mia003.pnap.ne 8% 66 71 34 34 42 241
16. 63.251.144.69 8% 66 71 35 34 35 39


And the following is from my home computer

Hostname %Loss Rcv Snt Last Best Avg Worst
1. 64.26.155.1 0% 123 123 9 8 9 22
2. 64.26.173.98 0% 123 123 9 9 10 28
3. 206.191.0.98 0% 123 123 9 9 17 199
4. border3-faste2-0.magma.ca 0% 122 122 11 9 19 263
5. A-pc2-803-S1.gw2.mtl1.sprint-canada 0% 122 122 14 13 25 213
6. g8-0-S1.bb2.mtl1.sprint-canada.net 0% 122 122 12 12 14 24
7. sl-gw1-pen-13-3.sprintlink.net 0% 122 122 22 22 25 33
8. sl-bb20-pen-5-0.sprintlink.net 0% 122 122 26 23 39 231
9. sl-bb24-pen-8-0.sprintlink.net 0% 122 122 30 24 43 225
10. sl-bb26-rly-0-0.sprintlink.net 0% 122 122 27 25 50 258
11. sl-bb20-atl-10-1.sprintlink.net 0% 122 122 41 39 56 235
12. sl-bb22-orl-14-0.sprintlink.net 0% 122 122 51 50 69 257
13. sl-st20-mia-15-1.sprintlink.net 0% 122 122 56 55 64 194
14. sl-internap-122-0.sprintlink.net 0% 122 122 56 55 60 235
15. border5.ge3-1.bbnet1.mia003.pnap.ne 0% 122 122 56 54 61 231
16. 63.251.144.69 0% 122 122 59 54 57 73


ANY comments and suggestions would be greatly appreciated!!

Thanks,

John

mainarea
07-27-2004, 02:20 PM
13. gar1-p360.miufl.ip.att.net 0% 71 71 37 36 36 37
14. 12.118.175.30 9% 65 71 34 34 40 167
15. border5.ge4-1.bbnet2.mia003.pnap.ne 8% 66 71 34 34 42 241
16. 63.251.144.69 8% 66 71 35 34 35 39

That's a problem with Internap down in Miami... you're having a problem with the server/computer in Miami, not a Peer1 issue. I'm not seeing any problems over the Verio link:

Hostname %Loss Rcv Snt Last Best Avg Worst
1. 57.67-18-145.reverse.theplanet.com 0% 241 241 0 0 1 7
2. 18.67-18-144.reverse.theplanet.com 0% 241 241 0 0 4 235
3. ibr1-ge-0-1-0-v1.dllstx3.theplanet.com 0% 241 241 3 0 1 7
4. ge-9-3.a00.dllstx04.us.ra.verio.net 0% 240 240 3 0 6 225
5. xe-0-3-0-4.r20.dllstx09.us.bb.verio.net 0% 240 240 2 0 2 45
6. p16-0-0-0.r01.atlnga03.us.bb.verio.net 0% 240 240 25 25 27 60
7. p4-1-2-0.r00.miamfl02.us.bb.verio.net 0% 240 240 41 38 39 47
8. ge-1-1.a00.miamfl02.us.ra.verio.net 0% 240 240 38 38 46 226
9. ge-4-2.a00.miamfl02.us.ce.verio.net 0% 240 240 38 37 43 241
10. border5.ge3-1.bbnet1.mia003.pnap.net 0% 240 240 35 34 41 236
11. 63.251.144.69 0% 240 240 37 34 36 85


- Matt

helios3
07-27-2004, 05:14 PM
Hi,

Thanks for the reply!!

But Sprint connects via the InterNap routers, too.

What do you think the exact problem is? How should I go about fixing it?

rusko
07-27-2004, 05:54 PM
i've done a traceroute with 100 queries against the destination from a box on our peer1 gig-e. see the results at:

http://www.rusko.us/voip.log

as you can see, there is nothing that even appears to look like packet loss up until hop 11. hop 11 is still an att ip address, but my guess is that is the att side of an interface on an internap router.

if i had to put 5 bucks down on something, i would point to internap having capacity issues in miami. a few days ago it was capacity to verio, today capacity to att.

since i'm drinking my coffee for the next 5 minutes, let's look at some bgp stuff and see if we can find some clues:

63.251.144.69 - internap - AS12180
12.118.175.26 - att - AS7018
130.94.195.118 - verio - AS2914
144.223.245.146 - sprintlink - AS1239

let's ask route-views.oregon-ix.net what 12180 is trying to do to influence ingress routes:

route-views.oregon-ix.net>sh ip bgp 63.251.144.69 | include 7018
7018 12180 12180 12180 12180 12180 12180 12180

comment: att route is prepended with 12180 six times - 12180 (internap) *really really really* doesn't want to get ingress traffic through 7018 (att)

route-views.oregon-ix.net>sh ip bgp 63.251.144.69 | include 2914
--- snip ---
2914 12180
--- snip ---

comment: 12180 seems happy to get ingress via verio today - enough capacity to them today?

route-views.oregon-ix.net>sh ip bgp 63.251.144.69 | include 1239
--- snip ---
1239 12180
--- snip ---

comment: same here

if we didn't have the data from a few days ago when the verio route was problematic, i'd say ask att first. in the context of previous happenings, this just looks like 12018 trying to load balance a lot of load and not doing very well (prepend is _not_ king, the only surefire way is deagging and announcing to one peer only).

paul

rusko
07-27-2004, 05:57 PM
Originally posted by helios3
Hi,

Thanks for the reply!!

But Sprint connects via the InterNap routers, too.

What do you think the exact problem is? How should I go about fixing it?

my guess is the problem is internap-miami. consider moving that endpoint =]

p