Web Hosting Talk







View Full Version : WHT keeps going down during the day


utropicmedia-karl
12-28-2007, 11:29 AM
For a few months now I have noticed, from multiple location, that WHT is unroutable for 5-20 minutes at a time, several times a day. Anyone else notice this?


Before you ask:

-No, it's not me, just me, or my connection
-We have verified this from multiple countries



Thanks.

SoftWareRevue
12-28-2007, 11:32 AM
Strange. I'm here every day from Kalamazoo and haven't noticed it.

utropicmedia-karl
12-28-2007, 11:39 AM
Here's a link to a posted trace from a week ago when I last mentioned it:

http://www.webhostingtalk.com/showpost.php?p=4859935&postcount=13



69.20.126.9 is from United States(US) in region North America

TraceRoute to 69.20.126.9 [webhostingtalk.com]
Hop (ms) (ms) (ms) IP Address Host name
1 0 0 1 66.98.244.1 gphou-66-98-244-1.ev1servers.net
2 0 0 0 66.98.241.7 gphou-66-98-241-7.ev1servers.net
3 0 0 0 66.98.240.14 gphou-66-98-240-14.ev1servers.net
4 86 49 4 129.250.11.141 ge-1-12.r04.hstntx01.us.bb.gin.ntt.net
5 1 1 1 129.250.4.233 xe-1-3-0.r20.hstntx01.us.bb.gin.ntt.net
6 9 6 6 129.250.3.129 as-0.r20.dllstx09.us.bb.gin.ntt.net
7 8 8 6 129.250.2.59 ae-0.r21.dllstx09.us.bb.gin.ntt.net
8 12 12 12 144.232.8.121 sl-st20-dal-14-2.sprintlink.net
9 10 9 12 144.232.20.82 sl-bb27-fw-6-0.sprintlink.net
10 31 31 29 144.232.8.65 sl-bb21-nsh-4-0-0.sprintlink.net
11 47 49 49 144.232.18.185 sl-crs2-dc-0-0-0-1.sprintlink.net
12 49 51 49 144.232.18.228 sl-st20-ash-9-0-0.sprintlink.net
13 43 46 46 144.223.246.118 sl-racks-4-0.sprintlink.net
14 43 44 43 69.20.1.40 vlan903.core3.iad1.rackspace.com
15 47 47 44 69.20.3.19 aggr104a.iad1.rackspace.com
16 Timed out Timed out Timed out -
17 Timed out Timed out Timed out -
18 Timed out Timed out Timed out -
19 Timed out Timed out Timed out -

Trace aborted.



This happens a handful of times each day and I've seen it from different physical locations. Already happened once this morning. I've noticed this behavior for maybe 4 months?

adam
12-28-2007, 12:02 PM
I have seen this also.

Outlaw Web Master
12-28-2007, 12:12 PM
I have noticed it loading slower but took this to be either my isp or server load...but it's never been unroutable for me here in UK.


owm

utropicmedia-karl
12-28-2007, 12:33 PM
From the traces it's clear that there is a problem with the next hop inside rackspace. I don't know if it's the WHT server, load-balancer, etc.

On the other hand, we have been getting a fair amount of attrition(read: clients) from rackspace the past few months. Maybe it's related? :) ;) (I kid!)

Seriously though, having several outages during the day, which is what these are, is not good by any measure.



Regards,

othellotech
12-28-2007, 01:32 PM
From the traces it's clear that there is a problem with the next hop inside rackspace
No, from the traces its clear that there was an issue with either the outboud *or* the return from hop 16 - unless you're monitorring both ways over the same out/in paths its impossible to even guess where the issue lies (beyond it being most likely one of either ev1/planet or rackspace regular daily route changes)

utropicmedia-karl
12-28-2007, 02:22 PM
No, from the traces its clear that there was an issue with either the outboud *or* the return from hop 16 - unless you're monitorring both ways over the same out/in paths its impossible to even guess where the issue lies (beyond it being most likely one of either ev1/planet or rackspace regular daily route changes)


You repeated what I said, except you first told me "no". :eek:

TCP and ICMP data is failing on the next hop inside rackspace's infrastructure, whatever that is. In or out, it doesn't matter. A fail is a fail.

That trace was from an ev1 box, but is just an example. I can get traces from other places as well.

othellotech
12-28-2007, 07:13 PM
TCP and ICMP data is failing on the next hop inside rackspace's infrastructure

Not necesarily, please re-read what I said :P
The fault could be at *any* of the points in or out, and we dont even know what the out route looks like. ALL that trace tells you is there is *probably* something wrong somewhere...

utropicmedia-karl
12-28-2007, 07:19 PM
Not necesarily, please re-read what I said :P
The fault could be at *any* of the points in or out, and we dont even know what the out route looks like. ALL that trace tells you is there is *probably* something wrong somewhere...

I get what you're saying, and in theory many of the OSI protocols do not guarantee a particular route for a series of packets, by default, but we both know the reality is the majority of the time when I run a 2 traces in a row and they show the same hop information I'm getting the exact same route. Couple that with the lack of http connectivity during these outages and it's obvious something is wrong. Wonder if it has to do with that proxy shield thing they have. Granted it may keep bad people out but you don't commit suicide if you just need to amputate an infected limb!

othellotech
12-28-2007, 07:46 PM
example:

1 gw1-shared-devices.uk.othellotech.net (80.82.140.66) 0.186 ms 0.151 ms 0.226 ms
2 transit1.as29527.net (80.82.140.41) 0.460 ms 0.453 ms 0.439 ms
3 peering1.as29527.net (80.82.140.43) 0.426 ms 0.414 ms 0.402 ms
4 193.109.219.50 (193.109.219.50) 0.941 ms 0.955 ms 1.033 ms
5 linx.peer.nac.net (195.66.224.94) 130.567 ms 130.537 ms 130.522 ms
6 0.so-5-0-0.gbr1.mmu.nac.net (209.123.11.53) 80.277 ms 80.174 ms 80.154 ms
7 0.ge-0-1-0.dar2.mmu.nac.net (209.123.11.206) 80.262 ms 0.ge-3-0-0.dar2.mmu.nac.net (209.123.11.166) 86.587 ms 0.ge-0-1-0.dar2.mmu.nac.net (209.123.11.206) 86.688 ms
8 * * *
9 * * *

Does this mean there is a problem ? Is there 100% packetloss at points 8 or 9 ?
Simple answer - no, beause the *RETURN* path as seen on a trace from the other end, shows it's going through a provider who is deliberatly droppng icmp packets from the IP range - so the "trace" packet gets to hop 8 (and 9) just fine, but gets "dumped" at hop 5 on the way back.

In a BGP network you decide the policy of how packets go *out* from your routers, someone else decides how they come *in* (your inbound is a-n-others outbound and therefore subject to their policies/settings/metrics/tweaks/c**kups)

Oddly enough, I can never traceroute to WHT whether its "up" or not ...
9 so-3-0-0.mpr1.iad2.us.above.net (64.125.29.134) 23.621 ms 22.343 ms 9.293 ms
10 so-3-0-0.mpr1.iad10.us.above.net (64.125.30.117) 9.143 ms 9.144 ms 10.305 ms
11 209.249.11.37.available.above.net (209.249.11.37) 9.227 ms 9.264 ms 9.187 ms
12 vlan903.core3.iad1.rackspace.com (69.20.1.40) 9.121 ms 9.311 ms 9.206 ms
13 aggr104a.iad1.rackspace.com (69.20.3.19) 9.174 ms 9.075 ms 9.019 ms
14 * * *
15 * *

Maybe its all part of racksplat's ficticious 100% uptime ? It's *always* available - we just have to all take turns when that availability is ;)

utropicmedia-karl
12-29-2007, 11:28 AM
Maybe its all part of racksplat's ficticious 100% uptime ? It's *always* available - we just have to all take turns when that availability is ;)


LOL.


Round-robin uptime.


I want that on a t-shirt.

suntexssupport
01-27-2008, 10:35 AM
May be because of too many visitors :)

plumsauce
01-28-2008, 02:39 AM
I have found that if I allow javascript on WHT it will try to load the top banner. That top banner consistently hung my browser. Once I disallowed javascript, the page loads very quickly.

This is also true on other sites for various adservers.

The thing that site owners have to remember when including off server content in their pages is that they then are at the mercy of the external servers.

utropicmedia-karl
01-29-2008, 02:08 PM
It's not an ad thing - nothing loads and the connection just times out.

Woooo
02-02-2008, 01:36 PM
WHT.. time to get a good webhost now.