View Full Version : monitoring outages
coolhand 03-04-2003, 02:08 AM I read somewhere about some services that would ping your web site every 15 minutes and notify you if it went down. SOme were free and some were subscription. Anyone who knows the names of these services, I would be very appreciative.
iamdotca 03-04-2003, 02:20 AM people have been raving about Alertra. I, however, spent alot [expletive delete] time compiling Nagios. It can now tell me when a drop of water falls in the kitchen sink, and when my oil needs changing! Pretty amazing what OSS can do for 'ya!
hostsol 03-04-2003, 02:24 AM http://www.alertradar.com
http://www.internetseer.com
But I think the service is not worth that price. A few-line script can do outage monitoring.
If your ISP allows you to do cron job, it is always better to do outage monitoring by yourself.
Just my $0.02.
sprintserve 03-04-2003, 03:08 AM Yes. I am about to configure and install Nagios on several top notch datacenters (all supposed to be with 100% uptime) and make them monitor each other. Cheaper than Alerta for sure. I cringe when I see what they charge for a by minute ping.
iamdotca 03-04-2003, 03:25 AM I know ... all that for a simple 56 byte packet. Again, I just have the monkeys working in my NOC ping my servers. I had them ping yours sprintserve, and everything looks good:
[me@host]$ ping sprintserve.net
PING sprintserve.net (64.157.176.109) from 64.230.113.171 : 56(84) bytes of data.
64 bytes from ip64-157-176-109.neutelligent.com (64.157.176.109): icmp_seq=0 ttl=45 time=67.792 msec
64 bytes from ip64-157-176-109.neutelligent.com (64.157.176.109): icmp_seq=1 ttl=45 time=69.348 msec
64 bytes from ip64-157-176-109.neutelligent.com (64.157.176.109): icmp_seq=2 ttl=45 time=70.136 msec
64 bytes from ip64-157-176-109.neutelligent.com (64.157.176.109): icmp_seq=3 ttl=45 time=65.900 msec
64 bytes from ip64-157-176-109.neutelligent.com (64.157.176.109): icmp_seq=4 ttl=45 time=67.032 msec
--- sprintserve.net ping statistics ---
5 packets transmitted, 5 packets received, 0% packet loss
round-trip min/avg/max/mdev = 65.900/68.041/70.136/1.560 ms
BTW ... Sprintserve ... if you kick ass graphs, screw MRTG, get your hands on cacti (http://www.raxnet.net/). It rocks and is much more versatile than MRTG!
iamdotca 03-04-2003, 03:29 AM Originally posted by hostsol
If your ISP allows you to do cron job, it is always better to do outage monitoring by yourself.
There's only one thing wrong with this. If my servers and ISP links go down, how am I supposed to find out. That's the whole reason why I monitor my sites off-net.
sprintserve 03-04-2003, 03:31 AM Thanks for the tip. I will take a look. Alerons are being added now... 3Gig-Es, and it is fully burstable to 6GBit/s. It should get better :) Now things are still in the process and adding.... when completed, it will be a total of almost 11Gbit/s.
Cacti looks similar to MRTG in their graphing. I guess I need to delve deeper.
iamdotca 03-04-2003, 03:40 AM My reading indicates MRTG doesn't scale as well as Cacti.
iamdotca 03-04-2003, 03:42 AM coolhand. You know the 'tried and true' tested way of monitoring your servers ... sit beside them and watch the LEDs. It's worked for me! ;)
atjeu 03-04-2003, 04:08 AM cacti is just a php graphing front end for rrdtool - mrtg is a different product- you can make rrdtool graphs look however you want including just like mrtg. mrtg is very limited compared to what rrd can do depending on the ftont end.
ServeForce 03-04-2003, 06:11 AM Originally posted by coolhand
I read somewhere about some services that would ping your web site every 15 minutes and notify you if it went down. SOme were free and some were subscription. Anyone who knows the names of these services, I would be very appreciative.
Screw commercial services, get some cheapie shell acounts and have them all ping eachother... It'll more than pay for itself.
hostsol 03-04-2003, 06:29 AM Originally posted by iamdotca
There's only one thing wrong with this. If my servers and ISP links go down, how am I supposed to find out. That's the whole reason why I monitor my sites off-net.
In any way, self monitoring is more accurate than off-net monitoring.
The former defines "non-uptime" (as self monitoring stops) as server outage and the latter defines "downtime" as server outage.
When your monitored server is down, both can find out server outage. However when your monitoring server is down, the former can still find out server outage but the latter cannot.
So that's why I say my original post is not wrong.
Of course it all depends on your need. If you need immediate action upon server outage, off-net monitoring is the best choice. Else if you only want to find out overall uptime, choose self monitoring.
|