Web Hosting Talk







View Full Version : RaQ550 Rebooting Issues


bek444
05-23-2006, 11:12 AM
Please forgive my technical inabilities, I will try to explain as best I can...
(my technician left my business high and dry, so I would appreciate any help)

My Raq550 needed rebooting, I thought I knew the IP address to use
so I tried to ssh admin@2............................. (I used the IP address of my web sites, rather than the server, which I didn't know at the time)

When I tired to ssh onto the server, it said the IP wasn't recognised...so I said
yes, when it asked if it wanted the IP to be recognised.... then tried reboooting from there ... it didn't work.

I found the proper IP address I was supposed to use...

But the server hasn't been right since, people can't receive their emails, everytime they try, it asks for there password (which it doesn't normally do) and wont let them log on... THIS IS THE URGENT ISSUE !!

Does anyone know, if what I did, when I incorrectly ssh onto the server that it would cause such problems?? And if so, how do I fix it ???

Or if this is not the problem, what is ?? What can I do??

Also now the Active Monitor keeps sending this message :

* There is a problem with disk integrity.
- The system is resynchronizing the information on the disks. Please look at the Disk Integrity entry in Active Monitor for more information.
How do I do this ?? And how do I fix this problem??

Thanks in Advance

BruceT
05-23-2006, 08:48 PM
Don't worry about the disk rebuild; that's the RAID doing it's thing. It will stop eventually, and you will get another Active Monitor message to that effect. Active Monitor should only send one message per status change though; you shouldn't keep getting messages about the disk integrity unless you're rebooting the server again and again...

Re your other problems - maybe some of the services didn't restart properly...? SSH into the server and rebooting wouldn't cause an authentication issue with email.

Were any patches installed, etc., prior to rebooting the server?

bek444
05-23-2006, 09:07 PM
Hi,

No patches were installed on the server...

You said:
SSH into the server and rebooting wouldn't cause an authentication issue with email.

So even if I tried to SSH in, on the website IP address and not the server IP that wouldn't make a difference??

What else could be causing the problems??

Also You said :
Don't worry about the disk rebuild; that's the RAID doing it's thing.

It has come up 5 times in the last 24 hours and I have only rebooted twice...
And it hasn't given me an Active Monitor message saying everything is OK yet...

Is that still normal??

Thanks Again

BruceT
05-23-2006, 09:14 PM
1. Correct, SSH to an IP does not matter vs a domain name (the domain name is being resolved to an IP via DNS anyway)

2. If you are getting repeated messages, your server may be rebooting itself for some reason. When you get an AM message, log in via SSH and run the 'uptime' command. That will tell you how long the server has been up since the last reboot. For example, on my server, when I do uptime I get

[admin admin]$ uptime
6:12pm up 110 days, 19:33, 1 user, load average: 0.05, 0.01, 0.00

Some causes of "spontaneous" reboots are failed cooling fans, or disk errors. You can look in /var/log/messages or use the 'dmesg' command to look for errors...


24 hours of RAID rebuild is not normal; I think you have something else going on there...

bek444
05-23-2006, 09:39 PM
Hi again,

I tried the uptime command :

[admin admin]$ uptime
8:47am , 5:19, 1 user, load average: 2.02, 2.01, 2.00

So does that mean my machine has been up for the past five hours??
Also what does the load average mean??

When I tried the /var/log/messages command, it gave me a permission denied, but I am logged in as "SU"

You said I caould use the 'dmesg' command to look for errors...do I just type in "dmesg" ?

So what causes an authentication issue with email ??

I'm so sorry about all the questions...
I just don't know what else to do

Thanks

BruceT
05-23-2006, 09:48 PM
Yes, uptime of 5:19 is 5 hours, 19 minutes. Load average shows the number of processes waiting for the CPU in the last 1, 5, and 15 minutes. If your RAID is rebuilding, loadavg of 2 is normal.

Did you become root from admin by doing "su -" or "su" ?? You need to add the "-" to the command to get root's search path etc.

Yes, as root, type dmesg. It will spew out all the boot info. If there are any errors in there, that might help you figure out what's going on with the server.

Re the authentication problems with mail, it could be a lot of things. Check /var/log/maillog (as root) to see if anything shows up there that points in one direction or another... without knowing specifics, it's too hard to narrow down the possibilities...

bek444
05-23-2006, 10:14 PM
When I did the dmesg command, a whole heap of the following was in the list :

end_request : I/O error, dev 03:04 (hda) sector 142657360
raid 1 : hda4 : rescheduling block 142657360
raid 1: hda4 : unrecoverable I/O read error for block 142657360
hda : dma_intr : status = 0x51 {DriveReady SeekComplete Error}
hda : dma_intr : error = 0x40 { Uncorrectable Error} LBAsect = 154971107, sector = 142657376

Do you know what this means??

I also checked /var/log/maillog it said "no such file or directory"

"without knowing specifics, it's too hard to narrow down the possibilities.."
What specifics would you need??

Thank you again

bek444
05-23-2006, 10:23 PM
"Did you become root from admin by doing "su -" or "su" ?? You need to add the "-" to the command to get root's search path etc."

When I tired the "su-" it said - bash : su- : command not found

Also I tried the /var/log/maillog (as root) again, the following error was given -
bash : /var/log/maillog : Permission Denied ...

This is a puzzle to me as it the password I use to reboot the system...

Thanks

BruceT
05-23-2006, 10:27 PM
end_request : I/O error, dev 03:04 (hda) sector 142657360
raid 1 : hda4 : rescheduling block 142657360
raid 1: hda4 : unrecoverable I/O read error for block 142657360
hda : dma_intr : status = 0x51 {DriveReady SeekComplete Error}
hda : dma_intr : error = 0x40 { Uncorrectable Error} LBAsect = 154971107, sector = 142657376


Your hard drive is failing. You need to get whatever data you can off this thing and replace them. That's probably why RAID rebuild is taking forever, or not happening at all, and why the server keeps rebooting. Could also be related to your mail issues.

If you don't have a maillog, sendmail is messed up. That also could be related to your mail problem.

One way or another, IMO your server is failing pretty badly. If you have data that's not backed up someplace else, I'd get it off as quickly as you can.

For RaQ repairs, Gerald Waugh at raqware.com is the guy to talk to. That's here in the US though; I don't know what your recourse might be in Australia.

The RaQ 550 takes a specific model of Seagate hard drive, as they plug directly onto the motherboard rather than using a cable. So only 1 or 2 specific models of disks will fit in there.

bek444
05-23-2006, 11:01 PM
Thanks for all your help...

Would you mind answering some more run of the mill questions, when I get them??

And do I put them in the forums??

I will go and try and get these hard drives replaced...

Do you have anymore clues as to what might cause the email issues??
The customer says its coming up with "email server rejected login username and password" Is there some way I can test this myself??


Anyway....Thanks

BruceT
05-23-2006, 11:08 PM
I'm on here sporadically, so I can't make any guarantees about timelines of answers, etc. There are many other folks here who are pretty helpful, so I'd say yes, post away...

Regarding the email issue, pick a username that's having problems, reset the user's password, and then try to connect using the new info.

Without a logfile though, there won't be much diagnostic info. You could look in the general system log file, /var/log/messages. Sometimes there is a security/authorization logfile, either /var/log/secure or /var/log/auth. You can see the last x lines in a file by using the "tail" command:

tail -20 /var/log/messages

will show you the last 20 lines of that file.