Results 1 to 10 of 10
  1. #1

    Centos Problem HDD

    i have a colo server, lately im having problem, every 2-3 oclock in the morning my server crash, i asked the server management to have a look at it but no luck, they install rpm, reduce the http max, etc, etc ( i dont want to mention the name since my server management have helped me alot and its not fair for them if i speak a bad thing about them ) anyway,

    I bought the server from siliconmechanics iServ R254

    CPU: 2 x Intel Xeon E5410 Quad-Core 2.33GHz, 12MB Cache, 1333MHz FSB, 45nmHi-k
    RAM: 12GB (6 x 2GB) DDR2-667 Registered ECC - Interleaved
    NIC: Intel 82573V & 82573L Gigabit Ethernet Controllers - Integrated
    Hot-Swap Drive - 1: 150GB Western Digital Raptor (1.5Gb/s,10Krpm,16MB Cache,NCQ) SATA
    Hot-Swap Drive - 2: 500GB Seagate Barracuda ES.2 (3Gb/s, 7.2Krpm, 32MB Cache, NCQ) SATA
    Optical Drive: Low-Profile DVD-ROM Drive
    Power Supply: 520W Power Supply with PFC - 87% Maximum Efficiency
    Rail Kit: 2-Piece Ball-Bearing Rail Kit
    OS: CentOS 5 - 64-bit - Preload, No Media
    Warranty: Standard 3 Year - Return to Depot - Advanced Component Exchange

    Configured Power: 255 W, 262 VA, 871 BTU/h, 2.4 Amps (110V), 1.3 Amps (208V)

    Im using CENTOS 5.2 x86_64

    i check on the message log, this is the error before crash

    hdc: status timeout: status=0xd0 { Busy }
    ide: failed opcode was: unknown
    hdc: no DRQ after issuing MULTWRITE_EXT
    ide1: reset: success
    hdc: status timeout: status=0xd0 { Busy }
    ide: failed opcode was: unknown
    hdc: no DRQ after issuing MULTWRITE_EXT
    ide1: reset: success
    hdc: status timeout: status=0xd0 { Busy }
    ide: failed opcode was: unknown
    hdc: no DRQ after issuing MULTWRITE_EXT
    ide1: reset: success

    Motherboard manual
    http://files.siliconmechanics.com/Do...d/MNL-0957.pdf

    i read on centos forum its a problem with the drive - the 500 gigs( the server read my hd as ATA instead of ATA )

    i tried to go to BIOS and change to AHCI but server doest recognize the drive

    http://img212.imageshack.us/img212/5890/img331.jpg

    any help will be appreciated

  2. #2
    Join Date
    Apr 2003
    Location
    NC
    Posts
    3,080
    It sees the drive as IDE because you are in Compat mode and not AHCI.

    It sounds like for the install you had it set to Compat so the OS thought the other drive was the only drive. When you set it to AHCI I bet the OS drive is being put as sdb, not sda.

    You could try having them swap the two hotswap drives into the opposite bay, load up a rescue disk and manually fix the bootloader, or if you have no clue what you are doing reload the OS with both drives in AHCI mode.

    The hard drive is probably not bad and this is probably just related to it being in compat mode
    John W, CISSP, C|EH
    MS Information Security and Assurance
    ITEagleEye.com - Server Administration and Security
    Yawig.com - Managed VPS and Dedicated Servers with VIP Service

  3. #3
    Quote Originally Posted by eth00 View Post
    It sees the drive as IDE because you are in Compat mode and not AHCI.

    It sounds like for the install you had it set to Compat so the OS thought the other drive was the only drive. When you set it to AHCI I bet the OS drive is being put as sdb, not sda.

    You could try having them swap the two hotswap drives into the opposite bay, load up a rescue disk and manually fix the bootloader, or if you have no clue what you are doing reload the OS with both drives in AHCI mode.

    The hard drive is probably not bad and this is probably just related to it being in compat mode
    thanks eth00, i will ask my server management to do that, im just trying to unmount the 500 drive for tonight just be sure thats the one that causing the problem. thanks again.

  4. #4
    Join Date
    Apr 2003
    Location
    NC
    Posts
    3,080
    Quote Originally Posted by phiu View Post
    thanks eth00, i will ask my server management to do that, im just trying to unmount the 500 drive for tonight just be sure thats the one that causing the problem. thanks again.
    No problem. It should be the cause, if you look the errors are all "hdc" which is IDE which means something not in AHCI mode -- the 500Gb drive.
    John W, CISSP, C|EH
    MS Information Security and Assurance
    ITEagleEye.com - Server Administration and Security
    Yawig.com - Managed VPS and Dedicated Servers with VIP Service

  5. #5
    Quote Originally Posted by eth00 View Post
    It sees the drive as IDE because you are in Compat mode and not AHCI.

    It sounds like for the install you had it set to Compat so the OS thought the other drive was the only drive. When you set it to AHCI I bet the OS drive is being put as sdb, not sda.

    You could try having them swap the two hotswap drives into the opposite bay, load up a rescue disk and manually fix the bootloader, or if you have no clue what you are doing reload the OS with both drives in AHCI mode.

    The hard drive is probably not bad and this is probably just related to it being in compat mode
    just another question, why the server sees my 1st hd as a SATA, but not the 2nd hd (sees as an ATA ) ? Both HD are SATA. thanks

  6. #6
    Join Date
    Apr 2003
    Location
    NC
    Posts
    3,080
    Quote Originally Posted by phiu View Post
    just another question, why the server sees my 1st hd as a SATA, but not the 2nd hd (sees as an ATA ) ? Both HD are SATA. thanks
    BIOS configuration, you already said it "will not boot in AHCI mode". AHCI mode is basically so the hard drive is SATA, if you do compatibility mode the motherboard makes it look like an IDE drive to the OS
    John W, CISSP, C|EH
    MS Information Security and Assurance
    ITEagleEye.com - Server Administration and Security
    Yawig.com - Managed VPS and Dedicated Servers with VIP Service

  7. #7
    hi eth00, i asked the datacenter to do it, they said it too risky, 90 percent would be causing kernel panic, is there any other way to doing it ? thanks

  8. #8
    Join Date
    Apr 2003
    Location
    NC
    Posts
    3,080
    They said that changing the hard drive over to AHCI would cause a panic? I don't see what the issue should be if they are both on the same controller which I would assume.

    The only thing that I can think of is that perhaps, like I mentioned earlier, the primary drive is becoming sdb and it is trying to boot off of the 500gb drive when they swap it. That is why I was suggesting drive bay swap. They could always check this via a live CD, see which drive is labeled which way.

    I am not sure on ways to fix those errors without a real fix of going AHCI.

    The real question should be is did they setup the server in the first place like this?...
    John W, CISSP, C|EH
    MS Information Security and Assurance
    ITEagleEye.com - Server Administration and Security
    Yawig.com - Managed VPS and Dedicated Servers with VIP Service

  9. #9
    when i bought the server everything was fine until one day i reboot my server and some kernel was missing and i asked the datacenter to reinstall the OS for me, and since then the problem started,

  10. #10
    i checked the 500Gigs, and this is what i found

    (smartctl -A /dev/hdc)
    === START OF READ SMART DATA SECTION ===
    SMART Attributes Data Structure revision number: 10
    Vendor Specific SMART Attributes with Thresholds:
    ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
    1 Raw_Read_Error_Rate 0x000f 102 099 006 Pre-fail Always - 4256474
    3 Spin_Up_Time 0x0003 095 095 000 Pre-fail Always - 0
    4 Start_Stop_Count 0x0032 100 100 020 Old_age Always - 17
    5 Reallocated_Sector_Ct 0x0033 100 100 036 Pre-fail Always - 0
    7 Seek_Error_Rate 0x000f 072 060 030 Pre-fail Always - 8625499575
    9 Power_On_Hours 0x0032 089 089 000 Old_age Always - 10058
    10 Spin_Retry_Count 0x0013 100 100 097 Pre-fail Always - 0
    12 Power_Cycle_Count 0x0032 100 100 020 Old_age Always - 17
    184 Unknown_Attribute 0x0032 100 100 099 Old_age Always - 0
    187 Unknown_Attribute 0x0032 100 100 000 Old_age Always - 0
    188 Unknown_Attribute 0x0032 099 001 000 Old_age Always - 4295037167
    189 Unknown_Attribute 0x003a 100 100 000 Old_age Always - 0
    190 Unknown_Attribute 0x0022 075 063 045 Old_age Always - 437780505
    194 Temperature_Celsius 0x0022 025 040 000 Old_age Always - 25 (Lifetime Min/Max 0/18)
    195 Hardware_ECC_Recovered 0x001a 054 028 000 Old_age Always - 4256474
    197 Current_Pending_Sector 0x0012 100 100 000 Old_age Always - 0
    198 Offline_Uncorrectable 0x0010 100 100 000 Old_age Offline - 0
    199 UDMA_CRC_Error_Count 0x003e 200 200 000 Old_age Always - 0

    is it possible that causing the server crash ? and do i need to change it or just re format it ?

    thanks

Similar Threads

  1. Problem with vnc on centOS 5. Help me :')
    By M0H4N in forum Hosting Security and Technology
    Replies: 0
    Last Post: 12-01-2007, 06:54 AM
  2. Problem with installing HyperVM on CentOS (Problem w/ YUM?)
    By DennisM in forum Hosting Security and Technology
    Replies: 4
    Last Post: 11-05-2007, 09:05 PM
  3. KVM and centos problem
    By koppan in forum Colocation and Data Centers
    Replies: 2
    Last Post: 01-03-2007, 11:16 AM
  4. Problem with PAM in Centos 4
    By goolex in forum Hosting Security and Technology
    Replies: 1
    Last Post: 02-11-2006, 01:51 AM
  5. Another CentOS problem
    By raqman in forum Hosting Security and Technology
    Replies: 1
    Last Post: 03-21-2005, 05:00 AM

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •