Results 1 to 10 of 10

Thread: Faulty IOstat?

  1. #1

    Faulty IOstat?

    We have a raid 10, 6 drives and here's the iostat:

    Device: rrqm/s wrqm/s r/s w/s rsec/s wsec/s avgrq-sz avgqu-sz await svctm %util
    sda 22.72 138.23 24.66 12.88 2227.01 1208.87 91.53 0.19 5.03 3.08 11.55
    sda1 0.01 0.00 0.00 0.00 0.01 0.00 10.58 0.00 7.10 6.49 0.00
    sda2 0.00 0.00 0.00 0.00 0.01 0.00 47.20 0.00 3.98 3.68 0.00
    sda3 0.01 0.13 0.00 0.17 0.06 2.36 14.17 15.02 0.77 5857.96 99.98
    sda4 0.00 0.00 0.00 0.00 0.00 0.00 2.00 1.00 0.00 89353908.00 99.99
    sda5 22.71 138.10 24.66 12.71 2226.93 1206.51 91.88 0.19 5.05 3.09 11.55
    sdc 0.13 0.08 0.53 0.03 5.31 0.83 11.04 0.01 18.24 2.32 0.13
    sdc1 0.13 0.08 0.53 0.03 5.31 0.83 11.04 0.01 18.24 2.32 0.13
    See sda3 and sda4 svctm and util. It always stays at that high range of level, never drops even when there's our web servers aren't running.

    Anyone guru knows what could possibly be wrong?

  2. #2
    Join Date
    Oct 2010
    Location
    Kent, UK
    Posts
    185
    svctm and %utils are indicators to how long requests are taking, this indicates those two disk are taking a long time.
    Without more details i'd be looking at drive or controller or cabling errors.
    What does a dd test look like in iostat?
    Cloud Pixies Ltd. Adding some Pixie magic into the Cloud!

  3. #3
    Join Date
    Oct 2009
    Posts
    822
    As long as sda shows a low load, you can probably ignore any high loads indicated for its partitions. I never use iostat on the partition level, as the numbers tend to be a bit off.
    Your faithful student,
    Twilight Sparkle

  4. #4
    Quote Originally Posted by DeanoC View Post
    svctm and %utils are indicators to how long requests are taking, this indicates those two disk are taking a long time.
    Without more details i'd be looking at drive or controller or cabling errors.
    What does a dd test look like in iostat?
    Raid 10, 6 drives (the one with faulty iostat):
    # sync ; dd if=/dev/zero of=test.file bs=8k count=100k
    102400+0 records in
    102400+0 records out
    838860800 bytes (839 MB) copied, 3.11384 seconds, 269 MB/s
    # sync ; dd if=test.file of=/dev/null
    1638400+0 records in
    1638400+0 records out
    838860800 bytes (839 MB) copied, 1.91765 seconds, 437 MB/s


    Raid 5, 4 drives (our other server):
    # sync ; dd if=/dev/zero of=test.file bs=8k count=100k
    102400+0 records in
    102400+0 records out
    838860800 bytes (839 MB) copied, 2.06441 seconds, 406 MB/s
    # sync ; dd if=test.file of=/dev/null
    1638400+0 records in
    1638400+0 records out
    838860800 bytes (839 MB) copied, 1.78293 seconds, 470 MB/s

  5. #5
    Join Date
    Oct 2010
    Location
    Kent, UK
    Posts
    185
    Even more suggestive of some hardware issue, the write is slowed as for RAID10 it has to write to all drives, whereas the read is mostly okay as it only sometimes will be accessing the bad drives.
    I'd say you have a fair case for swapping the drives or cables.
    Cloud Pixies Ltd. Adding some Pixie magic into the Cloud!

  6. #6
    Join Date
    Oct 2010
    Location
    Kent, UK
    Posts
    185
    If your OS has some fault management tools, see if you can get the read/write/drop out errors per drive. Should help pin down the exact problem
    Cloud Pixies Ltd. Adding some Pixie magic into the Cloud!

  7. #7
    Join Date
    Mar 2010
    Posts
    65
    drives, cables or ports/controller problem. This is not a software problem.

  8. #8
    Join Date
    Feb 2008
    Location
    Houston, Texas, USA
    Posts
    2,875
    Checkout this link w.r.t iostat's %util issues: http://yoshinorimatsunobu.blogspot.c...-on-linux.html

    In brief, it's not as reliable as it's thought to be. The %util numbers had me scratch my head in the past too.

    Regards
    Joe / UNIXY
    UNIXy - Fully Managed Servers and Clusters - Established in 2006
    [ cPanel Varnish Nginx Plugin ] - Enhance LiteSpeed and Apache Performance
    www.unixy.net - Los Angeles | Houston | Atlanta | Rotterdam
    Love to help pro bono (time permitting). joe > unixy.net

  9. #9
    Join Date
    Oct 2010
    Location
    Kent, UK
    Posts
    185
    Its not reliable as a speed metric, but it is fairly reliable as a problem sign. If one or two drives in a RAID set are showing different numbers than the rest of the set (like here) its a fair sign something is wrong, and here backed up by write speed being lower than expected.
    Cloud Pixies Ltd. Adding some Pixie magic into the Cloud!

  10. #10
    Join Date
    Apr 2009
    Location
    whitehouse
    Posts
    649
    What about the results of hdparm -tT /dev/devicename ?
    James B
    EzeeloginSetup your Secure Linux SSH Gateway.
    |Manage & Administer Multiple Linux Servers Quickly & Securely.

  11. Newsletters

    Subscribe Now & Get The WHT Quick Start Guide!

Similar Threads

  1. Replies: 0
    Last Post: 03-04-2010, 05:11 PM
  2. iostat + Top
    By Phil_B in forum Hosting Security and Technology
    Replies: 1
    Last Post: 01-04-2010, 03:26 PM
  3. RAID Faulty?
    By tanfwc in forum Hosting Security and Technology
    Replies: 0
    Last Post: 08-23-2008, 03:42 AM
  4. iostat - is this fine?
    By wheimeng in forum Hosting Security and Technology
    Replies: 3
    Last Post: 11-02-2004, 10:07 AM
  5. iostat not showing anymore
    By freeflight2 in forum Hosting Security and Technology
    Replies: 0
    Last Post: 05-29-2004, 11:09 PM

Related Posts from theWHIR.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •