Results 1 to 21 of 21
  1. #1
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939

    Need opinions - ext3 on a very large filesystem

    Hi,

    We're considering deploying a large server that will have 8x 500GB drives in a RAID-10 config. I intend to use a 3ware 9650SE w/ BBU along with A/B power to each of the PSU's.

    My question is... since this will return into a 2TB array/partition, in event of a crash (kernel panic, etc -- I expect a power outage will be very, very rare if at all) what do you guys think the fsck time would be? In my experience a RAID BBU significantly drops it, sometimes to the point of no manual fsck required, but in event of a manual fsck shouldn't the BBU be able to provide more consistent data (less errors) and therefore a much shorter fsck? Maybe just recovering the journal?

    Any input?
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  2. #2
    Join Date
    Jun 2003
    Location
    UK
    Posts
    6,601
    What are you going to be using the system for and what sort of files? Lots of big ones, normal selection? I'm just thinking you could do something like large block sizes which would reduce the speed of fscks etc but you would get overhead in lost space

    Rus
    Russ Foster - Industry Curmudgeon

  3. #3
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    It would be ext3 with 4kb block/inode sizes (the max ext3 supports). This will be a host for VPS accounts so it is really up the customer. It will have more smaller files than large files for sure.
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  4. #4
    Join Date
    Mar 2001
    Posts
    1,434
    Just disable the auto fsck settings with tune2fs for "Maximum Mount count" and "check interval" with an ext3 FS, and rely on the journal recovery to shorten reboot times. If there is damage, you can manually run an fsck or force one.

    - John C.

  5. #5
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    Disabling those forced checks is standard for us. My issue is with the manual filesystem checks that fsck -p cannot fix. With a RAID BBU the amount of forced/manual filesystem checks should be greatly reduced, correct?
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  6. #6
    Join Date
    May 2008
    Location
    FL
    Posts
    337
    Quote Originally Posted by TheWiseOne View Post
    Disabling those forced checks is standard for us. My issue is with the manual filesystem checks that fsck -p cannot fix. With a RAID BBU the amount of forced/manual filesystem checks should be greatly reduced, correct?
    Eh, RAID doesn't prevent file system corruption from sudden power offs. It mainly protects against drive failure.

  7. #7
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    It should with a BBU installed on it... that's the entire purpose.
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  8. #8
    Join Date
    Jul 2006
    Location
    Detroit, MI
    Posts
    1,955
    ZFS is your friend.

  9. #9
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    Oh trust me... I LOVE ZFS. But no ZFS on Linux, and Virtuozzo requires ext3. I suppose I could build a ZFS SAN, but the issue there is redundancy and performance.

    I suppose I will have to wait until RHEL supports ext4....
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  10. #10
    Join Date
    Jul 2006
    Location
    Detroit, MI
    Posts
    1,955
    Quote Originally Posted by TheWiseOne View Post
    Oh trust me... I LOVE ZFS. But no ZFS on Linux, and Virtuozzo requires ext3. I suppose I could build a ZFS SAN, but the issue there is redundancy and performance.

    I suppose I will have to wait until RHEL supports ext4....
    iSCSI is your new friend then. You can even boot from the iSCSI mount^N^N^N^N^N LUN.


    ZFS + iSCSI target ---> iSCSI initiator + ext3


    Now you still have the issue with power-loss with ext3, but at least you can take snapshots with the ZFS backend.

  11. #11
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    These are local disks, not SAN.
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  12. #12
    Quote Originally Posted by TheWiseOne View Post
    It should with a BBU installed on it... that's the entire purpose.
    No. A BBU provides power to the RAID card when power is lost so the state of the RAID card's cache is kept. Usually a BBU provides power for 2-3 days from power outage so you must have power restored within this time frame so your Array doesn't become corrupted due to inconsistant data, if changes exist in the cache which have not been flushed to disk yet.

    A RAID card/BBU has nothing to do with that consistancy/state of the file system.

    We have a few 2TB sized arrays with EXT3 and a fsck on those typically takes around 20-30 minutes to complete.

  13. #13
    Join Date
    May 2008
    Location
    FL
    Posts
    337
    Oops, wrong thread

  14. #14
    Join Date
    Aug 2001
    Location
    Orange County, CA
    Posts
    532
    Quote Originally Posted by vibrokatana View Post
    Eh, RAID doesn't prevent file system corruption from sudden power offs. It mainly protects against drive failure.
    It's critically important to make that point too. A RAID is no substitution for off-server backups.

    We've had an inadvertent hard reset corrupt the currently active parts of a RAID ext3 filesystem. Files turned into hard-linked directories; directories were replaced with files full of garbage; fsck ended up tossing a lot of crap into lost+found on boot. The mirror doesn't do you much good at that point.
    Jeff Standen, Chief of R&D, WebGroup Media LLC. - LinkedIn
    Cerb is a fast and flexible web-based platform for business collaboration and automation. http://www.cerbweb.com/

  15. #15
    Join Date
    Jul 2006
    Posts
    285
    You could look into mounting with "data=journal" and placing the fs journal on a separate flash device or solid state disk. You would be less likely to be left with a dirty fs and journal playback is a lot faster than an fsck.


  16. #17
    Join Date
    Aug 2001
    Location
    Orange County, CA
    Posts
    532
    Quote Originally Posted by [email protected] View Post
    it is true that EXT3 can handle up to 8TB only?
    4096 block size shows 16TB max filesystem. [http://en.wikipedia.org/wiki/Ext3#Size_limits]

    The difference probably comes whether you're using 32-bit or 64-bit.
    Last edited by jstanden; 05-24-2008 at 10:33 PM.
    Jeff Standen, Chief of R&D, WebGroup Media LLC. - LinkedIn
    Cerb is a fast and flexible web-based platform for business collaboration and automation. http://www.cerbweb.com/

  17. #18
    Join Date
    Mar 2003
    Location
    Chicago
    Posts
    285
    Chong, with 4k block size you can use 16tb and with 8k block size you can use 32tb partitions.

    http://en.wikipedia.org/wiki/Ext3

    Thanks for the file server by the way. It arrived on Monday with some UPS dings but due to the double boxing the server itself was in perfect condition!

  18. Quote Originally Posted by scooby2 View Post
    Chong, with 4k block size you can use 16tb and with 8k block size you can use 32tb partitions.

    http://en.wikipedia.org/wiki/Ext3

    Thanks for the file server by the way. It arrived on Monday with some UPS dings but due to the double boxing the server itself was in perfect condition!
    thanks for the pointer!

    when you get a chance, please post some performance testing result for the Adaptec 51645 RAID card (1.2Ghz dual-core RAID processor, PCI-E 8-lane, 512M buffer). on a 12x 1TB RAID5 volume we just tested, it achieved almost 300Meg/Sec write speed when we did a straight "dd" write 0 test to a 20G file. that was more than double the write speed from the best that 3ware or Areca can put out! just wondering the real world performance of the Adaptec 5000 series is while it's in production...

  19. #20
    Join Date
    Jan 2004
    Location
    Pennsylvania
    Posts
    939
    Quote Originally Posted by jstanden View Post
    It's critically important to make that point too. A RAID is no substitution for off-server backups.

    We've had an inadvertent hard reset corrupt the currently active parts of a RAID ext3 filesystem. Files turned into hard-linked directories; directories were replaced with files full of garbage; fsck ended up tossing a lot of crap into lost+found on boot. The mirror doesn't do you much good at that point.
    Did you have a BBU on the RAID card? I've seen that too, but only on systems where no BBU was installed. Basically the entire point of this post is to see if people see problems such as this on very large filesystems w/ BBU backed RAID.
    Matt Ayres - togglebox.com
    Linux and Windows Cloud Virtual Datacenters powered by Onapp / Xen
    Instant Setup, Instant Scalability, Full Lifecycle Hosting Solutions

    www.togglebox.com

  20. #21
    Join Date
    Aug 2001
    Location
    Orange County, CA
    Posts
    532
    Quote Originally Posted by TheWiseOne View Post
    Did you have a BBU on the RAID card? I've seen that too, but only on systems where no BBU was installed.
    Hey Matt,

    We've since ditched that particular server, but it was hosted over at LayeredTech at the time. I'll pass the BBU question on to them to see if they have a record of it. My guess would be that they're not equipping BBUs by default.
    Jeff Standen, Chief of R&D, WebGroup Media LLC. - LinkedIn
    Cerb is a fast and flexible web-based platform for business collaboration and automation. http://www.cerbweb.com/

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •