I put a question mark in my title because I'm unsure if that's the correct term for my goal.
Can anyone point me in the right direction on how to use storage on multiple servers as a single cluster?
I thought storage cluster was for that but, after much googling, and even more help from here, I don't think that achieves my goal.
My goal is to have multiple servers share a file system, to act as somewhat of a network raid, so if node-A goes down the files are available on other nodes, and hopefully so when the capacity of the nodes are reached I can add nodes to expand the "cluster".
I have no idea if my terminology is correct but I'd appreciate any feedback.
For example I read a small blog from backblaze named "Petabytes on a budget" where they go into great detail their infrastructure. I understand a lot is proprietary but is there anything out there that's open source and just requires config?
Gluster is not only stable, the latest version(s) are nearly idiot proof.
I was about to go glusterfs myself until I came across the split-brain problem with it. I don't relish the idea of having to touch every single file just to get it to self-repair. I'd hardly call it idiot proof.
Ceph is suppose to deal with it better with active cluster monitoring. Although its still in development/testing.
If we're not restricted to POSIX, then the world expands greatly.
Cassandra, MongoDB, CouchDB, HDFS, OpenStack (Object Storage). A lot of these use quorum based methods which should minimize/alleviate network problems.
If we're just talking about a simple two-node cluster, then drbd is also an option.
In any case, google
"distributed file system"
"distributed object storage"