I need some advice. I've been working with shared hosting for the past few years.
I have a client who I have built a site for (PHP/MySQL) who has big plans for the site. It is a special site that will be handling mp3 files (accounts will be able to be created where a user can upload a file and then the files will also be able to be downloaded - user created mp3s - completely legal). The client has some large growth plans for the site and I think it would be wise to start off on a dedicated server.
I am planning to use Amazon's S3 service to host the media files, but I am still going to need a solid server to host the site, database, mailing system, etc.
I've been doing a lot of reading from these forums to try to get an idea of good companies as well as what exactly I should be planning to get. I thought it would probably be best to ask for some advice.
I will be the one the client will look to to make sure that their server is running properly. I am a real quick learner, but I want to make sure that I am fully meeting my clients needs.
The first decision I have been trying to make is whether or not I need managed servers. I am leaning towards wanting to go towards a managed server solution, but at the same time, I have a feeling I will be learning pretty quick and really not need the managed solution that much once I am set up and running (my biggest concern is getting everything up and running properly - PHP, MySQL, mail, daily backups, etc. - all the basics you get out of the box with a shared host). I'm afraid if I go with a managed provider like server4sale, I will end up stuck going through them instead of ultimately dealing with the datacenter that houses my server.
A second solution I am considering is going with SoftLayer and utilizing their $3 support tickets to get everything set up and if I get the feeling that I am not going to be able to handle the system, I could find a third party to manage (rack911, platinumservermanagement, etc).
After that I will need to make some decisions about OS, processor, hard drive, bandwidth, control panel, and upgrade options.
As for those decisions, here are my initial thoughts. For OS, I have a dreamhost account and I understand they use Debian so I have some experience there - although if I go with SL, it seems they don't offer Debain but CentOS, RH, and some others. I have a feeling I can learn any of them pretty quick, just not sure what to choose.
Processor - this one I really don't know. I realize that there are different ones that will really depend on what you need the server for. I'm curious as to how much of a noticeable difference there is between the lower processors and the upper ones (celeron, opteron, pentium iv, dual core, etc.).
Harddrive I don't think is a major concern since I will mainly be using Amazon S3.
Bandwidth, I'm not sure. I'm thinking about unmetered (users will be uploading mp3 files, they will be process on the server, then sent to S3 - all downloading will be from S3).
Control panel - any thoughts - I understand no CP is best, but I think starting off with a CP I would feel more comfortable.
I guess really I am just looking for some feedback. Any thoughts?
Well you certainly have done your research. I would say your best bet would be to go with a datacenter that ALSO provides managed services that way you are not working with two different companies. As far as bandwidth if you are surfing away to Amazon S3 for the downloads then you won't really need a crazy amount of bandwidth. It will be much cheaper to go with a metered server then unmetered. Debian is not bad but there is a lot of information on the net related to RH and CentOS.
Thanks for the response. As for the bandwidth, I agree that I won't need it for users downloading, but the plan is for uploading to first be handled by the server. When a user uploads an MP3, I will be using Sound Exchange (sox.sourceforge.net/) and LAME (lame.sourceforge.net/) to automatically transform the files to the proper format (We are aware of the MP3 licensing fees).
With this in mind, I'm thinking that I will need a decent amount of bandwidth to handle uploading and then downloading onto the S3 servers.
I'm also thinking that one of the biggest processing concerns will be processing MP3s when they are uploaded. For instance, all uploads will be reencoded to 192 Kbps.
So....any more thoughts about uploading bandwidth concerns and then offloading to the S3 servers.
Also...does anybody have any recommendations for any datacenters that also offer management. I'm currently now aware of any (other than Rackspace - not really interested in their prices).
And...I'm leaning a little towards CentOS right now.
You are going to need a beast of a machine if you are going to be encoding MP3s. I would highly recommend you use some type of queueing process and not do the encoding real time as soon as they upload the file. You may even be better off having a 2nd server just doing the encoding so that it doesn't cause your site to crawl on its knees.
I had a feeling that might be the case. I figured I would set up a queueing system if the system appeared to be slow. I'm kind of liking the idea of a 2nd server to simply handle encoding. Any thoughts on what would be needed for that kind of machine.
Also, since I am coming from a shared server environment...what are the logistics of working with two servers. For instance, when you upload in PHP, the file is available as a tmp file until you save it to a location. I'm assuming you would simply have the move_uploaded_file() command and point it to the second server.
I realize that I'm throwing a lot of questions out here. If anybody has any thoughts on any of my questions, I appreciate it.
Well you would need to authenticate to the other machine somehow. You could setup an FTP server on the 2nd machine and then FTP the uploaded file the 2nd machine from the 1st. Then you would have a script on that box to detect the new uploaded file (could use cron to make it automatically check every so often) and start encoding it.
I would recommend either a Dual Xeon or Opteron for the 2nd machine. Processing power is going to be most important.
Makes perfect sense...you layed it out real nicely. When I saw your response, I started thinking about how to check for the file. I figured using a combination of cron and the script itself to check if there were any other files uploaded to the server.
Brings up another question...how does the FTPing affect bandwidth. I know that SL offers unlimited private bandwidth (I'm assuming this falls under private bandwidth if you have both servers located with the same company) - is that pretty standard or am I going to have to be looking for bandwidth for uploading - ftping to processing server - then sending to S3. In other words, will I need to count on that file taking up bandwidth three times or just twice?
There are a bunch of different way you can handle encoding your files. One reasonable architecture would be to upload files to one server, then have a cronjob send them over to the other server for processing. When the second server is finished it sends it back. This should all be asynchronous. I'd probably do some REST-type service. For example, you've got a web application running on the machine that does the encoding. And of course you have your regular upload-application on the webserver. The process would go something like this:
1. User uploads file to webserver
2. Webserver sticks it in a queue directory
3. cronjob takes each MP3 and sends a HTTP POST to http://encodingserver.mydomain.com/song with params kbps=192 and the mp3 file
4. encoding server just returns a 200 and sticks the file somewhere
5. Process the file in real time, or use a scheduler of some sort
6. When a file is ready, send it back to the webserver using a similar method.
How you implement 4-6 (where to store file and how/when to process it) is really up to you. When the user uploads the file you'll want to have some metadata such as the user id which you can pass along to the encoding server, so that when the web server gets the file back it knows which one it is. Best bet is probably just to create a db entry for the MP3, then send that exact id along to the encoding server. In fact, once you do that, instead of having the encoding server push it back to the web server, you could just poll it. For example:
This might sound somewhat complicated, but it's really a simple architecture. A very nice benefit of it is that you can put the encoding stuff on your main server at first. Then when you find you need a second server for encoding, you just install the app to the encoding server. In your webserver just set it to look for encoding.mydomain.com and then you can update your DNS when you make the switch.
Thanks pergesu...makes sense. I think I've got the logistics covered now.
If anybody has any more thoughts on the server setups I am going to need...I would really appreciate it. Company recommendations would be really nice if you have any ideas. I'm still trying to find a good datacenter that provides managed servers.