
|
View Full Version : Large website / Database question
ThreeD 05-23-2005, 02:46 PM Ok, so I'm on the verge of making a huge website, that's the plan anyways, and I need it to be run off of a HUGE database. What would you guys suggest, mssql, mysql? for running the website apache, IIS, Windows SQL Server? I've pretty much made up my mind about PHP, so I guess the initial aim would be Apache and MySQL, but I'm open for suggestions..
I've made several websites with php, apache and mysql, and it worked fine. I'm just worried about websites with a lot of traffic and a lot of search functions, if that will overload Apache and MySQL.
I'm kind of curious as to what the big websites out there are using? eBay? Amazon? Nba.com? Cnn.com? Or sitepoint for that matter, what database do you guys use? And what kind of server? What do you guys think?
If any of you have experience with huge websites please share you experience I want to make sure to get it right from the getgo..
thanks in advance..
ActivI 05-23-2005, 04:02 PM Hello DreamLevel.
If you are to make a website that is huge I'll say for one that you will want some scalability in your project and make it in a way that is both scalable and easy to maintain. For that I wouldn't sugest PHP, I'd rather switch the task for .Net.
As for the database if your website requiers some advanced database functionality and isn't simple queries based (simple selects to fetch a list of news or a forum) I'd go with SQL Server or Firebird. Both have way more scalability than MySQL and will provide you a lot more "database toys" than MySQL.
That's just my opinion, hope it helps.
Best regards.
ThreeD 05-23-2005, 04:14 PM Thanks for the informative reply ActivI :) I'll mostly be using the database to store data, huge amounts of data that the visitors can search through and get it listed as needed. I will also be using it for news and other articles. Is that a reason enough to go for MS SQL Server?
So if I decide to go for .NET, I would have tons of new things to learn since I'm kinda new to the entire .NET framework :( I know it's not something that can be taught overnight, but how hard would it be for me to get this website up and running with my novice .NET skills? THAT is the big question hehe..
Thanks again for the quick and informative reply..
Criminal#58369 05-23-2005, 04:15 PM I say apache/php/mysql all the way, stay away from microsoft, go linux, more secure and php has more features/compatiblity on linux :D... its just better :D
ActivI 05-23-2005, 05:11 PM Despite the lack of constructiveness in your post I'll keep considering this can be a decent thread.
DreamLevel both PHP+MySql (or other) and .Net + MSSQL (or other even mysql) are able to accomplish this task for you. You need to know your resources, specially when it comes to hosting.
If you're in a host that offers both Windows (with MS SQL) and Linux at comparable rates I'd go with MS SQL for sure even if it was the simplest queries on earth. And I would do that for a couple of reasons:
- I love stored procedures and other features that MySql (at least in it's current version still lacks);
- If I ever needed to scale my project into something more complex I wouldn't need to face a database migration that can be a huge headache...
About the learning curve for the .Net language it's all a question of will to learn. It's indeed easier to know .Net and learn PHP than the other way around but it's never too late nor too hard to learn a proper object-oriented language that will be a big plus for your future, mostly in mid-large scale projects.
I was using ASP 3.0 and PHP4 when .Net came out and I had to get used to the Object Oriented style of the .Net platform. After that I find that my development time is considerably lower than before and that my apps are way more scalable and easier to maintain.
Another big plus was very noticeable when I needed to learn Java, .Net saved my day it was very similar (although nothing replaces Visual Studio .Net ;) )
Hope this helps, again.
Best regards.
It all depends..Presumably this website would need to run off a dedicated server..So if you use Linux you would have to administer it yourself or pay somebody to keep it up-to-date.
With Windows Server 2003 you probably have a beter chance at keeping the server secure (run the updates, perform some service lockdowns etc.) and get .NET working without too much hassle.
The combination I would try is, ASP.NET with C# and to cut down on the expense use mySQL for the back-end..If you cana fford it use MS-SQL though.
ActivI 05-23-2005, 06:01 PM Hello again.
I'd always start in a shared hosting, a good shared host. That would minimize the overall investment and would allow you to benefit from the edge technologies, such as MS SQL, at a very good price. Having a dedicated server in the beginning of a project is usually not the ideal. But I'm speaking about a project I don't know the specs of, might be highly recommended...
Best regards.
sasha 05-23-2005, 06:05 PM I doubt you will get your answer here. Not many people are familiar enough with more then one platform to give it to you. Well written PHP can beat badly written ASP any day, and other way around.
My personal choice is that I do not want to do anything with Microsoft or any other closed source software so my only choice for huge database driven site would be apache/PHP/PostgreSQL. For not so huge site MySQL would do just fine.
ActivI 05-23-2005, 06:13 PM Originally posted by sasha
I doubt you will get your answer here. Not many people are familiar enough with more then one platform to give it to you. Well written PHP can beat badly written ASP any day, and other way around.
My personal choice is that I do not want to do anything with Microsoft or any other closed source software so my only choice for huge database driven site would be apache/PHP/PostgreSQL. For not so huge site MySQL would do just fine.
It's like everything. Even thought I like .Net more than PHP, even now with PHP5, I still do things in PHP if it is the best for the project.
This thread is a bit like fireing in the dark, we have no specs about the project to determine why something would actualy be better. We only have the info that it's suposed to be a big site with many information stored in a database. We also know that there is a certain need for the technologies used to be somewhat scalabe for the future of the project. This is nearly nothing, or nothing at all ;)
Best regards.
mfonda 05-23-2005, 11:23 PM <off topic>
Whoever said PHP is bad for large scale applications?
Some people seem to think that PHP is only good for small little scripts, and .NET is a far superior choice for large applications. However, this is very far from the truth. Many many large, huge even, applications are powered by PHP. Yahoo! for example is one, many others as well. PHP is excellent for use in large and enterprise applications.
</off topic>
So back on topic, you are right by choosing PHP, you will find to to fit your needs perfectly. Apache is also a good choice, it is arguably the best web server there is. However, MySQL is not the best choice for a site with a huge database. By nature, MySQL does not do so well with large databases. You would be much better off using something like PostgreSQL or Oracle. PostgreSQL is an excellent choice to use IMO, handles large databases very effectively.
Actually, I do not think Yahoo! is "powered by" PHP for the heavy-duty front-end widgets but if you can provide proof on the contrary then I will stand corrected. In any case, if they do use it (which is most likely for their back-end processing, CMS or such) it is heavily "scaled up" and modified, and amongst their other technologies that they heavily invested into.
As far as .NET goes..It is suited for bigger projects a lot better. The templating technology makes it very easy to use in large groups (reusing controls etc.) and makes working with designers a very simple task as opposed to PHP-solutions such as Smarty which have a lot of overhead and usually require a lot of time/training for designers to mess around with.
Scalability wise..without a PHP Accelarator, PHP doesnt have too much to say on that.
Don't get me wrong, PHP can be used for large projects..It just requires more discipline, and some work arounds in terms of templating.
In the end..calculate the costs of going each route, the server administration costs, upkeep, maintenance etc. And come out with a sensible idea of which path to take.
prices123 05-24-2005, 01:41 AM If you expect your app to be complex and handle huge loads, my opinion is C++, or Java should be used instead of PHP. I did not mention .NET because mainly of cost. Apache and mySQL should be good enough. Add a middle tier (either Java or C++) to handle caching and complex logic to boost performance.
From doing a quick check, I believe that eBay is using IIS with IBM WebSphere App Server (Enterprise Java) and Amazon is using Apache with cgi (possibly C or C++).
ActivI 05-24-2005, 02:37 AM Hello again.
I'm not going into any PHP vs .Net discussion, it's just a hell of a way to lose time to end up without any overall winner.
Will just say that out of the box .Net is better suited for mid-large projects and the cost you have with the Windows licence is easely paid off in development / tweaking time saved againts a PHP aproach.
As for the database it all depends on the resources you have, once again, if you're going for a dedicated box where MS SQL would cost a fortune and MySql would defenetly be a bad choise I would highly recomend Firebird which is way better than PSql. This option alone will reduce your database software cost to 0, nill.
PS: Please note that these would be my options given the information about this project. It's not ment to cause any discussion in a technology head-to-head.
Best regards.
I dont see how the cost of .NET can really be any higher than any of the solutions you mentioned above..At the very most it probably wouldn't be more than $1, 000 to get the license for Windows Server 2003 and .NET comes free with that. That is, if he is doing it in-house..If he gets a reliable datacenter and pays out around $300 a month for solid performance, the Windows server 2003 License wouldn't cost an extra dime..
And there's very few reasons to host servers in-house now-a-days..
ThreeD 05-24-2005, 02:46 AM thanks for the feedback and input guys.
About hosting - I'm currently using a reseller account to host several of my websites. I could always switch over to a dedicated server, but as activI stated, I won't worry about that until the website has accumulated huge amounts of traffic and needs more bw/space. My current account holds 5 gig of space and 50 gigs of BW.
To put it in more plain words, the website will be a "reference website" where people can search through hundreds of articles, search for words and descriptions, search for data in huge tables and so forth. In other words, a lot of search strings and a lot of data will be sent back and forth between the database and the end user.
When it comes to site traffic, I hope the site will generate about 10000 hits a day after a year or two. Still haven't figured out if that is a realistic goal, but I've worked out some pretty good SEO techniques and general promo ideas, so 10,000 is probably a pretty good estimate.
I guess I could also mention that I'm going to run a forum on the site, which eventually will have thousands of members. For that matter I think the forum of choice is vBulletin. I've tested it a couple of times and it seems like the most stable board around as well as well as the most customizable one.
In the end I should tell you guys that I'm more of a designer than a hardcore programmer. I can get back into programming, but I haven't used C/C++, VB, Java since I was in school about 4 years ago. The last couple of years I've focused more on php,SQL and Flash AS. That's why I'm looking at getting started with .NET as a huge task that will delay the launch of my site. Then again I gather that .NET really is worth looking into since the end result of my site may be so much more efficient and well structured.
I'll be thinking about this for a couple of days and make my final decision, it's already given me a couple of sleepless nights, and it will continue to do so until I make up my mind :D
thanks again for the feedback guys..feel free to come with more input!
I suggest going with Apache/PHP/mySQL then to start off..In the mean time while you are building that website, head on over to http://asp.net and check out the Visual Express builder..Play around with it and see how you like it.. That will get your feet wet in the technology enough to know if it warrants that you switch to something more scalable when you need to jump to a dedicated server :)
Burhan 05-24-2005, 05:00 AM First -- define HUGE. Huge means different things to different people.
MySQL 3.22 had a 4GB (4 gigabyte) limit on table size. With the MyISAM storage engine in MySQL 3.23, the maximum table size was increased to 8 million terabytes (2 ^ 63 bytes). With this larger allowed table size, the maximum effective table size for MySQL databases is usually determined by operating system constraints on file sizes, not by MySQL internal limits.
The InnoDB storage engine maintains InnoDB tables within a tablespace that can be created from several files. This allows a table to exceed the maximum individual file size. The tablespace can include raw disk partitions, which allows extremely large tables. The maximum tablespace size is 64TB.
From the MySQL Manual (http://dev.mysql.com/doc/mysql/en/table-size.html).
There is no reason why you shouldn't consider PHP/MySQL as a viable alternative. I don't think you quite understand the scope of your project (mainly because you use words such as "huge"). Please be more specific to get better responses.
ThreeD 05-24-2005, 05:49 AM Originally posted by fyrestrtr
First -- define HUGE. Huge means different things to different people.
HUGE means several hundred or thousand of articles, each several pages long with images and animations (maybe even movie clips). In addition to that this reference website will have multiple search functions that lets people do a search in tables for certain data and information. These information tables will each hold thousands of variables and char's. The search result displayed to the end user will be in tables as well.
maxymizer 05-24-2005, 07:20 AM MySQL is perfectly capable of providing solid performance for such a website.
It all depends on how you develop the website's software.
I see no performance-wise reason for choosing .Net (though some people really do develop faster using .Net).
A decent developer can develop solid, modular and fast application using any platform, be it .Net + Win or LAMP.
Personally, I'd go for open-source solution (Linux, Apache, MySQL, PHP) but I'd get a mid-end dedicated server to host that website.
The reason why I'd go for mentioned solution is that I find myself developing faster in LAMP enviroment than in .Net (I kindly ask people that read this not to "attack" me because of this statement, it's not an issue in this topic).
Performance-wise - MySQL is NOT that bad. It's good for websites such as yours but it would be really bad for something like google's database.
And if you don't require features such as views, stored procedures, triggers - you have no actual reason NOT to use MySQL (version 4.0.x or 4.1).
On top of it, you had some prior experience with MySQL so it should be no problem for you to pick-up some new knowledge on indexes, speeding up querys, using proper columns and some MySQL specific SQL addons.
Having in mind that MySQL is in fifth version of development, you can also transfer your system to that newly developed db without any problem (after it's released as stable version).
People tend to blame RDBMS for some problems, but in most of the cases it's the developer who should be blamed for lack of knowledge (thus poor results).
I say - go for what you've used before, but do some reading to get better performance results. It will be the fastest solution at this point and probably the cheapest.
sasha 05-24-2005, 07:58 AM Originally posted by Emil
As far as .NET goes..It is suited for bigger projects a lot better. The templating technology makes it very easy to use in large groups (reusing controls etc.) and makes working with designers a very simple task as opposed to PHP-solutions such as Smarty which have a lot of overhead and usually require a lot of time/training for designers to mess around with.
Scalability wise..without a PHP Accelarator, PHP doesnt have too much to say on that.
Don't get me wrong, PHP can be used for large projects..It just requires more discipline, and some work arounds in terms of templating.
.... [/B]
Off topic.
About PHP templating. There is no such thing built in PHP and that is good. PHP is open ended and it alows you make your own choices, including choice of the way templating should work. Smarty engine is just one of the options (that is not too much to my liking). My templating engine requires no knowledge of PHP at all and I cooperated with designers in the past who needed nothing more then graphic skills and some very basic HTML knowledge to work with it. Just yesterday I used that same tempating engine to "spit" out website and in a same time XML configuration file for some home made flash gallery. Regarding "reusing controls", well I reuse way more then that. In a moment I type in "makeproj projectName", there is already website out there that has working templating engine, cms interface, user administration, shopping cart and more and it can be completed by designer with not much PHP knowledge.
prices123 05-24-2005, 05:52 PM If you go with MS platforms, you'll pay a lot for the Development toolkits, OS, and possibly other software not already included in the OS bundle. SQL-Server, for example, will be a big expense if you stick with all MS. This gets multiplied if you scale to multiple server setups. With open source platforms, you just need to pay for the hardware. The time to think about scalability is at the design phase so you can architect it in a way that you would just add more bandwidth and hardware to your setup to keep your response times within specifications.
Lets be serious now..Open source doesn't mean you just pay for the hardware unless you are a one-man operation that can handle programming, database design, and server administration with all of its intricate details during the day, in which case it does mean hardware only.
However, not many people are in that position, which means hiring a low-end server adminstrator to keep up with patching, security alerts, and program updates would cost at least $100 a month, for cheap work done remotely, and don't count on getting more than a couple of hours of work or requests.
Suddenly the price difference is not so wide..;)
@sasha
I've worked and work with PHP5 (especially), and I do not see how a lack of a feature becomes an asset. I really don't. Sure, it provides flexibility in being able to coose different templating mechanism or fine-graining features..The issue is that a templating solution will be slow and inefficient as it essentially becomes another "layer" written in PHP to handle the templating.
The only other option is doing PHP5 Objects that get serialized to XML, and then using XSLT to transform it into something usable..Such a method has its advantages, but still requires a lot of work-arounds.
The faults with ASP.NET is heavy, heavy JavaScript usage, and the VIEWSTATE which can get pretty large, although, the size has been cut down in ASP.NET2.0 and was made more flexible.
Googled 05-26-2005, 01:18 AM Really depends on how big your site will be OR your database.
For a strong and cheap solution here we go:
- Apache 2.0 or Zeus (high performance)
- PHP is just fine if you don't need anything MS (stay away!)
- PostgreSQL because it's a really stronger/faster database, for BIG data, than MySQL and for the same price (free :P)
Sincerely
TimSG 05-26-2005, 01:26 PM Originally posted by DreamLevel
HUGE means several hundred or thousand of articles, each several pages long with images and animations (maybe even movie clips). In addition to that this reference website will have multiple search functions that lets people do a search in tables for certain data and information. These information tables will each hold thousands of variables and char's. The search result displayed to the end user will be in tables as well.
No offense DreamLevel but that's not huge. :) Huge is the realm of hundreds of Gigabytes or more even tens of Terabytes and 10,000 unique visitors per hour. :) Basically, .NET + mssql can handle your load, so can PHP + Apache + mysql. Both will be able to scale well.
So in answer to your question, which area are YOU more comfortable in? .NET + Mssql? Or Apache, PHP, Mysql? (Note: i'm not talking OS here... as you CAN run apache + php + mysql on a windows box.)
Also, I seem to remember about a year maybe year and half ago, either Google or Yahoo made a presentation at a PHP conference regarding their usage of PHP + Mysql + Apache.
nnormal 05-27-2005, 10:53 AM Don't get me wrong, PHP can be used for large projects..It just requires more discipline.
I think this hits the nail on the head. PHP allows you to write really sloppy and disorganized code however it also allows you to write reusable, oo code with a coherent and easily extendable architecture. If you are going to use PHP for a large enterprise level project, take some time, write some UML or other design docs, and dont just jump in and start coding. A "proper" oo language (java, .net) on the other hand strongarms you into writing in this fashion which could be good or bad depending on opinion.
On a side note. SQLserver > mySQL for one major reason IMO : stored procedures.
error404 05-27-2005, 07:18 PM Another vote for PostgreSQL. While there are quite a few large projects out there running MySQL (Wikipedia comes to mind), there are a lot of things you simply cannot do in MySQL that can really affect performance quite a lot. Until recently there weren't any stored procedures in MySQL; while Postgres has supported multiple SP languages for a long time now. Ditto for subqueries which can really improve performance in some cases. That's not even mentioning the design-related improvements like proper constraints and triggers, referential integrity, custom data types, views and all sorts of other fun stuff. Quite simply it's a better database than MySQL, which until recently ignored enough relational database ideas that it wouldn't really be right to call it a 'relational' database. For a project of any sort of size, I wouldn't touch MySQL. Go with a real database like Postgres, Firebird, or if you feel like paying, Oracle or MSSQL.
As for the frontend language, it's really not all that relevant. Code in what you know best. You're most likely to get a working application in a minimal amount of time, and less likely to have introduced security holes or bugs because you didn't really know what you were doing. That said, I'd probably go with PHP or Java, preferring PHP due to it's rapid development.
shellprompt 05-31-2005, 12:44 PM Personally, from the sound of your project, I'd go with Oracle HTMLDB...not as widely used as most of the other technologies mentioned in this thread, but more than up to the job. You get the massive scalability of a backend Oracle database, combined with an easy to use web-based GUI to design your project with.
mikaelhg 05-31-2005, 01:38 PM Originally posted by DreamLevel
To put it in more plain words, the website will be a "reference website" where people can search through hundreds of articles, search for words and descriptions, search for data in huge tables and so forth. In other words, a lot of search strings and a lot of data will be sent back and forth between the database and the end user.
Three things you want to get right from day one:
1. use a scalable search engine - I recommend Apache Lucene [1]
2. cache your pages and query answers efficiently
3. measure your resources and understand where they go
[1] http://lucene.apache.org/
I would recommend against "magical" solutions which might feel advantageous in the short term and to junior developers, but in practise backfire when things go wrong and it's impossible to penetrate the "magic" to find out what you must do to get your application back up again. This is why I'd rather go the Java route than take the One Microsoft Way, which otherwise offers, in the short term, almost the same advantages as Java plus many more salespeople.
ThreeD 06-01-2005, 01:45 AM a big thanks to all of you for the great feedback :) there are definately a lot of things to consider, so many options, so many possibilities. I hope I can make a final decision before I'm old and grey hehe.
Right now I'm fiddelin around a little more with the general layout, while testing different solutions. I'll keep you guys posted what works best for me. Thanks again for the great feedback..
I would recommend against "magical" solutions which might feel advantageous in the short term and to junior developers, but in practise backfire when things go wrong and it's impossible to penetrate the "magic" to find out what you must do to get your application back up again. This is why I'd rather go the Java route than take the One Microsoft Way, which otherwise offers, in the short term, almost the same advantages as Java plus many more salespeople.
While your post is cunningly written [1], especially the Lucene "footnote", as well as the subtle and not so subtle insults regarding junior developers, you show absolutely nothing in terms of tangibles on why the "One Microsoft way" offers "almost" (I do repeat, almost) the same "advantages" as Java, except of course more salesmen.
[1] Denotes heavy sarcasm. (http://dictionary.reference.com/search?q=sarcasm)
mikaelhg 06-01-2005, 10:39 AM I didn't mean to write something that could possibly be interpreted as insulting juniors - what I meant was that it is quite easy to fall into the trap of believing the salespeople pushing "magical" solutions to problems, when you haven't yet made that mistake and seen the resulting mess a few times. Everyone must start from that point at some time, so I personally wouldn't be insulted when identifying myself in that position.
|