Web Hosting Talk







View Full Version : Can 'LAMP' light your way no matter how big your site gets?


vk101
02-18-2005, 12:07 AM
Thanks for reading this post! I have some questions about LAMP in terms of scalability:

1) Can the LAMP system (Linux, Apache, MySQL, PHP) theoretically work no matter how big you get?

Sure you can make a small- to medium-sized website with LAMP, but, in theory, can you support a Google.com, Msn.com, or Dell.com with LAMP? Keep in mind the insanely ridiculous amount of traffic such sites get, the constantly-changing, fast-paced environment they operate in, and the need for secure payment processing systems, etc.

2) Do you have any examples of ENORMOUS sites using LAMP?

3) Would you say LAMP is generally as easily scalable as other systems?

4) Does LAMP remain reliable even in totally demanding environments like a Yahoo.com?

5) Are such huge sites using LAMP therefore saving a HUGE FORTUNE by paying nothing for their systems compared to other sites? Why would rational sites not all use LAMP then? Is it too good to be true? Am I missing something here? Is it really not as 'free' as I think? Discuss...

Thanks so much for reading this, I hope you can provide some insight on this, as I'm definitely interested about LAMP, its scalability/reliability for huge sites, and wondering what exactly it's all about and why everyone doesn't use it. Thanks again!

Carp
02-18-2005, 12:11 AM
Sites like Google, Yahoo, etc... run on multiple servers
.

ilyash
02-18-2005, 12:18 AM
php is not advanced enough imho to work in a large scale environment.

Throw Java instead, and you have a "LAMJ" lol.

That would work on a large scale.

Maybe replace mysql with oracle
and linux with unix..

"UAOJ"

Theres your new lamp.. lol

JustinH
02-18-2005, 12:44 AM
I humbly disagree :).

It just so happens... Yahoo, CMG, Motorola, IBM, Lucent Technologies... oh and Google happen to humbly disagree as well ;).

vk101
02-18-2005, 12:50 AM
Thanks for everyone's replies!

It looks like I'm seeing the same disagreement that I found while talking to people about this. Some say LAMP isn't advanced enough or reliable to work in huge environments, while others say it definitely is and there are enormous sites using nothing but this.

Does anybody have any technical reasons regarding the ability of LAMP to support or not support itself in very demanding environments?

(In addition to the specific questions in the thread starter)

Thanks so much, I appreciate it!

ilyash
02-18-2005, 01:11 AM
php has limited OOP functionality.

JustinH
02-18-2005, 01:17 AM
yep... the only reason the major search engines don't use it right in their search results. But as PHP 5 grows and OOP improves it will easily replace the java resource hog.

ilyash
02-18-2005, 01:22 AM
Java does not hog resources.
I would say this is a common myth.
If you run an identical query on a php or jsp, both will have comparable exec times.
However, for advanced jobs, with OOP, the job will run much faster on Java.
Java does not hog resources it does not need, so it all depends on the coding.

hiryuu
02-18-2005, 01:33 AM
Google did mention using Linux because n-thousand OS licenses becomes a non-trivial expense, but most infrastructures spend more on support personnel and development than upfront costs. Measuring real savings is more difficult than simply looking at licensing fees.

Similarly, Apache's process-per-request model is very heavy on memory and CPU, but Amazon is (claims to be) running on 1.3. It's flexible and absolutely rock-solid reliable, and often that's worth more than the added hardware to drive it. Despite the large improvements in 2.0, people have been very slow to 'upgrade.'

Yahoo is moving (has moved?) their core site TO PHP from their in-house code. I doubt they made that choice lightly, and I'm sure they considered java and such along the way. PHP (and other embedded languages like ASP) encourage people to merge the code and design together into a giant mess, which is how detractors often see it. The larger projects (even horrid PHP Nuke) separate the code and the output to keep both more manageable. PHP5 basically raided the Java object warchest, so there's that much less to complain about.

The weak link in the chain is probably MySQL, which is making strong progress, but still lacks the clustering, locking, and feature set of the big hitters. I think Yahoo uses it for their core site (which is basically read-only), but MySQL buckles under heavy writing. Granted, no boxed solution scales to the size of Google, but most larger sites use Oracle or DB2.

fastduke
02-18-2005, 01:49 AM
In all reality anything will come to it's knees (even well written code) without some sort of load balancing.

Big biz is just that, it can be the best suff in the world, but that doesn't equate to cost effectve.

Emil
02-18-2005, 01:50 AM
However, for advanced jobs, with OOP, the job will run much faster on Java.

Come on now, how do you equate "advanced jobs with OOP" to running much faster ? That is about as uncorrelated as you can get. PHP5 has brought in many Object Oriented features (including static methods, class constants, abstract classes, and a wealth of other features including typehinting) but I doubt it had any bearing on the speed factor.

Before that, you throw in the comment that PHP has limited OO functionality..Don't throw these accusations around unless you actually have some proof to back that up..In case you do, I would definately like to hear how PHP5 has "limited" OO functionality..It is all there, except for one or two points.

JustinH
02-18-2005, 01:54 AM
Originally posted by ilyash
Java does not hog resources.
I would say this is a common myth.
If you run an identical query on a php or jsp, both will have comparable exec times.
However, for advanced jobs, with OOP, the job will run much faster on Java.
Java does not hog resources it does not need, so it all depends on the coding.

Execution times may be correct, but I'm telling you now (being in this industry for many years) that Java by default uses more memory and processing power then PHP any day of the week.

PHP (and other embedded languages like ASP) encourage people to merge the code and design together into a giant mess, which is how detractors often see it.

I love to hate embedded code for that very reason. It does make it very useful at the presentation layer... in that I can use PHP as a templating engine on it's own, rather then develop my own crappy markup language, as is the case of Perl. However, I see SO many programs that fail to even seperate the business and presentation logic it makes me go bonkers.

Hopefully software like Savant2 will make that change... hopefully.

pnorilsk
02-18-2005, 05:14 PM
Originally posted by vk101

1) Can the LAMP system (Linux, Apache, MySQL, PHP) theoretically work no matter how big you get?

Linux and Apache - yes, MySQL and PHP - no.

2) Do you have any examples of ENORMOUS sites using LAMP?

Linux and Apache - many, MySQL and PHP - none

3) Would you say LAMP is generally as easily scalable as other systems?

Absolutely not.

4) Does LAMP remain reliable even in totally demanding environments like a Yahoo.com?

Linux and Apache - yes, MySQL and PHP - no

5) Are such huge sites using LAMP therefore saving a HUGE FORTUNE by paying nothing for their systems compared to other sites? Why would rational sites not all use LAMP then? Is it too good to be true? Am I missing something here? Is it really not as 'free' as I think? Discuss...

Yes, there are some sites with "zero-cost" software infrastructure which are built on contemporary, advanced technologies. It doesn't include PHP and/or MySQL

Peter.

sightz
02-18-2005, 05:29 PM
Google is so huge they had to write a special filesystem to deal with the miniscule error rate of IDE drives. The fact that one millionth of one billionth of the data written to a drive may get corrupted comes into play when dealing with Petabytes of data.

If they are running LAMP (or at least the L.A. of LAMP) it is so heavily modified it would be unrecognizable.

JustinH
02-18-2005, 05:29 PM
Originally posted by pnorilsk
Linux and Apache - yes, MySQL and PHP - no.


So you just ignore entire posts? Read the rest of the thread.

Linux and Apache - many, MySQL and PHP - none

See the other posts you ignored, many ENORMOUS sites use PHP and MySQL (Yahoo! being one).

Absolutely not.

MySQL and PHP NOT scalable? Are you kidding me? How about INET Interactive which runs a server cluster using... PHP and MySQL. So what do you define as scalable. MSSQL? :rolleyes:

Linux and Apache - yes, MySQL and PHP - no

See the rest I posted.

Yes, there are some sites with "zero-cost" software infrastructure which are built on contemporary, advanced technologies. It doesn't include PHP and/or MySQL

Peter.

That was arguably the most useless post I've ever read on WHT. Not only did you ignore literally every post before it, which proved EVERY one of your points wrong, you didn't even back up your statements with fact, or provide an alternative.

WHT is becoming a haven for people you pretend they know a lot more than they do.

JustinH
02-18-2005, 05:46 PM
Originally posted by sightz
Google is so huge they had to write a special filesystem to deal with the miniscule error rate of IDE drives. The fact that one millionth of one billionth of the data written to a drive may get corrupted comes into play when dealing with Petabytes of data.

If they are running LAMP (or at least the L.A. of LAMP) it is so heavily modified it would be unrecognizable.

Same is true of Windows, which is why Google and Yahoo use Linux and FreeBSD respectively. They have access to the code to alter it in a way making it more reasonable for there processing.

No generic software application will run enterprise applications with a default install. Try setting up a default Windows Server, Java and Oracle installation and watch the server die on a high traffic site. If I recall, Mathew (Prohacker) spoke about having a specific PHP installation for WHT.

We can argue to death about PHP/MySQL scalability, but I can assure you with the proper setup PHP/MySQL is easily as scalable as any Microsoft Installation. And PHP/Oracle vs Java/Oracle would win out in overall research usage. I've tested this time and time again.

But, as mentioned, the software itself is the biggest determining factor. I can write something in PHP with EVERY module installed, using SQLite and it execute MUCH faster then an enterprise ASP.NET/Oracle package that is poorly coded.

vk101
02-18-2005, 05:59 PM
Hi, thanks for the great information!

So in terms of Apache, MySQL, and PHP, they are all fairly standardized, right? It's pretty easy to just go to their respective websites, download the latest versions, and get started?

But with free OSs, that's where things get totally confusing for me! There's about a million distributions of Linux, each with a million versions, and different things like FreeBSD, a million other types of BSD, and some other things.

Can somebody please discuss how I can become less confused about this whole OS thing? What is the best in terms of scalability/reliability/functionality/customizability and everything else, in the same context of huge size as a Google.com, Msn.com, or Yahoo.com?

What are the best Linux distributions for this, and what are the other OS options out there like all the BSDs and other things...there's just SO MANY and I know NOTHING about any of them I'm going out of my mind here!

Thanks a lot for your help, I appreciate it!

JustinH
02-18-2005, 06:09 PM
Ultimately, if you are intending to use this for an enterprise site, the default PHP, Apache and MySQL installs aren't going to work. You are going to have to alter the underlying code to do the things you NEED better and the things you don't need to be removed.

To extend this, Google uses a customized Linux kernel running Google Web Server (there own server software). Yahoo runs a customized FreeBSD kernel running there own server software. MSN runs Windows Server 2003/2000 and IIS 5/6 (they run a cluster with both versions running).

I'd imagine they aren't running your typical Server 2003 install...

As for the best distributions? Most sites (google excluded) run either a customized FreeBSD kernel or one of the Unix systems. I will reiterate one thing: none of these systems are default installs, they are ALL customized for their purpose.

vk101
02-18-2005, 06:29 PM
Oh cool! So just to confirm, all of LAMP is fully customizable right? If you had people with the ability to do it, the source code is available for all of LAMP to be modified according to customized, enterprise needs? (Whereas something like Windows/Unix, Oracle, Java would not be customizable even if you had the people to do it, right? - those programs would be ready for enterprise use straight out of the box, almost?)

My next question is this: how do smaller sites that can't afford all this fancy customization or 'non-free' programs transition from their small nature to growing HUGE if they need to accommodate that? Do they basically ditch their current programs and hire people to customize new programs then integrate everything, or what? Seems like this would be an absolutely, ridiculously complicated BEAST of a task!

For the type of medium-sized site that cannot yet customize, what free OS distributions do you recommend? Any specific names? I keep hearing all these names tossed around like Debian, RedHat, Gentoo for Linux, and some for BSD too. What are your expert suggestions?

Thanks so much for your help!

pnorilsk
02-18-2005, 07:06 PM
vk101.

let me to refer another posting on this subject - what will it take to build HUGE web-centric application.
http://www.webhostingtalk.com/showthread.php?s=&threadid=370778

I hope it will help.

Peter.

JustinH
02-18-2005, 07:12 PM
Yep... keeping in mind Linux and MySQL are GPL, Apache and PHP use proprietory licensing models. FreeBSD is far less restrictive then Linux in terms of licensing (which is why Apple uses it). However, they all open source and available to alter.

That's exactly what they do ;). As stated most use a custom kernel from other applications, but the small grow slowly to that point. A great example:

WHT Started on a shared host by Matt some years ago, and grew up to its own server. After getting bigger it grew into a server cluster, with a specific PHP install (only used modules enabled etc).

At a certain point, if needed, companies that go through this growth will eventually hire a software engineer and start customizing to their needs. There is no need to START at a cluster at RackSpace, move slowly and let things grow naturally :). Usually, it happens over time, and eventually your FreeBSD install looks a lot more like a MySiteBSD install ;).

As for distribution... FreeBSD and NetBSD are both excellent. I'm a BSD junkie, so I'm partial, but the distribution you choose depends largely on what you are used to. Test them, see what makes you comfortable and go with it. They all rely on the same kernel and that is really the bottom line, since none of the distributions have something "amazing" the others lack in a server enviornment.

Burhan
02-19-2005, 02:33 AM
Lets clarify the air a bit.

First thing is first :

scalability != performance

This is a huge myth. Scalability does not mean "it will increase in peformance". Scalabiliy has to do with a lot more things (for example, maintainability, fault tolerance, etc.)

scalability != increase the number of servers

You can have the same application running on fifteen servers running behind a load balancer, that does not make the application "scalable". It makes the servers scalable.

I will not go into further details on that, if you need more fodder, google php scalability.

As far as large sites running PHP -- friendster.com (http://www.friendster.com) runs on PHP -- in fact, they moved from JSP + Tomcat to PHP to increase performance (read more on it here (http://troutgirl.com/blog/index.php?/archives/22_Friendster_goes_PHP.html))

pnorilsk
02-19-2005, 06:04 PM
Since we are talking about additional clarity here is my few thoughts,

Indeed the direct equation of scalability and performance is not appropriate. Let us define scalability since it is one of the issues in our discussion. Scalability is the ability to run your application on whatever size system makes sense, yet be able to move that application to either smaller or larger systems when needed, without effort. Further, the application should continue to work in a cost-effective manner, with both good performance and responsiveness, in relation to the system resources available to it. This definition of scalability is common to the industry and obviously quite different from what was stated in the previous posting.

IMHO with this definition of scalability the application developed in PHP cannot compete with application developed in J2EE environment. With some capability to scale vertically PHP has no solid mechanism (lack of messaging/middleware support) for horizontal scalability.

I also spent some time on the forum recommended above. From the thread few postings caught my attention. One where author admits that the one of the rational to move application from Java/Tomcat to PHP is a convince of PHP to link it with C/C++ code to enhance performance on backend with facade constructed in PHP. The other where author called attention to the documented study (I was surprised to find about that) where Java was proved to outperformed PHP by factor of 8. I have this document - "A performance comparison of dynamic Web technologies", published in ACM SIGMETRICS Performance Evaluation Review, Volume 31 Issue 3. I would be able to provide this document on request. I am first to mentioned some shortcomings in this document, but it' has a factual data, not "expert's" opinions on this thread.

So, I would suggest (if I can) to the "experts" on this thread to use hard core data, not their opinions in support of one or another point of view.

Peter Kinev.

JustinH
02-19-2005, 11:25 PM
Correct, scalable isn't the exact same thing as performance, but scalable and performance go hand in hand. Basically, scalability is how well a solution performs when the problem consistantly gets bigger.

In the world of PHP, that means how well does PHP perform, as the software gets larger and the amount the software is accessed gets larger.

Originally posted by pnorilsk
Indeed the direct equation of scalability and performance is not appropriate. Let us define scalability since it is one of the issues in our discussion. Scalability is the ability to run your application on whatever size system makes sense, yet be able to move that application to either smaller or larger systems when needed, without effort. Further, the application should continue to work in a cost-effective manner, with both good performance and responsiveness, in relation to the system resources available to it. This definition of scalability is common to the industry and obviously quite different from what was stated in the previous posting.

That's overly specific, but yes that is the idea.

IMHO with this definition of scalability the application developed in PHP cannot compete with application developed in J2EE environment. With some capability to scale vertically PHP has no solid mechanism (lack of messaging/middleware support) for horizontal scalability.

You obviously don't use PHP much. Try doing some research on Satellite and you'll see that PHP is quite capable of handling ORB via CORBA and that's the industry standard. What middleware applications does Java support that PHP doesn't? It supports the most widely known architecture, so that comment is without ground.

The other where author called attention to the documented study (I was surprised to find about that) where Java was proved to outperformed PHP by factor of 8. I have this document - "A performance comparison of dynamic Web technologies", published in ACM SIGMETRICS Performance Evaluation Review, Volume 31 Issue 3. I would be able to provide this document on request. I am first to mentioned some shortcomings in this document, but it' has a factual data, not "expert's" opinions on this thread.

Lance Titchkosky (the primary author) is a Java developer, and EVERY project that he's had a hand in (Timetracker and eCasino being the major ones) are Java applications... and in the world of research there is a LARGE difference between a "study" and a "paper". This wasn't a study.

So, I would suggest (if I can) to the "experts" on this thread to use hard core data, not their opinions in support of one or another point of view.

Peter Kinev.

And I'd suggest that you first read up on PHP, and second, not cite the biased opinions of two Java programmers working on their masters thesis.

pnorilsk
02-20-2005, 10:46 AM
To the public at large - the author of posting above tried to compensate very limited, incomplete knowledge of technologies and programming languages (IMHO as all his postings on this thread and elsewhere suggested) with, what I called “militant ignorance”. And sometimes this rhetoric could be quite confusing and believable. I cannot allow myself to be engaged in the discussion with him on this level. I will choose the second best option. I will ignore his postings.

Regards,
Peter Kinev.

JustinH
02-20-2005, 02:10 PM
Funny, I've been a PHP programmer for 5 years and prior to that developed exclusively in Perl. I've been a member here for many years as well, giving advice on everything from firewalls to programming. You, on the otherhand, simply ignored my post because it presented a logical counter to your claims.

I did my part, the fact of the matter is you couldn't come up with any evidence to the contrary of what I said, both about PHP's middleware support and about the masters thesis two java programmers wrote. And instead of admitting you can't, you attack me with name-calling and suggest that I'm "wrong", with yet again, no evidence to back up your statements.

You'll find in the world of debate, this is a fallacy-riddled argument, and if anyone suffers from "militant ignorance", it would be the person that attacks the poster and not the post. Furthermore, all of my "postings" on this thread and elsewhere are backed by one solid fundemental that yours lack: evidence.

Roy@ENHOST
02-20-2005, 02:29 PM
I am confident about the M part . I originally had doubts about Mysql and was pulling out my credit card getting ready to get oracle when I stumble upon this article:
mysql.com/press/release_2003_21.html

It put so much confident to you when you know that the nation's 4th largest cable company is powered by it.
A lesson to be learnt for web hosts: If you host a famous company, you will instantly get confidence and thus business.

lockbull
02-21-2005, 01:35 AM
I'll state my bias upfront that my company primarily develops in Java at this point, but we still do a bit of PHP work every now and then. I'd like to make a few (long) comments.

1. What type of site are you talking about when you ask about "enormous" sites using LAMP?
Yahoo!, Google, Friendster, etc. have a totally different set of criteria then a transactional website like Amazon, eBay, Orbitz, FedEx or Bank of America. While the former may rely on PHP or something like that, the latter group rely on Java, at least for their backend transactional processing. For a search engine like Google or Yahoo!, the data is skewed heavily towards being read-only, and therefore they don't run into the sort of consistency issues you encounter in transaction-oriented systems. If you look at what goes on in major financial institutions, Java is king and PHP is non-existant in that realm. Now obviously an enterprise customer is going to have a different set of concerns than a dotcom, but the transactional-oriented sites I mentioned (as well as many other large sites) run Java for their transactional backends. PHP's "shared nothing" architecture allows it to scale very well horizontally for something like a search engine or the web/presentation tier, but the lack of a persistence mechanism like Java works against it's usage in high-volume transactional systems. Java's worst problem is that there were way too many over-architected, overblown, under-performing applications put out when Enterprise Java Beans were all the rage for even moderate-scope enterprise applications (it didn't help that Sun's blueprint layed it out that way). Java can certainly be quite "lightweight" when done properly (we extensively use the Spring framework for instance), and that's becoming more the case now after all these failed enterprise projects left the managers asking a lot of tough questions about their Java architectures.

2. While it's usually somewhat obvious what a site uses for it's front end web tier, that doesn't tell you anything about what technologies they are using on the backend.
eBay for instance migrated to a Java backend that supposedly handles more than 400 million transactions per day (not financial transactions mind you), though for whatever reason they chose to use IIS for their frontend servers and not Apache. Amazon.com is a large BEA customer that uses WebLogic/Tuxedo & Oracle RAC on Linux extensively for its backend and commodity Linux/Apache boxes for it's web tier. Orbitz uses Linux pretty much everywhere with a Java SOA backend (and some custom LISP algorithms); interestingly I read they considered MySQL at one point but that was dropped in favor of Oracle RAC, whom they blamed for some high profile outages in 2003 IIRC. Yahoo! is held up as one of the poster boys of MySQL usage, but everything I've read about pertains to somewhat transitory data like stock prices from Yahoo! Finance and news items for Yahoo! News. I wonder if they use MySQL for their Yahoo! Shopping & Stores, which do transactional processing. I think that's much less likely.

3. Are you talking about usage of the entire LAMP stack or just elements of it?
I think you'll find several large sites that use elements of it, but I'd be hard pressed to name one that uses the entire stack without a ton of modications, and I doubt there are any large scale transactional systems built on LAMP. I'd say that PHP and MySQL are the weakest links for high volume transactional sites; you're much more likely to see Apache and Linux used in sites like that than PHP and/or MySQL, although a surprising number of heavily-trafficked/Fortune 500 sites use the Sun/iPlanet web server. MySQL Cluster would probably see more acceptance if it wasn't solely an inmemory database; we've actually used it for a few things and so far it's pretty solid and based on code used by some European telcos in the mid to late 90s. And MySQL 5.0 will introduce some much needed high availability features. There are other open source database that are arguably better for transactional data at this point (PostgreSQL, Ingres, Firebird), though I think the shear energy and momentum behind MySQL will close the gap and probably overtake the others within the next 12 months.

As far as middleware is concerned, that's pretty much a clean sweep for Java compared to PHP--surely you can't be arguing that the middleware portfolio of products available in PHP rivals Java in any way, shape or form? Middleware is needed in large enterprises with disparate systems--Java has a large enterprise base, while PHP has pretty much none (aside from a few installs of SugarCRM maybe at the departmental level). CORBA support is not the be-all end-all of middleware, even if you take an extremely narrow definition of middleware. Show me the PHP equivalent of products like BEA's Tuxedo, IBM'S CICS, TIBCO, Tangosol's Coherence, the ObjectWeb projects, etc. The momentum behind middleware at this point is in things like enterprise service buses, SOA, etc.; way more of that is getting done in the Java/J2EE world than PHP, simply because there isn't much of a need for it with PHP.

vk101
02-21-2005, 10:41 AM
Thanks for your responses!

1) First of all, what is web-tier, backend, and middleware? Sounds like these three things go together?

2) Secondly, when asking this question I thought that Apache, Linux, or MySQL would be the weakest links, not PHP...isn't PHP just a language like Java? The computer goes through the code and executes whatever you want it to do. I thought code in these two different languages shouldn't make a difference, it's just two different ways of writing the same thing - because you can always get the same result programming in either language, right? Why, then, does PHP seem to suck?

3) I couldn't tell what you meant by MySQL Cluster is an inmemory database? (Also, what is MySQL Cluster?)

4) I couldn't tell whether you meant MySQL 5.0 (in the next 12 months) would make MySQL the best database to use period, or whether it would make it the best of the open-source databases?

pnorilsk
02-21-2005, 12:49 PM
Iam impressed with lockbull's posting. Let me add few more critical points to his explanation in favor of Java as it applicable to development of big sites.

1. security provisioning in Java and associated topics, such as identity management are far more advanced as to compare it with PHP.

3. development time in Java could be reduced by the factor 5-10 if proper technologies and frameworks will be applied. In addition to frameworks mentioned by lockbull I add,
- Hibernate, Torque, Castor for O/R mapping and persistence
- Struts, Webwork, Expresso, Cocoon for MVC, MVC+1
- Ant, Eclipse, Maven - for IDE and development
- JSP, Velocity, JAF - for front end templating
This quite loose groping doesn't include thousand more OSS frameworks and technologies in Apache Organization and outside. Please, pay attention - all mentioned technologies are distributed under one or another form of OSS licensing. There are thousands more commercial packages available on the market.

I mentioned once somewhere in this forum, that I will be able to built e-commerce application with database support from the ground in 2-4 hours. Please, compare it with months of work in PHP.

4.code maintenance in proper designed web-application is much easy in Java

Well, here is a problem. To learn these technologies and frameworks one needs a time (few years) and solid educational foundation (master in CS would be desirable). It could not be applied to the "experts with militant ignorance", so they must find another way to defend indefensible. BTW, I have nagging feelings that lockbull, as myself has experience with more than one programming languages and technologies. During my 20+ years of development I worked with majority of programming languages and technologies including scripting such as Perl (my favor for scripting if AWK cannot handle it), Tcl/Tk (unfortunately slightly forgotten language, but used by AOL for its website), PHP, Python. Thus, we can compare and we have an authority to express our opinion.

Regards,
Peter Kinev.

folsom
02-21-2005, 02:36 PM
Originally posted by pnorilsk
Iam impressed with lockbull's posting. Let me add few more critical points to his explanation in favor of Java as it applicable to development of big sites.

1. security provisioning in Java and associated topics, such as identity management are far more advanced as to compare it with PHP.

3. development time in Java could be reduced by the factor 5-10 if proper technologies and frameworks will be applied. In addition to frameworks mentioned by lockbull I add,
- Hibernate, Torque, Castor for O/R mapping and persistence
- Struts, Webwork, Expresso, Cocoon for MVC, MVC+1
- Ant, Eclipse, Maven - for IDE and development
- JSP, Velocity, JAF - for front end templating
This quite loose groping doesn't include thousand more OSS frameworks and technologies in Apache Organization and outside. Please, pay attention - all mentioned technologies are distributed under one or another form of OSS licensing. There are thousands more commercial packages available on the market.

I mentioned once somewhere in this forum, that I will be able to built e-commerce application with database support from the ground in 2-4 hours. Please, compare it with months of work in PHP.

4.code maintenance in proper designed web-application is much easy in Java

Well, here is a problem. To learn these technologies and frameworks one needs a time (few years) and solid educational foundation (master in CS would be desirable). It could not be applied to the "experts with militant ignorance", so they must find another way to defend indefensible. BTW, I have nagging feelings that lockbull, as myself has experience with more than one programming languages and technologies. During my 20+ years of development I worked with majority of programming languages and technologies including scripting such as Perl (my favor for scripting if AWK cannot handle it), Tcl/Tk (unfortunately slightly forgotten language, but used by AOL for its website), PHP, Python. Thus, we can compare and we have an authority to express our opinion.

Regards,
Peter Kinev.

Most places, you would be locked up for stroking yourself off in public.

pnorilsk
02-21-2005, 02:49 PM
Originally posted by folsom
Most places, you would be locked up for stroking yourself off in public.
I hope you understand the meaning of my posting. I don't expect to get any advantage doing that - I care less. Too often on this forum people with limited knowledge and experience tried to pontificate on the subject they have limited knowledge or none at all. I am not against people who are trying to learn and profess. I am against "militant ignorance" in my business. Would you allow to do an open heart surgery to doctor without medical education and doing it first time? Please, think about all that in this way. And I hope then you understand my posting.

Roy@ENHOST
02-21-2005, 02:49 PM
folsom, that was totally unnecessary.

Anyway I admire you guys out there who are proficient in many programming languages.

I've been with PHP for a while now and I think that PHP has very limited OOP capability but it is very popular due to the ubiquity.
I mean most servers doesn't come equipped with java capability.

pnorilsk
02-21-2005, 03:22 PM
I have to disagree with you on last your statement - majority of servers come with java support (RH for instance). If not, you can get it freely from http://java.sun.com (one of the places). If you choose to download, select J2SDK not J2RE - first has a complete subset of Java technologies, including development support. If you need to develop a web application you will need one or another "container" - there are few of them available free of charge. I would advice at this time to get Tomcat from Apache. You also will need a development/build tool - get Ant. That's it - you are in the business to develop and run your first application. BTW, Tomcat will come with the set of JSP/servlets samples.

I agree with the rest of your posting. PHP is a language of choice and appropriate solution for the majority of people on this forum. It provides everything you need to get you on the Web with small web-centric application. But, the thread originator was asking about ability of PHP to support large and/or extremely large application. And as we believe (IMHO) PHP at present time has very serious limitations (see this and other threads on this topic).

Regards,
Peter Kinev.

hiryuu
02-21-2005, 09:04 PM
1) First of all, what is web-tier, backend, and middleware? Sounds like these three things go together?
Larger server structures are not just wide (load balanced pools), but deep, as well. The web-tier talks with the middleware, and the middleware talks with the backend (storage layer). I'm sure pnorilsk could give you pages on the topic. It gives you a level of abstraction (web-tier doesn't really care how a given thing is stored) for easier maintenance. It also filters tasks as cheaply as possible -- you can easily add new web servers, so static stuff is handled there. The storage layer is very difficult to expand, so nothing gets down that low unless it really, really needs to.

2) Secondly, when asking this question I thought that Apache, Linux, or MySQL would be the weakest links, not PHP...isn't PHP just a language like Java?
It's a language but, in this discussion, it's also an implementation. If you ran Java, PHP, and Perl all as CGIs (fork per request), they would all run like a wounded elephant. PHP is usually embedded in Apache, but each request starts from scratch, separate from any other request, so anything that requires cross-client communication or complex objects has a lot of setup work to do. Java (my knowledge is a little weak on the specifics here) is run as a separate multi-threaded process, so most of the setup work is done once and just hangs around, and threads can share data. Amazon is in a transition right now, but most of the site uses a massive FastCGI process called obidos. Like Java, fastcgi uses separate persistent processes to retain data and objects between requests. Amazon probably modified their copy, but the public version doesn't do multi-threading, so the processes don't share data except where you specifically program it in.

3) I couldn't tell what you meant by MySQL Cluster is an inmemory database? (Also, what is MySQL Cluster?)
Just like it sounds: in-memory. That does limit its versatility and reliability quite a bit. You can catch more here:
http://dev.mysql.com/doc/mysql/en/ndbcluster.html

4) I couldn't tell whether you meant MySQL 5.0 (in the next 12 months) would make MySQL the best database to use period, or whether it would make it the best of the open-source databases?
Over the other OSS databases (they're all a far cry from Oracle, if you really need those features). I also think he was describing the relative development pace, rather than MySQL 5.0 specifically. MySQL is so ubiquitous that people would rather shape MySQL into what they need than build their application to use something else. I suspect, as PHP-oriented developers move into larger projects, we'll see a similar transition in PHP. Having seen the growth from 3.x->4.x->5.0, I don't think there's anything inherently broken enough to stop it.

pnorilsk
02-21-2005, 10:25 PM
One more very thoughtful and technically sound posting (no patronizing or anything like that). IMHO, hiryuu manged to illuminate one point in very concise way what I tried to put in so many words.

Here what he said: "It's a language but, in this discussion, it's also an implementation." This statement goes to the core of our discussion.

Regards,
Peter Kinev.

vk101
02-21-2005, 11:16 PM
Thanks for everybody's information.

So what would the best thing to do be for somebody that wants to develop a website but doesn't want to change which software programs are used if and when it becomes a huge site?

Earlier, it was mentioned that things like PHP, Apache, and Linux used in huge websites have to be modified if the site becomes huge. If you just start off with things like Java, Unix and Oracle, do you still have to make any changes apart from the number of servers? Because the code you need would be the same, it would just be applied to a lot more people, right?

Anyways, if one decides to use Java and Oracle, how can you practice with these things? Oracle isn't free, is it? And with Java, I never even knew Java could do web things...do you mean JSP, the equivalent of PHP and ASP? Because if you use Java, isn't it either an applet or application? I'm not too sure, if somebody could lay all this out for me in LAYMAN's terms, I'd totally appreciate it.

Thanks everyone for all your help!

unlucky1
02-21-2005, 11:38 PM
Google was originally built (http://www-db.stanford.edu/~backrub/google.html) using C++ and Python. Didn't see if they stated a DB they used, though it's a lengthy article and I might have missed it.

hiryuu
02-22-2005, 01:36 AM
Originally posted by vk101
So what would the best thing to do be for somebody that wants to develop a website but doesn't want to change which software programs are used if and when it becomes a huge site?
The best thing would be to design some abstraction into the system, so you can replace pieces later, and not bite off more than you can chew. You can run a couple million pageviews on a fairly pedestrian setup so, unless you know you're going to move some serious traffic, you'll be better off with a shorter time to market and lower up-front costs.

With a few exceptions (MSN), the large sites you know today evolved into their current design; they didn't start there. Our main network uses three tiers, like most of the larger setups we've covered, but it started as a single PII/233 (now our primary DNS). We added the other layers as our needs grew. Most of the customizations and tweaks happen in that 'scratch an itch' manner.

This ties into a previous topic a bit: scalability describes cost relative to size. Most of this discussion has been toward massive upward scalability but, when starting off, you need to also consider downward scalability. I've only seen one single-proc Sun box -- it was the 'console' for the twenty or so giant Suns stretched along the wall. Solaris run beautifully on dozens of processors, but to run it or Oracle for a couple thousand visitors is a complete waste of money.

Also keep in mind that the solutions available to you will change as time goes on. Today, it seems natural that Yahoo would use PHP or Java to drive their site. Back in '94, nothing like that was around -- it was CGI or you wrote your own solution. Right now, Oracle is your best option along the high end. Over the next few years, MySQL and Postgres will eat most of that market, just like Linux did with entry-level servers.

Anyways, if one decides to use Java and Oracle, how can you practice with these things? Oracle isn't free, is it? And with Java, I never even knew Java could do web things...do you mean JSP, the equivalent of PHP and ASP?
Oracle has a Personal Edition, and most database books will include a copy, if they don't offer it some other way. JSP is one approach, but the Tomcat web site goes into more details about the options.

banner
02-22-2005, 02:15 AM
Originally posted by vk101
So what would the best thing to do be for somebody that wants to develop a website but doesn't want to change which software programs are used if and when it becomes a huge site?


My recommendation would be not to worry too much about this. No matter what happens with your site it will be constantly changing. You can't know all of the features you'll want to implement in advance and you also will ALWAYS find something that you want to change. As a result of this you will likely find that your site will undergo at least one major overhaul as it grows. Personally, I would focus on building your site into something functional that attracts and keeps visitors before I would focus on building something that will scale out effectively unchanged.

As for the various Java/PHP/MySQL/Oracle/etc options I can't help much. I do all my work in a Windows 2003/ASP.NET/MSSQL world and haven't really kept up with the other technologies.

Burhan
02-22-2005, 03:32 AM
I mentioned once somewhere in this forum, that I will be able to built e-commerce application with database support from the ground in 2-4 hours. Please, compare it with months of work in PHP.

I'm sorry, but I wouldn't trust anyone who can build an e-commerce application in 2-4 hours. I wouldn't trust them, and I probably wouldn't trust the application with my business.

As far as your arguments for frameworks, IDEs and so on -- you are correct -- Java does excel when it comes to application and third party support. PHP is lacking here, primarily because PHP is (a) new (b) doesn't have a corporate backing as large as Sun (and other major Java players). However, this is changing with Zend as they are trying to promote PHP as an enterprise development platform.

In all honesty, its really not practical to compare Java and PHP because they are designed for two very different purposes. PHP will never come close to the functionality of Java, and Java will probably never come close to the simplicity of PHP.

pnorilsk
02-22-2005, 10:31 AM
Originally posted by fyrestrtr
I'm sorry, but I wouldn't trust anyone who can build an e-commerce application in 2-4 hours. I wouldn't trust them, and I probably wouldn't trust the application with my business.
I wouldn't as well, if my experience in software engineering would be limited to bottom/up methodology and procedural programming. In all honesty, it's ridiculous to think that good quality e-commerce solution could be done in 2-4 hrs. I wrote this thing in an expectation to get this kind of response. But, I will be able to build fully functional site in 2-4 hrs for one and only one reason. I have a suite of prefabricated components and frameworks. We are not programming sites as you do in PHP, we build them from the blocks. Again it's quite simplistic way to present this difference, but I hope it outlines the point presented by hiryuu early - "difference in implementation". To build an e-commerce site in 2-4 hrs I need to take your data - name of the products, groupings and sub-groupings, cost, descriptions and pictures/images. I will place them in XML file (again using easy navigated XML editor in my Eclipse - GUI IDE) and I will invoke one single command, something like "ant all". It will build database in PostgreSQL (btw my Eclipse has very good plug-in to deal with majority of databases), will build all associated objects in Java and compile them. In this specific case, I have some business logic in place already for e-commerce type of business in immutable part of my environment - otherwise I need to build this business logic. After that my back-end is ready with complete database and transactional support, the security provisioning is built-in as well. Now, I need to do some facade. Guess what, I am using Velocity framework instead of JSP as templating engine. Some changes in XML like files and viola, my front-end is ready as well.

At this time I have very solid prototype in place - I would not call it yet as finished product. But in some places this work would be considered good enough to put it in production.

Well, I should admit that I reuse many thousands of man hours of my work and many contributors work to achieve this level of simplicity. And this is a qualitative difference between PHP and Java. And this is a topic of our discussion.

Regards,
Peter Kinev.

P.S. I hope it will not be construed as a self promotion. How can I explain it differently?

Burhan
02-23-2005, 04:02 AM
Well, I should admit that I reuse many thousands of man hours of my work and many contributors work to achieve this level of simplicity. And this is a qualitative difference between PHP and Java. And this is a topic of our discussion.

If you mean by this statement that PHP doesn't allow the reuse of code, then you really don't know much about PHP. There are many free components and frameworks that are available for PHP that allow for the RAD-type approach that you are talking about.

Again, the differences are quite visible because of the differences in the maturity of both languages. Java will have more components, PHP not as many, however, PHP is catching up.

A few examples :

http://www.blueshoes.org/
http://www.xisc.com/ (Prado)
http://pear.php.net/
http://propel.phpdb.org/wiki/
http://phing.info/wiki/index.php
http://creole.phpdb.org/wiki/

You will note that most of these projects are inspired by Java frameworks.

m-b
02-23-2005, 07:14 AM
Actually, you can build Applications of any size with every programming language!
But you will also face several problems when "scaling up" your hard- & software ...

A good book that regards this topic in php5 is Advanced PHP Programming (http://www.samspublishing.com/title/0672325616) from George Schlossnagle (Sams Publishing)!
Most of the parts of the book describe general problems with examples in php, so you can read it even when deciding to finally use an other programming language ...


Michael