Web Hosting Talk







View Full Version : Running Massive Databases


MaB
07-02-2003, 06:28 PM
Hi - we've developed a solution for a client - the website runs data feeds from merchants and then allows visitors to search for these items (in categories) and the results are show in order of the merchants bid per click (ie a bid per click search engine but using datafeeds)

The problme is that they are growing fast and its starting to get a tad slow. The machien is a dual xeon (looks like 4 cpu's therefore) and 1 G ram (up to 4 G possible)

The database runs on MySQL4 - however we're afraid the database will have millions of listings in a quick order of time and we're looking into how to let this grow without running into the slowdowns -

how do huge databases (such as this site which I can imagine has millions upon millions of database entries) run efficiently?

We've thought of solutions that involve load balancing / clustering but more details are needed. I know clustering on MySQL is one way (replication) but it may have to do.

What can we do to speed up the search through millions of entries? Load the whole db into ram? Split the database so that different products/merchants are on different servers?

Any real suggestion (maybe some from the WHT staff) as to how this can be done would be greatly appreciated.

Thanks.

jimt-bocacom
07-02-2003, 07:52 PM
One trick we use for our backend DBs is to run them on RAID 5 IDE arrays. We use 3ware 7850 RAID controllers. These are parallel IDE controllers (one per drive) with a transfer speed of 120Mb/s. With an 8 drive array, we can get a very high trasfer rate (reads at 175 Mb/s). If you combine that with a four way XEON box (we haven't yet, but will be soon), you should have one fast fault tolerant server.

You can check out the specs here:

http://www.3ware.com/products/benchmarks.asp

We are running MSSQL here, so I cannot really give you any advice on MySQL.

Jim

MaB
07-02-2003, 07:56 PM
It's running on a 3ware hardware raid card but its only 2 port raid 0 which means writing is 2x as slow but reading is still 2x as fast...

however, we arent hosting this server in one of our racks at IDT (for reasons I will not get into now) so its sort of beyond my control what they can / cant put in there :( I may look into asking them to convert it into raid 5 (there will be alot of downtime tho:()

Thanks for your reply.

genlee
07-02-2003, 08:53 PM
Optimizing your queries will probably give you the most improvement. You can use vmstat to see where your server is lagging while it is under load and go from there.

Daijoubu
07-02-2003, 08:58 PM
IPB use mysql FULLTEXTE index for search

jimt-bocacom
07-02-2003, 09:31 PM
Thanks for your reply
No Problem. Gook Luck! :)

Jim

DenRomano
07-03-2003, 01:07 AM
In this case you are going to need to both spend money on hardware and peak the software.

I would have 2 servers. 1 for app and 1 for mysql. Load up on the RAM

I would create several DBs to do the work.

MaB
07-03-2003, 08:16 PM
What do you mean peak software? You mean other database (like oracle?)?

boxman
07-03-2003, 10:41 PM
With such a huge project, you probably need to abandon the free database and spring for an Oracle license. Ouch -- it might cost you five figures for all those processors.

MaB
07-03-2003, 10:47 PM
:(

Well, i'm not the one paying for the project, im just the programmer :) Ill try to run it by em tho

How do sites such as WHT that have 1,000,000's of entries do it? or other large sites like yahoo financial or OSDN that run mysql? (i beleive hearing google ran mysql but i may be mistaken)

LiveRack
07-03-2003, 11:11 PM
I have a site that has around 3,000,000 records in it and recieves a little over 12-14 hits a second average and it's doing fine on a Dual Xeon system with 512MB RAM (apache 1.3 and mysql 4.0). The load average is around 0.5.

I'm sure mysql 4.0 can handle the kind of loads you describe. However, if you really need the best support possible then consider Oracle.

MaB
07-03-2003, 11:45 PM
Well its rather complicated because to display a single product it has to go through 1,000,000's of entries to get just a product match, then the same amount of records in a different DB to find a list of merchants then a differetn db to get the product info, then get the pid per click from each of the 100's of merchants (which have 1000's of products) and compare and sort em in order :(

I guess I'll have to bunch it all together to make it quicker....

Daijoubu
07-04-2003, 12:03 AM
Will this help o.O
http://www.mysql.com/doc/en/Fulltext_Search.html

LiveRack
07-04-2003, 12:09 AM
Full text searching wouldn't be ideal for these kind of queries. I think indexes would be the way to go. Good indexes can dramatically improve performance.

MaB
07-04-2003, 12:26 AM
Would you be so kind as to explain the differenfce between the 2?