Web Hosting Talk







View Full Version : Which one is more resources intensive?


NE-Andy
05-14-2005, 10:25 PM
I'm writing a statistic tallying script right now which needs to find running totals for 10 tables with several thousand indices (as of right now, all 2000 indices or more on each table, MySQL dump gives a 54MB .sql file and will continuously grow as time proceeds). Right now, I see two viable ways to do it, but I need to know which one is more resources intensive (IE: uses more CPU/RAM) so I can avoid doing that:

First method:
Run a CREATE TABLE with TYPE=MERGE UNION with the 10 tables, and then do a SELECT * FROM the newly created table with ORDER Value1 Desc.

Second method:
Run 10 SELECT * FROM the 10 old tables with LIMIT 100 (the amount I need) and then run a selection sort through the 1000 indices.

After either of the above process is done, data is then serialized into a long string which is dumped to another mysql server for frontend presentation. So the above can be setup into a crontab and put to execute on the lowest preformance required hour of the day.

I'm looking at it now and I realized that regardless of what I do, this process will be fairly resources intensive for small bursts when its processing (~3 seconds for the second method, have not tried the first method due to certain restrictions which I can over come if proven it'd be faster/more efficient), but optimally I would like to make it least intensive as possible. Can someone tell me which one of the two above outlined methods are more efficient? Additionally, I wouldn't mind coding anything extra for as long as it works in the most efficient manner. So if you have any other methods that I've overlooked, I'm also open for suggestions.

Thanks in advance!

error404
05-15-2005, 12:52 AM
I suspect that the first method is probably faster, though merging the 10 tables into one would probably be significantly faster, and if they all contain the same data (and I presume they do from your description), I don't know why you didn't do this in the first place. There really isn't much reason not to.

In any case, I suspect that the first method is probably slower, as it needs to copy the entirety of all of your tables into a new one - if both copies can fit in RAM it might be faster, but you'd have to analyze it yourself (use the EXPLAIN statement extensively and analyze performance logs).

The only way to know for sure is for you to try it with your real data set.

xelav
05-15-2005, 03:29 AM
what is the structure of tables - similar or not. what dependencies are between tables? I think, this information needed to get right solution

NE-Andy
05-15-2005, 01:20 PM
The 10 tables are character profiles for an online game I'm working on the side for resume filler. They were not done in one table in the first place for two reasons: 1) table locking while they're being read/written into from the queries of game server actively saving+reading profiles. This will reduce the chance of two people having to access the same table and resulting one having to wait for the other. 2) reduction in overhead, having it done like this, the chance of all ten tables having overhead is lesser than having one table with massive overhead.

Each table contains several fields with the exact same structure, more specifially:
29 int(10) unsigned
26 smallint(5) unsigned
6 smallint(6)
12 tinyint(3) unsigned

They have the same fields, but the tables have different amount of records and records are different; and I expect them to grow over time as we get more players and what not.

My original impression of doing this is that if I do merge everything, it will require A) the resources to store all values in all 10 tables + B) the resources to write out into one large table + C) resources to select the top 100 records. Where as if I were to do it the second method, I'd only need A) resources to run select queries 10 times for top 100 of each of the 10 tables (creating an array of 1000 values) and B) use php to sort through the records and C) truncate anything after the initial 100. However, being fairly new to all these optimization stuff, I am not entirely certain which method would be quicker, so I figured I'd come by and ask. Thanks for the quick responses :)

error404
05-15-2005, 06:52 PM
If the data is similar, store it in the same table. Locking and memory caching etc. is the DBMS' problem and you shouldn't concern yourself with it. That's the entire reason you use a DBMS rather than text files. It'll do a much better job of optimizing queries on one large table than a dozen small ones containing similar data. It then has far more indexes to maintain in RAM, and will be pulling stuff onto and off of the disk all the time because it can't optimize memory allocation between tables easily. The DBMS programmers have spent man-years of time working on optimizing these things, I don't think using the database in a wonky, difficult to use fashion is going to help you. Certainly, in this situation, it will probably hurt performance 10 fold at least over just doing a regular query.

What you're saying is 'I did it this odd way for performance, and now I have to kludge a solution to this problem, and I want to know which of my two grossly inefficient solutions is going to be faster.' Well the answer is that you shouldn't try to 'beat' the database at it's own game. Design your stuff sanely and don't think about what's going on at the lower level; it's not your concern. What you're doing defeats the purpose of using the database. Let it do it's job: making yours easier.

If you really insist on such a whack database design, use the second solution, and select into a temporary table, not a PHP array. Less IPC and more optimization can be done by the database, as long as the creation overhead isn't too great.