Results 1 to 10 of 10
  1. #1
    Join Date
    Mar 2001
    Location
    NY
    Posts
    741

    Database Experts Opinions Wanted

    Hey guys,
    I'm running a server that handles logging and reporting. It's pretty processor intensive. I have a list of numbers that i'd like to update the server with, approximately 100,000-1,000,000 numbers that have a URL attached to them in a seperate field. I'm being told that it is taking too long for the database to be query'ed and spit out an answer...what can I implement to make the query go faster?

    So far, my thoughts are:

    #1) Remove the keyword part of the schematics to a seperate server and run the querys there and report back the answer.

    #2) Update and improve the technical aspects of the server (more RAM, processors, etc)

    Anything else? Any help is very welcomed.

  2. #2
    Join Date
    Aug 2002
    Location
    Superior, CO, USA
    Posts
    633
    What indexes do you have on your tables? That is, is your where clause correctly using an index? A million rows is not very many for a modest sized machine.

    Can you provide the DDL for your table along with the query you are running? PM/email me if you don't want to do this publically.

    I'd look at this before you look at more hardware.

  3. #3
    that i'd like to update the server with
    Here you would be inserting.


    too long for the database to be query'ed and spit out an answer
    Here you would be retrieving.


    Do you want to insert the information into the db?
    How are you using your indexes?

    Trying inserting when the db is at a lower logging / reporting state.

    What does your query look like?
    Datums Internet Solutions, LLC
    Systems Engineering & Managed Hosting Services
    Complex Hosting Consultants

  4. #4
    Join Date
    Mar 2001
    Location
    NY
    Posts
    741
    I'm the strategy guy - and the technology guys dont want to add the keyword parts because they say it will load too slow. How many rows can a normal db hold? Thanks for all your responses.

  5. #5
    Join Date
    Aug 2002
    Location
    Superior, CO, USA
    Posts
    633
    Yep, an index will slow things down a little bit on insert. If the insert is done as a batch type operation then you can disable indexes during insert and re-enable it after.

    A million rows, especially given the small amount of data you're looking at (an int and a varchar() of some length) is not alot of data. In my "day job" I deal with 10's of millions of rows with a select time on the order of 20ms for a particular row. This is on a pretty lowly machine - a 450Mhz UltraSparc II. But our data is primarily read only so we've got indexes tuned pretty well.

    Is your data heavily updated or is it pretty static (say < 10% changed in an hour)? That should help you figure out the way to improve database performance. If you have no indexes on the table then, for every read, you will average reading half the table to find you data. This is called a full table scan and it is the worst possible thing you can do for performance - especially if the database has to read and ignore 500,000 rows.


    Originally posted by dherman76
    I'm the strategy guy - and the technology guys dont want to add the keyword parts because they say it will load too slow. How many rows can a normal db hold? Thanks for all your responses.

  6. #6
    Join Date
    Mar 2001
    Location
    NY
    Posts
    741
    Well, I certainly did receive some expert opinions. I will find out how our db is operated and see if all of this fits what we need to do.

  7. #7
    Join Date
    Mar 2001
    Location
    NY
    Posts
    741
    So, the inefficiencies come from either:

    1) poorly coded query script
    2) slow hardware
    3) seperate out different functions to different servers

  8. #8
    Join Date
    Aug 2002
    Location
    Superior, CO, USA
    Posts
    633
    Not necessary - a particular query, say
    "select id, url from my_table where id = 12345" (which is kinda what you said you were doing) can run in a few ms. on a well tuned database or in tens of seconds on a poorly tuned db. In this case it all depends on the indexes. In my query above if there is an index/primary key constraint on the id column then the query should be very fast. If, instead, the table is not created with something like that the dataserver (MySQL, PostgreSQL, etc.) must read each row from the table, comparing the value in the id column until it finds a match. Depending on the distribution of your data you would likely average reading half of the table on a set of queries.

    So in this case the query could be exactly the same and it is the table in the database that needs to be tuned.

    Throwing more hardware at a bad software design, while popular, is not something I would recommend. If your developers cannot tell you what first normal form is then it may be time to look for new developers. Sorry to be so harsh but the problem you're trying to solve is really not that tough for anybody who has done real database design.





    Originally posted by dherman76
    So, the inefficiencies come from either:

    1) poorly coded query script
    2) slow hardware
    3) seperate out different functions to different servers

  9. #9
    Join Date
    Mar 2001
    Location
    NY
    Posts
    741
    Gotcha - so what is the questions should I be asking?

  10. #10
    Join Date
    Aug 2002
    Location
    Superior, CO, USA
    Posts
    633
    My biggest question would be how is the data indexed? Secondly, what tuning has been done with the queries? If I were to help you here I'd need:
    [list=1][*]The DDL (data definition language) for the table - i.e. the create table script, including any indexes that have been added.[*]The queries that are slow. That is, regardless of the programming environment you are using (PHP, Java, etc.), at the database interface layer there are queries running that are independent from the PHP/Java/etc. We would need to see the query or queries that are slow to get a better feel for why they are slow.[*]One row of sample data with the assumption that the rest of the data is conceptually the same.[/list=1]
    If this cannot be provided because of IP type issues, I'd encourage you to bring in a consultant under an NDA to help. Based on what you've described it does not sound like alot of work though there certainly may be something I'm missing.

    Originally posted by dherman76
    Gotcha - so what is the questions should I be asking?

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •