Results 1 to 8 of 8
  1. #1
    Join Date
    Jul 2009
    Posts
    5

    Unanswered: Parallel Select Queries

    So I have a progam where I read data from the database programmatically. I have an array and for each element in the array I call select. So if my array has 20,000 elements, that's 20,000 calls.
    I want to make my program multi threaded, so each
    thread will run over a subset of the array, and then instead of calling
    select 20,000 times one call after another, I will have many select calls called in parallel, which should be much faster.
    The question is, with only one connection to the databse, can I execute my plan with no problems? I know that a database can be queried simultaneously, I just don't know how many connections I need to have.

    Thanks!

  2. #2
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    sounds like you need to rethink your SQL

    rather than iterate through a loop issuing single SQL commands, you'd be far better at issuing a single SQL command to return all relevant rows, then iterate through the loop. id you are consolidating data then udually its far better to do that consolidation within SQL than it is elsewhere.
    I'd rather be riding on the Tiger 800 or the Norton

  3. #3
    Join Date
    Jul 2009
    Posts
    5
    I see your point, but I'm doing it for various reasons. I can run one query that will have the same results, but then it will takes many hours to get the results back. What I am trying to do can be faster (with parallel execution) and memory efficient (each result sent back is small and return time is less than a second). Also, it's an offline process and will be executed once a day.

  4. #4
    Join Date
    May 2002
    Location
    General Deheza, Cba, Arg.
    Posts
    276
    hi ravivg, you show all result in the screen? or only a resumen of the data? why you need make 20000 select? it's very poor performance.
    may be a SP can help U and may be using explain command and indexing your table the query be more faster.

    Bye.
    Abel.

  5. #5
    Join Date
    Jul 2009
    Posts
    5
    Thanks Abel, these are good ideas -- I didn't know about EXPLAIN before and after reading about it, it's a great feature that might help me.
    I will also take a look on SP. My queries are dynamic but they are all based on data I get from a single static SELECT, so maybe it can work (didn't use SP before so I can tell now).
    I don't write anything to screen. I use a data structure to store all the results and then I do some processing on it. In fact, right now it doesn't save any memory usage because I store all the return results in memory and only when all the select queries are done, I do some work on it and eventually write it to file (or db). But once I have memory problems, I can run a number of select queries, write to memory, process, write to file/db, free memory, and then repeat the process till I'm done with all the select queries. Even better, use multithreads that will handle different select queries.
    I was also thinking on one thread that works on the select queries and second that do all the data processing on the data that was retrieved so far.

    So many options, I just need to find a fast one and also that doesn't use too much memory. One big select (with sub queries) might be the the best option -- I will see if I can work it out.

    Thanks!

  6. #6
    Join Date
    Nov 2004
    Location
    out on a limb
    Posts
    13,692
    Provided Answers: 59
    a lotof the "cost" in querying a SQL database is the the resources used to make the connection, the delays inherent in sending a query, waiting for it to be processed and waiting for the results to be returned.

    spawning 20000 odd queries is going to require multiple connections.. very few db's will allow that number of connections, so attempting to send 20,000 queries in parallel will not neccesarily do waht you think. I'd expect the server to choke the incoming connections to a level that its happy with, or set up to handle.

    without knowing more details about what you are doing its hard to say how you could move forward. you say you move the data into a structure and then do processing on it. Is there any way you can do some of that processing in the SQL itself?

    do you need to do the data extract, then stuff the data into a structure then apply some processes.

    Im not a expert in SQL, but I suspect that you may well find it usefull to explain here what you are doing. sometimes its very easy to go down a specific pathway and think its the right/only approach. I've often found that the problem is lack of understanding SQL of appropriate use of SQL is the problem.

    it may be that your proposed 20K solution is the only one but I think its worth a try explaining what you are trying to do in as much detail as you think you can
    I'd rather be riding on the Tiger 800 or the Norton

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    retrieving 20,000 rows does ~not~ take "many hours to get the results back"

    if you've never heard of EXPLAIN, maybe you don't know about indexes, too

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jul 2009
    Posts
    5
    r937 - I heard about indexes and I use them. I never said that it takes many hours
    to retrieve 20,000 rows.
    Last edited by ravivg; 07-13-09 at 14:10.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •