If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > 5 scripts accessing one table simultaneously without selecting same rows

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 05-22-10, 17:50
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
5 scripts accessing one table simultaneously without selecting same rows

I will have a table of rows which need processing (5000+ rows).

The PHP fork function will be used to run multiple scripts to process the rows.

The PHP script:
1. select one unprocessed row to process.
2. update row as processed.
3. process (about 1 second each row).
4. back to 1.

How do I make sure that between steps 1 and 2 another script will not select the same row?

Is it possible to use transactions/innodb to lock the row to prevent any other script reading the same row?

Cheers.
Reply With Quote
  #2 (permalink)  
Old 06-01-10, 07:12
ProphetX ProphetX is offline
Registered User
 
Join Date: Jun 2010
Location: New Zealand
Posts: 15
Firstly, out of curiosity, is there much of a performance gain to be had by running many scripts at once? Is it worth the problem caused by the possibility of multiple scripts accessing the same row?

But to answer your question, InnoDB locks rows that are being accessed so you shouldn't need to manually lock them or check if they are locked already. I might not fully understand what you're trying to do but what I would do is first select the ID/Primary Key from every row and store them in a global array. Each script can then pick a row from the array, delete that row from the array so no other scripts pick it, and then do the processing.

Example:
PHP Code:
global $records = array();
$result mysql_query("SELECT id FROM table");

while (
$row mysql_fetch_array($result)) {
      
$records[] = $row['id'];
}

//Run the multiple scripts now 
And then in each script...
PHP Code:
$my_id $records[count($records) - 1];
unset(
$records[$my_id]); //This row's ID is no longer in the array for other scripts to access
$query 'SELECT ... FROM table WHERE id = '.$my_id;
//Do your processing. 
...Is that vaguely what you're after or am I on the wrong track?
Reply With Quote
  #3 (permalink)  
Old 06-01-10, 14:02
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
Hi, yes it is important performance wise, each row may take up to 3 seconds, which means it could take multiple hours for the script to complete.

Your example is exactly what I need, but I need to absolutely guarantee that between these two lines:
Code:
$my_id = $records[count($records) - 1];
unset($records[$my_id]);
Two of the same scripts will not get to the same ID at the same time, and process one row twice.

I am not sure if this is possible or not. It seems possible but unlikely.
Reply With Quote
  #4 (permalink)  
Old 06-01-10, 17:08
ProphetX ProphetX is offline
Registered User
 
Join Date: Jun 2010
Location: New Zealand
Posts: 15
Those two statements sould only take several miliseconds to execute so the chances of two scripts grabbing the same row are very low. Could you add an 'updated' field (with default value as 0) to the table? When a script updates the row it sets the 'updated' value to 1 (true), and the queries to update rows can use ...WHERE updated = 0

...a little crude but you would absolutely guarantee that a record will not be updated twice.
Reply With Quote
  #5 (permalink)  
Old 06-02-10, 13:43
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
Another way to ensure you do not process same rows by the scripts is to break them into ranges. At a prior engagement we had a client that ran some batch processing based on the last 2 digits of their person ID. In job1 they processed partial ids 0 - 24, then 25 - 49.... This allowed them to cut a 10 hour process down to 2.5 hours.
Dave
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On