If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > Needs Performance in bulk updates

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-18-03, 23:50
syed1 syed1 is offline
Registered User
 
Join Date: Nov 2003
Posts: 3
Needs Performance in bulk updates

Hi,

Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.

the data is of the form:

coulmns: key1, data1, data2, data3, data4, data5

row1: 1,a,b,c,,,
row2: 1,,,c2,d,e
row3: 1,a2,,,d2,e2
row4: 1,,,,,e3

ofcourse in the table i should have the latest information as :

1,a2,b,c2,d2,e3

Thanks for ur time and response.

S
Reply With Quote
  #2 (permalink)  
Old 11-19-03, 09:48
bstjean bstjean is offline
Registered User
 
Join Date: Sep 2002
Location: Montreal, Canada
Posts: 219
Re: Needs Performance in bulk updates

Quote:
Originally posted by syed1
Hi,

Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.

the data is of the form:

coulmns: key1, data1, data2, data3, data4, data5

row1: 1,a,b,c,,,
row2: 1,,,c2,d,e
row3: 1,a2,,,d2,e2
row4: 1,,,,,e3

ofcourse in the table i should have the latest information as :

1,a2,b,c2,d2,e3

Thanks for ur time and response.

S
Can you be more precise and explain your problem a little bit more? Can you post a DESCRIBE of your table and SHOW INDEX as well as sample data?

Have you tried fillinp up the "missing" data with NULL? Like:

1, a2, NULL, NULL, NULL, d2, e2

Have you tried unsing LOAD DATA or redirecting an input text file into mysql??
Reply With Quote
  #3 (permalink)  
Old 11-19-03, 11:56
vanekl vanekl is offline
Registered User
 
Join Date: Nov 2003
Posts: 91
MySQL is blazenly fast, so it's something else, either
your algorithm or another one of your tools.
The word that stands out to me the most is 'Java.' It is a dog
when it comes to speed.

Second, it sounds like you are doing mulitple updates
on the same record over and over and over and over
again, when if you were concerned with speed you would
preprocess all the text lines that belong to the same
MySQL record first and just do one database update per record.
That change to your algorithm would cut about 75% of
the time down, in this example, because you are only
accessing the database 1/4 as much. (If, on the other hand,
it's Java that is causing the bottleneck then you wont see
any increase in speed.)

Third, adding indexes on fields that don't require indexes
actually slows updates and inserts down. You should only
have indexes on fields that you search on quite a bit, or
are used for referential integrity with other tables. And, BTW,
indexes are automatically created for your key fields so
don't duplicate the indexes.

Fourth, make sure you aren't opening a database connection,
performing the update, and then closing the database
connection every time you process one text line.

Fifth, if you are using MySQL 4.0+ then you should
consider using ALTER TABLE ... DISABLE KEYS and
ALTER TABLE ... ENABLE KEYS during the batch operation.

Sixth, you may want to turn loggin off temporarily:
SQL_LOG_OFF = 0

Seventh, review MySQL Manual sections 5.2.10
and 5.2.11 for additional optimization options.

-lv
Reply With Quote
  #4 (permalink)  
Old 11-19-03, 12:25
bstjean bstjean is offline
Registered User
 
Join Date: Sep 2002
Location: Montreal, Canada
Posts: 219
Quote:
Originally posted by vanekl
MySQL is blazenly fast, so it's something else, either
your algorithm or another one of your tools.
The word that stands out to me the most is 'Java.' It is a dog
when it comes to speed.
I agree with that opinion... Must be the algorithm, you're not using persistent connections or Java...

Just to compare, I'm able to do 4000 inserts/second on a Pentium 1.7 Ghz via ODBC with Smalltalk VAST !!!
Reply With Quote
  #5 (permalink)  
Old 11-19-03, 18:47
syed1 syed1 is offline
Registered User
 
Join Date: Nov 2003
Posts: 3
Thanks a lot lv


I realy appreciate ur suggestions. Let me reply one by one.

1) MySQL is very fast, thats true.. and im sure something is wrong at my end thats y im asking for help from u guys...

2) u r right im doing multiple update again and again on the same record. But thats the requirement.
Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.

In the peak time the frequency is around 100 messages per sec.

3) I have re examined the indexes and having bit speed gain but not upto the requirement. i have only one index other than primary key which is composite key containing the datetime filed.

4) i have only one db connection which opened at the start of the program and closes at the end of the program.

5) as this process is for the full day so i cant afford to disable key and enable keys as other applications r applying select on this table.

6) i have to see that can i offord to turn the log off

7) sure i ll review the optimization and see wts missing.

The where of the update contains two conditions:

company_name = 'abc' and date_format(col_timestamp,'%Y-%m-%d') = curdate()



thanks and regards,
S
Reply With Quote
  #6 (permalink)  
Old 11-19-03, 19:45
vanekl vanekl is offline
Registered User
 
Join Date: Nov 2003
Posts: 91
Your app sounds like a program that could be
easily done in C, instead of Java. C is _much_ faster.
MySQL has a C API that works well.

-lv
Reply With Quote
  #7 (permalink)  
Old 11-19-03, 22:08
syed1 syed1 is offline
Registered User
 
Join Date: Nov 2003
Posts: 3
Thanks buddy,

Actually using java i can run it on win and sun at the sametime and i dont have enough time to rewrite the whole application in c on sun. The reviewing indexes has given some performance. Now im trying to update 1000 rows together and getting better response. Ofcourse i ll have to review the java parsing code and hopefully will get it fixed.

thanks for ur valuable suggestions.

best regards,
syed
Reply With Quote
  #8 (permalink)  
Old 12-14-03, 15:01
vanekl vanekl is offline
Registered User
 
Join Date: Nov 2003
Posts: 91
Something else occured to me.

If you are working with a slow hard disk you will be constrained
by its slow seek times.
For example, if your DB is on a disk with 15 millisecond
seek times you will only get 67 DB operations second
(1/.015 = 66.7).
Reply With Quote
  #9 (permalink)  
Old 12-14-03, 17:33
sundialsvcs sundialsvcs is offline
Registered User
 
Join Date: Oct 2003
Posts: 706
Quote:
[i]
2) u r right im doing multiple update again and again on the same record. But thats the requirement.
Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.

In the peak time the frequency is around 100 messages per sec.
I think you will find that "batching" will help a lot. And you can take a previous day's data-stream (if it has a timestamp) to see if and to what extent this is so.

Let's hypothesize that you can get away with updating the data every 10 seconds; about 1,000 updates. Look at past data to see just how many times more than one update is applied to the same key value in the same 10-second interval, and what the typical number of updates are.

Then your program could slurp up data every 10 seconds, coalesce them into a set of UPDATE statements for each key such that only one UPDATE exists for each key, and execute them. (If there's the slightest chance that the ID might not be in the table, build INSERT statements at the same time (i.e. in advance) and use one or the other.

This very slight algorithm change might be just the ticket.
__________________
ChimneySweep(R): fast, automatic
table repair at a click of the
mouse! http://www.sundialservices.com
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On