| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

11-18-03, 23:50
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 3
|
|
|
Needs Performance in bulk updates
|
|
Hi,
Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.
the data is of the form:
coulmns: key1, data1, data2, data3, data4, data5
row1: 1,a,b,c,,,
row2: 1,,,c2,d,e
row3: 1,a2,,,d2,e2
row4: 1,,,,,e3
ofcourse in the table i should have the latest information as :
1,a2,b,c2,d2,e3
Thanks for ur time and response.
S
|
|

11-19-03, 09:48
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Montreal, Canada
Posts: 219
|
|
|
Re: Needs Performance in bulk updates
Quote:
Originally posted by syed1
Hi,
Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.
the data is of the form:
coulmns: key1, data1, data2, data3, data4, data5
row1: 1,a,b,c,,,
row2: 1,,,c2,d,e
row3: 1,a2,,,d2,e2
row4: 1,,,,,e3
ofcourse in the table i should have the latest information as :
1,a2,b,c2,d2,e3
Thanks for ur time and response.
S
|
Can you be more precise and explain your problem a little bit more? Can you post a DESCRIBE of your table and SHOW INDEX as well as sample data?
Have you tried fillinp up the "missing" data with NULL? Like:
1, a2, NULL, NULL, NULL, d2, e2
Have you tried unsing LOAD DATA or redirecting an input text file into mysql??
|
|

11-19-03, 11:56
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 91
|
|
|
|
MySQL is blazenly fast, so it's something else, either
your algorithm or another one of your tools.
The word that stands out to me the most is 'Java.' It is a dog
when it comes to speed.
Second, it sounds like you are doing mulitple updates
on the same record over and over and over and over
again, when if you were concerned with speed you would
preprocess all the text lines that belong to the same
MySQL record first and just do one database update per record.
That change to your algorithm would cut about 75% of
the time down, in this example, because you are only
accessing the database 1/4 as much. (If, on the other hand,
it's Java that is causing the bottleneck then you wont see
any increase in speed.)
Third, adding indexes on fields that don't require indexes
actually slows updates and inserts down. You should only
have indexes on fields that you search on quite a bit, or
are used for referential integrity with other tables. And, BTW,
indexes are automatically created for your key fields so
don't duplicate the indexes.
Fourth, make sure you aren't opening a database connection,
performing the update, and then closing the database
connection every time you process one text line.
Fifth, if you are using MySQL 4.0+ then you should
consider using ALTER TABLE ... DISABLE KEYS and
ALTER TABLE ... ENABLE KEYS during the batch operation.
Sixth, you may want to turn loggin off temporarily:
SQL_LOG_OFF = 0
Seventh, review MySQL Manual sections 5.2.10
and 5.2.11 for additional optimization options.
-lv
|
|

11-19-03, 12:25
|
|
Registered User
|
|
Join Date: Sep 2002
Location: Montreal, Canada
Posts: 219
|
|
Quote:
Originally posted by vanekl
MySQL is blazenly fast, so it's something else, either
your algorithm or another one of your tools.
The word that stands out to me the most is 'Java.' It is a dog
when it comes to speed.
|
I agree with that opinion... Must be the algorithm, you're not using persistent connections or Java...
Just to compare, I'm able to do 4000 inserts/second on a Pentium 1.7 Ghz via ODBC with Smalltalk VAST !!!
|
|

11-19-03, 18:47
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 3
|
|
Thanks a lot lv
I realy appreciate ur suggestions. Let me reply one by one.
1) MySQL is very fast, thats true.. and im sure something is wrong at my end thats y im asking for help from u guys...
2) u r right im doing multiple update again and again on the same record. But thats the requirement.
Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.
In the peak time the frequency is around 100 messages per sec.
3) I have re examined the indexes and having bit speed gain but not upto the requirement. i have only one index other than primary key which is composite key containing the datetime filed.
4) i have only one db connection which opened at the start of the program and closes at the end of the program.
5) as this process is for the full day so i cant afford to disable key and enable keys as other applications r applying select on this table.
6) i have to see that can i offord to turn the log off
7) sure i ll review the optimization and see wts missing.
The where of the update contains two conditions:
company_name = 'abc' and date_format(col_timestamp,'%Y-%m-%d') = curdate()
thanks and regards,
S
|
|

11-19-03, 19:45
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 91
|
|
Your app sounds like a program that could be
easily done in C, instead of Java. C is _much_ faster.
MySQL has a C API that works well.
-lv
|
|

11-19-03, 22:08
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 3
|
|
Thanks buddy,
Actually using java i can run it on win and sun at the sametime and i dont have enough time to rewrite the whole application in c on sun. The reviewing indexes has given some performance. Now im trying to update 1000 rows together and getting better response. Ofcourse i ll have to review the java parsing code and hopefully will get it fixed.
thanks for ur valuable suggestions.
best regards,
syed
|
|

12-14-03, 15:01
|
|
Registered User
|
|
Join Date: Nov 2003
Posts: 91
|
|
Something else occured to me.
If you are working with a slow hard disk you will be constrained
by its slow seek times.
For example, if your DB is on a disk with 15 millisecond
seek times you will only get 67 DB operations second
(1/.015 = 66.7).
|
|

12-14-03, 17:33
|
|
Registered User
|
|
Join Date: Oct 2003
Posts: 706
|
|
Quote:
[i]
2) u r right im doing multiple update again and again on the same record. But thats the requirement.
Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.
In the peak time the frequency is around 100 messages per sec.
|
I think you will find that "batching" will help a lot. And you can take a previous day's data-stream (if it has a timestamp) to see if and to what extent this is so.
Let's hypothesize that you can get away with updating the data every 10 seconds; about 1,000 updates. Look at past data to see just how many times more than one update is applied to the same key value in the same 10-second interval, and what the typical number of updates are.
Then your program could slurp up data every 10 seconds, coalesce them into a set of UPDATE statements for each key such that only one UPDATE exists for each key, and execute them. (If there's the slightest chance that the ID might not be in the table, build INSERT statements at the same time (i.e. in advance) and use one or the other.
This very slight algorithm change might be just the ticket.
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|