Results 1 to 9 of 9
  1. #1
    Join Date
    Nov 2003
    Posts
    3

    Unanswered: Needs Performance in bulk updates

    Hi,

    Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.

    the data is of the form:

    coulmns: key1, data1, data2, data3, data4, data5

    row1: 1,a,b,c,,,
    row2: 1,,,c2,d,e
    row3: 1,a2,,,d2,e2
    row4: 1,,,,,e3

    ofcourse in the table i should have the latest information as :

    1,a2,b,c2,d2,e3

    Thanks for ur time and response.

    S

  2. #2
    Join Date
    Sep 2002
    Location
    Montreal, Canada
    Posts
    224
    Provided Answers: 1

    Re: Needs Performance in bulk updates

    Originally posted by syed1
    Hi,

    Im using musql on sun system. i have a stand alone java based application running on the same system, which reads the text file and update the database. I need 100 rows/sec updates but mysql is taking too much time. I have used batch update of java with 1000 rows at once. the rows are read by the appication in 10 sec and it takes 40secs by mysql to update the table. I have indexes on the required fields.

    the data is of the form:

    coulmns: key1, data1, data2, data3, data4, data5

    row1: 1,a,b,c,,,
    row2: 1,,,c2,d,e
    row3: 1,a2,,,d2,e2
    row4: 1,,,,,e3

    ofcourse in the table i should have the latest information as :

    1,a2,b,c2,d2,e3

    Thanks for ur time and response.

    S
    Can you be more precise and explain your problem a little bit more? Can you post a DESCRIBE of your table and SHOW INDEX as well as sample data?

    Have you tried fillinp up the "missing" data with NULL? Like:

    1, a2, NULL, NULL, NULL, d2, e2

    Have you tried unsing LOAD DATA or redirecting an input text file into mysql??

  3. #3
    Join Date
    Nov 2003
    Posts
    91
    MySQL is blazenly fast, so it's something else, either
    your algorithm or another one of your tools.
    The word that stands out to me the most is 'Java.' It is a dog
    when it comes to speed.

    Second, it sounds like you are doing mulitple updates
    on the same record over and over and over and over
    again, when if you were concerned with speed you would
    preprocess all the text lines that belong to the same
    MySQL record first and just do one database update per record.
    That change to your algorithm would cut about 75% of
    the time down, in this example, because you are only
    accessing the database 1/4 as much. (If, on the other hand,
    it's Java that is causing the bottleneck then you wont see
    any increase in speed.)

    Third, adding indexes on fields that don't require indexes
    actually slows updates and inserts down. You should only
    have indexes on fields that you search on quite a bit, or
    are used for referential integrity with other tables. And, BTW,
    indexes are automatically created for your key fields so
    don't duplicate the indexes.

    Fourth, make sure you aren't opening a database connection,
    performing the update, and then closing the database
    connection every time you process one text line.

    Fifth, if you are using MySQL 4.0+ then you should
    consider using ALTER TABLE ... DISABLE KEYS and
    ALTER TABLE ... ENABLE KEYS during the batch operation.

    Sixth, you may want to turn loggin off temporarily:
    SQL_LOG_OFF = 0

    Seventh, review MySQL Manual sections 5.2.10
    and 5.2.11 for additional optimization options.

    -lv

  4. #4
    Join Date
    Sep 2002
    Location
    Montreal, Canada
    Posts
    224
    Provided Answers: 1
    Originally posted by vanekl
    MySQL is blazenly fast, so it's something else, either
    your algorithm or another one of your tools.
    The word that stands out to me the most is 'Java.' It is a dog
    when it comes to speed.
    I agree with that opinion... Must be the algorithm, you're not using persistent connections or Java...

    Just to compare, I'm able to do 4000 inserts/second on a Pentium 1.7 Ghz via ODBC with Smalltalk VAST !!!

  5. #5
    Join Date
    Nov 2003
    Posts
    3
    Thanks a lot lv


    I realy appreciate ur suggestions. Let me reply one by one.

    1) MySQL is very fast, thats true.. and im sure something is wrong at my end thats y im asking for help from u guys...

    2) u r right im doing multiple update again and again on the same record. But thats the requirement.
    Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.

    In the peak time the frequency is around 100 messages per sec.

    3) I have re examined the indexes and having bit speed gain but not upto the requirement. i have only one index other than primary key which is composite key containing the datetime filed.

    4) i have only one db connection which opened at the start of the program and closes at the end of the program.

    5) as this process is for the full day so i cant afford to disable key and enable keys as other applications r applying select on this table.

    6) i have to see that can i offord to turn the log off

    7) sure i ll review the optimization and see wts missing.

    The where of the update contains two conditions:

    company_name = 'abc' and date_format(col_timestamp,'%Y-%m-%d') = curdate()



    thanks and regards,
    S

  6. #6
    Join Date
    Nov 2003
    Posts
    91
    Your app sounds like a program that could be
    easily done in C, instead of Java. C is _much_ faster.
    MySQL has a C API that works well.

    -lv

  7. #7
    Join Date
    Nov 2003
    Posts
    3
    Thanks buddy,

    Actually using java i can run it on win and sun at the sametime and i dont have enough time to rewrite the whole application in c on sun. The reviewing indexes has given some performance. Now im trying to update 1000 rows together and getting better response. Ofcourse i ll have to review the java parsing code and hopefully will get it fixed.

    thanks for ur valuable suggestions.

    best regards,
    syed

  8. #8
    Join Date
    Nov 2003
    Posts
    91
    Something else occured to me.

    If you are working with a slow hard disk you will be constrained
    by its slow seek times.
    For example, if your DB is on a disk with 15 millisecond
    seek times you will only get 67 DB operations second
    (1/.015 = 66.7).

  9. #9
    Join Date
    Oct 2003
    Posts
    706
    [i]
    2) u r right im doing multiple update again and again on the same record. But thats the requirement.
    Its the stock markete application, and im getting the rates in a text file. the text file starts growing from 900 am till 500 pm and in total it has approximately 10 million messages. The total no of companies r around 4000, so there r 4000 inserts per day which r the initial rate of that company and the subsequent messages r regarding the change in the rate, which should be updated in the database. now i have to update the database at the same time im getting the messages. because i need to show the current rate to the users.

    In the peak time the frequency is around 100 messages per sec.
    I think you will find that "batching" will help a lot. And you can take a previous day's data-stream (if it has a timestamp) to see if and to what extent this is so.

    Let's hypothesize that you can get away with updating the data every 10 seconds; about 1,000 updates. Look at past data to see just how many times more than one update is applied to the same key value in the same 10-second interval, and what the typical number of updates are.

    Then your program could slurp up data every 10 seconds, coalesce them into a set of UPDATE statements for each key such that only one UPDATE exists for each key, and execute them. (If there's the slightest chance that the ID might not be in the table, build INSERT statements at the same time (i.e. in advance) and use one or the other.

    This very slight algorithm change might be just the ticket.
    ChimneySweep(R): fast, automatic
    table repair at a click of the
    mouse! http://www.sundialservices.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •