Results 1 to 8 of 8
  1. #1
    Join Date
    Feb 2005
    Posts
    116

    Question Unanswered: Best way of updating table records

    Hi Guys,
    I have written a script to update table records based on a join.

    update tableA A set(field1,field2) =
    (select field3,field4 from tableB B
    where A.TabKey = B.TabKey)

    The query works but there is a issue in the performance.

    TableA has around 9million plus records and tableB has around 2 hundred thousand. And this update statement has to run daily when new data gets loaded into both the tables.

    The issue is that this statement is taking extremely long to execute

    I came across a solution where you create a new table based on the existing table with updated data.
    then re-create indexes, contrainsts, drop the old table and rename the new table. But this solution
    is not applicable cause we have reports and user access to the following tables and this process will cause unaccessibility
    to certain user

    Just looking for some suggestion/approach on how can i lower down the amount of time
    taken to update this table

    thanks for any input
    Last edited by a1jit; 05-01-07 at 07:27.

  2. #2
    Join Date
    Mar 2002
    Location
    Reading, UK
    Posts
    1,137
    If it is possible can you change the insert statement into tableA to select the two columns from tableB. You will get much better performance this way than using an update statement.

    Alan

  3. #3
    Join Date
    Oct 2002
    Location
    Cape Town, South Africa
    Posts
    253
    Quick question, why do you need to rewrite the values on table A with the values from table B? If you have a join that can be applied, why not just use the join instead of duplicating the data?

  4. #4
    Join Date
    Dec 2003
    Posts
    1,074
    I agree with dayneo. If you have to move the data over for some reason, maybe you're better off with something like:

    Code:
    update tableA A 
    set(field1,field2) = (select field3,field4 
                          from   tableB B
                          where  A.TabKey = B.TabKey)
    where exists (select *
                  from   tableB B
                  where  A.TabKey = B.TabKey and
                         Not (A.field1 = B.field3 and A.field2 = B.field4)
    This query will only update when there's a match on the keys AND where the 2 fields don't already match.

    Right now you're updating all 9 million records whether or not they need updating, with the added side-effect of setting (field1,field2) to NULL if there's no match found in tableB.

    ---=Chuck

  5. #5
    Join Date
    May 2004
    Location
    Dominican Republic
    Posts
    721
    Another possible way could be updating a join..
    Code:
    update (
    select tableb.field3 tableb_field3, 
            tableb.field4 tableb_field4,
            tablea.field1 tablea_field1,
            tablea.field2 tablea_field2
      from tablea, tableb
     where a.tabkey = b.tabkey
            )
        set tablea_field1 = tableb_field3, 
             tablea_field2 = tableb_field4

  6. #6
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    The tables are very small, the big question is do you have an index on tabkey on both tablea and tableb. If you don't, it has to do a full table scan for every row. If they are indexed, it will simply do an index join and update the rows.
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

  7. #7
    Join Date
    Feb 2005
    Posts
    116
    thanks a million for the suggestion and solutions...really appreciate it..

    yea, maybe i should take a step back and look why this has to be implemented. Cause if user wish to view the data, we can always do join, as said by dayneo, i will also consider indexing and the query proposed by all of ull..

    thanks again guys, appreciate it..

  8. #8
    Join Date
    Oct 2002
    Location
    Cape Town, South Africa
    Posts
    253
    Just one last thing... if you are copying data to get around a join that takes too long, or trying to get aggregate values (eg. sum/count/avg etc.), consider a materialized view with query rewrite (if you are on Enterprise edition). This will precompute the values and redirect any user sql to the answer without any change to table joins.

    Also consider partitioning the 9million row table. Tables over 1 million records become combersom to work with and are candidates for partitioning (alas, enterprise edition only).

    These two things should get rid of any performance problems without changing any application code.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •