Results 1 to 8 of 8
  1. #1
    Join Date
    Feb 2012
    Posts
    3

    Unanswered: delete ... where not exists ... performance

    Hi there!

    I've got some performance issues with an "delete ... where not exists" statement. The statement is executed via JDBC.

    The statement looks like this:
    Code:
    DELETE FROM table_inputjob
    WHERE  inputfilestate =? 
           AND indexfilestate =? 
           AND NOT EXISTS(SELECT * 
                          FROM   table_job 
                          WHERE  table_job.inputid = table_inputjob.inputid)
    Each table contains about 4500000 rows. The above statement took 115 seconds to delete 1098 rows from TABLE_INPUTJOB. A smaller run with 50 rows deleted took about 46 seconds. Another run was about to delete 2000000 rows, which I've canceled after several hours have passed. I need to delete this 2000000 rows in an acceptable amount of time.

    Do you have any ideas to boost this statement?
    Thanks in advance!


    Table definition looks like this:
    Code:
    CREATE TABLE TABLE_INPUTJOB (
      TASKNAMES VARCHAR2(4000),
      INPUTID VARCHAR2(100) NOT NULL,
      INPUTFILESTATE NUMBER(10),
      EXECUTIONQUEUED DATE,
      EXECUTIONSTART DATE,
      EXECUTIONEND DATE,
      PRIORITY NUMBER(10),
      SERVERNAME VARCHAR2(100),
      INDEXFILESTATE NUMBER(10),
      OBSERVER VARCHAR2(50),
      ALLJOBS NUMBER(10),
        CONSTRAINT PK_TABLE_INPUTJOB PRIMARY KEY (INPUTID)
    );
    
    CREATE TABLE "TABLE_JOB" (
      "JOBSTATE" NUMBER(10),
      "DATETIME" DATE,
      "USERNAME" VARCHAR2(50),
      "DDONAME" VARCHAR2(1024) NOT NULL,
      "TEXTSEARCH" VARCHAR2(4000),
      "PROCESSED" NUMBER(10),
      "ARCHIVED" NUMBER(1),
      "ENABLED" NUMBER(1) NOT NULL,
      "QUEUENAME" VARCHAR2(100),
      "DOCTYPE" VARCHAR2(100),
      "PAGECOUNT" NUMBER(10) NOT NULL,
      "SHEETCOUNT" NUMBER(10) NOT NULL,
      "COPY" NUMBER(10),
      "ERRORDESCRIPTION" VARCHAR2(255),
      "JOB_ID" VARCHAR2(36) UNIQUE NOT NULL,
      "INPUTID" VARCHAR2(100) NOT NULL,
      "JOBID" VARCHAR2(100) UNIQUE,
      "LOCKED" NUMBER(10),
      "LAST_OUTPUT" DATE,
      "DOCTYPENR" NUMBER(10),
      "EVENT_ID" VARCHAR2(36),
        CONSTRAINT "PK_TABLE_JOB" PRIMARY KEY ("JOB_ID", "INPUTID")
    );
    
    ALTER TABLE "TABLE_JOB" ADD CONSTRAINT "FK_JOB_INPUTJOB" FOREIGN KEY (
      "INPUTID"
    )
    REFERENCES "TABLE_INPUTJOB" (
      "INPUTID"
    ) ON DELETE CASCADE;
    
    CREATE INDEX IDX_TABLE_JOB 
      ON TABLE_JOB (INPUTID, JOB_ID) NOLOGGING;
    
    CREATE INDEX IDX_TJ_IIDJS 
      ON TABLE_JOB (INPUTID, JOBSTATE) NOLOGGING;
    Last edited by PPJava; 02-28-12 at 12:43.

  2. #2
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    Code:
    DELETE FROM table_inputjob 
    WHERE  inputfilestate =? 
           AND indexfilestate =? 
           AND NOT EXISTS(SELECT * 
                          FROM   table_job 
                          WHERE  table_job.inputid = table_inputjob.inputid)
    post formatted EXPLAIN PLAN including Predicate details
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  3. #3
    Join Date
    Feb 2012
    Posts
    3
    Thanks for your quick response!

    Here is the result from EXPLAIN PLAN (when the statement should delete 1036 rows):

    Code:
    Plan hash value: 3615700051
     
    -------------------------------------------------------------------------------------------------
    | Id  | Operation              | Name           | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
    -------------------------------------------------------------------------------------------------
    |   0 | DELETE STATEMENT       |                |  7872 |   615K|       | 56776   (1)| 00:11:22 |
    |   1 |  DELETE                | TABLE_INPUTJOB |       |       |       |            |          |
    |*  2 |   HASH JOIN ANTI       |                |  7872 |   615K|    41M| 56776   (1)| 00:11:22 |
    |*  3 |    TABLE ACCESS FULL   | TABLE_INPUTJOB |   787K|    32M|       | 25398   (1)| 00:05:05 |
    |   4 |    INDEX FAST FULL SCAN| IDX_TJ_IIDJS   |  4651K|   164M|       | 18477   (1)| 00:03:42 |
    -------------------------------------------------------------------------------------------------
     
    Predicate Information (identified by operation id):
    ---------------------------------------------------
     
       2 - access("TABLE_JOB"."INPUTID"="TABLE_INPUTJOB"."INPUTID")
       3 - filter("INDEXFILESTATE"=TO_NUMBER(:2) AND "INPUTFILESTATE"=TO_NUMBER(:1))

  4. #4
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Try to create an index on tableinput_job(inputid)

    Depending on how many rows are selected by the condition on inputfilestate and indexfilestate an index on those columns might help as well.

  5. #5
    Join Date
    Jan 2012
    Posts
    84
    You want to delete 2 milion rows from a table with 4.5 million rows - it's almost a half of the table.
    Instead of deleting rows, consider saving rows that have to remain to a temporary table,
    then drop the original table and rename temp to the original table name.
    This will be much faster than deleting rows.
    Code:
    CREATE TABLE temp AS
    SELECT * FROM table_inputjob 
    WHERE  inputfilestate <> ? 
           AND indexfilestate <> ? 
           AND EXISTS(SELECT * 
                          FROM   table_job 
                          WHERE  table_job.inputid = table_inputjob.inputid)
    ;
    DROP TABLE table_inputjob PURGE;
    ALTER TABLE temp RENAME TO table_inputjob;

  6. #6
    Join Date
    Mar 2002
    Location
    Reading, UK
    Posts
    1,137
    To delete 2 million rows do you call the delete multiple times. If you are then altering it to do it in one delete statment execution would be quicker


    Alan
    Last edited by AlanP; 03-02-12 at 13:15.

  7. #7
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    The problem with deletes is that to maintain data integrity the database must make a copy of the row in the redo log before it can remove it. It is the slowest action on a database. The fastest is an insert.
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

  8. #8
    Join Date
    Feb 2012
    Posts
    3
    Thanks for all your help!

    Finally I'm using kordirko's suggestion. It completes the job in about 30 seconds!
    Thanks!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •