Results 1 to 2 of 2
  1. #1
    Join Date
    Oct 2016
    Posts
    1

    Unanswered: Query Optimization Help

    I'm trying to optimize the following query and I'm not sure how to do it. Any help would be appreciated

    This is really trimmed down data I'm doing now but I believe it shows the overall problem. For some reason Seq scans are getting done with Delete that aren't being done with Select. Both of these queries currently don't retrieve anything since the DELETED rows have already been done.

    This query takes 5 seconds:

    EXPLAIN DELETE FROM "RequestNew" WHERE id IN (select rn.id from "RequestNew" rn LEFT JOIN "Request" r ON r.id = rn.request_id where r.id IS NULL)

    "Delete on "RequestNew" (cost=1170350.15..1170355.63 rows=1 width=18)"
    " -> Nested Loop (cost=1170350.15..1170355.63 rows=1 width=18)"
    " -> HashAggregate (cost=1170349.60..1170349.61 rows=1 width=16)"
    " Group Key: rn.id"
    " -> Hash Anti Join (cost=474049.68..1170349.60 rows=1 width=16)"
    " Hash Cond: (rn.request_id = r.id)"
    " -> Seq Scan on "RequestNew" rn (cost=0.00..620147.96 rows=2383396 width=14)"
    " -> Hash (cost=370661.19..370661.19 rows=5947719 width=10)"
    " -> Seq Scan on "Request" r (cost=0.00..370661.19 rows=5947719 width=10)"
    " -> Index Scan using "RequestNew_pkey" on "RequestNew" (cost=0.56..6.01 rows=1 width=10)"
    " Index Cond: (id = rn.id)"

    This query takes 1 second:

    EXPLAIN SELECT FROM "RequestNew" WHERE id IN (select rn.id from "RequestNew" rn LEFT JOIN "Request" r ON r.id = rn.request_id where r.id IS NULL)

    "Nested Loop (cost=1123840.92..1123843.69 rows=1 width=0)"
    " -> HashAggregate (cost=1123840.36..1123840.37 rows=1 width=4)"
    " Group Key: rn.id"
    " -> Merge Anti Join (cost=71086.71..1123840.36 rows=1 width=4)"
    " Merge Cond: (rn.request_id = r.id)"
    " -> Index Scan using request_id_index on "RequestNew" rn (cost=0.43..658236.37 rows=2383396 width=8)"
    " -> Index Only Scan using "Request_pkey" on "Request" r (cost=0.56..421372.34 rows=5947719 width=4)"
    " -> Index Only Scan using "RequestNew_pkey" on "RequestNew" (cost=0.56..3.30 rows=1 width=4)"
    " Index Cond: (id = rn.i

  2. #2
    Join Date
    Nov 2003
    Posts
    2,933
    Provided Answers: 12
    You are over-complicating the query and the sub-query you use forces the database to go through RequestNew twice.

    The following should be faster (as a SELECT as well as the DELETE)

    Code:
    DELETE FROM "RequestNew" rn
    WHERE NOT EXISTS (select * 
                      FROM "Request" r 
                      where r.id = rn.request_id)
    Another version to try is:
    Code:
    DELETE FROM "RequestNew" 
    WHERE id NOT IN (select r.id from "Request" r);
    Please use [code] tags in the future to properly format code (see: http://www.dbforums.com/misc.php?do=bbcode#code on how to do it)

    Also: you should really avoid those dreaded quoted identifiers. They are much more trouble in the long run then they are worth it.

    You should also provide the (formatted) output of explain (analyze, verbose) instead of the plain explain output
    Last edited by shammat; 10-19-16 at 06:08.
    I will not read nor answer questions where the SQL code is messy and not formatted properly using [code] tags: http://www.dbforums.com/misc.php?do=bbcode#code

    Tips for good questions:

    http://tkyte.blogspot.de/2005/06/how...questions.html
    http://wiki.postgresql.org/wiki/SlowQueryQuestions
    http://catb.org/esr/faqs/smart-questions.html

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •