Results 1 to 5 of 5
  1. #1
    Join Date
    May 2009
    Posts
    18

    Unanswered: Bizarre performance threshhold: 37 numbers vs. 38 numbers??

    Hi all, i'm encountering a bizarre problem in PG8.3 for Windows. The following query runs consistently in about 1.2 seconds:

    Code:
    SELECT Availabilities.Availability_ID
    FROM Availabilities INNER JOIN Availability_Periods ON Availabilities.Availability_ID = Availability_Periods.Availability_ID 
    WHERE Business_ID IN (1413,1508,8226,60309,65743,1354,1373,1455,1504,1516,1719,1730,1802,1803,1825,1829,7988,23652,25216,30990,33059,33108,39055,39413,41471,42452,42561,45522,46322,50245,52600,54266,54587,59146,60006,63684,69667)
    There are 37 numbers in the IN() clause. Then, when I add any number to the IN() clause, making it 38 numbers:

    Code:
    SELECT Availabilities.Availability_ID
    FROM Availabilities 
    INNER JOIN Availability_Periods ON Availabilities.Availability_ID = Availability_Periods.Availability_ID 
    WHERE Business_ID IN (1402,1413,1508,8226,60309,65743,1354,1373,1455,1504,1516,1719,1730,1802,1803,1825,1829,7988,23652,25216,30990,33059,33108,39055,39413,41471,42452,42561,45522,46322,50245,52600,54266,54587,59146,60006,63684,69667)
    Suddenly, the query consistently takes around 5-6 seconds to run. It doesn't matter which number I add or remove, I consistently get this difference. I ran EXPLAIN on them and they show the same execution plan.

    Unfortunately I can't use temporary tables or subqueries to dynamically generate the list of numbers to begin with. However that aside, something is clearly wrong here in the first place.

    Any ideas on the difference? Is there some threshold measure in PG causing this difference, which I could tweak to a more suitable level? (Eg. if the slowness kicked in at around 50 numbers, this would easily meet our needs)

  2. #2
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    What does the explain plan for those queries show?

    I could image that adding the number makes the planner choose a different (slower) execution plan e.g. because of wrong or not sufficient statistics.

    Maybe increasing the default_statistics_target parameter and analyzing the tables could help.

  3. #3
    Join Date
    May 2009
    Posts
    18
    Quote Originally Posted by brendan.hill
    Hi all, i'm encountering a bizarre problem in PG8.3 for Windows. The following query runs consistently in about 1.2 seconds:

    Code:
    SELECT Availabilities.Availability_ID
    FROM Availabilities INNER JOIN Availability_Periods ON Availabilities.Availability_ID = Availability_Periods.Availability_ID 
    WHERE Business_ID IN (1413,1508,8226,60309,65743,1354,1373,1455,1504,1516,1719,1730,1802,1803,1825,1829,7988,23652,25216,30990,33059,33108,39055,39413,41471,42452,42561,45522,46322,50245,52600,54266,54587,59146,60006,63684,69667)
    There are 37 numbers in the IN() clause. Then, when I add any number to the IN() clause, making it 38 numbers:

    Code:
    SELECT Availabilities.Availability_ID
    FROM Availabilities 
    INNER JOIN Availability_Periods ON Availabilities.Availability_ID = Availability_Periods.Availability_ID 
    WHERE Business_ID IN (1402,1413,1508,8226,60309,65743,1354,1373,1455,1504,1516,1719,1730,1802,1803,1825,1829,7988,23652,25216,30990,33059,33108,39055,39413,41471,42452,42561,45522,46322,50245,52600,54266,54587,59146,60006,63684,69667)
    Suddenly, the query consistently takes around 5-6 seconds to run. It doesn't matter which number I add or remove, I consistently get this difference. I ran EXPLAIN on them and they show the same execution plan.

    Unfortunately I can't use temporary tables or subqueries to dynamically generate the list of numbers to begin with. However that aside, something is clearly wrong here in the first place.

    Any ideas on the difference? Is there some threshold measure in PG causing this difference, which I could tweak to a more suitable level? (Eg. if the slowness kicked in at around 50 numbers, this would easily meet our needs)
    Ah, I think I solved it - ran a FULL VACUUM on both tables and now it runs a treat. Bloody strange problem though!

  4. #4
    Join Date
    Nov 2003
    Posts
    2,935
    Provided Answers: 12
    Quote Originally Posted by brendan.hill
    Ah, I think I solved it - ran a FULL VACUUM on both tables and now it runs a treat. Bloody strange problem though!
    This seems to indicate that your auto_vacuum settings are not optimal.

    You didn't post the execution plans, but my guess would be that using 37 numbers generates an index scan whereas with 38 numbers Postgres used a seq scan. If you have a lot of dead tuples in the table, the seq scan can take really long because the dead tuples are scanned as well.

  5. #5
    Join Date
    May 2009
    Posts
    18
    Quote Originally Posted by shammat
    This seems to indicate that your auto_vacuum settings are not optimal.

    You didn't post the execution plans, but my guess would be that using 37 numbers generates an index scan whereas with 38 numbers Postgres used a seq scan. If you have a lot of dead tuples in the table, the seq scan can take really long because the dead tuples are scanned as well.
    Hi Shammat, yes - that's pretty much what the execution plans showed. I will be revisiting the auto_vacuum settings (it's a recently implemented installation) to make sure it's better handled in future - thanks for the advice.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •