Unanswered: Simple query over large dataset taking too much time.
I have been having a hard time with this query, I have a webpage which is supposed to run this query a dozen of times, and with an execution time up to 40 - 50 seconds per query, you can see how this is becoming quite problematic as the database keeps growing.
select distinct(scanid) from misspelled where word = ?
knowing that the table 'misspelled' can grow up to a dozen million rows, is there any way to make this work ?
Thanks, here are more details :
Query : select distinct(scanid) from misspelled where word = ?
This looks strange. The second plan looks quite reasonable with 820ms. This is probably caused by everything being in the cache. Which in turn means your harddisk seems to be extremely slow, otherwise I can't explain why scanning 250 MB should take 38 seconds.
The number of blocks scanned in relation to the number of rows is also quite high - maybe you are suffering from a table (and index) bloat.
Can you run vacuum full analyze misspelled and see if that changes anything?
The output of that statement might also be interesting.
9.4.0. is not the most resent 9.4 release though, but I doubt that this has anything to do with it.
Btw (unrelated to your problem): distinct is NOT a function. Writing select distinct(scanid) is useless, distinct always operates on all columns of the select list. You should remove those unnecessary parentheses. It's bad habit that will bite you in the foot if you do that for more than one column, because (column_1, column_2) is something different than column_1, column_2