We are faced with a critical business problem and stuggling to find a suitable solution. Hope some of you will be able to help.
Below is a summary of the problem we are facing.
A table has around 30 fields and around 20 million entries. There are 10 secondary index' in addition to the primary index.
A process involves the search on 5 of the fields, any combination of which should match the specified number of input parameters. A secondary index exists on each of the 5 fields.
So if we assume that the fields are A1, A2, A3, A4 and A5, the user can input all 5 parameters and state that atleast 3 of the parameters should match. The report needs to output data matching the following combinations.
To optimize the search, currently the program performs 5 individual selects for each of the input parameters, and then combines the data in the application logic.
We have been examining different possibilities to optimize this search. Some possible approaches are
1. Use a single select statement with an OR clause for each of the parameters. The data transported to the application server should be significantly less but we are not sure about the implications on the database server. In our initial tests, the performance does not seem to be much better.
2. Use a large number of specific selects for each of the possible combinations. For example, find records that match all 5 selection criteria, then each of the 4 combinations and so on.......
Any ideas about what might help us to do the trick?
You could try the following to restrict the number of returned rows but the indexing is a bit more troublesome. If you have 9i you can probably use the index skip scan feature (and probably reduce the number of indexes on the table) or the new data compression feature in 9ir2. Otherwise you need to probably do a statitical analysis of which columns are most often queried and do indexes on those common columns. Also if you have that many columns in the table is it normalised (you may be able to reduce the size of the table and thus improve performance). Also look at the optimizer parameters like multiblock read count etc as this may improve performance or help it choose a better plan. Depending on the cardinality of the columns also look at bitmap indexes.
select * from
select d.*, decode(A1,2,1,0)+decode(A2,200305,1,0)+decode(A3,' A',1,0) match
from tablename d
where A1 = 2 or A2=200305 or A3='A'
) where match>1