Now, I want search for records with employee id 1 in a way that if column emp1 contains 1 then I want the data in column emp2, and if column emp2 contains 1, then I want data in column emp1.
The output has to be a single column with no duplicate values. In the above example, for employee id 1, the output would be, 2,5 and 6 only.
This table has lakhs of records. I have to scan both columns for a given employee id. What will be the most optimized way to retrieve the data faster. Also, do I need to restructure this table for faster data retrieval?
I have indexes on emp1 and emp2 columns. Do we need union here and if yes, what is the best optimized query for the same?
yes..lakhs of records. A stored procedure puts lots of date daily in the table.
I have another concernabout how to manage this data. All the data from the table must be available all the time. Nad yes,it MS SQL 2005.
Were the queries fast enough or do you need further optimisation? Which performed best? I suspect it will be Rudy's (with the correct indexes). Also remember that adding indexes will slow down insert and (probably) update speed....
LOTS is basically I can say around 5000 records per day. And yes, Rudy's query is a little faster. Also, indexes is not a problem as the stored procedure runs midnight and does inserts only and the time it takes to execute is not a constraint...
I am trying to use partitions. I cannot use date to filter my rows in the where clause so cannot have partition on date column....is there anyway I can partition based on emp1 or emp2 for better data retrieval and storage.
btw, pootle flump- thanks for all the help so far..
Don't use partitions. 5k per day is a very small number of records.
For this query (and considering this query only) the fastest index choice would be to create a clustered index on emp1 and a nonclustered index on emp2. emp2 would be in the leaf level of the clustered index, and in 2005 the optimiser can access the clustered index value in the non clustered index without visiting the clustered index so, in effect, emp1 would be in the leaf level of the nonclustered index. Obviously, you could switch this the other way round - it is immaterial which is clustered or not.
Basically this would mean Rudy's query would do two index seeks. The distinct requirement will hold things up a bit but no alternative with your data structure.
Please also note this is not necessarily optimal for this table - it is only optimal for this one query. Index tuning can only be done properly considering all the factors of all the operations on the table. Normally, for example, you would want the clustered index to be unique and you may still do so in this instance.