I developing a system which allow customer to purchase data records such as company info.
The system has about 1 million data records(2GB) and 1000+ customers. Each customer purchase about 5000 data records each time. The data record is updated at the rate of 2000 records per day.
When a customer puchase the data records again, only the lastest modified records is sold to the customer. I put a flag and version column on the sales table so that when the product table is updated, records in sales table will be updated as well. The product and sales table will then be matched to produce the lastest update data records.
Using the above way i encounter a performace issue whereby the search for the lastest record is extremely slow. Moreover, the sales table is taking a huge amount of storage space.
Can anyone suggest me one better way to solve this problem.
Thanks a lot.
I assume flag and version are not your PK sequence. For the query you are talking about you probably are using a WHERE clause whose predicates are these two columns, and then you do some forther row processing.
If the table(s) are cluster-indexed on some other fields whose distribution is widely different from the flag+version sequence (very likely), even is you have an alternate index built on the latter two fields, you quite probably will end up performing a tablespace scan. :-(
Of course, if the flag and version fields are buried toward the end of a long record structure, you are talking about a giant amount of work poor DB2 will have to perform to get your results.
Create an index that has all the fields required for you primary query so you can perform an index-only query.
Rearrange the columns so falg and version are positioned closer to the start of the structure.
Publish the SQL statement so we can better reflect on the situation.