The application I am currently analysing, which is having, serious performance issues, is using semantic partitioning (range) on one of the main tables.
In order to be able to archive data easily, each partition contains the data of one month. Hence all inserts happen on the same partition. I do not really see the benefit of it.
In addition, each month, they obviously start with an empty partition and due to the high percentage of new data, the statistics are wrong. Therefore they run update stats every hour, in order to get the indexes used.
Main index is on date (which is the partitioning key) and an id-column, which is ascending.
My assumption is, that this index on (partitioning key, id) degenerates and is not really useful.
In addition I think that the performance problems could be solved, by partitioning by a completely different column.
Question: Has someone got experience with this sort of partitioning? Would it not be better to partition by a completely different column that the date of insertion, in order to distribute the load among all partitions.
The disadvantage would be, that the archiving would suffer, but finally the main purpose of the application is to get data in and out and not the archiving.
Need some more information.
Size of table and its indexes.
What do most of the queries use to search into the table? Assuming its partioned by range of dates ( i deduct from what you wrote ) the info might be needed for some months and being selected per month.
But it all comes down to how its used. Archiving is being done on that table? So a previous month is kept just for archiving purposes or is it still used?
Why do you say a different column is better?
The advantage they have is to have all the information on 1 table, even though every month they write on the new partition only. They can go back in time and not have a huge table to go through, partition segments the data that they look for.
If you have all the data you want on 1 partition is better than to spread it. Otherwise the queries have to go to multiple partitions to get the same data, its faster if the data is on 1 partition. Its used to segment data into smaller parts so its faster to access it. Otherwise you would have a huge table to go through.
Also update stats every hour?? That seems excessive....
Now it would be healthier to a have a normal table that gets archived to a historic table, that one being partitioned by whatever they want... but most of the times the customer can't change all their procs and processes to work in this way.