Unanswered: Tips to optimize my table for a complicated query
MY query looks something like this:
select cat1, cat3, cat4
sum(case when cat2='x' then val end) as sumValForX,
sum(case when cat2='y' then val end) as sumValForY
from your_table WHERE date between somedate and someotherdate AND mainname=somename
group by cat1, cat3, cat4
Basically this groups my table on cat1,3,4; and shows a cumulative total on the resultant groups *separately* for when cat2='x' and cat2='y'.
I'm working on a very large table (say 5 million records) so this is taking a lot of time of course. It's workable, but I just wanted to ask some of the MySQL pros here on what sort of optimizations I can do to my table to make this quicker.
What would be the best technique to use here to make
1. Group by faster
2. The WHERE faster (indexing? partitioning?)
3. The CASE WHEN faster (which I think is the main slowdown).
Some insight about the data:
In around 5mil records,
`mainname` is the column with most repeated values. So maybe 1mil for 5 types of `mainname`.
Within those 1mil, I have a range of around ~90 dates spread out on which I'm executing the BETWEEN statement.
Also within those 1mil each, there are MAX 4-5 distinct `cat2`. So there might be 'x', 'y' or maybe 'z' 'z1' too, but not more. I have additional sum(case when... statements corresponding to the number of types of `cat2`s.
To put it differently, I need the grouped sum for each type of `cat2`.