I've come along way learning about databases and SQL. I can write basic queries now. Even some with subQueries. What I need to learn, is how to approach data mining. Can someone suggest the best path to follow, to learn how to accomplish data mining from a very large database?
I don't just need to produce reports of acquired data. I need to go in and grab data and look for patterns against known result sets. I hope that makes sense.
Well, do you have a basic grounding in statistics?
TSQL lends itself fairly well to exception reporting, but unfortunately it does not have a rich function set for correlational studies. I've written my own code for simple linear regression, but I haven't attempted true multi-linear analysis.
SQL Server 2005 has the CLR toolkit that allows you to build custom aggregate functions, so it may prove more useful for data analysis and I wouldn't be surprised to see 3rd party developers marketing more advanced statistical functions. In the meantime, you will need to spin off subsets of data for further analysis is a stronger statistical tool, such as Excel or SAS.
If it's not practically useful, then it's practically useless.
Thanks. I actually over simplified my project. I have a fairly extensive database, that I need to work with. I need to select certain data set and see what factors that I can add that will make it more predictive. I'm test driving a software called Purple Mineset at present.