Confusion regarding strategies for datawarehouse creation
I am in process of searching for different dataware house creation and maintenance strategies for my project.
I came across these concepts which seem contradictory.
1) Data in the warehouse is never updated, but is populated on the basis of a timestamp.(This came from a general book on warehousing)
2) The data in the table to be used as a fact table is subject to additions and the dimension table data is subject to both additions as well as updations. (This came from a topic headed as 'Building a warehouse with OLAP solution in mind' which was built keeping in mind Analysis Services)
I have a few doubts...
1)Is the existing warehouse record never to be updated if there is a change in the corresponding OLTP record(but it should be appended with a timestamp)? Which would be the correct strategy to follow while loading data into the d/w?
2)Is it necessary to bring in all columns from the OLTP source to the warehouse destination , or can some fields be skipped?
Re: Confusion regarding strategies for datawarehouse creation
I'm not a DW person, but surely the answer to question 2 is: it is only necessary to copy columns from the source if anyone is interested in reporting on them in the DW. That might be all columns, but might not be. The answer is driven by requirements, not by theory.
Thanks for that!
You see, the project is for membership management- so I have, let us say, two tables- Leads who are to made into members and consultants who do this job. Now the leads may change their contact numbers,the consultants who track the leads may change their wage rates and the leads may be assigned to a different consultant if the consultant leaves his job.
So I'll append the refreshed data into the warehouse table with a timestamp.
The report I might generate would be to know how many leads are assigned to a particular consultant. I don't know how to add a time dimension to the cube so that the calculations are done as per the current most data only and the information cannot be misinterpreted in any way? How should I create the keys for such tables that include the time column(the field should be the insert date/ update date etc) ?