quick question on data warehouse analysis and design?
Sorry for this very newbie question but:
I want to analyse a warehouse system for a factory. I want to keep the info about the items that get into the warehouse and leave it every day. The structures I need to use must be optimal for storage space and time for different searches and report making. The reports are to be made for the items entering and leaving the warehouse at random dates( random access) and weekly/monthly(range query).
Well the solution seems to be using a B*Tree. But I'm still in doubt if I'm taking the right approach. I googled but I mostly came across stuff about data warehousing (which wasn't very useful to me at first look) and some stuff about database design (RDMS, ODMS) which I'm very new to.
Any pointers and directions or direct links(I'm on a slow connection) on what to use (what indexing?) and that help me with the analysis would be mostly appreciated.
There are TONS of sources for data warehousing. The concept behind a data warehouse is to gather data from systems used to run your day to day operations. Combine this data together from these different systems in a manner that lets you see a picture of your business across multiple systems over time.
That's a very vague concept I know(Inmon would be proud).
There's a good chance you might have a separate inventory and purchase system. So you might use your data warehouse to relate these together and provide snapshots of this data for executive decions.
A snapshot is a picture or information at a certain point in time, like what is the status of our inventory for each product or product line 'monthly' (one snapshot), 'weekly' (a different snapshot) or 'daily' (yet another point in time).
OK.. It's midnight in Columbus and I'm drunk so that's all the info you're getting.
index intersection with btree for desgning a warehouse system
I'm sorry that I made my question in a vague way but what I exactly want is as follows:
I have a warehouse (not a data warehouse I think!) to which some items enter and leave everyday.
I want to design an optimal index for it so that I can run different queries on it including random and range queries(snapshotting)
Here's an example table: (Let me know if you have a better design for the table too!)
(5 sample queries)
On (1)Day X which items have (2,3)left/entered the warehouse (4) how many
(5)On Day X which item has left/entered the warehouse MOST/least?
In Week/Month X which item has MOST/least left/entered the warehouse?
Well again after a lot of googling I came across this as the solution:
index intersection with B-trees (B*Trees)(Which in turn seems relevant to Composite B*Trees/ Multiple secondary B*Trees).
Might a B+Tree be of help ( I doubt!)
BTW I'm not after its implementation (SQL, C++ and such) but just the Analysis on paper.
Yes, right it's a homework but I'm not asking for the complete solution but just pointers and after all I have done my best so far.
As for your answer:
It is very reasonable but what is beyond this piece of SQL code, matters to me. As I mentioned no implementation of any sort is required and I think for a course called Data Storage and Retrieval with Michael J. Folk's book on File Structures as the reference which hasn't mentioned a word on the Composite B*Tree it's not very easy.
Getting to the B+ tree level is far beyond the scope that would normally be used by a data modeler or developer. This is actually at a lower level than even a true DBA would be likely to operate, more on the level of a database engine support engineer. I'd be very surprised if an assignment really wanted to know that kind of information.
Are you looking for a proposed schema? Do you want an overview of the queries needed to answer specific questions (which would depend completely on the schema)? Did I miss your point completely?