Results 1 to 5 of 5

Thread: Warehouse?

  1. #1
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322

    Warehouse?

    ok...ok...let me ask you...I never built a cube...or a cylinder or a tetrahydren(sp?)...

    So what's the big deal?

    Denormalization is Denormalisation (for my continent freinds), right?

    It's all about performance...aren't they just all new buzz words?

    Did the optimizer change?

    Or do extensions enhance performance....

    Any comments appreciated....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    The biggest thing that I see for the user coming from the switch from tables to cubes is that you can look at your data in a different way. Instead of thinking about the procedure of how to extract your data, you can think about the data itself and what dimensions are interesting.

    From the engine's perspective, the whole ballgame changes. Now instead of simply managing the tables and indicies, the engine can manage the whole ballgame. The designer specifies the underlying fact tables, the dimensions that they find interesting, and sets a limit on how much space the engine can consume. The engine then rolls up the fact data in ways that make it easier (faster) to answer the current user queries.

    In the same way that the optimizer sometimes makes surprising choices that benefit performance based on the indicies available, the cube manager can do the same thing with rollups. It can watch incoming queries to see what is currently interesting to the users, and it can jettison some old rollups to make room for new ones that are more useful for answering the user's current queries.

    The net effect is that for a given amount of hardware, you get better performance. For a given amount of human effort, you get better queries because they think in more user-like ways instead of having to think about the procedure needed to get the answer they need. The result is a win-win situation in my book.

    -PatP

  3. #3
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    Wait a sec...

    Yopu're saying that as a request comes in, a process then rolls the data up, at request time, and generates the new dimensions?

    Doesn't that take a long time?

    I thought the cubes are built by the dba, in a batch process, so the request can be honored quickly...

    Damn, now I gotta get a book....
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  4. #4
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,800
    Hmm. My first reply never made it in...

    I am not sure about other engines, but I know Analysis Services will keep a bunch of top-level aggregate information lying around in easy reach. This is the first stuff that is produced by "Processing the Cube". As the cube is further processed, more and more combinations of aggregations are computed and stored. Depending on how many measures and dimensions you have, as well as how many rows in the fact table, it will take a while to completely process a cube. So, they give you a "stop" button in the GUI (Sorry, Brett. Did you think MS was going to give you a batch script interface?). When you hit the stop button, the processing will stop with whatever combinations of data it has gathered together.
    When queries start coming into the cube, then AS tries to get the answer from these saved answers. If AS can not figure it out from these, then and only then, will it go back to the original data source.

    Anyway, that is the theory. It looks nice, but could be wildly different from reality. As Bill Gates always says "Pay no mind to that process behind the curtain".

  5. #5
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    The DBA defines the basic snowflake schema. The dimensions of interest, the measures (values) for those dimensions, the basic facts (sales, hours, margaritas, etc).

    Depending on whose cube manager you are using, there are varying degrees of automation. Using a relatively automated manager, the DBA might just specify the intial rollups... The DBA's best guess at what the users will want to know, then allow the cube manager to "watch" incoming queries. The manager first answers the query, then considers the query itself to see if a different rollup strategy might have made the query easier to answer. Over time, the manager might decide to take a rollup that doesn't perform as well (answering current queries) and use the space for a new rollup that works better.

    Granted, there are many purely manual cube managers still out there. The more automated ones generally cost the company LOTS less to run, so they are a lot more common and becoming more common all the time.

    -PatP

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •