Results 1 to 7 of 7
  1. #1
    Join Date
    Sep 2009
    Posts
    11

    ATTN: Blindman & others

    Quote Originally Posted by blindman
    G@dd$mn F#&$ing Star Schemas And Snowflakes Are Not Data Warehouses!
    Believe me, I'm completely sold that it is essential for both an ODS and a warehouse to be normalized. I think it is quite ridiculous that such a question is even contested.

    So normalized it is for this sake of this discussion.

    But a snowflake schema can be 3NF, so might it be reasonable to construct a warehouse using a 3NF snowflake?

    If not, I'm curious as to your approach for creating and maintaining the dimension tables for the datamarts.

  2. #2
    Join Date
    Mar 2009
    Location
    Sydney, Australia
    Posts
    258
    Finally, someone with a deeper understanding of the value of normlisation. Halelujah !

    Yeah, a word of warning, be careful, there are a few here who have their own private definitions of standard terms.

    For starters, I would say 4NF or BCNF is minimal these days, 3NF is no longer the minimum acceptable for large corporations, DoD, govt departments. Most of them demand IDEF1X as well.

    Yes, I have implemented a few pure data warehouses (sourcing the data from OLTP, so there is no OLTP requirement) as well as a few DSS on top of OLTP; one example here being discussed in a recent thread. Star or Snowflake on top of OLTP. Strictly BCNF. The standout benefit was ease of reporting, the users number in the thousands, and use many different report tools. I personally would not find an unnormalised design acceptable for a data warehouse (I have worked on more than a few, and the pain is unforgettable). It is not only reasonable, I would say it is a commercial minimum (supplied for a fee by a qualified outside professional, as opposed to supplied by a salaried internal person).

    Maintaining the dimension tables (and table clusters in a Snowflake) are dead easy. Simple and straight-forward, as are reference tables in a normalised OLTP database.

    One more point. The absence of natural relational keys in a BCNF database cripples it; it cripples a warehouse as well, the data volumes are just larger, and the effort required to correct it thusly larger. If there are more than a few Fact tables, and they are not related by migrated relational keys (they are related by surrogate keys alone), then Dimension-Fact reporting is also crippled, limited to the single parent Fact table. Migrated relational keys allow the Fact table to have children; and those child Fact tables can be reported on directly from the Dimensions, without being forced to go through the parent Fact table. The linked example does demonstrate that capability, in simple terms. The alternative (and I have seen warehouses undergo this correction), is massive data duplication: every Fact table has to have every Dimesion possible as an FK.
    Regards
    Derek Asirvadem (Formerly DerekA)
    Information Architect / Senior Sybase DBA
    Copyright 2009 Software Gems Pty Ltd

    I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

    http://www.softwaregems.com.au

  3. #3
    Join Date
    Dec 2007
    Location
    London, UK
    Posts
    741
    Quote Originally Posted by iambk
    But a snowflake schema can be 3NF, so might it be reasonable to construct a warehouse using a 3NF snowflake?
    Yes, any pattern of tables and relationships could arise in a normalized schema. That may include star and snowflake patterns. When designing Normal Form models the goal isn't to achieve some design pattern but to create a schema based on dependencies and minimising the potential for anomalies that could arise.

    As far as I know (please correct me if you know otherwise) the "snowflake" term originates with Kimball. In my opinion it isn't very important or interesting as a concept (is every model a snowflake if it is more than just a star?)

    A "star" is much more significant because it describes a common scenario in data models of all kinds: a table with multiple foreign keys which is the target of joins with a selection criteria on one or more of the "parent" tables. Optimising such queries is an important problem in DBMS design.

  4. #4
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Quote Originally Posted by iambk
    But a snowflake schema can be 3NF, so might it be reasonable to construct a warehouse using a 3NF snowflake?
    Heck, even a star schema can be "normalized", for very simple database designs. Obviously a snowflake schema can handle more complex designs and still be normalized. But in my experience no design with a single centralized fact table can accomodate even moderately complex normalized schemas.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  5. #5
    Join Date
    Sep 2009
    Posts
    11
    Quote Originally Posted by Derek Asirvadem
    The absence of natural relational keys in a BCNF database cripples it; it cripples a warehouse as well, the data volumes are just larger, and the effort required to correct it thusly larger. If there are more than a few Fact tables, and they are not related by migrated relational keys (they are related by surrogate keys alone), then Dimension-Fact reporting is also crippled, limited to the single parent Fact table. Migrated relational keys allow the Fact table to have children; and those child Fact tables can be reported on directly from the Dimensions, without being forced to go through the parent Fact table. The linked example does demonstrate that capability, in simple terms. The alternative (and I have seen warehouses undergo this correction), is massive data duplication: every Fact table has to have every Dimesion possible as an FK.
    Are you suggesting that natural keys from the OLTP database be included in the fact tables?

  6. #6
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Not as a part of the data warehouse, though it may be required for data synchronization.
    The data warehouse should in NO way be tied to or dependent upon the schema or conventions of the source database(s). Sources may change as applications are phased out and new ones are brought online, but theoretically the data warehouse should exist for the life of the business.
    The data warehouse should have its own independent set of keys, natural or surrogate as you choose.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  7. #7
    Join Date
    Sep 2009
    Posts
    11
    Quote Originally Posted by blindman
    Not as a part of the data warehouse, though it may be required for data synchronization.
    The data warehouse should in NO way be tied to or dependent upon the schema or conventions of the source database(s). Sources may change as applications are phased out and new ones are brought online, but theoretically the data warehouse should exist for the life of the business.
    The data warehouse should have its own independent set of keys, natural or surrogate as you choose.
    I certainly agree, I was concerned otherwise.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •