Results 1 to 13 of 13
  1. #1
    Join Date
    Mar 2004
    Posts
    30

    Highly Normalised Table Structures, how to load whilst assigning/allocating keys ?

    Hi, I have a problem in that Im not sure of the best way to load a highly normalised table structure, in that a number of tables are all related for a given situation / entity, but keeping tabs on primary keys, relationships between them all is very difficult seeing I need to consider the entire set when loading... First, a description of the business rules / design :

    Interest Rates for a given account may be structured in a way that depending on the balance you have, you may get a different rate, and furthermore, each part of that balance may have a seperate rate applied to it... Refer to the below diagram for a data example... The entire rate set is what we call the 'Matrix Rate Structure', within this we have a number of 'Condition Rules' ( lines )... Within each condition rule, there are two sets of 'Condition Numeric Ranges', one being Eligibility ( ie. if your balance is $x ), and the other being Applicability ( ie. the interest-rate applies only to this part of the balance ), and of course, the Interest Rate being applied... It should be pretty straight-forward to see how this fits into the data-model below...

    So working from the more granular level upwards, lets look at loading 'Condition Numeric Range', this table shows an upper and lower bound, therefore can be re-used in multiple 'Condition Rules', as either Applicable or Eligibile, it doesnt matter. It is like populated like a cartesian dimension ( ie. if this upper/lower bound exists already, get that key, if not, create a new one ). Then look at all the 'Interest Rates', get the distinct set of those, and populate RATE and INT_RATE tables ( supertype/subtype ). I have all the ranges and interest rates loaded ( with keys returned ), therefore I can populate 'Condition Rule' using these 3 values, once again, treating this table as a cartesian dimension ( does a rule exist with this eligability / applicability range with this interest rate attached, if so, use this key, otherwise, create a new one )...

    Im fine up to here, I can load these 3 tables using a denormalised 'Staging table' which has all attributes for all 3 parts of the rule, by using each attribute group ( interest rate, and the 2 numeric ranges ), looking up the associated table, and returning a primary key for each... Then using the 3 returned keys, I can create a row in 'Condition Rule', no worries... As seen, the 'Numeric Range', 'Interest Rate' and 'Rule' are all reusable elements, which if another 'Matrix Structure' comes along with the same rule, I can reuse it...

    Now my problem is loading the single 'Matrix' element ( into the RATE table, with links to each Rule elemt via the 'Matrix Rule Structure' table )... When loading this, I need a handle on all rules within the Matrix at once, to make sure I am either using an existing Matrix, or a new one... I cant seem to figure out how to load that last step, because I need to know all the Rules in my matrix at once, to see whether I should make a new Matrix, or re-use an existing one ( with exact same rules )...

    This is my problem with loading such a highly normalised structure, where I need a handle on all associated elements across the tables to ensure I am reusing existing primary keys, etc... Is the problem that my 'Matrix' does not have an identifier, with the only way to identify each Matrix ( which should result in a single row in the 'Rate' table ), is by looking at all the rules within...

    If anyone could help me on this I would be eternally grateful, it is causing me alot of grief... By the way, I am using an ETL tool to load / reference these tables ( therefore plain SQL ), so I cant do anything funky like store RULE_IDs in an array or anything when loading...Is my problem with my loading philosophies or with my DB design... Thanks heaps in advance
    Attached Thumbnails Attached Thumbnails stepped_1.jpg   stepped_2.jpg  

  2. #2
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Hi,

    Would it be possible to delete all data from the Matrix table and then just reload the table with all possible combinations of ConditionRuleId and RateId?

    That way, you do not have to worry about old and new data.

    By the way, there seems to be redundant information in the "Matrix Rate Structure" diagram that you sent with examples of data.

    Ravi

  3. #3
    Join Date
    Mar 2004
    Posts
    30
    Thanks for the reply rajiravi...

    1. There is an infinite number of combinations regarding matrix structures, so preloading all combinations is not an option
    2. Deleting all old data out is not viable, as once loaded, this 'Matrix' subject area is considered link reference data ( ie. bank accounts would then just link to a matrix type rate ).
    3. That still doesnt solve my problem, as Im having trouble keeping a handle on an entire matrix so when I am attempting to load it ( and either load the data as a new matrix type, or returning the primary key of an existing matrix ), Im finding it impossible to compare my current matrix against each existing one in the DB, as they are stretched out over multiple tables and can have variable number of rules.
    4. Im interested to know where you see the redundant data being ? Remember that once i have loaded these matrices ( which in the real world an account is linked to ), I just return the primary key of the matrix ( from the RATE table ), and link the account to that ( in another table, ie. AR_RATE_RLTSHP )

    Thanks,
    Adam

  4. #4
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    The redundancy occurs, for example, in these lines:

    If Balance Applies to Rate
    ---------- ------------------------
    <1000 All 5%
    1000-2000 0-1000 5%
    1000-2000 1001-2000 4.5%

    The second row is redundant and effectively a copy of the first.
    In essence, for every interval, you have a formula that says

    Interest Amount = fixed_amt + (actual balance - lower limit) * rate

    Or, using symbols:

    A = F + (Bal - Lower) * Rate

    For row 1, the values will be: F=0; lower = 0; rate = 5%
    For row 3, the values will be: F=50 (5 of 1000), lower = 1000, rate = 4.5%.
    Row 2 is not needed.

    This assumes that, in any slab, the interest rate is a linear function. Which it should be.
    Anyway, rereading your mail has left me a bit confused. You talk about a rule table but I do not see that table in the ERD you had attached. Also, what is the MatrixRateId column? Is it a FK from Rate table? Then there is also a rateId in Condition Rule table. So what exactly is happening here?

  5. #5
    Join Date
    Mar 2004
    Posts
    30
    I believe your formula is only correct where there is only two tiering for each balance amount ( ie. your Fixed value ). There can be any number of combinations of balance amount Eligibilities, and within these, any number of tiering Applicabilities ( ie. x% applies to this part of the balance ). Row 1 and Row 2 are therefore not related, as the '<1000' eligibility could have any number of tierings, maybe it looks like this because Row 1 was the only % rate in its eligability range... Pretty sure thats right..

    To answer your other questions :

    The entire matrix is refered to in the 'Rate' table
    The link between a matrix and its rules ( any number of ) is in 'Matrix Rate Structure'
    Each rule lives in 'Condition Rule'
    The Interest Rate that applies to that rule lives in the supertype/subtype relationship of 'Rate' and 'Interest Rate'
    The Eligibility and Applicability ranges live in 'Condition Numeric Range'

    The account is linked to the matrix ( say, via an ACCOUNT_RATE_RLTSHP table ).. Remember, rules, numeric ranges, and interest rates are generic objects and can be reused within multiple parent entities over again... Its the matrix which is the unique object, and that is where I am having trouble keeping a handle on the entire object ( including rules, ranges, and rates within ), to see if that exact matrix already exists or not ( ie. define a new matrix with new primary key, or use an old one ? )

    Thanks,
    Adam

  6. #6
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710
    This is a very good problem.

    This is one case where using natural keys instead of surrogate keys would be the better choice.

    Eliminate your surrogate key ids and use the natural keys of rate, eligibility and applicability and create a unique composite key on all of these values.

    This will be your matrix:

    Matrix(RateName, EligibilityUpper, EligibilityLower, ApplicabilityUpper, ApplicabilityLower) Unique Index

    Forget the "condition" tables for now.

    Once you have populated this new matrix table think about normalizing further.
    Last edited by certus; 12-23-04 at 22:39.

  7. #7
    Join Date
    Mar 2004
    Posts
    30
    thanks certus, i think we are along the right track now.. your suggestion of creating a denormalised, natural key version of a rule ( ie. interest rate, eligibility upper/lower, applicability upper/lower ), was the path I was heading down to populate initially in a 'staging' area, from where I could populate my numerous normalised tables from, but the original problem remains, and your logic was slightly flawed, in that the above denormalised/natural table would hold a 'Rule', but a Matrix can contain any number of 'Rules'...

    Using the above method, it is fairly easy for my to populate any given 'rule' with the denormalised version of the data, but I have to look at the Matrix in its entirerity ( ie. look at all the rules with in the matrix at once ), to see if the Matrix already exists, or if I have to create a new version...

    So i would populate / lookup from the lowest granular level ( ie. condition, interest rate, rule ), and have a handle on a single rule because it contains defined components ( 2 conditions + interest rate ), but I cant seem to get a handle on the entire matrix ( number of rules )... Can you see anyway I can populate a denormalised structure as per above so I can determine if the entire matrix combination already exists ?

    Maybe the point is that when I am loading, I have a handle on each rule independant on each other, but I do know which account each of these belong too ( as I have no concept of a 'matrix' when Im first loading ), and then try and do a group by on both sides ( ie. when loading, group account/rule combos against existing matrix/rule combos ) ?

    Thanks,
    Adam

  8. #8
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710

    Post

    Is the problem that my 'Matrix' does not have an identifier, with the only way to identify each Matrix ( which should result in a single row in the 'Rate' table ), is by looking at all the rules within...

    This is your problem.

    The denormalized table I suggested could hold as many Rules as you want. The unique index is on all the columns (a composite key) and would make sure that you have a set of unique rules. But there is no way to differentiate which matrix each rule belongs to.

    You need a Matrix Rate Structure natural key. That would make your staging table complete.

    StageTable(MatrixRateStructureKey, RateName, EligibilityUpper, EligiblityLower, ApplicabilityUpper, ApplicabilityLower)

    Unique Composite Index on all columns.

    You seem to be confusing matrix rate structure identification with interest rate identification.

    You need an interest rate table related to the condition table

    Drop the relationship between the Matrix Rate Structure table and the condition table.
    Last edited by certus; 12-24-04 at 03:37.

  9. #9
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    arnzie said:
    I believe your formula is only correct where there is only two tiering for each balance amount ( ie. your Fixed value ).
    That is not quite true. It is true for any number of tiers. Think for a moment and figure it out. If not, I'll post the answer in a day or two.

    And I am still not clear about the problem. Could you please give examples of what a small matrix rate set would look like, and how you would check if the incoming rate structure was possibly a new one? Give example of where the incoming is actually new, and one for the case where it is a copy of an already existing one.

  10. #10
    Join Date
    Dec 2003
    Location
    Canada
    Posts
    710
    Go get em raj!

  11. #11
    Join Date
    Mar 2004
    Posts
    30
    An example data-set :

    If Balance Applies To Rate
    =======================
    <1000 All 5%
    1000-2000 0-500 5%
    1000-2000 501-1000 4.7%
    1000-2000 1001-2000 4.5%
    2000+ 0-500 4.9%
    2000+ 501-1000 4.6%
    2000+ 1001-2000 4.4%
    2000+ 2000+ 4%

    I have no trouble with the concept of loading a Rule ( ie. each line above ) independantly.. Take the 2nd line... I will load NUMERIC_RANGE with the row meaning "0-500", and again for "1000-2000". Say, the first range does not exist already, so I create a surrogate key, and just return the other existing range, so I end up with two surrogate keys, being 1 and 3... I take the interest rate ( 5% ), and load that into RATE and INT_RATE ( supertype/subtype tables ), if it doesnt already exist ( lookup key in this case is just the % plus maybe a description 'Interest Rate' ), otherwise, I return a matching value, say it ends up being RATE_ID = 5

    All this is done from a denormalised table containing all the natural attribute information, but now also has :

    ELG_NUM_RANGE_ID = 1, APP_NUM_RANGE_ID = 3, (INT_)RATE_ID = 5

    No problems... So I do this for each row in this particular matrix independantly of each other, an insert/refer to the RULE table, populating the above combination... Now, I know that the above 8 rows belong together logically, but how do I tie them together in the DB, seeing a "matrix" has no identifier, it is just the logical tying of rows... The problem is that my source data ties the matrix components to each account ( therefore, no source of actual matrices, rather just data saying each account has these rules ), so effectively I am trying to create a system-of-record for matrices ( ie. take each accounts rules, create a matrix out of it, and reuse that matrix for other accounts with same rules )... Therefore, the concept of a defined "matrix" doesnt really exist in the source, we are creating a defined list of matrices as they come in, tied to accounts )...

    Anyway, my point is, maybe if I tie an account to each rule/line in this matrix I am trying to create/refer to, then there is my grouping, and I can create matrices from that ? Basically after I have created each line ( as per above ), I will want to look at all 8 rows at once ( tied together by account-id I suppose ), and look for an existing "matrices", or sets of rows...

    By the way, I still dont see how your formula fits fits for multi-tiered rates and which rows in the above example are obsolete, as I see row 1, rows 2-4 and rows 5-8, being totally independant of each other, as they are depedant on the balance amount...Or do you think they are obsolete within each set ? Anyway, I would really appreciate your help..

    Thanks,
    Adam

  12. #12
    Join Date
    Apr 2004
    Location
    Toronto, Canada
    Posts
    249
    Adam (arnzie) wrote:
    If Balance Applies To Rate
    =======================
    <1000 All 5%
    1000-2000 0-500 5%
    1000-2000 501-1000 4.7%
    1000-2000 1001-2000 4.5%
    2000+ 0-500 4.9%
    2000+ 501-1000 4.6%
    2000+ 1001-2000 4.4%
    2000+ 2000+ 4%
    This helps me understand the data better. Just to see if I have understood this properly, this rate structure is available to one or more customers, another set of customers might have another rate structure. The question is to identify if an incoming rate structure already exists. And you need a fast algorithm to do this.

    One way to do this is to assign an ID to this set of 8 rules. Then build a hash map with the key values being the rates and levels.

    For example, with this given data, you could create a key for each row like this:
    Code:
     0-1000:0-1000:0.05
    where the first range is the balance range, the second range is the balance amount to which the rate is to be applied, and the third number is the rate. Now, instead of just using this row, combine the keys for the 8 related rows like this:

    HTML Code:
     0-1000:0-1000:0.05::1000-2000:0-500:0.05::501-1000:0.047::1000-2000:1001-2000:0.044::2000-*:0-500:0.049:2000-*:501-1000:0.046::2000-*:1001:2000:0.044::2000-*:2001-*:0.04
    Map this to an ID.
    As you get data, you create this composite key. Be sure to preserve the order (ascending or descending) within the eligibility range and within the applicability range.

    Once the key is built, check if it already exists, and proceed accordingly.

    Your matrix rate structure table may have to be changed. I would have it thus:

    HTML Code:
     Table Matrix_Rate_Structure_Detail( 
    	 matrix_rate_structure_id, 
    	 elig_low_value,	 elig_high_value, 
    	 applicable_low_value, applicable_high_value, 
    	 rate) 
    with the whole row being the primary key.

    This way, the intent of the design seems clearer.

    You may also add a matrix_rate_structure table with just the id and some description and link it to the matrix_rate_structure_detail table.

    The advantage of the rate structure table is that if you have any rules about overlapping values, either within the applicability range or within the eligibility range, it is possible to implement them using data from this table.

    I see that you already have the condition_rule table, which is almost like the matrix_raate_structure_detail table that I mentioned, except that we have the matrix_rate_structure_id in the above design. And I think in your ER diagram, the matrix_Rate_structure table should not be linked to the Rate table, only to the condition_rule table.

    Hope that helps.

    Ravi
    Last edited by rajiravi; 12-29-04 at 10:34.

  13. #13
    Join Date
    Mar 2004
    Posts
    30
    thanks for the detailed reply rajiravi, pretty sure youve put me on the right track, ill have a look at this over the weekend and see if i can apply your solution to my exact situation... for the record though, the reason why it is modelled in the fashion that Matrix Rate Structure is linked to Rate is because, Rules can belong to many matrices, therefore Matrix Rate Structure ( realising now its is probably badly named ), does not hold 1 record per matrix, but the relationship between Matrix ( stored in RATE ), and Rules ( stored in CONDITION_RULE )..

    Therefore, it just resolves the many to many relationship between matrix and rule, whats why there is a link back to RATE ( the Matrix ).. And we are then linking the Matrix to the arrangement...

    Anyway thanks for your help...
    Adam

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •