Results 1 to 14 of 14
  1. #1
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1

    Answered: Permutation & Combination between source & lookup table

    Hi,

    For the below scenario, can someone suggest on how this can be achieved by writing a procedure in SQL Server DB.

    Remember that if there is a non-null value in the benchmark that is not equal to the equivalent criteria in the Hosting input then there is no match. Therefore those with region = IN would not match any of the given benchmarks. It is best fit as long as the non-matching values are to null value benchmark criteria.

    Hosting_Site (Source File)
    YEAR MONTH HOSTING CLASS REGION HOSTING COST
    2015 1 A UK 75
    2015 1 A IN 80
    2015 1 B IN 60
    2015 1 C US 100

    BENCHMARK (Lookup Table)
    Row Year Month Hosting class Region BENCHMARK COST
    1 2015 1 A US 100
    2 2015 1 B US 200
    3 2015 1 A UK 50
    4 2015 1 A null 70

    STANDARD CLASS COST (If the criteria between source file and the lookup file doesn't match then have to load this cost)
    Row Year Hosting class BENCHMARK COST
    1 2015 A 22
    2 2015 B 33

    Then result should be:

    FACT LOAD:

    YEAR MONTH HOSTING CLASS REGION HOSTING COST BENCHMARK COST
    2015 1 A UK 75 50
    2015 1 A IN 80 70
    2015 1 B IN 60 33 (Since for B class there is no region = IN or = null the Standard Class Cost must be used)
    2015 1 C US 100 ??? (Reject record since there is no default value in Standard Class Cost or in Benchmark)

    Thanks,
    Abdulrasheed.

  2. Best Answer
    Posted by shaikaba

    "Somehow, have modified the logic by adding few more conditions and it's working for the given data.

    SELECT DISTINCT h.YEAR, h.MONTH, h.[HOSTING CLASS], h.REGION, h.[HOSTING COST] AS 'HOSTING COST'
    , Coalesce(b1.[BENCHMARK COST], b2.[BENCHMARK COST], s.[BENCHMARK COST]) AS 'BENCHMARK COST'
    FROM DR_DW_BODS40.dbo.PROJ_TRACK_STG_SAMPLE AS h
    LEFT OUTER JOIN DR_DW_BODS40.dbo.PROD_BM_STG_SAMPLE AS b1
    ON (b1.Region = h.REGION
    AND b1.[HOSTING CLASS] = h.[HOSTING CLASS]
    AND b1.Year = h.YEAR
    AND b1.Month = h.MONTH
    AND b1.PROJECT_TYPE = h.PROJECT_TYPE
    AND b1.CHANNEL_NAME = h.CHANNEL_NAME)
    LEFT OUTER JOIN DR_DW_BODS40.dbo.PROD_BM_STG_SAMPLE AS b2
    ON (b2.Region IS NULL or b2.REGION=h.REGION)
    and(b2.[HOSTING CLASS]is null or b2.[HOSTING CLASS] = h.[HOSTING CLASS])
    and (b2.YEAR is null or b2.Year= h.YEAR)
    and (b2.Month is null or b2.MONTH = h.MONTH)
    AND (b2.PROJECT_TYPE = h.PROJECT_TYPE or b2.PROJECT_TYPE is NULL)
    AND (b2.CHANNEL_NAME = h.CHANNEL_NAME OR b2.CHANNEL_NAME IS NULL)
    LEFT OUTER JOIN DR_DW_BODS40.dbo.STANDING_CLASS_COST_SAMPLE AS s
    ON (s.[Hosting class] = h.[HOSTING CLASS]
    AND s.Year = h.YEAR)
    ORDER BY h.YEAR, h.MONTH, h.[HOSTING CLASS], h.[HOSTING COST];

    But for the actual data where the source data is of 2334 and the lookup data is of 2863 records, it's trying to fetch too many records.
    Still working on it.

    Thanks,
    Abdulrasheed."


  3. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Code:
    CREATE PROCEDURE shaikaba
    AS
    WITH Hosting_Site AS (
    SELECT 2015 AS 'YEAR', 1 AS 'MONTH', 'A' AS 'HOSTING CLASS', 'UK' AS 'REGION', 75 AS 'COST'
    UNION SELECT 2015, 1, 'A', 'UK',  75
    UNION SELECT 2015, 1, 'A', 'IN',  80
    UNION SELECT 2015, 1, 'B', 'IN',  60
    UNION SELECT 2015, 1, 'C', 'US', 100
    ), BENCHMARK AS (
    SELECT 1 AS 'Row', 2015 AS 'Year', 1 AS 'Month', 'A' AS 'Hosting class', 'US' AS 'Region', 100 AS 'BENCHMARK COST'
    UNION SELECT 2, 2015, 1, 'B', 'US', 200
    UNION SELECT 3, 2015, 1, 'A', 'UK',  50
    UNION SELECT 4, 2015, 1, 'A', null,  70
    ), STANDARD_CLASS_COST AS (
    SELECT 1 AS 'Row', 2015 AS 'Year', 'A' AS 'Hosting class', 22 AS 'BENCHMARK COST'
    UNION SELECT 2, 2015, 'B', 33
    )
    SELECT h.YEAR, h.MONTH, h.[HOSTING CLASS], h.REGION, h.COST AS 'HOSTING COST'
    ,  Coalesce(b1.[BENCHMARK COST], b2.[BENCHMARK COST], s.[BENCHMARK COST]) AS 'BENCHMARK COST'
       FROM Hosting_Site AS h
       LEFT JOIN BENCHMARK AS b1
          ON (b1.Region          = h.REGION
    	  AND b1.[Hosting class] = h.[HOSTING CLASS]
    	  AND b1.Year            = h.YEAR
    	  AND b1.Month           = h.MONTH)
       LEFT JOIN BENCHMARK AS b2
          ON (b2.Region IS NULL
    	  AND b2.[Hosting class] = h.[HOSTING CLASS]
    	  AND b2.Year            = h.YEAR
    	  AND b2.Month           = h.MONTH)
       LEFT JOIN STANDARD_CLASS_COST AS s
          ON (s.[Hosting class]  = h.[HOSTING CLASS]
    	  AND s.Year             = h.YEAR)
       ORDER BY h.YEAR, h.MONTH, h.[HOSTING CLASS], h.COST
    GO
    
    EXECUTE shaikaba;
    GO
    
    DROP PROCEDURE shaikaba;
    GO
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  4. #3
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Thanks for your reply Pat.
    Have one more question here. If there is no consistency in the column values of BENCHMARK, then how those records can be compared with HOSTING SITES.
    Can we have a loop which can compare values and also in future if any new column got added to BENCHMARK then those values should also get compared.

    Please advise on how this can be achieved.

    Regards,
    Abdulrasheed.

  5. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    If there is no consistency, I'm not sure how that would work. You mean that you want the year 1763 and 2055 and/or regions 'XX' and 'YY" to be considered equivalent under some circumstances???

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  6. #5
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Only if there are columns with a value like NULL then, that we should consider and should get compared with each column of HOSTING SITES(This table will never have NULL values).
    BENCHMARK may get NULL values. Basically, except BENCHMARK COST all columns should compared with HOSTING SITES table column there we should consider NULL of BENCHMARK with HOSTING SITES but not others. When no records are matching it should go to STANDARD COST. Even if it is not matching here then it should populate as NULL.

    Your query work good but it should look for combinations of source by considering NULL but avoiding NON MATCHING records.

    Thanks,
    Abdulrasheed.
    Last edited by shaikaba; 06-23-15 at 08:28.

  7. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by shaikaba View Post
    Your query work good but it should look for combinations of source by considering NULL but avoiding NON MATCHING records.
    I think that it does that. If you disagree, please supply some sample data and the results that you expect from that sample data.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  8. #7
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Hi Pat,

    Here is the scenario with data:

    1) All column values should match; Fetch COST from BENCHMARK
    2) Any one column is not matching then fetch COST from STANDDARD CLASS COST
    3) Any one column is not matching & also if it is not matching in STANDDARD CLASS COST then reject
    4) All column values of HOSTING SITES are matching with BENCHMARK table except one column value which is NULL in BENCHMARK table. In this case the cost should come from BENCHMARK (Consider the 4th record from BENCHMARK with 2nd record of HOSTING SITES)
    5) From the point 4, we may have multiple columns have NULL values and even in that case we should get the cost from BENCHMARK table(If the other column values are matching) - (Consider the 1st record from BENCHMARK with 5th record of HOSTING SITES)


    FYI:
    HOSTING SITEs table will never have NULL records
    BENCHMARK table columns might come with NULL records

    HOSTING SITES DATA:

    YEAR MONTH HOSTING_CLASS REGION HOSTING_COST CHANNEL_NAME PROJECT_TYPE
    2015 1 A UK 75 ABC 10-SUBMITTED
    2015 1 A IN 80 PQR 10-CANCELED
    2015 1 B IN 60 XYZ 10-PENDING
    2015 1 C US 100 DEF 10-SUBMITTED
    2015 1 A US 210 GHI 10-COMPLETED

    BENCHMARK DATA:

    YEAR MONTH HOSTING_CLASS REGION HOSTING_COST CHANNEL_NAME PROJECT_TYPE
    2015 1 A US 100 NULL NULL
    2015 1 B US 200 XYZ 10-PENDING
    2015 1 A UK 50 ABC 10-SUBMITTED
    2015 1 A NULL 70 PQR 10-CANCELED

    Hope it is clear.
    Ignore if there are any * in the given data.

    Thanks,
    Abdulrasheed.

  9. #8
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    What did you expect to get as results? My results are:
    Code:
    YEAR        MONTH       HOSTING_CLASS REGION HOSTING COST BENCHMARK COST
    ----------- ----------- ------------- ------ ------------ --------------
    2015        1           A             UK     75           50
    2015        1           A             IN     80           70
    2015        1           A             US     210          100
    2015        1           B             IN     60           33
    2015        1           C             US     100          NULL
    When I run:
    Code:
    CREATE PROCEDURE shaikaba
    AS
    WITH Hosting_Site AS (
    SELECT *
       FROM (VALUES
          (2015, 1, 'A', 'UK',  75, 'ABC', '10-SUBMITTED'),
          (2015, 1, 'A', 'IN',  80, 'PQR', '10-CANCELED'),
          (2015, 1, 'B', 'IN',  60, 'XYZ', '10-PENDING'),
          (2015, 1, 'C', 'US', 100, 'DEF', '10-SUBMITTED'),
          (2015, 1, 'A', 'US', 210, 'GHI', '10-COMPLETED')
       ) AS z (YEAR, MONTH, HOSTING_CLASS, REGION, HOSTING_COST, CHANNEL_NAME, PROJECT_TYPE)
    ), BENCHMARK AS (
    SELECT *
       FROM (VALUES
       (2015, 1, 'A', 'US', 100, NULL, NULL),
       (2015, 1, 'B', 'US', 200, 'XYZ', '10-PENDING'),
       (2015, 1, 'A', 'UK',  50, 'ABC', '10-SUBMITTED'),
       (2015, 1, 'A', NULL,  70, 'PQR', '10-CANCELED')
       ) AS z (YEAR, MONTH, HOSTING_CLASS, REGION, HOSTING_COST, CHANNEL_NAME, PROJECT_TYPE)
    ), STANDARD_CLASS_COST AS (
    SELECT 1 AS 'Row', 2015 AS 'Year', 'A' AS 'Hosting class', 22 AS 'BENCHMARK COST'
    UNION SELECT 2, 2015, 'B', 33
    )
    SELECT h.YEAR, h.MONTH, h.HOSTING_CLASS, h.REGION, h.HOSTING_COST AS 'HOSTING COST'
    ,  Coalesce(b1.HOSTING_COST, b2.HOSTING_COST, s.[BENCHMARK COST]) AS 'BENCHMARK COST'
       FROM Hosting_Site AS h
       LEFT JOIN BENCHMARK AS b1
          ON (b1.Region          = h.REGION
    	  AND b1.HOSTING_CLASS = h.HOSTING_CLASS
    	  AND b1.Year            = h.YEAR
    	  AND b1.Month           = h.MONTH)
       LEFT JOIN BENCHMARK AS b2
          ON (b2.Region IS NULL
    	  AND b2.HOSTING_CLASS = h.HOSTING_CLASS
    	  AND b2.Year            = h.YEAR
    	  AND b2.Month           = h.MONTH)
       LEFT JOIN STANDARD_CLASS_COST AS s
          ON (s.[Hosting class]  = h.HOSTING_CLASS
    	  AND s.Year             = h.YEAR)
       ORDER BY h.YEAR, h.MONTH, h.HOSTING_CLASS, h.HOSTING_COST
    GO
    
    EXECUTE shaikaba;
    GO
    
    DROP PROCEDURE shaikaba;
    GO
    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  10. #9
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Hi,

    Concern is on the 5th scenario(From the list of points) which is with below entry from

    HOSTING SITES:

    YEAR MONTH HOSTING CLASS REGION HOSTING COST CHANNEL_NAME PROJECT_TYPE
    2015 1 A US 210 GHI 10-COMPLETED

    with

    BENCHMARK:

    YEAR MONTH HOSTING CLASS REGION BENCHMARK COST CHANNEL_NAME PROJECT_TYPE
    2015 1 A US 100 NULL NULL

    In this case, we should get BENCHMARK COST as 100 instead of 210 even if CHANNEL_NAME & PROJECT_TYPE are NULL.

    Thanks,
    Abdulrasheed.

  11. #10
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    Quote Originally Posted by shaikaba View Post
    In this case, we should get BENCHMARK COST as 100 instead of 210 even if CHANNEL_NAME & PROJECT_TYPE are NULL.
    Please recheck the results. The line with a HOSTING_COST of 210 has a BENCHMARK_COST of 100.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  12. #11
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Thanks for this Pat. I ran the query at my end and i could see the result now.
    My only concern is on this condition b2.Region IS NULL. I thought i have to use loop to compare the data.

    in BENCHMARK table there is no consistency in values. Anytime and in any column there will be a NULL record then in that case, it should compare it with HOSTING SITES as a valid record and in remaining cases it should go to STANDARD CLASS.

    Will apply your logic on real time data and will let you know.
    Hopefully it should work.

    Thanks again for your help.

  13. #12
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Hi Pat,

    what if we have data like this? Have tried for different scenarios using your query but we didn't got the result expected.

    HOSTING SITES DATA: (Have added few records where we see a complexity in writing a query)

    YEAR MONTH HOSTING CLASS REGION CHANNEL_NAME PROJECT_TYPE HOSTING COST
    2015 1 A UK ABC 10-SUBMITTED 75
    2015 1 A IN PQR 10-CANCELED 80
    2015 1 B IN XYZ 10-PENDING 60
    2015 1 C US DEF 10-SUBMITTED 100
    2015 1 A US GHI 10-COMPLETED 210
    2015 1 A GB FGH 10-CANCELED 56

    BENCHMARK DATA:

    YEAR MONTH HOSTING CLASS REGION CHANNEL_NAME PROJECT_TYPE BENCHMARK COST
    2015 1 A US NULL NULL 100
    2015 1 B US XYZ 10-PENDING 200
    2015 1 A UK ABC 10-SUBMITTED 50
    2015 1 NULL GB PQR 10-CANCELED 70
    2015 1 A NULL FGH 10-CANCELED 122

    Please advise.

    Thanks,
    Abdulrasheed.

  14. #13
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    What results did you expect?

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  15. #14
    Join Date
    Jun 2015
    Posts
    8
    Provided Answers: 1
    Somehow, have modified the logic by adding few more conditions and it's working for the given data.

    SELECT DISTINCT h.YEAR, h.MONTH, h.[HOSTING CLASS], h.REGION, h.[HOSTING COST] AS 'HOSTING COST'
    , Coalesce(b1.[BENCHMARK COST], b2.[BENCHMARK COST], s.[BENCHMARK COST]) AS 'BENCHMARK COST'
    FROM DR_DW_BODS40.dbo.PROJ_TRACK_STG_SAMPLE AS h
    LEFT OUTER JOIN DR_DW_BODS40.dbo.PROD_BM_STG_SAMPLE AS b1
    ON (b1.Region = h.REGION
    AND b1.[HOSTING CLASS] = h.[HOSTING CLASS]
    AND b1.Year = h.YEAR
    AND b1.Month = h.MONTH
    AND b1.PROJECT_TYPE = h.PROJECT_TYPE
    AND b1.CHANNEL_NAME = h.CHANNEL_NAME)
    LEFT OUTER JOIN DR_DW_BODS40.dbo.PROD_BM_STG_SAMPLE AS b2
    ON (b2.Region IS NULL or b2.REGION=h.REGION)
    and(b2.[HOSTING CLASS]is null or b2.[HOSTING CLASS] = h.[HOSTING CLASS])
    and (b2.YEAR is null or b2.Year= h.YEAR)
    and (b2.Month is null or b2.MONTH = h.MONTH)
    AND (b2.PROJECT_TYPE = h.PROJECT_TYPE or b2.PROJECT_TYPE is NULL)
    AND (b2.CHANNEL_NAME = h.CHANNEL_NAME OR b2.CHANNEL_NAME IS NULL)
    LEFT OUTER JOIN DR_DW_BODS40.dbo.STANDING_CLASS_COST_SAMPLE AS s
    ON (s.[Hosting class] = h.[HOSTING CLASS]
    AND s.Year = h.YEAR)
    ORDER BY h.YEAR, h.MONTH, h.[HOSTING CLASS], h.[HOSTING COST];

    But for the actual data where the source data is of 2334 and the lookup data is of 2863 records, it's trying to fetch too many records.
    Still working on it.

    Thanks,
    Abdulrasheed.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •