Results 1 to 14 of 14
  1. #1
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1

    Question Unanswered: Classifying Customers' Order Method Preference

    I am trying to classify a customer's order method preference based on their history. My source table has every order, order method, date, etc.

    The logic is:
    • If only one order method, that is their preference
    • >1 order method, the majority is their preference
    • if 50/50 split, the order method with the mose recent order is the preference


    I then created a query to group by the customer, order method, and max(date):
    http://i.imgur.com/GKAWMTo.png

    I am having trouble though, creating a query that applies the above logic and outputs this: http://i.imgur.com/eYsCKXX.png

  2. #2
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1

    Please follow Netiquette

    Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.

    This is minimal polite behavior on SQL forums. You have posted before and know better!

  3. #3
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1
    Quote Originally Posted by Celko View Post
    Please post DDL, so that people do not have to guess what the keys, constraints, Declarative Referential Integrity, data types, etc. in your schema are. Learn how to follow ISO-11179 data element naming conventions and formatting rules. Temporal data should use ISO-8601 formats. Code should be in Standard SQL as much as possible and not local dialect.

    This is minimal polite behavior on SQL forums. You have posted before and know better!
    I was trying to keep it simple and get a conceptual answer rather than spend paragraphs explaining our complex data layout. In addition, I am an analyst and have no say on the data structure, schema, etc.

  4. #4
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Code:
    CREATE TABLE #t (
       customer_number int
     , order_date      datetime
     , order_method    char(5)
     , CONSTRAINT xyz UNIQUE (customer_number, order_date)
    );
    
    
    INSERT INTO #t (customer_number, order_date, order_method)
      VALUES (9, '20130101', 'Phone') -- only one order
           , (3, '20130202', 'Web')
           , (3, '20130303', 'Phone')
           , (3, '20130404', 'Web')   -- majority web
           , (7, '20130505', 'Phone')
           , (7, '20130606', 'Web')   -- tie-break, latest method
    ;
    
    ; WITH x AS (
      SELECT customer_number
           , order_method
           , Max(order_date) As latest_order_date
           , Count(*) As number_of_orders
           , Dense_Rank() OVER (PARTITION BY customer_number ORDER BY Count(*) DESC) As most_orders
           , Row_Number() OVER (PARTITION BY customer_number ORDER BY Max(order_date) DESC) As latest_order
      FROM   #t
      GROUP
          BY customer_number
           , order_method
    )
    SELECT customer_number
         , order_method
         , latest_order_date
         , number_of_orders
         , most_orders
         , latest_order
    FROM   x
    WHERE  most_orders = 1
    AND    latest_order = 1
    
    GO
    DROP TABLE #t
    Any good?
    George
    Home | Blog

  5. #5
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1
    Thanks gvee. i have been playing around with those window functions trying to figure out which was the best way to go. Thanks for the direction. I'll let you know how it works...

  6. #6
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    gvee,

    If order_date were slightly different like
    Code:
    INSERT INTO #t (customer_number, order_date, order_method)
      VALUES (9, '20130101', 'Phone') -- only one order
           , (3, '20130202', 'Web')
           , (3, '20130403', 'Phone')
           , (3, '20130304', 'Web')   -- majority web
           , (7, '20130505', 'Phone')
           , (7, '20130606', 'Web')   -- tie-break, latest method
    ;
    Note: The inserted data were changed to make sense the following issue.

    In customer_number = 3,
    "most_orders = 1" might be Web
    and "latest_order = 1" might be Phone.

    So, no matched order_method of customer_number = 3 which satisfied
    Code:
    WHERE  most_orders = 1
    AND    latest_order = 1

    By the way,
    I think that Orders column(number of order) should be added in the source table.
    Last edited by tonkuma; 11-06-13 at 12:28. Reason: Modify the data.

  7. #7
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Hey Tonkuma,

    Just given that a try and I am still getting what I consider to be the expected results?
    George
    Home | Blog

  8. #8
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Sorry,

    the data should be
    Code:
    INSERT INTO #t (customer_number, order_date, order_method)
      VALUES (9, '20130101', 'Phone') -- only one order
           , (3, '20130202', 'Web')
           , (3, '20130403', 'Phone')
           , (3, '20130304', 'Web')   -- majority web
           , (7, '20130505', 'Phone')
           , (7, '20130606', 'Web')   -- tie-break, latest method
    ;

  9. #9
    Join Date
    Jan 2007
    Location
    UK
    Posts
    11,434
    Provided Answers: 10
    Bingo!

    Okay, so we have to stagger the logic:
    Code:
    ; WITH x AS (
      SELECT customer_number
           , order_method
           , Max(order_date) As latest_order_date
           , Count(*) As number_of_orders
           , Dense_Rank() OVER (PARTITION BY customer_number ORDER BY Count(*) DESC) As most_orders
      FROM   #t
      GROUP
          BY customer_number
           , order_method
    )
    , y AS (
      SELECT customer_number
           , order_method
           , latest_order_date
           , number_of_orders
           , most_orders
           , Row_Number() OVER (PARTITION BY customer_number ORDER BY latest_order_date DESC) As latest_order
      FROM   x
      WHERE  most_orders = 1
    )
    SELECT customer_number
         , order_method
         , latest_order_date
         , number_of_orders
         , most_orders
         , latest_order
    FROM   y
    WHERE  latest_order = 1
    ;
    Phew!
    George
    Home | Blog

  10. #10
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1
    Ok, I thought of something. Can't I just sort the table by orders DESC and the recent order date DESC, and use Row_number() and grab the value where row_number=1?

    Code:
    SELECT CustKey
       ,Orders
       ,OrdMethDesc
       ,RecentOrdDtKey
       , row_number() over (partition by custkey order by orders DESC, Recentorddtkey DESC) as ordidal
    FROM myTable
    this way, the order method with the most orders will be ordinal 1, if there is a tie, then the tied order method with most recent date will be ordinal 1.
    Last edited by clawlan; 11-06-13 at 18:09.

  11. #11
    Join Date
    Jan 2013
    Posts
    354
    Provided Answers: 1
    Since you will not post DDL and want to make us guess, how can we be sure that what we invent will help you? Even a skeleton would have helped. Here is my guess.

    CREATE TABLE Orders
    (order_nbr INTEGER NOT NULL PRIMARY KEY,
    customer_nbr INTEGER NOT NULL
    REFERENCES Customers(customer_nbr),
    order_date DATE DEFAULT CURRENT_TIMESTAMP NOT NULL,
    order_method CHAR(1) NOT NULL
    CHECK (order_method IN ('I', 'P')) - is this all?
    );


    INSERT INTO Orders
    VALUES (12, 9, '2013-01-01', 'P'), -- only one order
    (13, 3, '2013-02-02', 'I'),
    (14, 3, '2013-03-03', 'P'),
    (15, 3, '2013-04-04', 'I'), -- majority web
    (16, 7, '2013-05-05', 'P'),
    (17, 7, '2013-06-06', 'I');-- tie-break, latest method

    What we would like is a VIEW using the FIRST() aggregate. But T-SQL does not have this yet. Strangely, ACCESS does!

    CREATE VIEW Order_Preferences (customer_nbr, order_method)
    AS
    WITH Method_Tallies
    AS
    (SELECT customer_nbr, order_method,
    COUNT(*) AS method_cnt,
    MAX(order_date) AS last_order_date
    FROM Orders
    GROUP BY customer_nbr, order_method)

    SELECT customer_nbr,
    FIRST(order_method)
    OVER (PARTITION BY customer_nbr
    ORDER BY method_cnt DESC, last_order_date ASC)
    FROM Method_Tallies;

    To fake it, we need more work:

    WITH Method_Tallies
    AS
    (SELECT customer_nbr, order_method,
    COUNT(*) AS method_cnt,
    MAX(order_date) AS last_order_date
    FROM Orders
    GROUP BY customer_nbr, order_method),

    Order_Methods_Frequency
    AS
    (SELECT customer_nbr, order_method, method_cnt, last_order_date,
    ROW_NUMBER()
    OVER (PARTITION BY customer_nbr
    ORDER BY method_cnt DESC, last_order_date DESC)
    AS freq_order_method
    FROM Method_Tallies)

    SELECT customer_nbr, order_method
    FROM Order_methods_Frequency
    WHERE freq_order_method = 1;

  12. #12
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Quote Originally Posted by clawlan View Post
    Ok, I thought of something. Can't I just sort the table by orders DESC and the recent order date DESC, and use Row_number() and grab the value where row_number=1?

    Code:
    SELECT CustKey
       ,Orders
       ,OrdMethDesc
       ,RecentOrdDtKey
       , row_number() over (partition by custkey order by orders DESC, Recentorddtkey DESC) as ordidal
    FROM myTable
    this way, the order method with the most orders will be ordinal 1, if there is a tie, then the tied order method with most recent date will be ordinal 1.
    I agree with you.
    "partition by custkey order by orders DESC, Recentorddtkey DESC" must be a key point.

    With the precondition,
    it might be neccesary to use subquery, to specify "row_number() over(...) = 1", like ...
    Code:
    SELECT customer
         , order_method
         , order_number
         , order_date
     FROM  (SELECT t.*
                 , ROW_NUMBER()
                      OVER( PARTITION BY customer
                                ORDER BY order_number DESC
                                       , order_date   DESC
                          ) AS r_num
             FROM  source_table AS t
           )
     WHERE r_num = 1
    ;
    Note: column names were somewhat different,
    but I thought that you might be able to understand what was the intention of the query example.

  13. #13
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1
    Quote Originally Posted by tonkuma View Post
    I agree with you.
    "partition by custkey order by orders DESC, Recentorddtkey DESC" must be a key point.

    With the precondition,
    it might be neccesary to use subquery, to specify "row_number() over(...) = 1", like ...
    Code:
    SELECT customer
         , order_method
         , order_number
         , order_date
     FROM  (SELECT t.*
                 , ROW_NUMBER()
                      OVER( PARTITION BY customer
                                ORDER BY order_number DESC
                                       , order_date   DESC
                          ) AS r_num
             FROM  source_table AS t
           )
     WHERE r_num = 1
    ;
    Yes, Instead of a subquery, I used a CTE

    Code:
    ;with x as(
          SELECT CustKey,Orders,OrdMethDesc as Preference,RecentOrdDtKey, row_number() over (partition by custkey order by orders DESC, Recentorddtkey DESC) as ordinal
          FROM myTable1
                )
        SELECT * 
        INTO myTable2
        FROM x WHERE ordinal = 1

  14. #14
    Join Date
    Oct 2010
    Location
    Atlanta, GA
    Posts
    211
    Provided Answers: 1
    Quote Originally Posted by Celko View Post
    Since you will not post DDL and want to make us guess, how can we be sure that what we invent will help you? Even a skeleton would have helped.
    i post DDL next time. thanks for your help.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •