Results 1 to 3 of 3
  1. #1
    Join Date
    Jul 2012
    Posts
    5

    Unanswered: Self-join to return latest record, eliminating duplicates?

    Hello

    Here's what I'm working with:

    Code:
    CREATE TABLE IF NOT EXISTS `rate` (
      `id` int(11) NOT NULL AUTO_INCREMENT,
      `client_company` int(11) DEFAULT NULL,
      `client_group` int(11) DEFAULT NULL,
      `client_contact` int(11) DEFAULT NULL,
      `role` int(11) DEFAULT NULL,
      `date_from` datetime DEFAULT NULL,
      `hourly_rate` decimal(18,2) DEFAULT NULL,
      PRIMARY KEY (`id`)
    ) ENGINE=MyISAM  DEFAULT CHARSET=utf8 COLLATE=utf8_unicode_ci;
    
    INSERT INTO `rate` (`id`, `client_company`, `client_group`, 
                        `client_contact`, `role`, `date_from`, `hourly_rate`) 
    VALUES
    (4, NULL, NULL, NULL, 3, '2012-07-30 14:48:16', 115.00),
    (5, 3, NULL, NULL, 3, '2012-07-30 14:51:38', 110.00),
    (6, 3, NULL, NULL, 3, '2012-07-30 14:59:20', 112.00);
    This table stores chargeout rates for clients; the idea being that, when looking for the correct rate for a job role, we'd first look for a rate matching the given role and client contact, then if no rate was found, would try to match the role and the client group (or 'department'), then the client company, and finally looking for a global rate for just the role itself. Fine.

    Rates can change over time, so the table may contain multiple entries matching any given combination of role, company, group and client contact: I want a query that will only return me the latest one for each distinct combination.

    Could someone please explain why the query below is returning all three of the records above and not, as I want it to, only the records with IDs 4 and 6. It shouldn't return row ID 5, as that's a duplicate of row 6 with an earlier date.

    Is it something to do with my trying to join based on columns containing NULL?

    Code:
    SELECT
        rate.*,
        newest.id
    FROM rate
        LEFT JOIN rate AS newest ON(
            rate.client_company = newest.client_company
            AND rate.client_contact = newest.client_contact
            AND rate.client_group = newest.client_group
            AND rate.role= newest.role
            AND newest.date_from > rate.date_from
        )
    WHERE newest.id IS NULL

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by Winter View Post
    Is it something to do with my trying to join based on columns containing NULL?
    yup

    try using COALESCE on those columns
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Jul 2012
    Posts
    5
    I owe you a beer. Thank you.

    Code:
    SELECT
        rate.*,
        newest.id
    FROM rate
        LEFT JOIN rate AS newest ON(
            COALESCE(rate.client_company,1) = COALESCE(newest.client_company,1)
            AND COALESCE(rate.client_contact,1) = COALESCE(newest.client_contact,1)
            AND COALESCE(rate.client_group,1) = COALESCE(newest.client_group,1)
            AND COALESCE(rate.role,1) = COALESCE(newest.role,1)
            AND newest.date_from > rate.date_from
        )
    WHERE newest.id IS NULL

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •