Results 1 to 4 of 4
  1. #1
    Join Date
    Jan 2010
    Posts
    2

    Unanswered: Avoiding Distinct!

    The SQL:

    Code:
    SELECT c.cloc_name, Count(app_id) as "Applications",
    
    --PASS
    SUM(
      CASE a.app_decision 
        WHEN 'Pass' THEN 1
        ELSE 0
      END
    ) as "Pass"
    
    FROM reports r
    INNER JOIN applications a ON r.report_appid = a.app_id
    INNER JOIN companylocations c ON c.cloc_id = r.report_clocid 
    WHERE c.cloc_companyid = 116 
    AND
    date(r.report_timestamp) >= '10/1/2009' 
    AND 
    date(r.report_timestamp) <= '10/31/2009'
     
    GROUP BY c.cloc_name
    ORDER BY c.cloc_name
    Here my issue:
    My Count(app_id) is coming back with duplicate values, as is the "Pass" column. If I make "Applications" = Count(DISTINCT app_id), that fixes that issue. But how do I make the "Pass" variable distinct? I'm confused when I start to use aggragate functions with distinct. Any help would be greatly apprciated. My table builds are below. Cheers, and thank you.

    Application table
    Code:
    CREATE TABLE applications
    (
      app_id serial NOT NULL,
      app_timestamp timestamp with time zone DEFAULT now(),
      app_userid integer,
      app_decision character varying(200),
      app_decisiontree text,
      app_scorecard text,
      CONSTRAINT "applicationsPrimary" PRIMARY KEY (app_id)
    Companylocations
    Code:
    CREATE TABLE companylocations
    (
      cloc_id serial NOT NULL,
      cloc_companyid integer,
      cloc_name character varying(255),
      cloc_address character varying(100),
      cloc_suite character varying(100),
      cloc_city character varying(100),
      cloc_stateprovince character varying(100),
      cloc_postal character varying(50),
      cloc_phone character varying(50),
      cloc_decisionid integer,
      cloc_code character varying(100),
      cloc_contactemail character varying(100),
      CONSTRAINT "companylocationsPrimary" PRIMARY KEY (cloc_id)
    Reports table:
    Code:
    CREATE TABLE reports
    (
      report_id serial NOT NULL,
      report_timestamp timestamp with time zone DEFAULT now(),
      report_statusid integer,
      report_fname character varying(100),
      report_mname character varying(50),
      report_lname character varying(50),
      report_clocid integer,
      report_appid integer, 
      CONSTRAINT "reportsPrimary" PRIMARY KEY (report_id)

  2. #2
    Join Date
    Jun 2008
    Location
    Chennai
    Posts
    21
    it is not clear on which table is having the duplicates ...

    looking into the table structure, i assume the reports table is having duplicates ...

    the following code might work ... pls try it


    SELECT c.cloc_name, Count(app_id) as "Applications",

    --PASS
    SUM(
    CASE a.app_decision
    WHEN 'Pass' THEN 1
    ELSE 0
    END
    ) as "Pass"

    FROM (SELECT DISTINCT report_appid, report_clocid, report_timestamp FROM reports) r
    INNER JOIN applications a ON r.report_appid = a.app_id
    INNER JOIN companylocations c ON c.cloc_id = r.report_clocid
    WHERE c.cloc_companyid = 116
    AND
    date(r.report_timestamp) >= '10/1/2009'
    AND
    date(r.report_timestamp) <= '10/31/2009'

    GROUP BY c.cloc_name
    ORDER BY c.cloc_name

  3. #3
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Quote Originally Posted by kandrusatish View Post
    i assume the reports table is having duplicates ...
    that's sort of my guess, too, but putting the DISTINCT on the reports table alone will not solve it, because each report by itself is unique, the problem is that there are multiple reports per company location, and yet these reports are for several identical applications

    so you have to make the applications distinct before you count them
    Code:
    SELECT c.cloc_name
         , COUNT(nodupes.app_id) as "Applications"
         , SUM(CASE nodupes.app_decision 
               WHEN 'Pass' THEN 1 END) as "Pass"
      FROM companylocations AS c
    INNER
      JOIN ( SELECT r.report_clocid
                  , a.app_id
                  , a.app_decision
               FROM reports AS r
             INNER 
               JOIN applications AS a 
                 ON r.report_appid = a.app_id
              WHERE r.report_timestamp >= '10/1/2009'  
                AND r.report_timestamp  < '11/1/2009'  
             GROUP
                 BY r.report_clocid
                  , a.app_decision ) AS nodupes
        ON nodupes.report_clocid = c.cloc_id 
     WHERE c.cloc_companyid = 116 
    GROUP 
        BY c.cloc_name
    ORDER 
        BY c.cloc_name
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  4. #4
    Join Date
    Jan 2010
    Posts
    2

    Solved

    Ah, I had a feeling the solution would lie in furthur nesting and getting the correct sequence on those inner joins. Thank you very much R937, your solution was right on target! Thank you both for taking the time to help me with the solution. Cheers.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •