Results 1 to 14 of 14
  1. #1
    Join Date
    Aug 2010
    Posts
    9

    Unanswered: problem with count in join

    Hi all,

    I'm trying to count the records in table t1 that also exist in t2 and meet certain condition in t2 (i.e that the record in t2 also has certain value in another column).

    The thing is that I only want to count the records in t1 not only that matches those in t2 but also that are within a certain period of time (there's another column that stores the date of the record).

    My code looks as follows:

    Code:
    SELECT t2.a AS wwa,
    	  COUNT(t1.b) AS wwb,
    	  max(t1.date),
    FROM	  table t1 INNER JOIN
    	  table t2 ON t1.b = t2.b
    WHERE (date >= '2009-08-10 00:00:00.000') 
              AND (date <= '2010-08-10 00:00:00.000') AND (t2.c='myString')
    GROUP BY t2.a
    ORDER BY wwa ASC
    That count is counting something different than the records in "t1.b = t2.b" between that period of time because in the database I filter the results and it is something different than that count.

    Can anyone please tell me why? or how to fix this?

    Thanks,
    Last edited by db2Noob; 08-10-10 at 18:32.

  2. #2
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by db2Noob View Post
    Can anyone please tell me why? or how to fix this?
    In order for someone to do this, you would need to explain what you mean by

    Quote Originally Posted by db2Noob View Post
    in the database I filter the results and it is something different than that count.

  3. #3
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    try using an exists subselect for t2 rather than a join.
    Dave

  4. #4
    Join Date
    Aug 2010
    Posts
    9

    Thumbs up

    Yes sorry what I meant by:

    in the database I filter the results and it is something different than that count.
    Is that I login into Control Center, open the table t1 and use the filter button to filter the results by those conditions, i.e. (date >= '2009-08-10 00:00:00.000')
    AND (date <= '2010-08-10 00:00:00.000') AND t1.b = <the value in t2 I want>, I do this for several records and they do not match the results of the query above.

    Lets say between those dates I have 2 records according to the filtering I did in the gui, the query however throws 30 for the same record (which is odd because I have a total of 175 records where t1.b = t2.b in t1 and only 2 of them are within that time range so I don't know how that 30 is getting in there).

    I tried the EXIST (subselect) as follows but it throws an error:

    Code:
    SELECT t2.a AS wwa,
    	  COUNT(t1.b) AS wwb,
    	  max(t1.date),
    FROM	  table t1
    WHERE EXIST( SELECT *
                         FROM table t2 
                        WHERE (t1.b = t2.b) 
                        AND (date >= '2009-08-10 00:00:00.000') 
                        AND (date <= '2010-08-10 00:00:00.000') 
                        AND (t2.c='myString') )
    GROUP BY t2.a
    ORDER BY wwa ASC
    I get the following error:
    SQL0104N An unexpected token "table t2 WHERE (t1.b = t2" was found following "EXIST (SELECT * FROM". Expected tokens may include: "<space> "

    How can I fix this query or what else can I do to get the right count?

    EDITED: Just wanted to add that I changed line 3 from "max(t1.date)" to "t1.date" in the select statement in the original query (post #1) and the result query listed the value I wanted twice and a count of 15 for each, hence the 30. What I want is a count of 2 not 30, why is that not counting that there are 2 records meeting that condition! - Thanks
    Last edited by db2Noob; 08-11-10 at 11:48.

  5. #5
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    the query however throws 30 for the same record.
    No "record" in DB2.
    What mean with 30?
    Do you want to say "the query returns 30 result rows"?

    If t2.b was not unique,
    then you would get more than the number of rows of t1 by "t1 INNER JOIN t2 ON t1.b = t2.b".

  6. #6
    Join Date
    Dec 2007
    Location
    Richmond, VA
    Posts
    1,328
    Provided Answers: 5
    the missing token is an S, the keyword is EXISTS, not EXIST. What table does the DATE column belong to t1 or t2? I did not notice in your first post that you are, also, attempting to select a column from T2. So, the exists will not work due to that. Also, get rid of the unneccessary parens they just hurt readability and introduce problems for maintenance as Tonkuma likes to say. the join condition of the two tables, is that the complete join condition between the tables? It might help if you gave us some ddl for the 2 tables and input data and expected results. Lastly, what is the deal with the max date?

    Dave

  7. #7
    Join Date
    Aug 2010
    Posts
    9
    Thanks for the help. the date is in t1 and the MAX is there because I want to just display the t1.b value once with the most recent date.

    Ok so I'm going to give an example

    Code:
    t1    
    dateCol   b col
    Jan13      4
    Jan14      2
    Jan14      4
    Jan14      1
    Jan15      5
    Jan15      4
    Jan16      3
    Code:
    t2
    a col    b col    c col
    val1     1         stringa
    val2     2         stringb
    val3     3         stringa
    val4     4         stringa
    val5     5         stringc
    Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

    Code:
    Resulting Table I want:
    t2.a      count(that meet condition)  mostRecentDate
    val4      2                                     Jan15
    val1      1                                     Jan14

    What I'm getting now is that table I want except the count column is giving me a number that's definitely not the one I want, so I'm not sure what that COUNT is doing, I guess I should be adding something to the "COUNT(t1.b)"

    I hope that's a bit clearer, thanks for the help.
    Last edited by db2Noob; 08-11-10 at 14:32.

  8. #8
    Join Date
    Aug 2010
    Posts
    40
    Quote Originally Posted by db2Noob View Post
    Thanks for the help. the date is in t1 and the MAX is there because I want to just display the t1.b value once with the most recent date.

    Ok so I'm going to give an example

    Code:
    t1    
    dateCol   b col
    Jan13      4
    Jan14      2
    Jan14      4
    Jan14      1
    Jan15      5
    Jan15      4
    Jan16      3
    Code:
    t2
    a col    b col    c col
    val1     1         stringa
    val2     2         stringb
    val3     3         stringa
    val4     4         stringa
    val5     5         stringc
    Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

    Code:
    Resulting Table I want:
    t2.a      count(that meet condition)  mostRecentDate
    val4      2                                     Jan15
    val1      1                                     Jan14

    What I'm getting now is that table I want except the count column is giving me a number that's definitely not the one I want, so I'm not sure what that COUNT is doing, I guess I should be adding something to the "COUNT(t1.b)"

    I hope that's a bit clearer, thanks for the help.
    Try this::
    **Replace Dates with the format you want.
    Select T2.a col,b_count,datecol
    from t2 t2
    JOIN (Select max(Datecol),count(b col) b_Count, b col from t1 where datecol between date('01/14/2010') and date(01/15/2010') group by b col) T1 on T1.b col = t2.b col
    where T2.c col = 'Stringa'
    Last edited by Rajesh1203; 08-11-10 at 15:50.

  9. #9
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    Your original query would give your wanted result, like this:
    (I modified the query slightly to meet exactly to your newly described conditions and ORDER BY wwa DESC.)
    Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

    Code:
    Resulting Table I want:
    t2.a      count(that meet condition)  mostRecentDate
    val4      2                                     Jan15
    val1      1                                     Jan14
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     t1(dateCol , b) AS (
    VALUES
      (DATE('2010-01-13') , 4)
    , (DATE('2010-01-14') , 2)
    , (DATE('2010-01-14') , 4)
    , (DATE('2010-01-14') , 1)
    , (DATE('2010-01-15') , 5)
    , (DATE('2010-01-15') , 4)
    , (DATE('2010-01-16') , 3)
    )
    , t2(a , b , c) AS (
    VALUES
      ('val1' , 1 , 'stringa')
    , ('val2' , 2 , 'stringb')
    , ('val3' , 3 , 'stringa')
    , ('val4' , 4 , 'stringa')
    , ('val5' , 5 , 'stringc')
    )
    SELECT t2.a            AS wwa
         , COUNT(t1.b)     AS wwb
         , max(t1.datecol) AS "mostRecentDate"
      FROM t1
     INNER JOIN
           t2
       ON  t1.b = t2.b
     WHERE datecol BETWEEN '2010-01-14'
                       AND '2010-01-15'
       AND t2.c = 'stringa'
     GROUP BY
           t2.a
     ORDER BY
           wwa DESC
    ;
    ------------------------------------------------------------------------------
    
    WWA  WWB         mostRecentDate
    ---- ----------- --------------
    val4           2 2010-01-15    
    val1           1 2010-01-14    
    
      2 record(s) selected.

  10. #10
    Join Date
    Aug 2010
    Posts
    9
    Thanks for the responses,

    Rajesh: I want to group by "T2.a col" not "b col", so when I change "group by b col" to "group by T2.a col" it throws an error of the type. If I leave your code as it is it also throws an error.
    SQL0119N An expression starting with "b col" specified in a SELECT
    clause, HAVING clause, or ORDER BY clause is not specified in the GROUP BY
    clause or it is in a SELECT clause, HAVING clause, or ORDER BY clause with a
    column function and no GROUP BY clause is specified. SQLSTATE=42803

    tonkuma: Let me change the example a little maybe it's going to throw something different, the example is too short to yield anything lets add more , let me change t2 and put val4 with a string other than stringa:

    Code:
    t1    
    dateCol   b col
    Jan13      4
    Jan14      2
    Jan14      4
    Jan14      4
    Jan14      1
    Jan15      5
    Jan15      4
    Jan15      4
    Jan16      4
    Jan16      3
    Code:
    t2
    a col    b col    c col
    val4     4         stringa
    val1     1         stringa
    val4     4         stringc
    val2     2         stringb
    val4     4         stringa
    val3     3         stringa
    val4     4         stringc
    val5     5         stringc
    val4     4         stringb
    val4     4         stringa
    Thanks,

    EDITED: OK I made an interesting discovery maybe this will help solve the issue. OK so I go to control Center and open T2 to use a filter, I filter for all rows that have "b col = 4", a bunch of rows show. I filter once more to add that "string=stringa" and I get a total of 15 records!
    I went to T1 and filter "b col = 4" between the time range 'x' and 'y', only 2 rows showed.
    This goes back to post#4, where I said if I use max date the count will give me (2 from T1 multiplied by 15 in T2) 30 and if I do not use MAX it will show 2 entries with 15 counts each.

    OK so now I know what that count is doing but I don't know how to make it count what I really want, I want it to return a total count of 2 (the 2 records in T1 that meet this condition) Please help.

    Thanks so much for your efforts.
    Last edited by db2Noob; 08-12-10 at 14:55.

  11. #11
    Join Date
    Aug 2010
    Posts
    40
    Quote Originally Posted by db2Noob View Post
    Thanks for the responses,

    Rajesh: I want to group by "T2.a col" not "b col", so when I change "group by b col" to "group by T2.a col" it throws an error of the type. If I leave your code as it is it also throws an error.



    tonkuma: Let me change the example a little maybe it's going to throw something different, the example is too short to yield anything lets add more , let me change t2 and put val4 with a string other than stringa:

    Code:
    t1    
    dateCol   b col
    Jan13      4
    Jan14      2
    Jan14      4
    Jan14      4
    Jan14      1
    Jan15      5
    Jan15      4
    Jan15      4
    Jan16      4
    Jan16      3
    Code:
    t2
    a col    b col    c col
    val4     4         stringa
    val1     1         stringa
    val4     4         stringc
    val2     2         stringb
    val4     4         stringa
    val3     3         stringa
    val4     4         stringc
    val5     5         stringc
    val4     4         stringb
    val4     4         stringa
    Thanks,
    What are the results you are expecting with the new table data?

  12. #12
    Join Date
    Aug 2010
    Posts
    9
    I would expect:

    Resulting Table I want:

    Code:
    t2.a      count(that meet condition)  mostRecentDate
    val4      4                                     Jan15
    val1      1                                     Jan14

    Also please note the edit I did to my last post, i think that's where the problem lies, according to that post the result of this code would give:

    Code:
    t2.a      count(that meet condition)  mostRecentDate
    val4      12                                   Jan15
    val1      1                                     Jan14

  13. #13
    Join Date
    Feb 2008
    Location
    Japan
    Posts
    3,483
    I wrote:
    If t2.b was not unique,
    then you would get more than the number of rows of t1 by "t1 INNER JOIN t2 ON t1.b = t2.b".
    Here is an example removing duplicated t2.b:
    Code:
    ------------------------------ Commands Entered ------------------------------
    WITH
     t1(dateCol , b) AS (
    VALUES
      (DATE('2010-01-13') , 4)
    , (DATE('2010-01-14') , 2)
    , (DATE('2010-01-14') , 4)
    , (DATE('2010-01-14') , 4)
    , (DATE('2010-01-14') , 1)
    , (DATE('2010-01-15') , 5)
    , (DATE('2010-01-15') , 4)
    , (DATE('2010-01-15') , 4)
    , (DATE('2010-01-16') , 4)
    , (DATE('2010-01-16') , 3)
    )
    , t2(a , b , c) AS (
    VALUES
      ('val4' , 4 , 'stringa')
    , ('val1' , 1 , 'stringa')
    , ('val4' , 4 , 'stringc')
    , ('val2' , 2 , 'stringb')
    , ('val4' , 4 , 'stringa')
    , ('val3' , 3 , 'stringa')
    , ('val4' , 4 , 'stringc')
    , ('val5' , 5 , 'stringc')
    , ('val4' , 4 , 'stringb')
    , ('val4' , 4 , 'stringa')
    )
    SELECT t2.a            AS "t2.a"
         , COUNT(t1.b)     AS "count(t1 meet condition)"
         , max(t1.datecol) AS "mostRecentDate"
      FROM t1
     INNER JOIN
           (SELECT DISTINCT
                   a , b
              FROM t2
             WHERE c = 'stringa'
           ) t2
       ON  t1.b = t2.b
     WHERE t1.datecol BETWEEN '2010-01-14'
                          AND '2010-01-15'
     GROUP BY
           t2.a
     ORDER BY
           t2.a DESC
    ;
    ------------------------------------------------------------------------------
    
    t2.a count(t1 meet condition) mostRecentDate
    ---- ------------------------ --------------
    val4                        4 2010-01-15    
    val1                        1 2010-01-14    
    
      2 record(s) selected.

  14. #14
    Join Date
    Aug 2010
    Posts
    9

    Talking

    tonkuma it worked!! thanks thanks thanks!!!!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •