If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > DB2 > problem with count in join

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 08-10-10, 17:29
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
problem with count in join

Hi all,

I'm trying to count the records in table t1 that also exist in t2 and meet certain condition in t2 (i.e that the record in t2 also has certain value in another column).

The thing is that I only want to count the records in t1 not only that matches those in t2 but also that are within a certain period of time (there's another column that stores the date of the record).

My code looks as follows:

Code:
SELECT t2.a AS wwa,
	  COUNT(t1.b) AS wwb,
	  max(t1.date),
FROM	  table t1 INNER JOIN
	  table t2 ON t1.b = t2.b
WHERE (date >= '2009-08-10 00:00:00.000') 
          AND (date <= '2010-08-10 00:00:00.000') AND (t2.c='myString')
GROUP BY t2.a
ORDER BY wwa ASC
That count is counting something different than the records in "t1.b = t2.b" between that period of time because in the database I filter the results and it is something different than that count.

Can anyone please tell me why? or how to fix this?

Thanks,

Last edited by db2Noob; 08-10-10 at 17:32.
Reply With Quote
  #2 (permalink)  
Old 08-10-10, 17:38
n_i n_i is online now
:-)
 
Join Date: Jun 2003
Location: Toronto, Canada
Posts: 4,449
Quote:
Originally Posted by db2Noob View Post
Can anyone please tell me why? or how to fix this?
In order for someone to do this, you would need to explain what you mean by

Quote:
Originally Posted by db2Noob View Post
in the database I filter the results and it is something different than that count.
Reply With Quote
  #3 (permalink)  
Old 08-11-10, 08:08
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
try using an exists subselect for t2 rather than a join.
Dave
Reply With Quote
  #4 (permalink)  
Old 08-11-10, 10:14
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
Thumbs up

Yes sorry what I meant by:

Quote:
in the database I filter the results and it is something different than that count.
Is that I login into Control Center, open the table t1 and use the filter button to filter the results by those conditions, i.e. (date >= '2009-08-10 00:00:00.000')
AND (date <= '2010-08-10 00:00:00.000') AND t1.b = <the value in t2 I want>, I do this for several records and they do not match the results of the query above.

Lets say between those dates I have 2 records according to the filtering I did in the gui, the query however throws 30 for the same record (which is odd because I have a total of 175 records where t1.b = t2.b in t1 and only 2 of them are within that time range so I don't know how that 30 is getting in there).

I tried the EXIST (subselect) as follows but it throws an error:

Code:
SELECT t2.a AS wwa,
	  COUNT(t1.b) AS wwb,
	  max(t1.date),
FROM	  table t1
WHERE EXIST( SELECT *
                     FROM table t2 
                    WHERE (t1.b = t2.b) 
                    AND (date >= '2009-08-10 00:00:00.000') 
                    AND (date <= '2010-08-10 00:00:00.000') 
                    AND (t2.c='myString') )
GROUP BY t2.a
ORDER BY wwa ASC
I get the following error:
SQL0104N An unexpected token "table t2 WHERE (t1.b = t2" was found following "EXIST (SELECT * FROM". Expected tokens may include: "<space> "

How can I fix this query or what else can I do to get the right count?

EDITED: Just wanted to add that I changed line 3 from "max(t1.date)" to "t1.date" in the select statement in the original query (post #1) and the result query listed the value I wanted twice and a count of 15 for each, hence the 30. What I want is a count of 2 not 30, why is that not counting that there are 2 records meeting that condition! - Thanks

Last edited by db2Noob; 08-11-10 at 10:48.
Reply With Quote
  #5 (permalink)  
Old 08-11-10, 10:37
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,195
Quote:
the query however throws 30 for the same record.
No "record" in DB2.
What mean with 30?
Do you want to say "the query returns 30 result rows"?

If t2.b was not unique,
then you would get more than the number of rows of t1 by "t1 INNER JOIN t2 ON t1.b = t2.b".
Reply With Quote
  #6 (permalink)  
Old 08-11-10, 10:46
dav1mo dav1mo is offline
Registered User
 
Join Date: Dec 2007
Location: Richmond, VA
Posts: 782
the missing token is an S, the keyword is EXISTS, not EXIST. What table does the DATE column belong to t1 or t2? I did not notice in your first post that you are, also, attempting to select a column from T2. So, the exists will not work due to that. Also, get rid of the unneccessary parens they just hurt readability and introduce problems for maintenance as Tonkuma likes to say. the join condition of the two tables, is that the complete join condition between the tables? It might help if you gave us some ddl for the 2 tables and input data and expected results. Lastly, what is the deal with the max date?

Dave
Reply With Quote
  #7 (permalink)  
Old 08-11-10, 13:12
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
Thanks for the help. the date is in t1 and the MAX is there because I want to just display the t1.b value once with the most recent date.

Ok so I'm going to give an example

Code:
t1    
dateCol   b col
Jan13      4
Jan14      2
Jan14      4
Jan14      1
Jan15      5
Jan15      4
Jan16      3
Code:
t2
a col    b col    c col
val1     1         stringa
val2     2         stringb
val3     3         stringa
val4     4         stringa
val5     5         stringc
Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

Code:
Resulting Table I want:
t2.a      count(that meet condition)  mostRecentDate
val4      2                                     Jan15
val1      1                                     Jan14

What I'm getting now is that table I want except the count column is giving me a number that's definitely not the one I want, so I'm not sure what that COUNT is doing, I guess I should be adding something to the "COUNT(t1.b)"

I hope that's a bit clearer, thanks for the help.

Last edited by db2Noob; 08-11-10 at 13:32.
Reply With Quote
  #8 (permalink)  
Old 08-11-10, 14:38
Rajesh1203 Rajesh1203 is offline
Registered User
 
Join Date: Aug 2010
Posts: 38
Quote:
Originally Posted by db2Noob View Post
Thanks for the help. the date is in t1 and the MAX is there because I want to just display the t1.b value once with the most recent date.

Ok so I'm going to give an example

Code:
t1    
dateCol   b col
Jan13      4
Jan14      2
Jan14      4
Jan14      1
Jan15      5
Jan15      4
Jan16      3
Code:
t2
a col    b col    c col
val1     1         stringa
val2     2         stringb
val3     3         stringa
val4     4         stringa
val5     5         stringc
Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

Code:
Resulting Table I want:
t2.a      count(that meet condition)  mostRecentDate
val4      2                                     Jan15
val1      1                                     Jan14

What I'm getting now is that table I want except the count column is giving me a number that's definitely not the one I want, so I'm not sure what that COUNT is doing, I guess I should be adding something to the "COUNT(t1.b)"

I hope that's a bit clearer, thanks for the help.
Try this::
**Replace Dates with the format you want.
Select T2.a col,b_count,datecol
from t2 t2
JOIN (Select max(Datecol),count(b col) b_Count, b col from t1 where datecol between date('01/14/2010') and date(01/15/2010') group by b col) T1 on T1.b col = t2.b col
where T2.c col = 'Stringa'

Last edited by Rajesh1203; 08-11-10 at 14:50.
Reply With Quote
  #9 (permalink)  
Old 08-11-10, 15:40
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,195
Your original query would give your wanted result, like this:
(I modified the query slightly to meet exactly to your newly described conditions and ORDER BY wwa DESC.)
Quote:
Condition: rows in t1 between the dates Jan 14 to Jan 15 (inclusive) where t1.b=t2.b as long as t2.c=stringa

Code:
Resulting Table I want:
t2.a      count(that meet condition)  mostRecentDate
val4      2                                     Jan15
val1      1                                     Jan14
Code:
------------------------------ Commands Entered ------------------------------
WITH
 t1(dateCol , b) AS (
VALUES
  (DATE('2010-01-13') , 4)
, (DATE('2010-01-14') , 2)
, (DATE('2010-01-14') , 4)
, (DATE('2010-01-14') , 1)
, (DATE('2010-01-15') , 5)
, (DATE('2010-01-15') , 4)
, (DATE('2010-01-16') , 3)
)
, t2(a , b , c) AS (
VALUES
  ('val1' , 1 , 'stringa')
, ('val2' , 2 , 'stringb')
, ('val3' , 3 , 'stringa')
, ('val4' , 4 , 'stringa')
, ('val5' , 5 , 'stringc')
)
SELECT t2.a            AS wwa
     , COUNT(t1.b)     AS wwb
     , max(t1.datecol) AS "mostRecentDate"
  FROM t1
 INNER JOIN
       t2
   ON  t1.b = t2.b
 WHERE datecol BETWEEN '2010-01-14'
                   AND '2010-01-15'
   AND t2.c = 'stringa'
 GROUP BY
       t2.a
 ORDER BY
       wwa DESC
;
------------------------------------------------------------------------------

WWA  WWB         mostRecentDate
---- ----------- --------------
val4           2 2010-01-15    
val1           1 2010-01-14    

  2 record(s) selected.
Reply With Quote
  #10 (permalink)  
Old 08-12-10, 13:04
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
Thanks for the responses,

Rajesh: I want to group by "T2.a col" not "b col", so when I change "group by b col" to "group by T2.a col" it throws an error of the type. If I leave your code as it is it also throws an error.
Quote:
SQL0119N An expression starting with "b col" specified in a SELECT
clause, HAVING clause, or ORDER BY clause is not specified in the GROUP BY
clause or it is in a SELECT clause, HAVING clause, or ORDER BY clause with a
column function and no GROUP BY clause is specified. SQLSTATE=42803

tonkuma: Let me change the example a little maybe it's going to throw something different, the example is too short to yield anything lets add more , let me change t2 and put val4 with a string other than stringa:

Code:
t1    
dateCol   b col
Jan13      4
Jan14      2
Jan14      4
Jan14      4
Jan14      1
Jan15      5
Jan15      4
Jan15      4
Jan16      4
Jan16      3
Code:
t2
a col    b col    c col
val4     4         stringa
val1     1         stringa
val4     4         stringc
val2     2         stringb
val4     4         stringa
val3     3         stringa
val4     4         stringc
val5     5         stringc
val4     4         stringb
val4     4         stringa
Thanks,

EDITED: OK I made an interesting discovery maybe this will help solve the issue. OK so I go to control Center and open T2 to use a filter, I filter for all rows that have "b col = 4", a bunch of rows show. I filter once more to add that "string=stringa" and I get a total of 15 records!
I went to T1 and filter "b col = 4" between the time range 'x' and 'y', only 2 rows showed.
This goes back to post#4, where I said if I use max date the count will give me (2 from T1 multiplied by 15 in T2) 30 and if I do not use MAX it will show 2 entries with 15 counts each.

OK so now I know what that count is doing but I don't know how to make it count what I really want, I want it to return a total count of 2 (the 2 records in T1 that meet this condition) Please help.

Thanks so much for your efforts.

Last edited by db2Noob; 08-12-10 at 13:55.
Reply With Quote
  #11 (permalink)  
Old 08-12-10, 13:49
Rajesh1203 Rajesh1203 is offline
Registered User
 
Join Date: Aug 2010
Posts: 38
Quote:
Originally Posted by db2Noob View Post
Thanks for the responses,

Rajesh: I want to group by "T2.a col" not "b col", so when I change "group by b col" to "group by T2.a col" it throws an error of the type. If I leave your code as it is it also throws an error.



tonkuma: Let me change the example a little maybe it's going to throw something different, the example is too short to yield anything lets add more , let me change t2 and put val4 with a string other than stringa:

Code:
t1    
dateCol   b col
Jan13      4
Jan14      2
Jan14      4
Jan14      4
Jan14      1
Jan15      5
Jan15      4
Jan15      4
Jan16      4
Jan16      3
Code:
t2
a col    b col    c col
val4     4         stringa
val1     1         stringa
val4     4         stringc
val2     2         stringb
val4     4         stringa
val3     3         stringa
val4     4         stringc
val5     5         stringc
val4     4         stringb
val4     4         stringa
Thanks,
What are the results you are expecting with the new table data?
Reply With Quote
  #12 (permalink)  
Old 08-12-10, 13:58
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
I would expect:

Resulting Table I want:

Code:
t2.a      count(that meet condition)  mostRecentDate
val4      4                                     Jan15
val1      1                                     Jan14

Also please note the edit I did to my last post, i think that's where the problem lies, according to that post the result of this code would give:

Code:
t2.a      count(that meet condition)  mostRecentDate
val4      12                                   Jan15
val1      1                                     Jan14
Reply With Quote
  #13 (permalink)  
Old 08-12-10, 14:47
tonkuma tonkuma is online now
Registered User
 
Join Date: Feb 2008
Location: Japan
Posts: 2,195
I wrote:
Quote:
If t2.b was not unique,
then you would get more than the number of rows of t1 by "t1 INNER JOIN t2 ON t1.b = t2.b".
Here is an example removing duplicated t2.b:
Code:
------------------------------ Commands Entered ------------------------------
WITH
 t1(dateCol , b) AS (
VALUES
  (DATE('2010-01-13') , 4)
, (DATE('2010-01-14') , 2)
, (DATE('2010-01-14') , 4)
, (DATE('2010-01-14') , 4)
, (DATE('2010-01-14') , 1)
, (DATE('2010-01-15') , 5)
, (DATE('2010-01-15') , 4)
, (DATE('2010-01-15') , 4)
, (DATE('2010-01-16') , 4)
, (DATE('2010-01-16') , 3)
)
, t2(a , b , c) AS (
VALUES
  ('val4' , 4 , 'stringa')
, ('val1' , 1 , 'stringa')
, ('val4' , 4 , 'stringc')
, ('val2' , 2 , 'stringb')
, ('val4' , 4 , 'stringa')
, ('val3' , 3 , 'stringa')
, ('val4' , 4 , 'stringc')
, ('val5' , 5 , 'stringc')
, ('val4' , 4 , 'stringb')
, ('val4' , 4 , 'stringa')
)
SELECT t2.a            AS "t2.a"
     , COUNT(t1.b)     AS "count(t1 meet condition)"
     , max(t1.datecol) AS "mostRecentDate"
  FROM t1
 INNER JOIN
       (SELECT DISTINCT
               a , b
          FROM t2
         WHERE c = 'stringa'
       ) t2
   ON  t1.b = t2.b
 WHERE t1.datecol BETWEEN '2010-01-14'
                      AND '2010-01-15'
 GROUP BY
       t2.a
 ORDER BY
       t2.a DESC
;
------------------------------------------------------------------------------

t2.a count(t1 meet condition) mostRecentDate
---- ------------------------ --------------
val4                        4 2010-01-15    
val1                        1 2010-01-14    

  2 record(s) selected.
Reply With Quote
  #14 (permalink)  
Old 08-12-10, 16:24
db2Noob db2Noob is offline
Registered User
 
Join Date: Aug 2010
Posts: 9
Talking

tonkuma it worked!! thanks thanks thanks!!!!
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On