| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

01-15-09, 20:58
|
|
Registered User
|
|
Join Date: Apr 2007
Posts: 127
|
|
|
how to find out values shared by some members
|
|
Hello ALL,
I have a table contains words extracted from documents which are belonging to a number of categories. Each category contains a number of documents. For each category, I want to find out those words which occur at least in N documents belonging to this category.
Code:
Category DocID Word Freq Tag
Art A12 Paint 3 verb
Art A12 Exhibition 5 noun
Art A12 Show 2 verb
Art A12 Ticket 4 noun
Art B34 Exhibition 2 noun
Art B34 Musum 3 noun
Art B34 Contemporary 2 adj
Art B34 Show 1 noun
Art B34 Visitor 2 noun
Art E97 gallery 3 noun
Art E97 Show 3 noun
Art E97 Exhibition 2 noun
Electronics A32 Circuit 2 noun
Electronics A32 IC 2 noun
Electronics A32 Factory 2 noun
Electronics A32 Electronics 3 noun
Electronics F11 HiFi 2 noun
Electronics F11 IC 2 noun
Electronics F11 Circuit 2 noun
Electronics C55 IC 3 noun
Electronics C55 Circuit 3 noun
Electronics C55 Transmit 2 verb
Electronics C55 Receive 5 verb
I wonder how to do this query.
Thanks
|
|

01-15-09, 21:08
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
|
Originally Posted by cy163
I wonder how to do this query.
|
oh, come on, it's not that hard
you've been writing SQL on this schema for months and months
surely you have learned enough by now to give it a try on your own
otherwise, we're going to start asking you for a part of your salary...

|
|

01-16-09, 09:43
|
|
Registered User
|
|
Join Date: Apr 2007
Posts: 127
|
|
|
|
sorry.
Actually, I have tried to sovle the problem by myself, but i failed.
|
|

01-16-09, 09:46
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
show your attempt, and explain what results it returned and why those results were incorrect
|
|

01-16-09, 11:05
|
|
Registered User
|
|
Join Date: Apr 2007
Posts: 127
|
|
Is this one right?
Code:
SELECT
a.Category, a.DocID, a.Word
FROM
cateData as a
INNER JOIN cateData as b
ON
(a.Category = b.Category and a.Word = b.Word)
where
a.DocId <> b.DocID;
|
|

01-16-09, 11:14
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Code:
SELECT Category
, DocID
, Word
FROM cateData
GROUP
BY Category
, DocID
, Word
HAVING COUNT(*) >= N
|
|

01-17-09, 03:50
|
|
Registered User
|
|
Join Date: Apr 2007
Posts: 127
|
|
I never know GROUP BY can be followed by more than one field names before. So, It is no doubt that I could not figure out this solution by myself.
Thanks r937.
Forget to mention, I also need the average value of occurrence frequency of a selected word. So, I adapt the solution. Is it right?
Code:
SELECT Category
, DocID
, Word
, AVG(Freq) as Avg_Freq
FROM cateData
GROUP
BY Category
, DocID
, Word
HAVING COUNT(*) >= N
|
|

01-17-09, 04:11
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
|
Originally Posted by cy163
Is it right?
|
yes, it is 
|
|

01-19-09, 09:27
|
|
Registered User
|
|
Join Date: Apr 2007
Posts: 127
|
|
Today, I test the query with real data and find out the right statement is
Code:
SELECT Category
, DocID
, Word
, AVG(Freq) as Avg_Freq
FROM cateData
GROUP
BY Category
, Word
HAVING COUNT(*) >= N
That is, the GROUP BY clause should be
Code:
GROUP BY
Category
, Word
DocID should not appear in the GROUP BY clause.
|
|

01-19-09, 09:35
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
you are certainly making good progress in learning SQL
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|