If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > Microsoft SQL Server > Listing words with no of occurances in sql server

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 12-18-06, 06:31
Harris Harris is offline
Registered User
 
Join Date: Apr 2006
Posts: 5
Question Listing words with no of occurances in sql server

Hi,

Is it possible to get words from text columns in a sql server database in the order of their occurances in that column with full text search or by any other method.


Thanks in advance.
Reply With Quote
  #2 (permalink)  
Old 12-18-06, 07:03
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
try this approach -- Simple Keyword Relevance
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 12-18-06, 07:37
Harris Harris is offline
Registered User
 
Join Date: Apr 2006
Posts: 5
Cool My problem is some thing different



My query is regarding retreiving words from table on the bases of its maximum occurances in column irrespective of a row.

means a word that appears maximum number of times in a column, suppose if i have 10 rows in a table and word "Globe" appears maximum no of times say in 5 rows i want Globe word on the top.


Thanks in advance.
Reply With Quote
  #4 (permalink)  
Old 12-18-06, 07:58
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
Code:
select wordcolumn
     , count(*) as occurrences
  from daTable
group
    by wordcolumn
order
    by occurrences desc
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 12-18-06, 08:11
Harris Harris is offline
Registered User
 
Join Date: Apr 2006
Posts: 5
Hi,
Thanks for quick reply....

This could solve my problem if reffered column contain only single word at a time.
But this column is varchar column containing text of multiple words.

Any suggestion?

Thanks
Reply With Quote
  #6 (permalink)  
Old 12-18-06, 08:21
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
ok, that's going to require that you split out every word in the column individually

this may require a cross join with a numbers table

what's this for? how often are you going to use it?

the query will definitely not be easy to write, nor efficient to run
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #7 (permalink)  
Old 12-18-06, 10:00
Brett Kaiser Brett Kaiser is offline
Window Washer
 
Join Date: Nov 2002
Location: Jersey
Posts: 10,322
Please read the sticky at the top of the board and post what it asks for...we need some examples
__________________
Brett
8-)

It's a Great Day for America everybody!

dbforums Yak CorralRadio 'Rita
dbForums Member List
I'm Good Once as I ever was

The physical order of data in a database has no meaning.
Reply With Quote
  #8 (permalink)  
Old 12-18-06, 10:13
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
you gonna write the query, brett?

here's an example --
Code:
create table somewords
( id integer not null primary key identity
, blah text not null
);
insert into somewords (blah) 
values ('a word that appears maximum number of times in a column')
insert into somewords (blah) 
values ('Is it possible to get words from text columns in a sql server database')
insert into somewords (blah) 
values ('This could solve my problem if reffered column contain only single word')
insert into somewords (blah) 
values ('that''s going to require that you split out every word in the column individually')
insert into somewords (blah) 
values ('the query will definitely not be easy to write')
insert into somewords (blah) 
values ('Please read the sticky at the top of the board')
insert into somewords (blah) 
values ('The physical order of data in a database has no meaning')
i'm sure you understand the requirement -- which word occurs the most frequent?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #9 (permalink)  
Old 12-20-06, 07:55
Harris Harris is offline
Registered User
 
Join Date: Apr 2006
Posts: 5
Hi,

r937 quoted the right example and i am sure he understood the problem.

I need it often in my application.

If you say to split words in multiple columns, the problem here is we are not sure about maximum number of words in field.
Reply With Quote
  #10 (permalink)  
Old 12-20-06, 13:05
Brett Kaiser Brett Kaiser is offline
Window Washer
 
Join Date: Nov 2002
Location: Jersey
Posts: 10,322
Is this for doing a search and ranking the results based on multiple search criteria...if it is then Rudy did not understan...because what you are asking makes no sense

Is it really show me for all rows in a table, the rows with the most occuring types?

If it is, and you need to "do that alot", what in heavens name for?
__________________
Brett
8-)

It's a Great Day for America everybody!

dbforums Yak CorralRadio 'Rita
dbForums Member List
I'm Good Once as I ever was

The physical order of data in a database has no meaning.
Reply With Quote
  #11 (permalink)  
Old 12-20-06, 13:08
Brett Kaiser Brett Kaiser is offline
Window Washer
 
Join Date: Nov 2002
Location: Jersey
Posts: 10,322
Oh, and why would you need a query

http://www.google.com/search?hl=en&l...=a&btnG=Search

About 8 billion hits

I think that would "win"
__________________
Brett
8-)

It's a Great Day for America everybody!

dbforums Yak CorralRadio 'Rita
dbForums Member List
I'm Good Once as I ever was

The physical order of data in a database has no meaning.
Reply With Quote
  #12 (permalink)  
Old 12-20-06, 13:14
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
the question is really very simple, brett

refer to the sample data which i posted (which you wanted to see), and answer the question: which word occurs most frequently in the blah column? hint: it isn't the word "data" and it isn't the word "word"
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #13 (permalink)  
Old 12-20-06, 15:04
Brett Kaiser Brett Kaiser is offline
Window Washer
 
Join Date: Nov 2002
Location: Jersey
Posts: 10,322
OK, well, you would have to start be decontructig the data...but I still don't know what for or why there would be a need for this

Code:

create table somewords
( id integer not null primary key identity
, blah text not null
);
insert into somewords (blah) 
values ('a word that appears maximum number of times in a column')
insert into somewords (blah) 
values ('Is it possible to get words from text columns in a sql server database')
insert into somewords (blah) 
values ('This could solve my problem if reffered column contain only single word')
insert into somewords (blah) 
values ('that''s going to require that you split out every word in the column individually')
insert into somewords (blah) 
values ('the query will definitely not be easy to write')
insert into somewords (blah) 
values ('Please read the sticky at the top of the board')
insert into somewords (blah) 
values ('The physical order of data in a database has no meaning')
GO

CREATE TABLE UniqueWords (
	  Word varchar(256)
	, WordId int IDENTITY(1,1)
	, Add_Dt datetime DEFAULT (GetDate()))
 GO

CREATE UNIQUE INDEX UnqueWords_PK ON UniqueWords(Word)
GO

CREATE PROC isp_INS_UNIQUE_WORDS
AS
BEGIN
	SET NOCOUNT ON
	DECLARE @Words INT, @Pos INT, @x Int, @str varchar(256)
		  , @word varchar(256), @start int, @end int, @exitstart int
	SELECT @Words = 0, @Pos = 1, @x = -1, @Word = '', @start = 1

	DECLARE myCursor CURSOR FOR SELECT Blah FROM SomeWords
	OPEN myCursor
	FETCH NEXT FROM myCursor INTO @str

	WHILE @@FETCH_STATUS = 0
	  BEGIN
		WHILE (@x <> 0)
			BEGIN
				SET @x     = CHARINDEX(' ', @str, @Pos)
				IF @x <> 0
				  BEGIN 
					SET @end   = @x - @start
					SET @word  = SUBSTRING(@str,@start,@end)
					IF NOT EXISTS (SELECT * FROM UniqueWords WHERE Word = @Word)
						INSERT INTO UniqueWords(Word) SELECT @word
					-- SELECT @Word, @@ROWCOUNT,@@ERROR
					-- SELECT @x, @Word, @start, @end, @str
					SET @exitstart = @start + @end + 1
					SET @Pos   = @x + 1
					SET @start = @x + 1
					SET @Words = @Words + 1
				  END
				IF @x = 0
				  BEGIN
					SET @word  = SUBSTRING(@str,@exitstart,LEN(@str)-@exitstart+1)
					IF NOT EXISTS (SELECT * FROM UniqueWords WHERE Word = @Word)
						INSERT INTO UniqueWords(Word) SELECT @word
					-- SELECT @Word, @@ROWCOUNT,@@ERROR
					-- SELECT @x, @Word, @exitstart, LEN(@str)-@exitstart, @str
				  END
			END
		FETCH NEXT FROM myCursor INTO @str
		SELECT @Words = 0, @Pos = 1, @x = -1, @Word = '', @start = 1
	  END	

	  CLOSE myCursor
	  DEALLOCATE myCursor
	  SET NOCOUNT OFF
	RETURN @Words
END
GO

EXEC isp_INS_UNIQUE_WORDS
GO

SELECT * FROM UniqueWords ORDER BY Word
GO

DROP PROC isp_INS_UNIQUE_WORDS
DROP TABLE UniqueWords, somewords
GO
I'm gonna blog this
__________________
Brett
8-)

It's a Great Day for America everybody!

dbforums Yak CorralRadio 'Rita
dbForums Member List
I'm Good Once as I ever was

The physical order of data in a database has no meaning.
Reply With Quote
  #14 (permalink)  
Old 12-20-06, 15:14
RedNeckGeek RedNeckGeek is offline
Village Idiot
 
Join Date: Jul 2003
Location: Michigan
Posts: 1,941
Quote:
...but I still don't know what for or why there would be a need for this
You don't think there's a use for data mining on simple text?

let's see....

military intel
or
political speech analysis
or
most occurences in scientific notation
...
__________________
Inspiration Through Fermentation
Reply With Quote
  #15 (permalink)  
Old 12-20-06, 15:15
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 20,000
brett! dude!! that's wonderful! and the cursors, so pretty!!

now, any chance you could actually address the problem as initially posted?

which word occurred the most often?

boy it's taking you a long time to catch on

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On