Hi everyone .. this is my first post here ( just found this forum )
the following is about MS Access
if it is not a good choice please suggest another
but keep in mind that i will be making a cd-rom out of this
and i cannot have the client install sql server or something like that
i have a few thousand files with a few hundred words in each..
lets say 3000 files with 500 words in each ..
i need to be able to search for specific words ( with OR or AND )
and i also must be able to search for phrases.
no need for wildcards..
the way i have gone so far ( and it might be completely off ) is
i have made a table containing the filenames, a table with all different words, and a table that links the previous two, having links like
this approach has made the linking table to have around 1.5 million records ( 3000 files x 500 words )
the different words are 80.000
Question: How should i search this data ?
my way: I nest sql statements like this
select DISTINCT(filetable.file) where
wordtable.word ='1st word here' and filetable.file in ( select for second word where file in (select for third word ....))
it is slow, and i optimized it by arranging the search words by number of times found ( i made a table containing word counts), but still it is slow for more than 3 words ...
i also keep in the linking table the position in the file that the word is found ( integer position of word in file 1-500)
A bit of airware here, I think the search could be improved by converting the words to ascii in the tables and then converting the search phrase to ascii. I think the speed is a direct result of access doing the conversions under the covers.