Hi,
We have an user registration system, which handles:
the registration of new users
the update of existing ones
and the search for already existing ones.
In total there are already around 2million registered users, and the search( like %searchterm%) among the existing customers is very slow(10-15 min).
I would like to have your opinion on the proper design and implementation ideeas for such a situation, and not necessarily on the existing implementation which might be flawed.
The customers have:
1) a name (textual, up to 300 characters, however most of the records have around 30-100 caracters)
2) an andress.
I'll be happy if we could search for already existing customers based on name only within seconds.
Currently we have an index on the name field, but the index , as expected doesn't work on searches that start with a wildacard '%searchterm'
I assume this is a classical issue, and I would like to know if there are any ideeas on how to provide for a fast search of a search such as
user_name like %new% with the end goal to avoid having twice the same user.
I did my homework and started to search the web, thinking mainly at the google like solution, and I came across a few ideeas
1) the
inverted indexing .
However this will solve the issue of multiple words as search criteria, and as in our case we are talking about user names which can be very unique, I assume that it may not map exactly to the model of a search engine, that searches for words(out of the dictionary mainly) without using wildcards.
2) Then I thought of oracle search text
3) and or a reverse index.
I do not have that much experience with Oracle search text though, but I'm willing to invest time and study/learn about the solution that you will recommend.
I have posted here because we are currently using Oracle:
Operating system Sun Solaris 9.
RDBMS Oracle 10.2.0.3.
Application server BEA WebLogic Server 8.1
I have as well checked a bit
Apache Lucene, however I couldn't find any db information about the core design, the only info were the javadocs.
Or maybe is it too much to ask for seconds while searching for username LIKE %searchterm% ? Is it feasible?
Thank you!