If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > string similarity algorithm.

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 11-13-10, 12:28
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
string similarity algorithm.

I have a database table of product titles. For each product, I want to link 10-20 similar products, ordered by relevance.

Anybody have any suggestions about how I could go about this (the database design is OK, just the actual algorithm/process to link similar products is what I need)?

The main concern is quality/relevance and accuracy.

Any help appreciated, thanks.
Reply With Quote
  #2 (permalink)  
Old 11-13-10, 12:31
r937 r937 is online now
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
possible approaches:

- similar titles based on soundex

- purchase patterns (customers who bought this also bought...)

- taxonomy (all titles in same category)

- related by vendor

- ...
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 11-13-10, 12:43
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
- soundex.
will give this a try.

- purchase patterns
- taxonomy
do not have this data to use.

- related by vendor.
will also give this a try, but I would have to first extract an index of vendors before, as I do not currently have the vendor brand separate from the titles.

cheers
Reply With Quote
  #4 (permalink)  
Old 11-14-10, 06:56
bchanan bchanan is offline
Registered User
 
Join Date: Dec 2009
Posts: 27
another idea

why not build a family tree of products:

idea 1:
- add a product_parent_id as the product it's related to
- if the product_parent_id is null, no relation is found
- if the product_parent_id is not null, then it's related to the parent product
- now you can produce a full relation father-son products

idea 2:
- add a new table: product_relations_type (relation_type_id, relation, desc)
- add a new table product_2_product_relation (product_id1, product_id2, relation_type_id)
- this way you will have muliply products related to other products
using multiply relation types.

thanks
Chanan
Reply With Quote
  #5 (permalink)  
Old 11-14-10, 07:12
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
thanks for your suggestion, but as I said the database design is not the problem!

the problem is how do I go about finding and saving the relations between products (actually populating the tables you have mentioned)
Reply With Quote
  #6 (permalink)  
Old 11-14-10, 10:05
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,609
The first thing that I'd do is to come up with definitions for "Quality", "Relevance", and "Accuracy" that I could program.

In other words, if I can't express what "Quality" means using code I can write along with the data that I have, then the term doesn't mean anything relevant to the problem that I'm trying to solve.

I'm pretty sure that the database schema is at least part of your problem. If it wasn't the problem, you'd be able to pick the results that you want from an existing table or view.

We can help you fix this problem, but right now you're like of like the patient that goes to the doctor, says "I'm sick", then expects the doctor to write a prescription. I can't speak for everyone, but we need to know more about your problem. Once you explain what those three words mean in your context (preferably as code we can examine), then I think we can help you a lot more.

-PatP
__________________
In theory, theory and practice are identical. In practice, theory and practice are unrelated.
Reply With Quote
  #7 (permalink)  
Old 11-15-10, 14:24
nzo nzo is offline
Registered User
 
Join Date: Jan 2010
Posts: 16
"I'm pretty sure that the database schema is at least part of your problem."
-the database is part of the solution.

I was probably being too general...

I got good results using the Sphinx search server algorithm. (match all mode).
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On