If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > MySQL MATCH returns no relevance score

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 02-22-10, 13:32
JohannShunt JohannShunt is offline
Registered User
 
Join Date: Feb 2010
Posts: 4
MySQL MATCH returns no relevance score

Hi all,

I'm stuck on a weird bug in some search code for my site. This is a simplified version that I've boiled the problem down to:

This statement correctly returns an ID and a score, as you would expect:

Code:
SELECT id, match (title) against ('search term') as score FROM ArticleDetails ORDER BY score DESC
But this formally identical statement just returns zeros for all scores:

Code:
SELECT id, match (xml) against ('search term') as score FROM ArticlesXML ORDER BY score DESC
It thus seems to me that the problem must be outside of my queries, but where is not obvious to me. The column for the first example is varchar(255) , and the second is longtext. Could the answer be somewhere here or in the indexing?

I've read through the relevant documentation and searched around, but no answers as yet. Any help would be much appreciated.

Last edited by JohannShunt; 02-26-10 at 10:50.
Reply With Quote
  #2 (permalink)  
Old 02-26-10, 10:56
JohannShunt JohannShunt is offline
Registered User
 
Join Date: Feb 2010
Posts: 4
Fulltext search problem with XML content stored in longtext fields?

After a lot of fiddling - testing one variable at a time in the structure of these tables, the character of their data etc - I think I've found the answer to this problem. There doesn't seem to be anything else out there on this, which makes me suspicious of the solution, but in case this is of use to anyone else:

The fulltext search seems to trip up on XML content that is stored in the longtext column, and always return zeros for match results. To remedy this, I created another column, and inserted into this a version of the content with all of the XML tags stripped out. I indexed this column as fulltext. Now, a search performed on this table, but matching against the stripped version rather than the XML returns sensible (non-zero) results.

If anyone out there can enlighten me on exactly what is going on here, it'd be appreciated.
Reply With Quote
  #3 (permalink)  
Old 02-26-10, 12:06
JohannShunt JohannShunt is offline
Registered User
 
Join Date: Feb 2010
Posts: 4
Hmm. The bug has returned without any obvious cause, and using my previous "solution". This is really weird. I've spent days of work on this now.
Reply With Quote
  #4 (permalink)  
Old 03-04-10, 13:12
JohannShunt JohannShunt is offline
Registered User
 
Join Date: Feb 2010
Posts: 4
Well in case it is of use to anyone else, the end of the story is that I discovered the problem was down to MySQL's policy of considering any word that appears in more than 50% of rows to be a stopword.

To specify: the apparent unpredictability in tests was down to the fact that it was only search phrases in which every individual word was in more than 50% of rows that had the problem: only these returned a relevance of zero, since all words were treated as de facto "stopwords". Exacerbating the confusion, the query that I was running returned all rows rather than none every time it encountered one of these failing searches, meaning that I failed to see that all words were being treated as stopwords. Other very generic searches were fine as long as there was at least one "narrowing" term in less than 50% of rows. Although I was aware of the 50% policy, since I was thinking (incorrectly) on the level of phrases rather than individual words, I assumed such searches as "third world" could not possibly be in more than 50% of rows. As it turned out however, "third" and "world" both individually appear in more than 50% of rows.

This wasted a lot of my time.... Hopefully others can avoid this problem.

Last edited by JohannShunt; 03-04-10 at 13:19.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On