Unanswered: Query: Top 3 Results grouped by Category
I am working on a search result query in which the user types a keyword
and the top 3 results for each category are displayed with 24 or so
total results on the page. The database is large (8 GB) and the table
in which the fulltext search occurs has 3 million rows.
The category table contains many categories but I want to output only
top level "general" categories. Each product has a link to its general
category in the category_link table.
The fulltext search runs fine if I don't attempt to order and group the
output by categories. Currently I have 1 query that gets the top level
categories (about 20 in all) then it loops through in PHP executing a
fulltext search on EACH of those results. This probably isn't the best
solution since it requires executing a fulltext search 20 different
times and in some cases (especially general words or multiword
searches) takes much too long. (anywhere from 2 - 45 seconds)
Is there a way to combine this into 1 FAST query? I'm using MySQL
4.0.17 and PHP 4.3.4 so I don't think subqueries are an option... I
suppose I could upgrade if it was totally necessary.
The desired results for searching for "Caffeine" would look like this:
Search results for "Caffeine"
After each category I have a link to "See all Matches in Category XYZ"
but that query runs fine since the dataset to search is reduced in size
to just that particular category.
The simplified pseudo-code of what I am currently using is this:
GET TOP LEVEL CATEGORIES AND PLACE IN ARRAYS ( I left this code out)
// the resulting arrays might look like this:
$category_id = array("1","2","3");
$category_name = array("One","Two","Three");
$query = "SELECT P.product_id, P.product_name,...other stuff...
FROM product AS P, category AS C, category_link AS CL
INNER JOIN ... join necessary tables for images etc...
WHERE MATCH (P.product_name) AGAINST ('" . $search_term . "') AND
CL.product_id = P.product_id AND CL.category_id = '" . $category_id[$i]
LIMIT 0, 3 ";
$result = mysql_query($query);
print(....) // output the 3 results
print("See all matches in " . $category_name[$i]);
To reduce the overall time delay, you have to switch from doing 20 or so fulltext searches to doing one fulltext search. It's the only way -- although a practical test on your database is the only way to know.
For example, include the category id field in the fields returned by the first query and do not limit by category id. Consider using "INSERT INTO... SELECT..." so that potentially many rows found are put into a temp table on the server instead of returning everything across the network. Then run a second query against the temp table to do the group/limit by stuff.
Before MySQL 4.1, what I call a temp table is referred to as a HEAP table. MySQL then refers to them as MEMORY tables.
Thats exactly what I wanted, 1 query because (20 fulltext searches) * (4.5 million records) = slow.
Is creating a temp table FAST? I always shy-ed away fromm making temp tables because of performance issues and concurrent user issues.
I'll look into this idea.
Is creating a temp table FAST? Yes it's fast. The table and its contents are stored in memory. Such tables are shared between all clients so you might dynamically generate a unique table name, create it then drop it after use.