Alright, I'm working on a ruby on rails web application and I have one query in it that just kills the entire site. The basic concept is that I have a medias table and I can tag it (a tags table and a taggings table that holds the relationship). What I want to do is find similar items out of the media table by pulling the medias that have the most similar tags. This is the current query I have:
Code:
SELECT
medias.*, COUNT(acts_as_taggable_tags.name) AS acts_as_taggable_tag_count
FROM medias
LEFT OUTER JOIN taggings AS acts_as_taggable_taggings ON acts_as_taggable_taggings.taggable_id = medias.id
LEFT OUTER JOIN tags AS acts_as_taggable_tags ON acts_as_taggable_tags.id = acts_as_taggable_taggings.tag_id AND acts_as_taggable_tags.name IN ('video','funny','kitten')
WHERE
(medias.id != 849 AND (medias.status = 1 AND medias.publish_on <= '2006-07-25' AND (medias.expire_on > '2006-07-25' OR medias.expire_on IS NULL)) AND type != 'PluggedMedia')
GROUP BY
medias.id, medias.type, medias.user_id, medias.status, medias.publish_on, medias.expire_on,
medias.position, medias.title, medias.description, medias.views, medias.comments_count,
medias.url, medias.ip_address, medias.total_views, medias.legacy_url, medias.created_at
ORDER BY acts_as_taggable_tag_count DESC
LIMIT 4;
These are the indexes I have on the tables:
medias:
Code:
+--------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+--------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| medias | 0 | PRIMARY | 1 | id | A | 875 | NULL | NULL | | BTREE | NULL |
| medias | 1 | medias_publish_on_index | 1 | publish_on | A | 291 | NULL | NULL | YES | BTREE | NULL |
| medias | 1 | medias_publish_on_index | 2 | position | A | 875 | NULL | NULL | YES | BTREE | NULL |
| medias | 1 | medias_legacy_url_index | 1 | legacy_url | A | 875 | NULL | NULL | YES | BTREE | NULL |
+--------+------------+-------------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
Code:
+----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| taggings | 0 | PRIMARY | 1 | id | A | 4873 | NULL | NULL | | BTREE | NULL |
| taggings | 1 | taggings_tag_id_index | 1 | tag_id | A | 4873 | NULL | NULL | YES | BTREE | NULL |
| taggings | 1 | taggings_tag_id_index | 2 | taggable_id | A | 4873 | NULL | NULL | YES | BTREE | NULL |
+----------+------------+-----------------------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
tags:
Code:
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| Table | Non_unique | Key_name | Seq_in_index | Column_name | Collation | Cardinality | Sub_part | Packed | Null | Index_type | Comment |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
| tags | 0 | PRIMARY | 1 | id | A | 1249 | NULL | NULL | | BTREE | NULL |
+-------+------------+----------+--------------+-------------+-----------+-------------+----------+--------+------+------------+---------+
The query as is can take over 100 seconds. An extract from the slow query log:
# Query_time: 111 Lock_time: 0 Rows_sent: 4 Rows_examined: 3881445
Any tips?