| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

04-14-08, 02:18
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
|
subquery question
|
|
here is a pretty complicated query...
Code:
SELECT *
FROM table_categories
JOIN table_files
ON table_categories.file_id = table_files.file_id
WHERE table_categories.unique_id IN (
( SELECT i2t1.object_id
FROM table_tags t0
CROSS JOIN table_tags t1
CROSS JOIN table_tags t2
INNER JOIN table_tagged_objects i2t0
ON t0.id = i2t0.tag_id
INNER JOIN table_tagged_objects i2t1
ON i2t0.object_id = i2t1.object_id
AND i2t1.tag_id = t1.id
LEFT JOIN table_tagged_objects i2t2
ON i2t0.object_id = i2t2.object_id
AND i2t1.object_id = i2t2.object_id
AND i2t2.tag_id= t2.id WHERE t0.tag = 'fruit'
AND t1.tag = 'appl'
AND t2.tag = 'green'
AND i2t2.object_id IS NULL)
)
AND table_categories.display_type='1'
AND (table_categories.visible=1 OR table_categories.author_id='26')
ORDER BY table_categories.title ASC
LIMIT 0,24
the subquery is a search for apple+fruit, exclude green. The outer query is selecting from a table of records and joining onto it (the pictures).
Is it really inefficient to use a subquery in IN () ? I am not sure if I have too many options since the search query is so complicated. Any insight would be extremely helpful-
|
|

04-14-08, 08:02
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
|
Originally Posted by morguefile
Is it really inefficient to use a subquery in IN () ?
|
in general, no, not at all
however, your CROSS CROSS INNER INNER LEFT looks hellish inefficient
|
|

04-14-08, 09:17
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
|
|
I don't think it is (maybe I am wrong), I am using and example from mysql. (slide 21)
http://www.slideshare.net/jonathanba...em-with-mysql/
The idea is using a group / count method would have been inefficient because it has to use a temporary table, while this method does not.
|
|

04-14-08, 09:29
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
there is one way to see if it is inefficient or not: run an EXPLAIN on it
|
|

04-14-08, 09:37
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
I did- it doesn't use temporary file
http://screencast.com/t/k5Z3MkIdLN
But what I was reading was that since the subquery is in an IN(...) it would be doing the subquery (A*B) rather than (A+B). Not sure if that is whats happening
|
|

04-14-08, 09:44
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
what about this:
Code:
SELECT table_tagged_objects.object_id
FROM table_tags
INNER
JOIN table_tagged_objects
ON table_tagged_objects.tag_id = table_tags.id
WHERE table_tags.tag IN ( 'fruit','apple','green' )
GROUP
BY table_tagged_objects.object_id
HAVING COUNT(CASE WHEN table_tags.tag = 'fruit'
THEN 1 END ) > 0
AND COUNT(CASE WHEN table_tags.tag = 'apple'
THEN 1 END ) > 0
AND COUNT(CASE WHEN table_tags.tag = 'green'
THEN 1 END ) = 0
|
|

04-27-08, 21:34
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
sorry went on vacation. I tried the above statement but it doesn't work since it uses temporary file. Anytime it has to use count it becomes very inefficient. I thought I had it but after some testing, this still doesn't work for unison searches. (OR)
SELECT i2t1.object_id
FROM table_tags t0
CROSS JOIN table_tags t1
CROSS JOIN table_tags t2
INNER JOIN table_tagged_objects i2t0
ON t0.id = i2t0.tag_id
INNER JOIN table_tagged_objects i2t1
ON i2t0.object_id = i2t1.object_id
OR i2t1.tag_id = t1.id LEFT
JOIN table_tagged_objects i2t2
ON i2t0.object_id = i2t2.object_id
AND i2t1.object_id = i2t2.object_id
AND i2t2.tag_id= t2.id WHERE t0.tag = 'fruit'
AND t1.tag = 'appl'
AND t2.tag = 'green'
AND i2t2.object_id IS NULL
|
|

04-27-08, 22:30
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
Quote:
|
Originally Posted by morguefile
... but it doesn't work since it uses temporary file. Anytime it has to use count it becomes very inefficient.
|
"doesn't work" <> "becomes inefficient"
besides being slow, does it actually run? does it actually produce the correct results?
if so, then optimizing it is a secondary step -- the first step is getting correct results
what indexes have you defined on your tables?
|
|

04-27-08, 23:19
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
it did work, using explain produced...
Using where; Using temporary; Using filesort
The results showed results of apples that were green. Rather than all fruits. (not that that is wrong)
|
|

04-27-08, 23:29
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
actually, it is wrong
the results should not have any green objects -- check the HAVING clause again
and what indexes have you defined on your tables?
|
|

04-28-08, 09:44
|
|
Registered User
|
|
Join Date: Mar 2008
Posts: 11
|
|
oops - sorry you are right it returned only red apples.
CREATE TABLE IF NOT EXISTS `table_tagged_objects` (
`tag_id` int(10) unsigned NOT NULL default '0',
`tagger_id` int(10) unsigned NOT NULL default '0',
`object_id` int(10) unsigned NOT NULL default '0',
`tagged_on` datetime NOT NULL default '0000-00-00 00:00:00',
`file_type` tinyint(1) NOT NULL default '0',
`display_type` int(3) NOT NULL,
PRIMARY KEY (`tag_id`,`tagger_id`,`object_id`),
KEY `tag_id_index` (`tag_id`),
KEY `tagger_id_index` (`tagger_id`),
KEY `object_id_index` (`object_id`),
KEY `display_type` (`display_type`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;
CREATE TABLE IF NOT EXISTS `table_tags` (
`id` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(30) character set latin1 NOT NULL default '',
`raw_tag` varchar(50) character set latin1 NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=13 ;
|
|

04-28-08, 14:14
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
|
|
i'm not the best guy to discuss optimization strategy, but it seems to me that the tagger_id in the middle of the 3-part index is what's keeping that index from being used properly
you could also use an index on (tag,id)
|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|