If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > Database Server Software > MySQL > subquery question

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 04-14-08, 02:18
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
subquery question

here is a pretty complicated query...


Code:

SELECT *
FROM table_categories 
JOIN table_files 
ON table_categories.file_id = table_files.file_id 

WHERE table_categories.unique_id IN ( 
	( SELECT i2t1.object_id 
		FROM table_tags t0 
		CROSS JOIN table_tags t1 
		CROSS JOIN table_tags t2 
		INNER JOIN table_tagged_objects i2t0 
		ON t0.id = i2t0.tag_id 
		INNER JOIN table_tagged_objects i2t1 
		ON i2t0.object_id = i2t1.object_id 
		AND i2t1.tag_id = t1.id 
		LEFT JOIN table_tagged_objects i2t2 
		ON i2t0.object_id = i2t2.object_id 
		AND i2t1.object_id = i2t2.object_id 
		AND i2t2.tag_id= t2.id WHERE t0.tag = 'fruit' 
		AND t1.tag = 'appl' 
		AND t2.tag = 'green' 
		AND i2t2.object_id IS NULL) 
	) 
		
AND table_categories.display_type='1' 
AND (table_categories.visible=1 OR table_categories.author_id='26') 
ORDER BY table_categories.title ASC 
LIMIT 0,24
the subquery is a search for apple+fruit, exclude green. The outer query is selecting from a table of records and joining onto it (the pictures).

Is it really inefficient to use a subquery in IN () ? I am not sure if I have too many options since the search query is so complicated. Any insight would be extremely helpful-
Reply With Quote
  #2 (permalink)  
Old 04-14-08, 08:02
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Quote:
Originally Posted by morguefile
Is it really inefficient to use a subquery in IN () ?
in general, no, not at all

however, your CROSS CROSS INNER INNER LEFT looks hellish inefficient
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 04-14-08, 09:17
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
I don't think it is (maybe I am wrong), I am using and example from mysql. (slide 21)

http://www.slideshare.net/jonathanba...em-with-mysql/

The idea is using a group / count method would have been inefficient because it has to use a temporary table, while this method does not.
Reply With Quote
  #4 (permalink)  
Old 04-14-08, 09:29
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
there is one way to see if it is inefficient or not: run an EXPLAIN on it
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #5 (permalink)  
Old 04-14-08, 09:37
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
I did- it doesn't use temporary file

http://screencast.com/t/k5Z3MkIdLN

But what I was reading was that since the subquery is in an IN(...) it would be doing the subquery (A*B) rather than (A+B). Not sure if that is whats happening
Reply With Quote
  #6 (permalink)  
Old 04-14-08, 09:44
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
what about this:
Code:
SELECT table_tagged_objects.object_id 
  FROM table_tags 
INNER
  JOIN table_tagged_objects
    ON table_tagged_objects.tag_id = table_tags.id
 WHERE table_tags.tag IN ( 'fruit','apple','green' )
GROUP
    BY table_tagged_objects.object_id
HAVING COUNT(CASE WHEN table_tags.tag = 'fruit'
                  THEN 1 END ) > 0
   AND COUNT(CASE WHEN table_tags.tag = 'apple'
                  THEN 1 END ) > 0
   AND COUNT(CASE WHEN table_tags.tag = 'green'
                  THEN 1 END ) = 0
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #7 (permalink)  
Old 04-27-08, 21:34
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
sorry went on vacation. I tried the above statement but it doesn't work since it uses temporary file. Anytime it has to use count it becomes very inefficient. I thought I had it but after some testing, this still doesn't work for unison searches. (OR)


SELECT i2t1.object_id
FROM table_tags t0
CROSS JOIN table_tags t1
CROSS JOIN table_tags t2
INNER JOIN table_tagged_objects i2t0
ON t0.id = i2t0.tag_id
INNER JOIN table_tagged_objects i2t1
ON i2t0.object_id = i2t1.object_id
OR i2t1.tag_id = t1.id LEFT
JOIN table_tagged_objects i2t2
ON i2t0.object_id = i2t2.object_id
AND i2t1.object_id = i2t2.object_id
AND i2t2.tag_id= t2.id WHERE t0.tag = 'fruit'
AND t1.tag = 'appl'
AND t2.tag = 'green'
AND i2t2.object_id IS NULL
Reply With Quote
  #8 (permalink)  
Old 04-27-08, 22:30
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
Quote:
Originally Posted by morguefile
... but it doesn't work since it uses temporary file. Anytime it has to use count it becomes very inefficient.
"doesn't work" <> "becomes inefficient"

besides being slow, does it actually run? does it actually produce the correct results?

if so, then optimizing it is a secondary step -- the first step is getting correct results

what indexes have you defined on your tables?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #9 (permalink)  
Old 04-27-08, 23:19
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
it did work, using explain produced...

Using where; Using temporary; Using filesort

The results showed results of apples that were green. Rather than all fruits. (not that that is wrong)
Reply With Quote
  #10 (permalink)  
Old 04-27-08, 23:29
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
actually, it is wrong

the results should not have any green objects -- check the HAVING clause again

and what indexes have you defined on your tables?
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #11 (permalink)  
Old 04-28-08, 09:44
morguefile morguefile is offline
Registered User
 
Join Date: Mar 2008
Posts: 11
oops - sorry you are right it returned only red apples.


CREATE TABLE IF NOT EXISTS `table_tagged_objects` (
`tag_id` int(10) unsigned NOT NULL default '0',
`tagger_id` int(10) unsigned NOT NULL default '0',
`object_id` int(10) unsigned NOT NULL default '0',
`tagged_on` datetime NOT NULL default '0000-00-00 00:00:00',
`file_type` tinyint(1) NOT NULL default '0',
`display_type` int(3) NOT NULL,
PRIMARY KEY (`tag_id`,`tagger_id`,`object_id`),
KEY `tag_id_index` (`tag_id`),
KEY `tagger_id_index` (`tagger_id`),
KEY `object_id_index` (`object_id`),
KEY `display_type` (`display_type`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin;


CREATE TABLE IF NOT EXISTS `table_tags` (
`id` int(10) unsigned NOT NULL auto_increment,
`tag` varchar(30) character set latin1 NOT NULL default '',
`raw_tag` varchar(50) character set latin1 NOT NULL default '',
PRIMARY KEY (`id`)
) ENGINE=MyISAM DEFAULT CHARSET=utf8 COLLATE=utf8_bin AUTO_INCREMENT=13 ;
Reply With Quote
  #12 (permalink)  
Old 04-28-08, 14:14
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,534
i'm not the best guy to discuss optimization strategy, but it seems to me that the tagger_id in the middle of the 3-part index is what's keeping that index from being used properly

you could also use an index on (tag,id)
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On