Results 1 to 5 of 5
  1. #1
    Join Date
    Nov 2004
    Posts
    35

    Question Unanswered: Preventing an inner MAX subquery from searching all table rows

    Hello,

    Thanks in advance for your help - these forums are always very helpful those few times I've lost some hairs on SQL syntax.

    I currently have a query that treats on the following tables:
    Code:
    CREATE TABLE `authorization` (
      `username` varchar(40) NOT NULL default '',
      `password` varchar(12) NOT NULL default '',
      `validated` int(1) NOT NULL default '0',
      `customer_id` bigint(20) unsigned NOT NULL default '0',
      `rights` int(32) NOT NULL default '0',
      `vkey` tinytext NOT NULL,
      PRIMARY KEY  (`username`),
      KEY `customer_id` (`customer_id`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8
    
    CREATE TABLE `topics` (
      `id` int(32) unsigned NOT NULL auto_increment,
      `category` varchar(12) NOT NULL default '0',
      `subject` varchar(64) NOT NULL default '',
      `priority` varchar(128) NOT NULL default '',
      `initiator` bigint(20) unsigned NOT NULL default '0',
      `date` datetime NOT NULL default '0000-00-00 00:00:00',
      `status` int(1) unsigned NOT NULL default '1',
      `emailed` int(1) NOT NULL default '0',
      `extra` text NOT NULL,
      `flags` int(32) unsigned NOT NULL default '0',
      `public` int(1) unsigned NOT NULL default '0',
      PRIMARY KEY  (`id`),
      KEY `initiator` (`initiator`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Topic Index'
    
    CREATE TABLE `messages` (
      `id` int(32) unsigned NOT NULL auto_increment,
      `topic_id` int(32) unsigned NOT NULL default '0',
      `user` bigint(20) unsigned NOT NULL default '0',
      `date` datetime NOT NULL default '0000-00-00 00:00:00',
      `message` text NOT NULL,
      PRIMARY KEY  (`id`),
      KEY `topic_id` (`topic_id`),
      KEY `date` (`date`),
      KEY `user` (`user`),
      FULLTEXT KEY `search` (`message`)
    ) ENGINE=MyISAM DEFAULT CHARSET=utf8 COMMENT='Message Container'
    On these three tables, I run as example the following query, where the topics.initiator value is dynamically inserted :

    Code:
    SELECT topics.*, X.mdate, authorization.username
    FROM (topics LEFT JOIN
                 ( SELECT MAX( date ) mdate, topic_id, user, id as messageid
                   FROM messages
                   GROUP BY topic_id
                 ) AS X ON topics.id = X.topic_id )
                 LEFT JOIN authorization ON authorization.customer_id = X.user
    WHERE topics.initiator='3433' AND topics.status != '4'
    GROUP BY id ORDER BY topics.category, X.mdate;
    Which far from optimized...does the trick. Time comes to make it efficient, since the data set it queries is becoming rather large.

    If I run an explain query, it shows what follows:
    "id","select_type","table","type","possible_keys", "key","key_len","ref","rows","Extra"
    1,"PRIMARY","topics","ref","initiator","initiator" ,8,"const",26,"Using where; Using temporary; Using filesort"
    1,"PRIMARY","<derived2>","ALL","","",(null),"",416 6,""
    1,"PRIMARY","authorization","ref","customer_id","c ustomer_id",8,"X.user",1,""
    2,"DERIVED","messages","ALL","","",(null),"",23685 ,"Using temporary; Using filesort"
    So I can see that the inner query that tries to fetch the maximum date for each message associated to each topic, is scanning the whole damned table. I also find myself with two indexless references, which I know should be avoided.

    My initial assumption was that I could solve this problem by first selecting the topics that match the query, which are found very quickly, and then left joining to this a result that locates the message with newest date for each of these topics.

    Something which in my rusty SQL gearing, works something like this
    Code:
    SELECT * FROM
    (SELECT * FROM topics WHERE topics.initiator='3433' AND topics.status != '4' ) AS X
     INNER JOIN
    (SELECT MAX(date) FROM messages WHERE messages.topic_id = X.id) AS Y
    ON ??
    Which miserably fails, yet I hope it conveys the idea. Trying to avoid a full table scan to search select the message with the topic ID we are left joining on.

    Help appreciated!
    Last edited by Atari; 03-21-06 at 13:14.

  2. #2
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    Code:
    SELECT topics.* 
      FROM topics 
    INNER 
      JOIN (
           SELECT topic_id
                , MAX(date) 
             FROM messages 
           group
               by topic_id
           ) AS Y
        on Y.topic_id = topics.id     
    WHERE topics.initiator='3433' 
      AND topics.status != '4'
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  3. #3
    Join Date
    Nov 2004
    Posts
    35
    Thanks for the reply -

    That query still scans all of the message table rows unfortunately

    "id","select_type","table","type","possible_keys", "key","key_len","ref","rows","Extra"
    1,"PRIMARY","<derived2>","ALL","","",(null),"",416 7,""
    1,"PRIMARY","topics","eq_ref","PRIMARY,initiator,s tatus","PRIMARY",4,"Y.topic_id",1,"Using where"
    2,"DERIVED","messages","ALL","","",(null),"",23693 ,"Using temporary; Using filesort"
    Is it possible to left join two select statements, with a WHERE condition related to the first join table present in the second select?

    ex:
    Code:
    SELECT * FROM
      ( SELECT X FROM A ) AS T1
      LEFT JOIN
      ( SELECT Y FROM B WHERE Z = T1.X  ) AS T2
    )

  4. #4
    Join Date
    Nov 2004
    Posts
    35
    Just as additionnal info, here's the total count of message rows related to the WHERE clause used:

    Code:
    SELECT COUNT(*) AS total
    FROM messages LEFT JOIN topics ON messages.topic_id = topics.id
    WHERE topics.initiator='3433' AND topics.status != '4'
    -->341

    Code:
    SELECT COUNT(*) AS total FROM messages
    -->23693

  5. #5
    Join Date
    Nov 2004
    Posts
    35
    No responses - impossible I gather? If this is a limitation of mySQL, I'll have to do it programmatically I suppose.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •