Page 1 of 2 12 LastLast
Results 1 to 15 of 26
  1. #1
    Join Date
    Nov 2012
    Posts
    13

    Alternative for IN+Subselect in this case

    Good morning everyone,

    iam having some problems with the following statement. The Subselect is moderately quick, but the whole statment takes about 70-75 seconds, which is unacceptable. Here are some facts you might need to know:
    The statement is partly hardcoded and cant be changed (until the first WHERE). So i cant use a join rather than the IN+Subselect.
    "EXISTS" only saves me a few seconds.
    table consists of 1 million lines
    idxmail consists of 2 million lines
    matching consists of 10 000 lines
    @var1 is set by the program


    SELECT col1, col2
    FROM table b
    WHERE b.id IN (
    SELECT id
    FROM idxmail i
    JOIN matching m ON i.proxyaddress = m.proxyaddress
    WHERE m.mailnickname = '@var1')

    The profile of this query has about 2 million lines aswell. One for each line in idxmail.


    Explain (There is an index on id in table b):

    "id" "select_type" "table" "type" "possible_keys" "key" "key_len" "ref" "rows" "Extra"
    "1" "PRIMARY" "b" "ALL" NULL NULL NULL NULL "756831" "Using where"
    "2" "DEPENDENT SUBQUERY" "i" "ref" "ID,Proxyadresse" "ID" "5" "func" "1" "Using where"
    "2" "DEPENDENT SUBQUERY" "m" "ALL" NULL NULL NULL NULL "10" "Using where"

    Thanks in advance.

  2. #2
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    You would need to get some information based on the number of entries returned from each of the tables. In your case above, the estimated rows returned from table b is 756831, idxmail returns 1 row and matching returns 10 rows. As the number of rows returned from the subquery are relatively small, we can create indexes referencing back to the data blocks. If the number of rows returned were extensive then I would recommend including all information in the indexes.

    To improve the performance of this query I would make sure I have the following indexes on each of the tables:

    idxmail (proxyaddress)
    matching (mailnickname)
    table b (id)

    With these this will speed up the processing.
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  3. #3
    Join Date
    Nov 2012
    Posts
    13
    These indexes aready existst.
    Unfortunately it still takes more than a minute for the mysql server to return 3500 rows.

  4. #4
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    This does not match the explain plan that you have provided:
    table b (no possible keys or key is being used)
    table idxmail (possible keys: ID and Proxyaddresse but is using ID)
    table matching (possible keys: no keys are used)

    This might happen if you have not analyzed statistics on the tables. This allows the cost based optimizer to determine the best access plan. I would do that but at the moment it is not selecting the correct indexes.
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  5. #5
    Join Date
    Nov 2012
    Posts
    13
    sorry, you where right, i posted an older explain result.
    The problem remains though.

    "id" "select_type" "table" "type" "possible_keys" "key" "key_len" "ref" "rows" "Extra"
    "1" "PRIMARY" "b" "ALL" NULL NULL NULL NULL "813796" "Using where"
    "2" "DEPENDENT SUBQUERY" "i" "ref" "ID,Proxyadresse" "ID" "5" "func" "1" "Using where"
    "2" "DEPENDENT SUBQUERY" "m" "ref" "Mailnickname,Proxyadresse" "Proxyadresse" "768" "dwdata.i.Proxyadresse" "1" "Using where"

  6. #6
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    Can you show use the create index statements for the indexes

    Table idxmail (ID, Proxyaddresse)
    Table matching (Mailnickname, Proxyaddresse)
    Table b (ID)
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  7. #7
    Join Date
    Nov 2012
    Posts
    13
    Iam not quite sure, whether this is what you wanted to see (SHOW CREATE INDEX doesnt exist.):


    "Table" "Non_unique" "Key_name" "Seq_in_index" "Column_name" "Collation" "Cardinality" "Sub_part" "Packed" "Null" "Index_type" "Comment"
    "belegarchiv" "0" "PRIMARY" "1" "DWDOCID" "A" "727851" NULL NULL "" "BTREE" ""
    "belegarchiv" "0" "IX13dde7782d48449fa2218904eeba" "1" "DWVERID" "A" "727851" NULL NULL "" "BTREE" ""
    "belegarchiv" "0" "IX13dde7782d48449fa2218904eeba" "2" "DWSYSVERSION" "A" "727851" NULL NULL "" "BTREE" ""
    "belegarchiv" "1" "IX6f2201ba8d8f469785525393227d" "1" "DWDOCID" "A" "727851" NULL NULL "" "BTREE" ""
    "belegarchiv" "1" "IX6f2201ba8d8f469785525393227d" "2" "DWDISKNO" "A" "727851" NULL NULL "YES" "BTREE" ""
    "belegarchiv" "1" "IX1fca6b92d1ee4c028998887ddb5f" "1" "DWDOCID" "A" "727851" NULL NULL "" "BTREE" ""
    "belegarchiv" "1" "IX1fca6b92d1ee4c028998887ddb5f" "2" "DWFLAGS" "A" "727851" NULL NULL "YES" "BTREE" ""
    "belegarchiv" "1" "IX1fca6b92d1ee4c028998887ddb5f" "3" "DWFLAGSEX" "A" "727851" NULL NULL "YES" "BTREE" ""

    "idxmail" "1" "ID" "1" "ID" "A" "2149097" NULL NULL "YES" "BTREE" ""
    "idxmail" "1" "Proxyadresse" "1" "Proxyadresse" "A" "85963" "25" NULL "YES" "BTREE" ""

    "matchingtabelle" "1" "Mailnickname" "1" "Mailnickname" "A" "10" NULL NULL "YES" "BTREE" ""
    "matchingtabelle" "1" "Proxyadresse" "1" "Proxyadresse" "A" "10" NULL NULL "YES" "BTREE" ""

  8. #8
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    Create an index on belegarchiv on the ID column.
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  9. #9
    Join Date
    Nov 2012
    Posts
    13
    there are some already. the field name is DWDOCID actually.

  10. #10
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    Now I am confused. The query you have provided initially is:

    SELECT col1, col2
    FROM table b
    WHERE b.id IN (
    SELECT id
    FROM idxmail i
    JOIN matching m ON i.proxyaddress = m.proxyaddress
    WHERE m.mailnickname = '@var1')

    On table b which I am assuming is belegarchiv you mention the column ID. Perhaps you should post the actual SQL that is being run. Issue a SHOW CREATE TABLE for each of the tables and this will also give the table layout as well as the indexes.
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  11. #11
    Join Date
    Nov 2012
    Posts
    13
    i changed the query to not confuse you with software specific columns :-)

    explain
    SELECT Betreff, absender
    FROM belege b
    WHERE b.dwdocid IN (
    SELECT id
    FROM idxmail i
    JOIN matchingtabelle m ON i.proxyadresse = m.proxyadresse
    WHERE m.mailnickname = 'a@b.de')

    "Table" "Create Table"
    "belegarchiv" "CREATE TABLE `belegarchiv` (
    `DWDOCID` int(11) NOT NULL default '0',
    `DWPAGECOUNT` int(11) default NULL,
    `DWDISKNO` int(11) default NULL,
    `DWFLAGS` int(11) default NULL,
    `DWOFFSET` int(11) default NULL,
    `DWSTOREDATETIME` datetime default NULL,
    `DWSTOREUSER` varchar(20) default NULL,
    `DWMODDATETIME` datetime default NULL,
    `DWMODUSER` varchar(20) default NULL,
    `DWLASTACCESSDATETIME` datetime default NULL,
    `DWLASTACCESSUSER` varchar(20) default NULL,
    `DWSTORAGEACCESS` longtext,
    `DWEXTENSION` varchar(20) default NULL,
    `DWHEADERSIZE` bigint(20) default NULL,
    `DWDOCSIZE` bigint(20) default NULL,
    `DWVERID` int(11) NOT NULL,
    `DWSYSVERSION` int(11) NOT NULL default '0',
    `DWFLAGSEX` bigint(20) default NULL,
    `BELEGART` varchar(40) default NULL,
    `BELEGNUMMER` varchar(20) default NULL,
    `ABTEILUNG` varchar(40) default NULL,
    `KOSTENSTELLE` varchar(40) default NULL,
    `VORGANSNUMMER` varchar(40) default NULL,
    `BETREFF` varchar(255) default NULL,
    `DATUM` datetime default NULL,
    `UHRZEIT` varchar(20) default NULL,
    `EMAIL_INHALT` text,
    `NACHRICHTEN_ID` varchar(255) default NULL,
    `STATUS` varchar(255) default NULL,
    `PROJEKTNUMMER` varchar(40) default NULL,
    `KUNDENNUMMER` varchar(40) default NULL,
    `FIRMA` varchar(60) default NULL,
    `DATEITYP` varchar(20) default NULL,
    `DATEIGROESSE` int(11) default NULL,
    `DOKUMENTENTYP` varchar(20) default NULL,
    `STATUS2` varchar(60) default NULL,
    `MITARBEITER` varchar(40) default NULL,
    `ABGELEGTVON` varchar(40) default NULL,
    `ABSENDER` varchar(40) default NULL,
    `EMAIL_INTERN_` varchar(255) default NULL,
    `EMAIL_EXTERN` varchar(255) default NULL,
    `DSB` varchar(20) default NULL,
    `EBENE1` varchar(255) default NULL,
    `PST` varchar(255) default NULL,
    `UID` varchar(255) default NULL,
    `ORDNER` varchar(255) default NULL,
    `EMPFANGEN` varchar(255) default NULL,
    `EMPFAENGER` varchar(255) default NULL,
    PRIMARY KEY (`DWDOCID`),
    UNIQUE KEY `IX13dde7782d48449fa2218904eeba` (`DWVERID`,`DWSYSVERSION`),
    KEY `IX6f2201ba8d8f469785525393227d` (`DWDOCID`,`DWDISKNO`),
    KEY `IX1fca6b92d1ee4c028998887ddb5f` (`DWDOCID`,`DWFLAGS`,`DWFLAGSEX`),
    KEY `DWDOCID` (`DWDOCID`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8"


    "Table" "Create Table"
    "idxmail" "CREATE TABLE `idxmail` (
    `ID` int(10) default NULL,
    `Proxyadresse` text,
    `rematched` bit(1) default NULL,
    KEY `ID` (`ID`),
    KEY `Proxyadresse` (`Proxyadresse`(25))
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8"


    "Table" "Create Table"
    "matchingtabelle" "CREATE TABLE `matchingtabelle` (
    `Mailnickname` varchar(255) default NULL,
    `Proxyadresse` varchar(255) default NULL,
    KEY `Mailnickname` (`Mailnickname`),
    KEY `Proxyadresse` (`Proxyadresse`)
    ) ENGINE=InnoDB DEFAULT CHARSET=utf8"

  12. #12
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    I take is belege is belegarchiv?

    I also see that Proxyaddresse in one table as a TEXT data type and VARCHAR(255) in the other. I think you should change both of these to become the same data type (preferably VARCHAR(255)). You should create the indexes on both Proxyaddresse the same i.e. KEY Proxyaddress (Proxyaddresse(25)).

    The matchingtabelle table is mailnickname supposed to be primary key or unique?
    What about idxmail is ID a primary key or unique?
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  13. #13
    Join Date
    Nov 2012
    Posts
    13
    Good morning,

    thats the same table, right.

    Changed the data type to varchar(255) and the key length to 25.
    No change at all.

    The entries in idxmail.id and matchingtabelle.mailnickname are not unique (there are duplicate entries) and therefor cannot have a PRIMARY OR UNIQUE index.

  14. #14
    Join Date
    Sep 2009
    Location
    San Sebastian, Spain
    Posts
    860
    Did you analyze the tables after changing the data type?
    Also how many rows do you get returned?
    Can you also resend the explain plan after reanalyzing the table?
    Ronan Cashell
    Certified Oracle DBA/Certified MySQL Expert (DBA & Cluster DBA)
    http://www.it-iss.com

  15. #15
    Join Date
    Nov 2012
    Posts
    13
    I analyzed the table now.
    Here is the explain plan:

    "id" "select_type" "table" "type" "possible_keys" "key" "key_len" "ref" "rows" "Extra"
    "1" "PRIMARY" "b" "ALL" NULL NULL NULL NULL "774367" "Using where"
    "2" "DEPENDENT SUBQUERY" "i" "ref" "ID,Proxyadresse" "ID" "5" "func" "1" "Using where"
    "2" "DEPENDENT SUBQUERY" "m" "ref" "Mailnickname,Proxyadresse" "Proxyadresse" "78" "dwdata.i.Proxyadresse" "1" "Using where"

    The query returns 3565 rows.

    Due to hardware changes i got an significant improvement down to 26 seconds. But still way to expencive.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •