Page 1 of 2 12 LastLast
Results 1 to 15 of 25
  1. #1
    Join Date
    Sep 2004
    Posts
    30

    Unanswered: optimize a UNION query with LIKE (was "Query help")

    Hi,

    SQL 2K
    Table loaded using BULK INSERT with ~5 000 000 records.
    CREATE TABLE [MyTable] (
    [SeqNumber] int NOT NULL , -- sequence unique number
    [ObjId] int NOT NULL ,
    [HierarchyLevel] varchar(255) NULL , -- highest level = 0
    [NameText] varchar(255) NULL
    .
    .
    )

    insert into MyTable
    select 1, 1, 0, 'text-10'
    union all select 2, 1, 1, 'text-11 state-1'
    union all select 3, 1, 2, 'text-12 social-1'
    union all select 4, 1, 3, 'text-13 abc social-1.1'
    union all select 5, 1, 1, 'text-11 123'
    union all select 6, 1, 1, 'text-11 ABCDEF'
    union all select 7, 1, 2, 'text-12 bbb'
    union all select 8, 1, 3, 'text-13 social-2'
    union all select 9, 1, 4, 'text-14 aaa'
    union all select 10, 2, 1, 'text-21 state-3'
    union all select 11, 3, 5, 'text-31 state-2 social-3'

    - Unique Index on SeqNumber

    Query:
    Select top 100 TI.*
    from [dbo].[MyTable] TI
    inner join [dbo].[MyTable] TL
    on TL.ObjId = TI.ObjId
    where (TL.[NameText] like '%state%')
    and TI.[NameText] like '%social%'
    and TI.HierarchyLevel > TL.HierarchyLevel
    and TI.SeqNumber > TL.SeqNumber
    and not exists (select *
    from [dbo].[MyTable] T3
    where T3.ObjId = TL.ObjId
    and T3.SeqNumber > TL.SeqNumber
    and T3.SeqNumber < TI.SeqNumber
    and T3.HierarchyLevel <= TL.HierarchyLevel)
    Union
    Select T4.*
    from [dbo].[MyTable] T4
    where T4.[NameText] like '%state%'
    and T4.[NameText] like '%social%'
    order by Ti.SeqNumber

    Result:
    SeqNumber ObjId HierarchyLevel NameText
    3 1 2 text-12 social-1
    4 1 3 text-13 abc social-1.1
    11 3 5 text-31 state-2 social-3

    Could somebody help me please to optimize this query?

  2. #2
    Join Date
    Dec 2004
    Location
    Kharkov, Ukraine
    Posts
    40
    -1 table scan in such case:

    Select top 100 TI.*
    from [dbo].[MyTable] TI
    inner join [dbo].[MyTable] TL
    on TL.ObjId = TI.ObjId
    where
    (
    (TL.[NameText] like '%state%')
    and TI.[NameText] like '%social%'
    and TI.HierarchyLevel > TL.HierarchyLevel
    and TI.SeqNumber > TL.SeqNumber
    and not exists (select *
    from [dbo].[MyTable] T3
    where T3.ObjId = TL.ObjId
    and T3.SeqNumber > TL.SeqNumber
    and T3.SeqNumber < TI.SeqNumber
    and T3.HierarchyLevel <= TL.HierarchyLevel)
    )

    OR(
    where TI.[NameText] like '%state%'
    and TI.[NameText] like '%social%')

    order by Ti.SeqNumber

  3. #3
    Join Date
    Sep 2004
    Posts
    30

    optimize a UNION query with LIKE

    Thanks for reply

    I run the suggested solution but the performance is worse.
    Curently I have 1.3 million records in the table and the response times are:
    old = 10 sec
    new = 50 sec

    The execution plan for the new query shows 3 table scans.
    No index suggestions from ‘Index Tuning Wizard’.

    Any other suggestions how to improve performance ?
    Helena

  4. #4
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    what is the purpose of TOP 100?

    are you getting the top 100 of the (unsorted) first SELECT in the union plus all the rows of the second?

    would a subquery that performs the TOP 100 help the first SELECT?

    have you tried a left outer join instead of the NOT EXISTS subquery?
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  5. #5
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Code:
    Select	TI.*
    from	[dbo].[MyTable] TI
    	inner join [dbo].[MyTable] TL on TL.ObjId = TI.ObjId
    	left outer join [dbo].[MyTable] T3
    		on T3.ObjId = TL.ObjId
    		and T3.SeqNumber > TL.SeqNumber
    		and T3.SeqNumber < TI.SeqNumber
    		and T3.HierarchyLevel <= TL.HierarchyLevel
    where	(TL.[NameText] like '%state%'
    	and TI.[NameText] like '%social%'
    	and TI.HierarchyLevel > TL.HierarchyLevel
    	and TI.SeqNumber > TL.SeqNumber
    	and T3.ObjID is null)
    	or
    	(TI.[NameText] like '%state%'
    	and TI.[NameText] like '%social%')
    order by TI.SeqNumber
    But you can't expect this query to ever run lickety-split, with all the LIKE and > comparisons.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  6. #6
    Join Date
    Dec 2004
    Location
    Kharkov, Ukraine
    Posts
    40
    Show you execution plan, please. I am interested.
    This is my (for 11 rows of course):
    |--Top(100)
    |--Nested Loops(Left Anti Semi Join, WHERE(([T3].[ObjId]=[TL].[ObjId] AND [T3].[SeqNumber]>[TL].[SeqNumber]) AND [T3].[SeqNumber]<[TI].[SeqNumber]) AND [T3].[HierarchyLevel]<=[TL].[HierarchyLevel]))
    |--Nested Loops(Inner Join, OUTER REFERENCES[TL].[SeqNumber], [TL].[HierarchyLevel], [TL].[ObjId]))
    | |--Table Scan(OBJECT[MWolf_Mail].[dbo].[MyTable] AS [TL]), WHERElike([TL].[NameText], '%state%', NULL)))
    | |--Filter(WHERE[TI].[SeqNumber]>[TL].[SeqNumber]))
    | |--Index Spool(SEEK[TI].[ObjId]=[TL].[ObjId] AND [TI].[HierarchyLevel] > [TL].[HierarchyLevel]))
    | |--Table Scan(OBJECT[MWolf_Mail].[dbo].[MyTable] AS [TI]), WHERElike([TI].[NameText], '%social%', NULL)))
    |--Table Scan(OBJECT[MWolf_Mail].[dbo].[MyTable] AS [T3]))

    Apropos, something prompts me that table is not normalized

    And also, try to create index (ObjId, SeqNumber)

  7. #7
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    blindman, where's your TOP 100 logic?

    i'd still like to know what that was all about

    without testing, i'm guess the TOP 100 was applied only to the first SELECT in the UNION...
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  8. #8
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Silly me. I was assuming she wanted TOP 100% just for sorting purposes.

    I guess we only see what we want to see, eh?

    I think Helena needs to explain what the heck she is trying to do with this messy query.
    Last edited by blindman; 02-01-05 at 13:31.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  9. #9
    Join Date
    Sep 2004
    Posts
    30
    Thank You all for your response.

    Sorry for the late response but this is due to difference in working hours
    I apologize for the poor explanation but my English isn’t very good . I shell do my best to explain it
    There is ONE and only one table holding the data.
    The data is collection of hierarchy :

    SeqNumber ObjId HierarchyLevel NameText
    1 1 0 name1
    2 1 1 name1_SynonymLevel1
    3 1 2 name1_SynonymLevel1_ SynonymLevel2
    4 1 3 name1_SynonymLevel1_ SynonymLevel2_ SynonymLevel3
    5 1 1 name1_SynonymLevel1 -- second main synonym for the same name, no more synonym levels fot it
    6 1 1 name1_SynonymLevel1 -- third main synonym for the same name
    7 1 2 name1_SynonymLevel1_ SynonymLevel2
    8 1 3 name1_SynonymLevel1_ SynonymLevel2_ SynonymLevel3
    9 1 4 name1_SynonymLevel1_ SynonymLevel2_ SynonymLevel3_ SynonymLevel4
    10 2 1 name2 -- new name with only one synonym level
    11 3 5 name3 -- new name with only one synonym level

    The client created & populated the table & now he asked me to improve the performance. This query is repeated frequently and needs to be performed very quickly.

    The query returns name information that is further checked by the user
    TOP 100 limits the num of records returned to 100 total for the user sake
    The records returned having all the key words in its name field or belong to level the main key was found eg if ‘state’ was found in rec 2 only record 3 belong to it. If ‘state’ was found in rec 1 then rec 2-9 belong to it.

    And NO, I can’t change the design of the table !!!

    Rudy & Blindman:
    I shell try today the OUTER JOIN

    Mwolf:
    -Currently, the best performance for this query is where exist only one index : SeqNumber.
    -Changing the index (ObjId, SeqNumber) making the performance much worse (~ 110 sec).
    -EVEN ADDING additional index (ObjId, HierarchyLevel) making the performance much worse


    THANK U ALL
    Any suggestions ?????????????????

    Helena

  10. #10
    Join Date
    Sep 2004
    Posts
    30
    Rudy & Blindman:
    I tested the OUTER JOIN – it worked over 7 min

    Mwolf:
    Execution plan for the query u suggested:
    ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
    |--Top(100)
    |--Filter(WHERE(like([TI].[NameText], '%state%', NULL) OR (((like([TL].[NameText], '%state%', NULL) AND [TI].[HierarchyLevel]>[TL].[HierarchyLevel]) AND [TI].[SeqNumber]>[TL].[SeqNumber]) AND [Expr1004]=NULL)))
    |--Nested Loops(Left Semi Join, WHERE(((like([TI].[NameText], '%state%', NULL) OR IsFalseOrNull(like([TL].[NameText], '%state%', NULL))) OR IsFalseOrNull([TI].[HierarchyLevel]>[TL].[HierarchyLevel])) OR IsFalseOrNull([TI].[SeqNumber]>[TL].[SeqNumber]))OUTER REFERENCES[TI].[SeqNumber], [TL].[ObjId], [TL].[SeqNumber], [TL].[HierarchyLevel]), DEFINE[Expr1004] = [PROBE VALUE]))
    |--Nested Loops(Inner Join, WHERE([TL].[ObjId]=[TI].[ObjId]))
    | |--Sort(ORDER BY[TI].[SeqNumber] ASC))
    | | |--Table Scan(OBJECT[helena_tests].[dbo].[MyTable] AS [TI]), WHERE(like([TI].[NameText], '%social%', NULL)))
    | |--Table Scan(OBJECT[helena_tests].[dbo].[MyTable] AS [TL]))
    |--Row Count Spool
    |--Table Scan(OBJECT[helena_tests].[dbo].[MyTable] AS [T3]), WHERE((([T3].[ObjId]=[TL].[ObjId] AND [T3].[SeqNumber]>[TL].[SeqNumber]) AND [T3].[SeqNumber]<[TI].[SeqNumber]) AND [T3].[HierarchyLevel]<=[TL].[HierarchyLevel]))

    Thanks for help
    Helena

  11. #11
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    helen your english is excellent

    i'm having trouble understanding your sample data, but that's okay, it is not unusual for me to have trouble understanding something

    for example, you say "if ‘state’ was found in rec 2 only record 3 belong to it If ‘state’ was found in rec 1 then rec 2-9 belong to it" but it looks like 4 belongs to 3 so right away i am lost

    just for interest, what is the performance of your original query without the NOT EXISTS subquery?
    Last edited by r937; 02-02-05 at 07:00.
    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  12. #12
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    "helen your english is excellent"

    How would you know? Your native tongue is Canadian.

    Helen, I'm stumped too on the logic you are trying to implement here. It looks like you have record IDs and the level where the records exists in the hierarchy, but I don't see where you have any Parent/Child relationships between your records. Without that, I don't get how you can figure out what records belong to what records. Unless you are depending upon the order of the data to indicated relationships, and that is a very BAD idea.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

  13. #13
    Join Date
    Apr 2002
    Location
    Toronto, Canada
    Posts
    20,002
    use a smiley when you say that, otherwise it's pistols at dawn, my good man

    besides, my native language is german

    my family didn't move to canada until 1953, when i was 4 years old

    rudy.ca | @rudydotca
    Buy my SitePoint book: Simply SQL

  14. #14
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    You can't.

    It will ALWAYS cause a scan.

    You need to normalize the data.

    You need additional columns for text, social, ect

    English? They speak English in Toronto? Since when?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  15. #15
    Join Date
    Jun 2003
    Location
    Ohio
    Posts
    12,592
    Provided Answers: 1
    Pistols it is then!

    But can we meet at noon? I am not much of a morning person.
    If it's not practically useful, then it's practically useless.

    blindman
    www.chess.com: "sqlblindman"
    www.LobsterShot.blogspot.com

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •