Results 1 to 14 of 14
  1. #1
    Join Date
    Nov 2002
    Posts
    8

    Post Unanswered: BETWEEN in fields type datetime

    Hello, I have a table with a field with type datetime with name tr_fecha. Exist a index with this field.

    When I execute following query uses index

    select * from tc_tdatosanterior_tarjeta where tr_fecha between '2003-03-15 00:00:00' and '2003-03-15 23:59:59'

    But, when i use variables, the query uses table scan. What is the reason?. Thanks

    declare @fch1 datetime
    declare @fch2 datetime

    select @fch1 = '2003-03-15 00:00:00'
    select @fch2 = '2003-03-15 23:59:59'

    select * from tc_tdatosanterior_tarjeta where tr_fecha between @fch1 and @fch2

  2. #2
    Join Date
    Nov 2002
    Location
    Switzerland
    Posts
    524
    Between ... and ... is replaced by < and > by the optimizer.

    How many pages are in your table ? How many pages solve your condition ? Use dbcc traceon(3604,302) to understand the strategy.

  3. #3
    Join Date
    Jan 2004
    Posts
    19
    The problem seems to be happening because the optimiser is not able to use histogram values in second case(where you are using variables).The values of the variables are not known at compile time leading to optimiser assuming the closed range match(25% of table count).Have you declared a non clustered index on tr_fecha ?(in this case optimiser will most probably use a table scan).Anyways you can always find the strategy by running dbcc as suggested by fadace.
    Thnks
    Andy

  4. #4
    Join Date
    Nov 2002
    Posts
    8
    Hi, thanks for your help.

    Important: The lock scheme in the table is datapages. Clustered index not exist.

    I used the dbcc comands with the following results:

    =====================================

    For the query:

    select * from tc_tdatosanterior_tarjeta where tr_fecha between '2003-03-15 00:00:00' and '2003-03-15 23:59:59'

    *************************************
    Results for the dbcc command:

    Beginning selection of qualifying indexes for table 'tc_tdatosanterior_tarjeta',
    varno = 0, objectid 2073055390.
    The table (Datapages) has 325475 rows, 4160 pages,
    Data Page Cluster Ratio 0.997527

    Table scan cost is 325475 rows, 4461 pages,
    using data prefetch (size 16K I/O),
    in data cache 'default data cache' (cacheid 0) with MRU replacement


    Selecting best index for the SEARCH CLAUSE:
    tc_tdatosanterior_tarjeta.tr_fecha <= Mar 15 2003 11:59:59:000PM
    tc_tdatosanterior_tarjeta.tr_fecha >= Mar 15 2003 12:00:00:000AM


    Estimated selectivity for tr_fecha,
    selectivity = 0.001470, upper limit = 0.052631.

    Estimating selectivity of index 'ix_tdatosanterior_tarjeta_01', indid 2
    scan selectivity 0.001470, filter selectivity 0.001470
    479 rows, 78 pages, index height 2,
    Data Row Cluster Ratio 0.858357,
    Index Page Cluster Ratio 0.999336,
    Data Page Cluster Ratio 0.607890


    The best qualifying index is 'ix_tdatosanterior_tarjeta_01' (indid 2)
    costing 78 pages,
    with an estimate of 479 rows to be returned per scan of the table,
    using index prefetch (size 16K I/O) on leaf pages,
    in index cache 'default data cache' (cacheid 0) with LRU replacement
    using no data prefetch (size 8K I/O),
    in data cache 'default data cache' (cacheid 0) with LR


    ==========================================

    For the query:

    declare @fch1 datetime
    declare @fch2 datetime

    select @fch1 = '2003-03-15 00:00:00'
    select @fch2 = '2003-03-15 23:59:59'

    select * from tc_tdatosanterior_tarjeta where tr_fecha between @fch1 and @fch2

    ************************************
    Results for the dbcc command:

    Beginning selection of qualifying indexes for table 'tc_tdatosanterior_tarjeta',
    varno = 0, objectid 2073055390.
    The table (Datapages) has 325475 rows, 4160 pages,
    Data Page Cluster Ratio 0.997527

    Table scan cost is 325475 rows, 4461 pages,
    using data prefetch (size 16K I/O),
    in data cache 'default data cache' (cacheid 0) with MRU replacement

    Selecting best index for the search clause:
    tc_tdatosanterior_tarjeta.tr_fecha <= unknown-value
    tc_tdatosanterior_tarjeta.tr_fecha >= unknown-value

    SARG is a local variable or the result of a function or an expresion,
    usign the default in between selectivity to estimate selectivity.

    Estimated selectivity for tr_fecha,
    selectivity = 0.250000.

    Estimating selectivity of index 'ix_tdatosanterior_tarjeta_01', indid 2
    scan selectivity 0.250000, filter selectivity 0.250000
    81369 rows, 12851 pages, index height 2,
    Data Row Cluster Ratio 0.858357,
    Index Page Cluster Ratio 0.999336,
    Data Page Cluster Ratio 0.607890

    The best qualifying index is 'ix_tdatosanterior_tarjeta_01' (indid 2)
    costing 12851 pages,
    with an estimate of 81369 rows to be returned per scan of the table,
    using index prefetch (size 16 K I/O) of leaf pages,
    in index cache 'default data cache' (cacheid 0) with MRU replacement
    using no data prefetch (size 8K I/O),
    in data cache 'default data cache' (cache id 0) with MRU replacement
    Search argument selectivity is 0.250000

    =================================

    When i used the variables for the query, the optimizer choice's is table scan because the cost for use the index is 12851 pages, and the cost for table scan is 4461 pages.

    The Data Page Cluster Ratio is 0.607890, this measurement is the cause of the election of table scan. Because, I created a clustered index with the field tr_fecha, which after its creation was eliminated.

    =================================

    I executed the query following after eliminating clustered

    declare @fch1 datetime
    declare @fch2 datetime

    select @fch1 = '2003-03-15 00:00:00'
    select @fch2 = '2003-03-15 23:59:59'

    select * from tc_tdatosanterior_tarjeta where tr_fecha between @fch1 and @fch2

    ************************************
    Results for the dbcc command:

    Beginning selection of qualifying indexes for table 'tc_tdatosanterior_tarjeta',
    varno = 0, objectid 2073055390.
    The table (Datapages) has 325475 rows, 4128 pages,
    Data Page Cluster Ratio 0.996401

    Table scan cost is 325475 rows, 4228 pages,
    using data prefetch (size 16K I/O),
    in data cache 'default data cache' (cacheid 0) with MRU replacement


    Selecting best index for the SEARCH CLAUSE:
    tc_tdatosanterior_tarjeta.tr_fecha <= unknown-value
    tc_tdatosanterior_tarjeta.tr_fecha >= unknown-value

    SARG is a local variable or the result of a function or an expression,
    using the default in between selectivity to estimate selectivity.

    Estimated selectivity for tr_fecha,
    selectivity = 0.250000.

    Estimating selectivity of index 'ix_tdatosanterior_tarjeta_01', indid 2
    scan selectivity 0.250000, filter selectivity 0.250000
    81369 rows, 1513 pages, index height 2,
    Data Row Cluster Ratio 0.999782,
    Index Page Cluster Ratio 0.999380,
    Data Page Cluster Ratio 0.994568


    The best qualifying index is 'ix_tdatosanterior_tarjeta_01' (indid 2)
    costing 1513 pages,
    with an estimate of 81369 rows to be returned per scan of the table,
    using index prefetch (size 16K I/O) on leaf pages,
    in index cache 'default data cache' (cacheid 0) with LRU replacement
    using data prefetch (size 16K I/O),
    in data cache 'default data cache' (cacheid 0) with LRU replacement
    Search argument selectivity is 0.250000.

    ==================================

    The data page cluster ratio is 0.994568. Why this measurement changed the results and why the clustered was the solution?

    Regards,

  5. #5
    Join Date
    Jan 2004
    Posts
    19
    Guilleve,
    I can tell you why the clustered index was solution but i am a little baffled by the page costs calculated by the optimiser.Hopefully someone else in this forum will be able to solve my query.

    For the second case(using variables and non clustered index) :

    Since the optimiser is not able to get the variable values at compile time it is using built in percentage for closed range queries(25% of total rowcount i.e total no of rows effected is assumed to be 325475*.25 ~ 81369 ) hence selectivity = 0.250000 .Now in a non clustered index for every matching row ,a data page have to be scanned(if the query is not covered which is in the above case) so i expected the I/O cost to be at least more than 81369(even if i exclude index pages) .But i do not understand how the optimiser reached a count of 12851 pages

    For the third case(using variables with clustered index):

    Here everything remains the same except the fact that data is stored in sorted manner in clustered index.Since the no of rows/page is 78 ~ 325475 /4128 .So the no. of datapages is 1044 ~ 81369/78 .Now after taking into consideration the index pages I/O should be 1048*3(pages scanned to reach data page)= 3144 but again i am not able to understand how the count of 1513 was achieved.

    Thanks,
    Andy
    Last edited by Andy2004; 03-16-04 at 23:34.

  6. #6
    Join Date
    Jun 2002
    Location
    Argentina
    Posts
    78
    Guilleve,
    I'm agree with Andy.
    If you want to use the index "ix_tdatosanterior_tarjeta_01" for this select you can "Specifying an Index for a Query":

    <B>
    select *
    from tc_tdatosanterior_tarjeta ( index ix_tdatosanterior_tarjeta_01 )
    where tr_fecha between @fch1 and @fch2
    </B>

    I think this is well just for an exception. If you need to tuning all the query with the table "tc_tdatosanterior_tarjeta", then you will have to change the cluster and index structure for a better one.
    Never forget to run update statistics !!
    And, check periodically if your tuning select run fine o not !!
    For more information read the chapter "Advanced Optimizing Tools" into the "Performance and Tuning Guide"
    http://manuals.sybase.com/onlinebook...TextView/35283
    Good luck

    Sebastian

  7. #7
    Join Date
    Jan 2004
    Posts
    19
    Hi,
    Can someone please explain how optimiser arrived at the counts in dbcc output mentioned by guilleve.
    TIA

  8. #8
    Join Date
    Aug 2002
    Location
    Omaha, Nebraska
    Posts
    21
    Originally posted by Andy2004
    Hi,
    Can someone please explain how optimiser arrived at the counts in dbcc output mentioned by guilleve.
    TIA
    To explain that, one would need to know the average size of a row in the index, and the total number of pages in the index.

  9. #9
    Join Date
    Jan 2004
    Posts
    19
    Thanks kevin for replying
    The count which is puzzling me is for the second case of dbcc(non clustered with variables)
    **
    81369 rows, 12851 pages, index height 2
    **
    Since for a non clustered index the leaf page points to the data row so the optimiser should estimate the I/O count to be =81369 + index pages scanned.But it is taking the number of pages scanned to be 12851 which is < 81369 .So even if i don't take into consideration index pages to be scanned the I/O count should always be > 81369 .

    TIA

  10. #10
    Join Date
    Nov 2002
    Posts
    8
    Hi, everybody

    Thanks for your help in this case.

  11. #11
    Join Date
    Aug 2002
    Location
    Omaha, Nebraska
    Posts
    21
    Originally posted by Andy2004
    Thanks kevin for replying
    The count which is puzzling me is for the second case of dbcc(non clustered with variables)
    **
    81369 rows, 12851 pages, index height 2
    **
    Since for a non clustered index the leaf page points to the data row so the optimiser should estimate the I/O count to be =81369 + index pages scanned.But it is taking the number of pages scanned to be 12851 which is < 81369 .So even if i don't take into consideration index pages to be scanned the I/O count should always be > 81369 .

    TIA
    I don't follow your reasoning above. Just because there are an estimated 81369 (25% of total rows in table) rows estimated to be returned doesn't mean that translates into 81369 IO's. Pages contain multiple rows. The number of pages in the index is what you need to focus on.

    Do me a favor and post the optdiag output for the 'tc_tdatosanterior_tarjeta' table and all of this can be explained so that you can understand it. It's important to know how many pages the index contains, the cluster ratios, etc.

  12. #12
    Join Date
    Nov 2002
    Posts
    8
    Kevin, thanks for your email.

    The optdiag for the tc_tdatosanterior_tarjeta table is:

    Server name: "SYBASE3"

    Specified database: "nts_tarjcred"
    Specified table owner: not specified
    Specified table: "tc_tdatosanterior_tarjeta"
    Specified column: not specified

    Table owner: "dbo"
    Table name: "tc_tdatosanterior_tarjeta"

    Statistics for table: "tc_tdatosanterior_tarjeta"

    Data page count: 5375
    Empty data page count: 0
    Data row count: 420981.0000000000000000
    Forwarded row count: 0.0000000000000000
    Deleted row count: 0.0000000000000000
    Data page CR count: 692.0000000000000000
    OAM + allocation page count: 400
    First extent data pages: 167
    Data row size: 95.6382488590190430

    Derived statistics:
    Data page cluster ratio: 0.9957473952796088
    Space utilization: 0.9195412933251577
    Large I/O efficiency: 0.9957654033453314

    Statistics for index: "ix_tdatosanterior_tarjeta_01" (nonclustered)
    Index column list: "tr_fecha", "tr_usuario", "tr_tarjeta", "tr_empresa"
    Leaf count: 2224
    Empty leaf page count: 0
    Data page CR count: 26041.0000000000000000
    Index page CR count: 280.0000000000000000
    Data row CR count: 62827.0000000000000000
    First extent leaf pages: 0
    Leaf row size: 39.6896363303741640
    Index height: 2

    Derived statistics:
    Data page cluster ratio: 0.5918429732121310
    Index page cluster ratio: 0.9989722507708120
    Data row cluster ratio: 0.8617633046683638
    Space utilization: 0.9222749784948325
    Large I/O efficiency: 0.9989733059548254

    Statistics for index: "ix_tdatosanterior_tarjeta_02" (nonclustered)
    Index column list: "tr_tarjeta", "tr_tipo"
    Leaf count: 1506
    Empty leaf page count: 0
    Data page CR count: 376209.0000000000000000
    Index page CR count: 1317.0000000000000000
    Data row CR count: 384068.0000000000000000
    First extent leaf pages: 0
    Leaf row size: 17.8283208804328980
    Index height: 2

    Derived statistics:
    Data page cluster ratio: 0.0204983880895993
    Index page cluster ratio: 0.1435079726651481
    Data row cluster ratio: 0.0888172933018291
    Space utilization: 0.6117916705846653
    Large I/O efficiency: 0.5386503067484663

    Statistics for index: "ix_tdatosanterior_tarjeta_03" (nonclustered)
    Index column list: "tr_fechaproc"
    Leaf count: 2138
    Empty leaf page count: 0
    Data page CR count: 24593.0000000000000000
    Index page CR count: 309.0000000000000000
    Data row CR count: 58793.0000000000000000
    First extent leaf pages: 0
    Leaf row size: 12.0059408037897892
    Index height: 2

    Derived statistics:
    Data page cluster ratio: 0.5884275907159202
    Index page cluster ratio: 0.9780748663101604
    Data row cluster ratio: 0.8714696130469723
    Space utilization: 0.2902061331541412
    Large I/O efficiency: 0.9785452642595499

    Statistics for column: "tr_fecha"
    Last update of column statistics: Mar 3 2004 6:17:14:536AM

    Range cell density: 0.0000026921538620
    Total density: 0.0000026921538620
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Histogram for column: "tr_fecha"
    Column datatype: datetime
    Requested step count: 20
    Actual step count: 20

    Step Weight Value

    1 0.00000000 <= "Jun 3 2002 12:27:52:560PM"
    2 0.05262955 <= "Sep 9 2002 5:40:23:610PM"
    3 0.05262955 <= "Nov 4 2002 1:08:07:876PM"
    4 0.05262955 <= "Jan 7 2003 3:45:42:433PM"
    5 0.05262955 <= "Feb 26 2003 5:34:22:870PM"
    6 0.05262955 <= "Apr 11 2003 11:42:39:000PM"
    7 0.05262955 <= "May 31 2003 8:01:18:273AM"
    8 0.05262955 <= "Jul 18 2003 11:04:43:666AM"
    9 0.05262955 <= "Aug 21 2003 2:49:37:736PM"
    10 0.05262955 <= "Sep 19 2003 2:21:37:820PM"
    11 0.05262955 <= "Oct 21 2003 1:24:01:693AM"
    12 0.05262955 <= "Nov 5 2003 4:52:04:186PM"
    13 0.05262955 <= "Nov 21 2003 12:04:25:540PM"
    14 0.05262955 <= "Dec 2 2003 11:38:11:303PM"
    15 0.05262955 <= "Dec 17 2003 11:44:30:676PM"
    16 0.05262955 <= "Jan 6 2004 5:26:56:823PM"
    17 0.05262955 <= "Jan 20 2004 11:27:29:433PM"
    18 0.05262955 <= "Feb 4 2004 11:45:21:540PM"
    19 0.05262955 <= "Feb 20 2004 9:54:36:120AM"
    20 0.05266806 <= "Mar 2 2004 11:49:02:290PM"

    Statistics for column group: "tr_fecha", "tr_usuario"
    Last update of column statistics: Mar 3 2004 6:17:14:536AM

    Range cell density: 0.0000026920495649
    Total density: 0.0000026920495649
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Statistics for column group: "tr_fecha", "tr_usuario", "tr_tarjeta"
    Last update of column statistics: Mar 3 2004 6:17:14:536AM

    Range cell density: 0.0000025335294230
    Total density: 0.0000025335294230
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Statistics for column group: "tr_fecha", "tr_usuario", "tr_tarjeta", "tr_empresa"
    Last update of column statistics: Mar 3 2004 6:17:14:536AM

    Range cell density: 0.0000025335294230
    Total density: 0.0000025335294230
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Statistics for column: "tr_fechaproc"
    Last update of column statistics: Mar 3 2004 6:17:29:536AM

    Range cell density: 0.0023657370638124
    Total density: 0.0023657370638124
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Histogram for column: "tr_fechaproc"
    Column datatype: datetime
    Requested step count: 20
    Actual step count: 20

    Step Weight Value

    1 0.00000000 <= "Jun 3 2002 12:27:52:560PM"
    2 0.05262955 <= "Sep 9 2002 5:40:23:610PM"
    3 0.05262955 <= "Nov 4 2002 1:07:53:500PM"
    4 0.05262955 <= "Jan 7 2003 3:41:08:740PM"
    5 0.05262955 <= "Feb 26 2003 11:19:33:060AM"
    6 0.05262955 <= "Apr 11 2003 6:37:51:243PM"
    7 0.05262955 <= "May 30 2003 7:16:38:676PM"
    8 0.05307246 <= "Jul 18 2003 12:00:00:000AM"
    9 0.05262955 <= "Aug 21 2003 2:33:59:736PM"
    10 0.05262955 <= "Sep 19 2003 12:08:50:370PM"
    11 0.05262955 <= "Oct 20 2003 5:35:25:750PM"
    12 0.05262955 <= "Nov 5 2003 11:50:51:116AM"
    13 0.05279805 <= "Nov 21 2003 12:00:00:000AM"
    14 0.05401365 <= "Dec 3 2003 12:00:00:000AM"
    15 0.05262955 <= "Dec 17 2003 5:05:16:390PM"
    16 0.05408105 <= "Jan 7 2004 12:00:00:000AM"
    17 0.05262955 <= "Jan 21 2004 12:21:54:920PM"
    18 0.05262955 <= "Feb 5 2004 5:34:05:630PM"
    19 0.05262955 <= "Feb 20 2004 4:55:31:276PM"
    20 0.04922105 <= "Mar 2 2004 9:24:59:840PM"

    Statistics for column: "tr_tarjeta"
    Last update of column statistics: Mar 3 2004 6:17:21:536AM

    Range cell density: 0.0000244818187266
    Total density: 0.0000244818187266
    Range selectivity: default used (0.33)
    In between selectivity: default used (0.25)

    Optdiag succeeded.

    Regards,

  13. #13
    Join Date
    Aug 2002
    Location
    Omaha, Nebraska
    Posts
    21
    Thanks, but I was hoping for the optdiag output which matched the dbcc 302 output posted in this thread. The optdiag output looks like the table has changed significantly since the posting of the 302 output. In order to match the numbers up, we need both the 302 and optdiag output to be done at the same time.

    If not, I could go on to guess what the 302 info might say about index io estimates. The point I was going to make with the optdiag output is that the index has a number of pages. There is an estimate to how many of those pages will be read for the SARG criteria, and that is what you utlimately see in the 302 output.

    Andy was assuming that each row read would equate to one IO which is not a valid assumption since each IO reads a page, and each page contains many rows.

  14. #14
    Join Date
    Jan 2004
    Posts
    19
    Quote Originally Posted by Kevin Sherlock
    I don't follow your reasoning above. Just because there are an estimated 81369 (25% of total rows in table) rows estimated to be returned doesn't mean that translates into 81369 IO's. Pages contain multiple rows. The number of pages in the index is what you need to focus on.

    Do me a favor and post the optdiag output for the 'tc_tdatosanterior_tarjeta' table and all of this can be explained so that you can understand it. It's important to know how many pages the index contains, the cluster ratios, etc.
    Kevin,
    The reason i was saying that it will result in >81369 IOs is because cost is being calculated for a non clustered index .Since in a non clustered index for every row fetched, server should read the full data page(if the query is not index covered).But it appears optimiser is taking into consideration cache and IO strategy as well.I am not aware how to take them into consideration while doing the calculations.Again thanks for taking interest in my doubts.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •