Results 1 to 6 of 6
  1. #1
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601

    Unanswered: Performance and the size of tables

    A very general question about performance and number of records in tables . . .

    I have a system that I have written for a client that has, over the years, seen its primary table grow as shown below, broken-out by records per year:

    Code:
    Year    Count      Percent  Running %
    1998    20,475     0%       0%
    1999    620,617    3%       3%
    2000    790,049    4%       7%
    2001    1,011,374  5%       12%
    2002    1,397,457  7%       19%
    2003    1,583,289  8%       27%
    2004    1,332,138  7%       33%
    2005    1,517,167  8%       41%
    2006    2,012,506  10%      51%
    2007    2,496,265  12%      63%
    2008    3,103,463  15%      79%
    2009    2,824,669  14%      93%
    2010    1,459,737  7%       100%
    			
    Total   20,169,206
    The system has never been divided between history tables and live tables, so all 20 million records are live. On a fairly frequent basis there is a need to see the last year's worth of data. On a less frequent basis there is a need to see the last two, or three, years of data. It is rarely necessary to see beyond five years.

    My question is, is there a benefit to archiving, say, the records from 2006 and older, eliminating 51% of the data from the system?

    One could presume that the smaller the tables, the better, but would there be a "significant" improvement in performance between a system with 10 million records versus a system with 20 million records?

    Is there a way to quantify the presumed benefit?

    Thanks.
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  2. #2
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    So long as most of the requests on this table go through the index to get their data, the increasing volume should not be a big deal. Run this, and see what you get for index_depth. index_depth is the number of pages an index needs to read through before it gets to the table:
    Code:
    select index_id, index_depth
    from sys.dm_db_index_physical_stats(db_id(), object_id('Your Table Here'), -1, 0, default)

  3. #3
    Join Date
    Jul 2003
    Location
    San Antonio, TX
    Posts
    3,662
    If you introduce data partitioning (both table and index), and effectively select partitioning key (be warn, it's not easy, because it may be hard/impossible to satisfy the requirements of every query), then the benefit can be multi-fold. But it is also possible (seen it plenty of times) to kill your server if partitioning strategy was not thought through all the way. For excercises like this you need to have an adequately configured test environment for both partitioned and "partitionless" versions of your database, where you'd be able to run tests and be able to draw conclusions.
    "The data in a record depends on the Key to the record, the Whole Key, and
    nothing but the Key, so help me Codd."

  4. #4
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601
    MCrowley, ran the SQL and the result for all indexes on the main table is 4 (four). Is that excessive, or normal?

    rdjabarov, by "partitioning" do you mean possibly placing the data on one device and indexes on other devices?
    Ken

    Maverick Software Design

    (847) 864-3600 x2

  5. #5
    Join Date
    Jan 2003
    Location
    Massachusetts
    Posts
    5,799
    Provided Answers: 11
    It means to read a row in the table via an index, it will take 4 page reads (lower is better). 8, if it is a non-clustered index.

    Are you seeing an actual performance problem, or is this all theoretical? There are systems out there that hold on the order of billions of rows of data. The hardware involved in those systems may be a bit more powerful than what most people get to play with, of course.

    Table/Index Partitioning is available in Enterprise edition of SQL Server. It is useful in certain situations (like rdjabarov pointed out), where you have an attribute in most queries that will server as a partion key. You may want to start here.

  6. #6
    Join Date
    Sep 2001
    Location
    Chicago, Illinois, USA
    Posts
    601
    More a theoretical question.

    Customer is very happy with current system performance, but I would rather stay ahead of any problems than wait until the customer starts complaining.
    Ken

    Maverick Software Design

    (847) 864-3600 x2

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •