Results 1 to 10 of 10
  1. #1
    Join Date
    Aug 2003
    Posts
    13

    Unanswered: To partition or not to partition

    I need a bit of theory help, as I'm unfamiliar with partitioning performance in SQL Server and I'm no super-DBA.

    When I worked with DB2 at a big corporation, the rule was always partition the big tables regardless. For the last 2 years however, I've been working with SQL Server 2005 and now are we starting to run into issues with system load and queries timing out on one-partition, appropriate clustered tables in a heavily normalized database using, as far as we know, appropriately optimized queries.

    Now, we have several options. Our disk queues are high in Performance Monitor, which means are reads/writes are stacking up, so we could throw some amount of hardware at it like more spindles, faster drives, etc, although the client is hesitant to throw a ton of money at the problem. Compounding that problem is that this is the peak part of the year for their business so they're hesitant to have any real downtime. Frankly I don't see any way to get around downtime with any option we go with.

    Aside from hardware, I've been considering partitioning some of the bigger tables, which are 10 million + records each. I've done a lot of reading on MSS partitioning, but all of the examples I've seen always mention partitioning by date in respect to performance, but never anything else.

    So my question is this, would a table that is partitioned by a unique id gain any performance benefit from partitioning?

    The relevant part of the Company_Employee table, for example, looks something like this, where there are many companies with many employees:

    Company_EmployeeId uniqueidentifier
    CompanyId uniqueidentifier
    EmployeeId uniqueidentifier

    The primary key is Company_EmployeeId, non-clustered.
    The clustered index is on CompanyId and EmployeeId.

    The query we're often seeing timeout is on pulling a list of employees by CompanyId, although in the queries we're seeing it do a clustered index seek.

    If we were to partition on CompanyId, would we see any real value and if so, why (I need to explain to the higher-ups why to push for it)? Or, should we just concentrate on throwing hardware at it?

    I appreciate any help in advance.

  2. #2
    Join Date
    Jun 2004
    Location
    Long Island
    Posts
    696
    Do you have more than 1024 Company IDs ? or is there a chance of having ? I believe your only allowed this many partitions with SQL2005.

    If you're not going to use multiple filegroups across separate physical disks, you won't really get any performance gains, but logically it will look nice. I've used parititioning for loading data (with aligned indexes), where I can switch out the respective partition, load/update data, rebuild indexes, and switch it back in, instead of having to worry about rebuilding/reorging indexes on a 1 billion, unpartitioned, row table.

  3. #3
    Join Date
    Aug 2003
    Posts
    13
    We don't have that many companies yet, but there is always potential.

    However, I should have mentioned this: I was actually thinking of partitioning on ranges of CompanyId using the first 2 bytes of the uniqueidentifier.

    I'm wondering if I do that, and use different filegroups on different disks, if that would help.

  4. #4
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    10 million rows is not a lot of data btw

    What version of SQL Server are you using?
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  5. #5
    Join Date
    Aug 2003
    Posts
    13
    Quote Originally Posted by Brett Kaiser
    10 million rows is not a lot of data btw

    What version of SQL Server are you using?
    We're on 2005. While not a statistically insignificant number, I know it's not a lot of data compared to some of the people with 1 billion row tables. With high disk queues, makes me think it's more of a hardware issue but I'd like to explore all avenues.

    Another question. Our DB is on a SAN, so if we spread filegroups over the SAN and partition over the filegroups, will we really receive parallel I/O since they all act as one?

  6. #6
    Join Date
    Nov 2002
    Location
    Jersey
    Posts
    10,322
    did you look in BOL?

    Partitioned Tables
    Partitioned tables are tables whose data is horizontally divided into units which may be spread across more than one filegroup in a database. Partitioning makes large tables or indexes more manageable by letting you access or manage subsets of data quickly and efficiently, while maintaining the integrity of the overall collection. Under a partitioning scenario, operations such as loading data from an OLTP to an OLAP system, for example, can take seconds instead of minutes or hours in previous releases. Maintenance operations that are performed on subsets of data are also performed more efficiently because they target only the data that is required, instead of the whole table.

    Partitioning a table makes sense if the table is very large or is expected to become very large, and either of the following is true:

    The table contains, or is expected to contain, lots of data that is used in different ways.


    Queries or updates against the table are not performing as you expect, or maintenance costs exceed predefined maintenance periods.


    Partitioned tables support all the properties and features associated with designing and querying standard tables, including constraints, defaults, identity and timestamp values, triggers, and indexes. Therefore, if you want to implement a partitioned view that is local to one server, you should implement a partitioned table instead. For information to help you understand, design, and implement partitioned tables, see Partitioned Tables and Indexes.
    Brett
    8-)

    It's a Great Day for America everybody!

    dbforums Yak CorralRadio 'Rita
    dbForums Member List
    I'm Good Once as I ever was

    The physical order of data in a database has no meaning.

  7. #7
    Join Date
    Jun 2004
    Location
    Long Island
    Posts
    696
    Quote Originally Posted by mferrell
    We're on 2005. While not a statistically insignificant number, I know it's not a lot of data compared to some of the people with 1 billion row tables. With high disk queues, makes me think it's more of a hardware issue but I'd like to explore all avenues.

    Another question. Our DB is on a SAN, so if we spread filegroups over the SAN and partition over the filegroups, will we really receive parallel I/O since they all act as one?
    I think it depends on the number of LUNS in your SAN. Also, make sure you reorg and rebuild your indexes regularly and manually update your statistics. I've noticed SQL 2005 will scan tables a heck more often than 2000 did, even on an indexed column if those stats are stale. Also, if you notice what looks to be self blocking (I/O waits), you may want to adjust your MAXDOP setting for the respective query(ies).

  8. #8
    Join Date
    Aug 2003
    Posts
    13
    Quote Originally Posted by Brett Kaiser
    did you look in BOL?
    I have read that bit yes, but was there something in particular you were pointing out? If so, I'm missing it.

  9. #9
    Join Date
    Aug 2003
    Posts
    13
    Quote Originally Posted by PMASchmed
    I think it depends on the number of LUNS in your SAN. Also, make sure you reorg and rebuild your indexes regularly and manually update your statistics. I've noticed SQL 2005 will scan tables a heck more often than 2000 did, even on an indexed column if those stats are stale. Also, if you notice what looks to be self blocking (I/O waits), you may want to adjust your MAXDOP setting for the respective query(ies).
    Thanks PMA! I'll look into that. Much appreciated for all the help, guys. When I find a solution I'll try to remember to post here.

  10. #10
    Join Date
    Sep 2005
    Posts
    161
    I smell a ram issue.

    As mentioned, ten million rows is not a lot of data. Also reindex was mentioned, which would help if you had index fragmentation.

    Since your disk queues are high, I'm inclined to ask about your ram. How much do you have, and how much is SQL Server using? Do you have 32 bit or 64 bit version of SQL Server? If you have the 32 bit standard version, then you are using at most 2G of ram, which is not enough for tables that have millions of rows, which would lead to a high disk read queue on your select queries. If your write queue is high, check that you have clustered indexes on the tables in question, preferably on a column that increments sequentially with new records.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •