Results 1 to 8 of 8
  1. #1
    Join Date
    Jun 2008
    Posts
    2

    Unanswered: Large OLTP table, Index common field, Slow INSERT performance

    Hi All,

    I'm trying to design an efficient OLTP database which can handle a high INSERT rate (at least 100 per second), but still be searched relatively efficiently.

    My primary OLTP table looks like:

    CREATE TABLE Transactions ( Time BIGINT NOT NULL, CustId VARCHAR(20) TermId VARCHAR(12), Amt FLOAT, ... )

    There are roughly 800,000 unique customer ids, 100,000 unique terminal ids.

    So for every customer transaction an entry goes into this table (BIGINT is a high resolution timestamp). The clustered index is defined on Time, which is normally increasing. As you'd expect, the INSERT performance on the table as it stands is very good.

    However, there are two common search cases:
    - Most recent transactions by customer - SELECT TOP 100 * from Transactions where CustId = '010101037' order by Time desc
    - Most recent transactions at terminal - SELECT TOP 100 * from Transactions where TermId = '475794' order by Time desc

    In order to satisfy these requests, I basically need an index on both TermId and CustId. As soon as I do this and the table grows past 20 million rows, INSERT performance consistently decreases.

    I assume this is because of the increasing fragmentation on the TermId and CustId indexes?

    Still, I'm convinced this must an incredibly common type of application (e.g Customer service at a bank looking up recent customer transactions). Am I missing some approach that will improve performance?

    Thanks,
    Shaun

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I hate to say this, because it borders on cheating.

    Think about what the teacher has covered in class recently. I'm sure that they've recently explained indexing and index performance, and if they are teaching your class from a book I'd be willing to bet that there is a sidebar or even a section discussing index operations and performance.

    There's more than one way to solve this problem, and the solution that will get you the most credit will depend on what the teacher is trying to get you to learn. You're going to have to guide us a bit before we can help you get inside the teacher's head to find the solution that will get you the most credit.

    -PatP

  3. #3
    Join Date
    Jun 2008
    Posts
    2
    Hi Pat,

    Unfortunately this isn't a university assignment, it's a real world problem I'm trying to solve.

    As I understand it, my performance problems are arising from the fact that my index columns are ordered such that the left most column repeats. This means that SQL server is constantly page splitting to put the new indexes (with high resolution timestamps) in.

    So, if the index is 'create index i_custid on Transactions ( CustId, Time )', I end up with a b-tree that has leaf nodes that constantly need to be split.

    This is all well and good, but it's not at all obvious to me how to avoid it. If I change the index to be ( Time, CustId ), performance is drastically better, but now the index cannot be used for the most common query (i.e 'SELECT TOP 100 * from Transactions where CustId = '9444' order by Time') without doing an index scan.

    I'm sorry if I'm missing something totally obvious, but most discussions on this seem to just indicate you just have to expect the performance problems or apply the index only after loading all of the data. Adding the index is not really appropriate for an online system.

    Cheers,
    Shaun

  4. #4
    Join Date
    Apr 2008
    Location
    Iasi, Romania
    Posts
    561
    Provided Answers: 2
    Why don't you try:

    create index i_custid on Transactions(CustID) ?

    I don't think that sorting the rows with one customer's transactions is so costly.

  5. #5
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Quote Originally Posted by shaun4477
    I assume this is because of the increasing fragmentation on the TermId and CustId indexes?
    Actually, (leaf level) fragmentation will inprove the insert of "random" key values - it is the process of fragmentation that slows stuff down.

    Please can you show the ENTIRE DDL for the table (including all index definitions)? When you say that the time is usually increasing - what proportion?

  6. #6
    Join Date
    Feb 2004
    Location
    One Flump in One Place
    Posts
    14,912
    Also - why SELECT *? You know this is not good practice right?

  7. #7
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    I'd recommend that you start here. You've got step #1 pretty well handled, so you can just pick up from there.

    -PatP

  8. #8
    Join Date
    Jun 2004
    Location
    Long Island
    Posts
    696
    periodic running of sp_updatestats has worked wonders for me, especially following index rebuilds and reorgs.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •