Results 1 to 6 of 6
  1. #1
    Join Date
    Jun 2005
    Posts
    319

    Unanswered: what method will give the best performance?

    Let's say we have 100 web servers load balanced, for every 5 web servers we have a DB Server running SQL 2008 and for simplicity's sake there is only 1 table on each DB Server. The table on each DB Server is identical in schema. We want to aggregate the data from the table on the 20 DB Servers onto a Master DB Server so that the data can be fed into a Data Warehouse. Performance here is key and volume is high, I'll have to get back to you on exact volume, but let's assume about 15k records a second to each of the 20 DB servers. Is this outside of the bounds for Transactional Replication? Does Database Mirroring offer better performance? I am not even sure DB Mirroring is an option since I think the subscriber would be offline for periods. I also assume Merge Replication would offer worse performance than Transaction Replication, correct? Assume all 21 DB Servers are in the same room so geographic considerations are not an issue. They key is getting the data into the DW as fast as possible. Currently a job runs every 30 seconds, and this might be the best solution to avoid issues on the subscriber.

    I think ideally I'd like to setup table partitioning on the Master DB Server, then populate the 20 different partitions individually, but I feel like this is doing Table Partitioning backwards and I am not even sure Table Partitioning works well across different DB's. But then I'd still have to think about how the data gets from the table on the Master Sever into the Data Warehouse.

    Sorry 1 last question, what is the upper bounds for the amount of volume that Transactional Replication can handle?

  2. #2
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    300,000 rows per second is WELL within the performance limiations of either transactional or merge replication. Merge replication offers you much better fault tolerance, especially if you can tweak the schema to use GUIDs in order to support the distributed design.

    I would make the OLTP databases do merge replication to a "central" copy of the database. I would use log shipping to clone that central database to a read-only staging database. Pause the application of log shipping as needed to allow aggregation into your DataWarehouse tables, then resume the log shipping once the data is aggregated.

    Any given OLTP database can be sacrificed and only the data that hadn't been merged would be lost. The central database can process merges at will since it has no direct conntection to the DW. The staging image can be quiesced at will to allow easy aggregation.

    This isn't theoretical, it is a synopsis of an actual Microsoft client implementation.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  3. #3
    Join Date
    Jun 2005
    Posts
    319
    That's good stuff Pat, are you able to release the name of the client (anyone recognizable?).

    This article is about 3 years old, would you suggest the approach is dated compared to what you described?

    Distributed Partitioned Views / Federated Databases: Lessons Learned - Technical Notes

    I believe I understand why you are decoupling the aggregation with the DW population, they are separate processes and you wouldn't want 1 to bottleneck the other.

    Would we be able to take an approach similar to what is described in the article above to go directly from the 20 different servers into the DW via a partitioned view? I think the main drawback might be if one of the servers goes down that would kill the partitioned view right? Which brings us back to why you decoupled the 2 processes in the first place.

    We also have the issue of knowing what to delete from the tracking servers in this scenario.
    Last edited by Gagnon; 09-29-10 at 18:11.

  4. #4
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    No, I can't give you the name of the client, but it wouldn't take rocket science to work it out (I'm pretty sure it was specifically cited in Microsoft SQL whitepapers circa 2007). I can't read enough of the URL you provided to give any decent comment now, but I should be able to eyeball it tonight.

    The design was to decouple the data at every stage possible... Things break, that's life, but the business must go on! If a web session crashes, that web session is lost. If an ASP application server crashes (heaven forbid), any active sessions are lost. If an OLTP database crashes, then unmerged data gets lost. The "Central" server is where all the eggs hit one basket, and that is where the hardware is unholy in its redundancy... Only outright destruction of the building could take that out. The staging database is simply a clone of the Central database, and it really only serves to isolate the OLTP side of the house from the Analytics side of the house.

    Since they can quiesce the staging database at the end of any transaction log, it can logically "flip" from OLTP to Analytics at whim, which makes the aggregation process quick and simple.

    The DataWarehouse has a very Inmon-esque design, but the presentation almost exclusively handled by a mix of SSAS, SSRS, and for the hardcore analytic types Excel spreadsheets. It is an intriguing environment, and one in which I'm very personally interested.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

  5. #5
    Join Date
    Jun 2005
    Posts
    319
    I understand. The link (from 2007) essentially talks about Federated Servers and partitioned views, here's the SQL 2008 documentation on it (sorry if you are already well versed with it):

    Understanding Federated Database Servers

    From reading the SQL 2008 documentation above it actually sounds like a SQL Server can crash and it will not kill the partitioned view.

    I think if we were to go this route, we could schedule the DW uploads every 30 seconds, the first step in the process would be to have the web farm point to Table2 so that Table1 could be pulled into the DW and then truncated, after the next iteration is complete we substitute Table1 w/ Table2, rinse-repeat. Sounds like a simpler solution with the only drawback being that we are not decoupling the aggregation with the DW upload. Does it sound like it has much risk? The more I think about this approach the more I like it since (at least to me) it seems like a simpler design taking merge replication, log shipping, and the read-only DB out of the equation.

    Ideally we would want to make this happen faster than every 30 seconds, maybe every 10 seconds or even every 5 seconds, I think it will get to a point though where we would need to extend the interval to take into account the time required to:
    1) execute the SELECT on the partitioned view across 20 (federated) servers
    2) cut over from Table1 to Table2
    3) complete the upload to the DW
    Last edited by Gagnon; 09-29-10 at 18:45.

  6. #6
    Join Date
    Feb 2004
    Location
    In front of the computer
    Posts
    15,579
    Provided Answers: 54
    As far as I know, if any server in a federated farm goes down, then the whole farm becomes inoperative until that server is brought back online. With clustering or mirroring underneath the federation, this can be made tolerable but the cost can get out of hand pretty quickly. If anyone else knows differently, please speak up because I'd love to have someone show me that I'm wrong on this point.

    I'm very much "old school" and I deeply believe that complicated things that work are just collections of simple things that work. The design that I described was "bleeding edge" in 2006 when it was conceived, and was still "freaky new" in 2007 when it went into production. Telecom has failed, a couple of web servers have failed, but never has a brick failed and never has the system as a whole so much as hickuped. That to me is reliable.

    This sounds like a big job. If it were mine, I'd engage Microsoft's PSS (Professional Support Services) group to ensure that your design is supportable and "bleeding edge" current. It costs quite a bit to engage MS-PSS, but the cost is a fraction of what even a small system design error could cost.

    -PatP
    In theory, theory and practice are identical. In practice, theory and practice are unrelated.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •