Page 1 of 3 123 LastLast
Results 1 to 15 of 34
  1. #1
    Join Date
    Oct 2010
    Posts
    94

    Unanswered: DB2 - not as good??

    Hello Gurus,

    Our BI shop is running into significant performance issues - ETL loads running for several days, reporting queries running slow, yaada yaada ... . Overview of our environment in one line - AIX ETL Application server uses Informatica, AIX server hosts 48-partition (6 node) database cluster and 2 AIX servers hosting BO and SAS reporting servers resply. I would estimate the warehouse at 3TB.

    On review, the problem seems to be, in large part, due to sloppy ETL code and some design deficiencies in the database schema (missing indexes, sub-optimal partitioning keys etc.). These problems, like one would expect, require sizeable time and effort to correct. However, management seems to be in a rush and apparently feels DB2 is the bottleneck and is looking at Teradata as an alternative.

    While there may be more to what meets the eye with these 'decisions', I am only interested in knowing your feedback on 3 technical items -


    1) One thing that is known about our cluster is all individual partitions have their data and log files residing on the same disks (RAID). I am considering pursuing with storage to see if segregating these speeds things up - especially since I know most of our ETL performs logged-inserts and updates in our environment. What's your experience been with performance gains from such a segregation?


    2) Also, some reporting applications like SAS access the administrator node for everything - I understand SAS has options like DBSLICEPARM that make it issue DB2 DPF-aware queries by straight accessing the logical nodes - that should reduce network hops and speed up reports right? Again there I am running into a mindset where accessing all partitions at the same time is some how being perceived as a 'bad practice'


    3) How do you think Teratada compares with DB2 in terms of cost and performance? assuming the same supporting hardware.


    Apologies for the redundancy with (1) above - I know I have asked the question here before - but needed confirmation, and if you know of IBM articles out there that demonstrate this, that would be truly appreciated!!!!!

    AS ALWAYS, THANKS FOR YOUR GUIDANCE!!!!

    - G

  2. #2
    Join Date
    Oct 2010
    Posts
    94
    Also, I wanted to make sure my understanding of why 'the log segregation from data would help' is correct. As I see it, sharing the same disks for both data and logs would mean, the disk spindle is most likely going to be 'out-of-position' for the next log write as data accesses (read/write) are random compared to log writes that are sequential - so this should essentially reduce delays by eliminating/minimizing actuator seeks, correct?

    Also, this would apply regardless of the physical device used, right? including RAIDs??

  3. #3
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by getback0 View Post
    management seems to be in a rush and apparently feels DB2 is the bottleneck and is looking at Teradata as an alternative.
    So, your management thinks that migrating the entire data warehouse to a new platform is faster than tuning the existing environment? That's sad. Do you mind telling us what is the company name - we may need to dump their stock...

    Quote Originally Posted by getback0 View Post
    What's your experience been with performance gains from such a segregation?
    Depending on circumstances, they could be anywhere from negligible to immense. You'll need to monitor DB2 logging performance and AIX I/O statistics.


    Quote Originally Posted by getback0 View Post
    that should reduce network hops and speed up reports right? Again there I am running into a mindset where accessing all partitions at the same time is some how being perceived as a 'bad practice'
    I'd say it is quite the opposite. Also, DB2 is practically guaranteed to parallelise the database workload better than a 3d party application.

    Quote Originally Posted by getback0 View Post
    3) How do you think Teratada compares with DB2 in terms of cost and performance? assuming the same supporting hardware.
    Teradata, being an appliance, may perform some tasks better than DB2 DPF. And, it being an appliance, you cannot assume "the same supporting hardware", because it comes with its own hardware.

  4. #4
    Join Date
    Oct 2010
    Posts
    94
    Do you mind telling us what is the company name - we may need to dump their stock...
    Lol! They are not publicly traded - so you are covered there!



    I'd say it is quite the opposite. Also, DB2 is practically guaranteed to parallelise the database workload better than a 3d party application.
    Not sure I follow - Isnt the default sequence of events for an application trying to run a query -

    1) submit query to admin node
    2) admin node broadcasts as needed
    3) worker nodes process and return results to admin node
    4) admin node retransmits to application

    and wouldnt this be slower than when -

    1) application consults the admin node to query catalog and determine hash key for the table in question
    2) application then established straight one-to-one connection with data nodes
    3) application issues UOW requests straight to the worker nodes and gets response back straight from them

    I ask as I see Informatica has the database partitioning feature which is documented to accomplish the second sequence of events above. And it clearly outperforms all other forms of application partitoning.

    Thoughts?

    Teradata, being an appliance, may perform some tasks better than DB2 DPF. And, it being an appliance, you cannot assume "the same supporting hardware", because it comes with its own hardware.
    Thanks - thats quite an eye opener!

  5. #5
    Join Date
    Jun 2003
    Location
    Toronto, Canada
    Posts
    5,516
    Provided Answers: 1
    Quote Originally Posted by getback0 View Post

    and wouldnt this be slower than when -

    1) application consults the admin node to query catalog and determine hash key for the table in question
    2) application then established straight one-to-one connection with data nodes
    3) application issues UOW requests straight to the worker nodes and gets response back straight from them
    It's more like this:
    1) application connects to the catalog (I guess that's what you call admin) node and queries the catalog tables (involves standard SQL query processing)
    2) the catalog node sends query results back
    3) the application performs its "query optimization"
    4) the application initiates few parallel processes
    5) each parallel process connects to its designated partition and issues a query against its data
    6) each data partition queries the catalog partition for catalog information, compiles and optimizes the query
    7) each query executes and results are sent back to the application
    8) the application combines results from all parallel processes

    I don't see how that would be faster than DB2 doing what it is specifically designed and optimized to do.

  6. #6
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    At various times, I have worked for both Teradata and IBM. In my opinion, the architectures of both Teradata and DB2 DPF are very similar, and there should not be a big difference between them (although one may excel at some tasks, and other excel at other tasks). Unfortunately for your comparison, Teradata dropped out of the TPC-H benchmark tests (TPC - Homepage) long ago (you can draw your own conclusion as to why they withdrew).

    At one time Teradata did run on proprietary hardware (originally Teradata hardware, and then later NCR), but I am not sure that is always true any more. I can tell you that Teradata on their own hardware is fairly expensive, especially when compared to DB2 DPF on Intel Linux hardware. Of course, AIX on P-Series is pretty expensive also.

    I suspect that the problems you are having are related to the configuration and tuning of the DB2 DPF system. As you stated, putting transaction logs and data on the same disk is not good for performance, and that is not how the DB2 BCU configurations are shipped. Given that rather elementary mistake, I am sure there are other mis-configurations in your DPF system.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  7. #7
    Join Date
    Oct 2010
    Posts
    94
    Given that rather elementary mistake, I am sure there are other mis-configurations in your DPF system.
    Oh I like the sound of that - my suspicion for quite a while - are there others that come to mind Marcus?

  8. #8
    Join Date
    Oct 2010
    Posts
    94
    And actually that is where I am hitting a wall - the DBA staff has locked down access to the point where I cant directly access the individual partitions to run utilities like iostat or nmon to get utilization stats. And they keep shooting down the log separation thought as one having marginal potential for performance gain with a significant effort involved. According to them, there is no IO bottleneck and hence no need for a log separation.


    So I am basically trying to build a case through arguments in lack of access. And here is one I think that substantiates the fact that there is IN FACT an IO problem (perhaps among others) -


    The DBAs mention we use WLM (Workload Management) to control resource consumption of DB2 UOWs. We have been told on occassions that a higher priority job (lets say J1 at higher priority) is contending with lower priority jobs (lets say J2, J3 and J4) for resources.

    This does not make sense to me, if the WLM is to be working correctly. Although, we did notice that those were the only 4 jobs running on the server, and that J1's performance improved once J2, J3 and J4 completed. My feeling is that WLM was working correctly, and that J1 did enjoy all the CPU and memory benefits on the servers due to a higher priority.

    The only thing that could have hampered J1's performance could have to do with the stuff out on the storage where WLM likely does not have control??? like if the log and data files are on the same disk, the commits from jobs J2/3/4 could contend with commits from J1??
    Am I overthinking this? or this reasonable evidence that keeping logs and data on the same disk is hurting us?

    Thanks again!

  9. #9
    Join Date
    May 2003
    Location
    USA
    Posts
    5,737
    Quote Originally Posted by getback0 View Post


    Oh I like the sound of that - my suspicion for quite a while - are there others that come to mind Marcus?
    DPF configuration is a bit complicated and unusual (often times counter-intuitive to those who don't understand it) so there are many areas where they may have configured it poorly. I am not going to speculate about all the areas where there may be problems.

    The issue of putting logs on the same disks as data is not just a performance issue, but also a data integrity and recovery issue if a an array failure occurs. In fact, if only non-logging LOADs are done to populate the warehouse (no inserts, updates, or imports to speak of) then it may not even be a performance issue.
    M. A. Feldman
    IBM Certified DBA on DB2 for Linux, UNIX, and Windows
    IBM Certified DBA on DB2 for z/OS and OS/390

  10. #10
    Join Date
    Oct 2010
    Posts
    94
    Marcus,

    Like I mentioned most etl in our shop performs logged transactions ( logged inserts, updates & deletes)

  11. #11
    Join Date
    Nov 2011
    Posts
    334
    we have used teradata as our data warehouse for over 5 years。
    It is also the share-nothing architecture which db2 dbf is。
    There are many works need to be done for optimization on it ,too。
    I think Just replace your system with another database will not improve your system performance at all。

  12. #12
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Quote Originally Posted by getback0 View Post
    Marcus,

    Like I mentioned most etl in our shop performs logged transactions ( logged inserts, updates & deletes)
    You're sure that the inserts/updates/deletes are really logged? Maybe logging has been turned off for the ETL?

    3TB of data is not unusually big. (I would just call it a more or less normally sized data warehouse.) A properly configured and tuned DB2 can cope with that amount quite easily.

    As Markus said, using separate file systems and underlying hardware for logs and data is a must-have if the logs are truly important for recovery. Performance is an important, but only secondary concern in this respect. From what I read, the main issue seems to be organizational in nature: your management, operatings people, database administrators and application developers don't seem to be in sync on the issues and what to do about them.

    As for the source for the poor performance: Maybe the sysops are right and there is no I/O bottleneck because they do the proper monitoring, but if you don't know the details, you cannot confirm or disconfirm this. Maybe you are right that indexes are missing and the data model has flaws. Then a look at the query access plans that take so long would give some indication at what's happening. Furthermore, monitoring DB2 should be done (if it is not already in place). Maybe you have badly configured buffer pools. Maybe the 8 partitions you have on each node exceed the available hardware resources. Maybe the ETL jobs take so long because access to the data sources is limited or the transformations are extremely complex. Maybe something else or all together. We can't really tell...

    First you should get the various groups to talk to each other and decide whether the problem shall be solved with the existing environment (with the starting point that this can surely be done). Then work together to identify the problems and resolve them. You may also want to consider external support in case the necessary skills are missing in any of the involved teams. But you'll need management commitment for all of this to follow through.
    Last edited by stolze; 02-17-12 at 12:16.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  13. #13
    Join Date
    Oct 2010
    Posts
    94
    First up, thanks to you all for taking the time to review! I agree that this problem is not purely technical - a lot of it has to do with teamwork and managing expectations across the board. But, like always, it's not all under my control! But I will keep these pointers in mind.

    One question though -
    You're sure that the inserts/updates/deletes are really logged? Maybe logging has been turned off for the ETL?
    How can logging be turned off? None of the updates and deletes can bypass the transaction log of the database, right? Also, most of our inserts don't run the db2 load - in which case, I believe they get 'logged' too.

    Am I missing something?

  14. #14
    Join Date
    Jan 2007
    Location
    Jena, Germany
    Posts
    2,721
    Just two straight-forward examples: On DB2 z/OS, you can switch a table space to not logged. In DB2 LUW, you can set a table attribute NOT LOGGED INITIALLY. If you do a LOAD, the records will not be logged. An IMPORT in DB2 LUW is usually causing logs to be written.
    Knut Stolze
    IBM DB2 Analytics Accelerator
    IBM Germany Research & Development

  15. #15
    Join Date
    Oct 2010
    Posts
    94
    stolze,

    In case of Informatica , I think it runs under one of the following 2 modes-

    1. Logged Insert (called 'Normal' Insert)
    It compiles the 'Insert into ' SQL once perhaps as a stored proc. and from then on just passes values to parameter markers for each column value.

    2. Bulk Loads
    This mode bypasses the log as it uses the db2 load APIs - so it is clearly that there will be no logging with this option.


    Although, based on our review of the ETL code overall, category (1) constitutes majority of the inserts issued.

    Also, for the cases you mentioned (in both z/OS and LUW), would 'Updates' and 'Deletes' bypass the transaction log? even if the tablespace or the table are set to be not logged?

    Thanks in advance!

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •