Unanswered: Making Data Available for Reporting Purposes
I am currently using an environment where we have three databases:
SERVER1 - DB2INSTANCE1 - ETLDB - This is the database which the ETL populates on a nightly basis using DataStage.
SERVER1 - DB2INSTANCE1 - REPDB - This is the database that Cognos points at for the majority of our end users. Once the ETL is finished populating ETLDB, a LOAD operation is performed from ETLDB into this REPDB.
SERVER2 - DB2INSTANCE1 - ANALYSIS - This is the database used for data-intensive analysis queries. This is populated from REPDB via a backup\restore once the above LOAD operation is completed.
This feels like a very inefficient way of replicating the data from the ETL database into reporting environments as it takes many hours and results in tables and entire databases becoming unavailable while LOAD\RESTORE operations are completing. I was wondering what setups others are using? Can anyone point me in the right direction for the technologies we should be using in this situation?
Our main requirement is that we have control over when the switch-over\update happens, as we need the full ETL to complete before the reporting databases are completed to ensure we don't have orphaned records in the reporting database.
Sites at which I've worked use the range-partitioned (and sometimes additionally hash-partitioned) facts and dimensions tables. So the most recent ETL only inserts(or loads) into the relevant range-paritions, and only when all such ranges are populated are the relevant partitions made available for reporting.