Results 1 to 3 of 3
  1. #1
    Join Date
    Nov 2002
    Posts
    10

    Unanswered: db2 ETL (extract, transform, load)

    I have a couple of tables on udb that need to be populated periodically. The data source cames from mainframe db2. Before I use load utitlity in udb, the data is extracted and transformed on mainframe, and then ftp from mainframe to unix box. This is a very typical ETL process, currently, we don't want to buy any third party ETL tools. What we want to do is writing shell scripts, mainframe JCL remote invoke those shell scripts to automate load process after ftp data to unix box. My questions are,

    1. how to deal with situation if load or import failed or partly successful? how do I delete those records partly imported? ( I want the ETL process as automatic as possible)
    2. how to deal with log full? (use NONRECOVERABLE when load and NOT LOG INITIALLY when import?)
    3. how to validate if the load is success? (use COUNT(*) before load and after load? is there any other way ?)

    Thanks in advance.
    regards,
    BBG

  2. #2
    Join Date
    Aug 2001
    Location
    UK
    Posts
    4,650

    Re: db2 ETL (extract, transform, load)

    1. May be you can have an extra column in the target which contains the update timestamp .. I'm sure this will work with IMPORT ... For LOAD, I don't have any technique on top of my head ...

    2. NLI with import is OK , but LOAD, in the DELETE phase will use logs even if NONRECOVERABLE option is used ..

    3. Script to check the message file of the LOAD or IMPORT ...

    HTH

    Cheers

    Sathyaram
    Originally posted by bbg
    I have a couple of tables on udb that need to be populated periodically. The data source cames from mainframe db2. Before I use load utitlity in udb, the data is extracted and transformed on mainframe, and then ftp from mainframe to unix box. This is a very typical ETL process, currently, we don't want to buy any third party ETL tools. What we want to do is writing shell scripts, mainframe JCL remote invoke those shell scripts to automate load process after ftp data to unix box. My questions are,

    1. how to deal with situation if load or import failed or partly successful? how do I delete those records partly imported? ( I want the ETL process as automatic as possible)
    2. how to deal with log full? (use NONRECOVERABLE when load and NOT LOG INITIALLY when import?)
    3. how to validate if the load is success? (use COUNT(*) before load and after load? is there any other way ?)

    Thanks in advance.

  3. #3
    Join Date
    Jan 2002
    Posts
    11
    We do similar process and this is how we handle the load.
    Our environment.
    Flat files are created on the mainframe and we transfer it to our DB2 Database server (Windows 2000) using HFT (High speed File Transfer for moving data in a SAN storage network)
    For each file transfered we also send an Audit file containing information about the file transfer consisting of Table Name, Record count etc.
    We invove a batch job on our Windows 2000 from mainframe. We only invoke the subsequent batch jobs when the previous batch job is succesfull.
    On a monthly basis we do the full data load of the current months data. Our Data Warehouse is Star Schema with confirmed dimension. We load the Monthly Data into a Staging Table (Same Schema as the Production Table) and once the data is validated promote it to production table. Each month the staging tables are initialized, after the data is promoted to production.
    The Batch job on our Windows 2000 calls Perl script which invokes db2 script to load the table and captures the Load Statistics generated by DB2 into a table.

    1. how to deal with situation if load or import failed or partly successful? how do I delete those records partly imported? ( I want the ETL process as automatic as possible)
    If a load fails, all subsequent loads jobs are deffered until the database problem is fixed.
    2. how to deal with log full? (use NONRECOVERABLE when load and NOT LOG INITIALLY when import?)
    The Perl script checks for the Log full message in the .mes file.
    Usually for full load there will occur only when there are duplicate records and will happend during index build process.
    3. how to validate if the load is success? (use COUNT(*) before load and after load? is there any other way ?)
    We check the Db2 load stats record count (Stored in our meta data table) with the audit file being sent from the mainframe.

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •