Results 1 to 5 of 5
  1. #1
    Join Date
    May 2012
    Posts
    1

    Unanswered: ETL performance improvement

    we daily extract around 40 million of rows from oracle table over the network. These are fetched from views.
    over the network our informatica process take almost 6-8 hrs depending on network traffic

    We are thinking to write a architecture where we can extract the data in such a way that it create a file based on some row id that are define
    internally in Oracle databse.

    suppose we have 20 GB of data evey day that our informatica process make. we are thinking in such a way - say we create 40 files each of 500 mb so that the process will be much faster.

  2. #2
    Join Date
    Aug 2003
    Location
    Where the Surf Meets the Turf @Del Mar, CA
    Posts
    7,776
    Provided Answers: 1
    ETL - Usually stands for Extract, Transform, & Load to move data from OLTP DB into Data Warehouse DB.
    I don't understand how creating intermediate OS files will reduce the elapsed time for the overall process.

    What exactly is the final destination for 40 million extracted rows?
    Last edited by anacedent; 05-30-12 at 19:10.
    You can lead some folks to knowledge, but you can not make them think.
    The average person thinks he's above average!
    For most folks, they don't know, what they don't know.
    Good judgement comes from experience. Experience comes from bad judgement.

  3. #3
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    I concur with anacedent, what is the purpose of extracting 40 million rows every day?
    Perhaps you need to find a way to extract only the new and changed rows.
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

  4. #4
    Join Date
    Jun 2004
    Location
    Liverpool, NY USA
    Posts
    2,509
    and there is no way that extracting 40 million rows should take 6 hours. I can move a gig of data in less then a minute. Find out why it takes so long to extract. Moving the data is not your choke point.
    Bill
    You do not need a parachute to skydive. You only need a parachute to skydive twice.

  5. #5
    Join Date
    Jun 2003
    Location
    West Palm Beach, FL
    Posts
    2,713

    Cool

    Quote Originally Posted by beilstwh View Post
    and there is no way that extracting 40 million rows should take 6 hours. I can move a gig of data in less then a minute. Find out why it takes so long to extract. Moving the data is not your choke point.
    Bulls eye...Exactly on target!

    The choke point: "...our informatica process take almost 6-8 hrs".
    If I remember well, Informatica extract uses sql queries to extract the data ...
    The person who says it can't be done should not interrupt the person doing it. -- Chinese proverb

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •