If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Performance of loading a Data Warehouse

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 05-17-04, 17:41
barryw barryw is offline
Registered User
 
Join Date: Apr 2003
Location: London, England
Posts: 42
Performance of loading a Data Warehouse

I wonder whether anyone has any experience or advice on my problem.

I am designing an analysis tool to monitor the performance of about 200 data extract files as they are loaded into a Data Warehouse.
The key feature is to adjust the loading sequence automatically if specific extract files arrive early or late by identifying the dependencies on the missing files.

I will be very grateful for any suggestions or advice.

Barry
Reply With Quote
  #2 (permalink)  
Old 05-18-04, 06:43
nik_bg nik_bg is offline
Registered User
 
Join Date: May 2004
Posts: 1
The best solution is to define a small database (you may use the warehouse) with tables that describe the required files and the dependencies. Then you just have to write several queries.
Reply With Quote
  #3 (permalink)  
Old 05-19-04, 18:05
barryw barryw is offline
Registered User
 
Join Date: Apr 2003
Location: London, England
Posts: 42
Thanks but ...

I have already done what you suggested - what's your next suggestion ?

Barry
Reply With Quote
  #4 (permalink)  
Old 05-20-04, 01:55
rajiravi rajiravi is offline
Registered User
 
Join Date: Apr 2004
Location: Toronto, Canada
Posts: 249
Hi,

Let me see if I understand the problem.

You have many extract files that arrive at various points in times. These files are to be processed, sometimes in specified order.For example if FileA has to be processed before FileB can be processed, then, if FileB arrives first, it must not be processed.

The tool that you are creating should wait until FileA arrives and is processed successfully.

This is essentially a workflow problem. Many vendors provide software solutions for these types of situations.

If you are developing in-house solutions, then the process that runs FileB must check if FileA has been processed before starting work on FileB.

In a Unix environment, you could create a shell script that does this. In a loop, it could check if FileA has been processed. If yes, process FileB. If not, "sleep" for some time and then repeat until FileA has been processed.

More complicated situations can be handled in a similar manner. Alternatively, you could use your database management system's scheduler to accomplish the same task.

Was that what you were looking for? Or did I misunderstand your problem?

Ravi
Reply With Quote
  #5 (permalink)  
Old 05-20-04, 17:28
barryw barryw is offline
Registered User
 
Join Date: Apr 2003
Location: London, England
Posts: 42
This is my DW Data Loading problem ...

You have understood the problem very well.
My current situation is that I have a Data Dictionary in an Access Database.
I use this to generate UNIX Shell and SQL Scripts to control the Loading into a Data Warehouse, using Informatica and Oracle.
Now I am getting hit with many more files than I had been told to expect so I am having trouble loading all the files in the overnight window.
Therefore I would like to add the ability to evaluate several options for sequencing and loading files, based on performance stats taken from actual log files.
Then I would like to automatically generate the Scripts to load the files in the sequence I have decided is optimal.
Of course, this might change every night.
I now want to add functionality for this kind of Analysis Engine on top of my Scripting Engine.

I would appreciate any advice or suggestions.

Barry
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On