If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > multiple vendor connections vs. centralized/ redundant db

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 09-22-09, 13:07
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
multiple vendor connections vs. centralized/ redundant db

I have a project to create a dashboard that will connect to existing systems as well as create new features based on combining data from the existing systems. For example, the dashboard will be able to generate "orders" containing data merged from "members" (MS Access DB), "employees" (MySQL DB) and "products" (flat file), and there will also be new attributes particular to "orders."

At first I thought it would be most efficient to have my application connect to each of the systems separately and perform cross-vendor joins between the different databases. But then I thought that creating a centralized/redundant db (built with scripts pushing and pulling data between the systems) might also be useful because it would empower some semi-technical staff to use products like OOBase, which can only make a single connection.

Are there any other advantages to creating a centralized/redundant DB like the one I'm talking about? Or are multiple direct connections the best approach?

Thanks in advance for any tips.
Reply With Quote
  #2 (permalink)  
Old 09-22-09, 13:29
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
A key advantage of creating a Data Warehouse (the centralised, redundant database) is that it can be a conformed single version of the "truth". For instance you can perform any necessary conversion of units, data types and encoding schemes to make sure that all the data can be reconciled and summarised in useful ways.

Also the warehouse approach will makes it easier to manage availability, performance and security issues.

If you haven't built a data warehouse before then get some help. The failure rate for such projects is notoriously high and you'll need someone experienced to help you avoid the pitfalls.
Reply With Quote
  #3 (permalink)  
Old 09-22-09, 17:01
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
Thanks for your reply, dportas. I'm beginning to think a data warehouse is the way to go.

However, I have never needed a consultant ("someone experienced") to help me build an information system. Can you recommend any books or tutorials? Is "The Data Warehouse Toolkit" still the standard?

I would be writing the ETL in Python.
Reply With Quote
  #4 (permalink)  
Old 09-22-09, 17:12
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Start with Bill Inmon's book "Building the Data Warehouse". Stay away from Kimball.
Reply With Quote
  #5 (permalink)  
Old 09-22-09, 17:16
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
Wow - that book is 17 years old! And "oldie but a goodie", huh?
Reply With Quote
  #6 (permalink)  
Old 09-23-09, 10:45
Derek Asirvadem Derek Asirvadem is offline
Registered User
 
Join Date: Mar 2009
Location: Sydney, Australia
Posts: 258
The miracle of flight is 100 years old. The size and complexity of aircraft has come a long way in that time, but there have been no advances in the theory of flight.

If you do not understand the theory, you will build an albatross, or a grey goose. Marvellous ideas, from very clever people, but they did not fly. Success or aptitude in one field does not guarantee success in another field.
__________________
Regards
Derek Asirvadem (Formerly DerekA)
Information Architect / Senior Sybase DBA
Copyright © 2009 Software Gems Pty Ltd

I answer questions from the Original Poster only. If you have a genuine question, as the moderators have requested, start a new thread.

http://www.softwaregems.com.au

Last edited by Derek Asirvadem; 09-23-09 at 10:49.
Reply With Quote
  #7 (permalink)  
Old 09-23-09, 15:23
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
I've been reading about the differences between Inmon and Kimball, and all these complexities about data warehouses vs data marts, dimensions, star schemas, EAV tables and so forth, and I can't help feeling that it's more complex than it needs to be. Why can't I just look at my data dictionaries, come up with a nice, unified RD schema (integrating the new tables with the redundant/ updated ones), and schedule my ETL update scripts on a daily or event-driven basis?

To give some perspective, we're a mid-sized retail business (160 employees, 600 members, $500K gross weekly sales).
Reply With Quote
  #8 (permalink)  
Old 09-23-09, 15:58
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Quote:
Originally Posted by yekibud
Why can't I just look at my data dictionaries, come up with a nice, unified RD schema (integrating the new tables with the redundant/ updated ones), and schedule my ETL update scripts on a daily or event-driven basis?
In a nutshell that is what you do. The devil is in the detail however. Dealing with problems such a late arriving data, non-conforming data, surrogate key assignment and representing changes over time are potentially complex problems in data warehouses.

As I said before, I recommend you avoid the Kimball "dimensional" stuff. At best it is just jargon and at worst it can cripple your data model with inflexible dogmas.
Reply With Quote
  #9 (permalink)  
Old 09-23-09, 16:28
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
Thanks for the tips, dportas. At the very least, it sounds like Inman's ideas are something I should read up on, anyway - sort of general IT/DB knowledge. And hopefully it will help me out when I run into some practical issues.
Reply With Quote
  #10 (permalink)  
Old 09-23-09, 16:35
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by yekibud
I have never needed a consultant ("someone experienced") to help me build an information system. Can you recommend any books or tutorials? Is "The Data Warehouse Toolkit" still the standard?

I would be writing the ETL in Python.

...

To give some perspective, we're a mid-sized retail business (160 employees, 600 members, $500K gross weekly sales).
Why not get someone in to advise you how to go about things - it's great to learn on a fun new project but getting things wrong will cost you long term. Once you have a few recommendations then you can consider how to go about building the system. It doesn't sound complex but why not do things right from the off. Would it help to have a more central system rather than lots of separate systems? What else can you do with your data when you get it all in one place?
Reply With Quote
  #11 (permalink)  
Old 09-24-09, 16:17
yekibud yekibud is offline
Registered User
 
Join Date: Sep 2009
Posts: 6
@dportas: I'm beginning to question whether what I really need is properly called a "data warehouse" - my original term "centralized / redundant db" is starting to sound better. My main issue is with Inmon's "non-volatility" requirement. Why would I want to duplicate an employee record every time an attribute changed? (e.g. change of address, status changed to 'inacitive', etc.) Is non-volatility an essential feature of a data warehouse?

@mkie_bike_kite: Yeah, you're right. I wish I knew someone I trusted who had experience in these things.
Reply With Quote
  #12 (permalink)  
Old 09-24-09, 16:45
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
One of the desirable goals of a warehouse is to capture data to the greatest possible level of detail so you can support both current and future requirements. If you don't capture changes when they happen then that data is probably lost forever. You don't necessarily have to capture every change to every attribute. Part of the design process is deciding what history needs to be captured and to what level of detail. If you are reliant on taking daily "snapshots" of your data then you inevitably lose some of that day's changes anyway unless the history is always preserved at source as well.

Quote:
Originally Posted by yekibud
I wish I knew someone I trusted who had experience in these things.
In the IT business, working alone on anything sucks. It means you have limited opportunities to learn and some of what you learn by experience will probably be defective knowledge. Speaking for myself, if I found I was working alone, particularly on something unfamiliar, for any period of time then that would be a good enough reason for me to look for employment elsewhere. That's just my 0.02 though.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On