If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Normalization Vs. Overdesign

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #16 (permalink)  
Old 03-05-10, 05:48
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
Quote:
Originally Posted by zeroinsum
Normalization Vs. Overdesign
Normalisation is usually the key to any database design but whether it's sensible to redesign your entire database and all the code that goes around it without any clear idea of what benefits to expect is another question. Normalisation in itself isn't a benefit, it's just a process that will hopefully make your database easier to maintain. Have you looked into how much your new design will cost to implement and what the ROI might be? Asking us to agree with your design while not telling us what the database is to hold seems unbelievable to me but here's a link that provides lot's of useful normalised databases you could perhaps use.

You make a fair point regards the stilettos but I suppose it wouldn't be difficult counting all the descriptions that include stiletto or stileto. I guess I could counter by asking what you'd have to do to find how many of your shoes have steel toe caps (let's say this is a new requirement) - you'd have to add a new type table for toe caps, then add a field into your shoe table, then update your input screens to use this field, then populate the data for all your shoes and then finally write some SQL to join the tables and produce a total. Using the simple description field alone means you just have to write the SQL. Sure it's not rigorous but then nor is the world you're modelling.

Normalisation or overdesign? if the existing database works and your new design will be both more expensive and more complicated while providing no real benefits other than being normalised then I'm afraid I'd say it's over designed

Just my 2c
__________________
Mike
Reply With Quote
  #17 (permalink)  
Old 03-05-10, 07:07
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
nice one, mike

and i got shit for asking "why?" in post #2
__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #18 (permalink)  
Old 03-05-10, 14:50
zeroinsum zeroinsum is offline
Registered User
 
Join Date: Mar 2010
Posts: 10
Quote:
Normalisation is usually the key to any database design but whether it's sensible to redesign your entire database and all the code that goes around it without any clear idea of what benefits to expect is another question.
And I've explained the benefits that I'm quite certain I will receive... A couple of times now, actually.

Quote:
Normalisation in itself isn't a benefit, it's just a process that will hopefully make your database easier to maintain.
This is nothing I don't already know.

Quote:
Have you looked into how much your new design will cost to implement and what the ROI might be?
I'm the only person who will be working on this. I'm salaried. I'm also underworked. I have no additional hardware or software costs to do this. Front end code, I'm quite familiar with and can write without difficulty. Thus, the cost can pretty easily be calculated as MyPayRate*TheTimeISpendWorkingOnIt, and then only if it's time I would have spent doing something else anyway (See: underworked).

The return will be that it will be vastly easier for me to maintain it. It will be easier for me to make additions, it will be easier for me to generate accurate reports. I won't have to spend a lot of time wondering if anyone in data entry flubbed something.

It will also shave a vast amount of time off of the data entry process. As a rough estimate, I would say something between 5 and 10 man-hours... daily. We have half a dozen employees who do nothing but take orders and enter them all day. It's a grossly time consuming process at present.

So, here's my problem:

This database sucks. It "works", but the data is unreliable and the data entry is prohibitively time consuming. There is no metric by which fixing it is not a genuine need. Having made that determination, I may as well truly fix the thing instead of putting in a bunch of shortcuts that will likely need more fixing two years down the road. As I said before, I could fairly trivially redesign it to yield the benefits I want, but if I'm going to take the "redesign" step, why half-ass it if I don't need to?

Quote:
Asking us to agree with your design while not telling us what the database is to hold seems unbelievable to me but here's a link that provides lot's of useful normalised databases you could perhaps use.
I'm not asking anyone to agree with a design. I'm providing an example and asking for your opinion about how you would handle that hypothetical information. I'm not sure what you find offensive about that. I'm not saying, "Hey, design my database for me.... but you can't know what's going to be in it, so put on that blindfold and throw some darts at it!"

I know the examples I've given to be logically analogous to the problem I'm working on. I am also capable of translating a solution that works for a hypothetical dataset to work for another, analogous dataset. Fear not, if I fumble that process somehow, or discover that my analogy was flawed, I won't blame anyone for giving me a bad design.

It's actually very important to me that I do understand these design processes in the abstract, and not just as they relate to my current dilemma.

Quote:
I guess I could counter by asking what you'd have to do to find how many of your shoes have steel toe caps (let's say this is a new requirement) - you'd have to add a new type table for toe caps, then add a field into your shoe table, then update your input screens to use this field, then populate the data for all your shoes and then finally write some SQL to join the tables and produce a total. Using the simple description field alone means you just have to write the SQL. Sure it's not rigorous but then nor is the world you're modelling.
Er... you're making an unfair comparison. You're saying I'd have to create new tables and fields and SQL and do a bunch of data entry on the one side, but not on the other. You're right about the fields and tables and SQL...

...but, if at some point in the future I need to know how many steel-toe caps there were, when I never needed to before... why would that information be contained within a description field? It wouldn't. If I don't care about that information right now, I don't care about it right now regardless of whether dataentry consists of a description, or perfectly delimited option sets.

I would not, then, have to *just* write the SQL - it's still new information that we never cared about before, but now we do. We'd STILL have to update every description of every specification - a data entry process that will itself be very much more arduous and error-prone than creating a new table, populating it, and linking it.

Quote:
Normalisation or overdesign? if the existing database works and your new design will be both more expensive and more complicated while providing no real benefits other than being normalised then I'm afraid I'd say it's over designed
Am I missing something? Is data entry that is both fast, efficient, and reliable not a real benefit?

Trust me: I am a card-carrying member of the, "If it ain't broke, don't fix it" school of thought.

To use another analogy, though, let's say you had a car. A run down, beat up, POS that gets 5 miles to a gallon and burns through oil like it is its job. It has no radio, it has no AC, and one of the taillights is broken. It still runs, sure, and it will get you from Point A to Point B, which is its primary purpose... but it is clearly at a point where even "If it isn't broken, don't fix it" is no longer applicable.

I've been maintaining this database for about 5 years now - it's at that point.

I'm not entirely certain how extreme I have to be with the redesign - that's what this is really about. Whether or not it needs a redesign isn't really a question. It does, and has for some time and, honestly, I'd have done it sooner if it wasn't a rather daunting task. It has been annoying, frustrating, and otherwise wasting my time (and the time of everyone else who works here) for the last 5 years.

I assure you, the motivation for doing a redesign is very worthwhile, and it's not simply an academic concern. This is about the degree of redesign necessary - not whether or not to do one.

I would, however, like to thank you for the link. As I said, it is important to me that I understand this in the abstract, and I reckon that link will help substantially.

Last edited by zeroinsum; 03-05-10 at 15:13.
Reply With Quote
  #19 (permalink)  
Old 03-05-10, 15:15
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Quote:
Originally Posted by zeroinsum View Post
I'm the only person who will be working on this. I'm salaried. I'm also underworked. I have no additional hardware or software costs to do this. Front end code, I'm quite familiar with and can write without difficulty. Thus, the cost can pretty easily be calculated as MyPayRate*TheTimeISpendWorkingOnIt, and then only if it's time I would have spent doing something else anyway (See: underworked).
You mean your company doesn't employ testers or have a QA process, you intend do all the application regression testing yourself and that won't cost your organisation anything either? And the cost or risk of something hitting production that isn't captured during testing is also negligible? Then I'd say go ahead, absolutely.

Personally (and this is just my personal outlook on work, not necessarily yours) I would also be tempted to start looking for a new job. If I'm not in a position where my work is in demand because it is aligned with the priorities of the business then I would think one of A) I'm wasting my time B) I'm working for an organisation that doesn't have a great future anyway.

Good luck.
Reply With Quote
  #20 (permalink)  
Old 03-05-10, 15:39
zeroinsum zeroinsum is offline
Registered User
 
Join Date: Mar 2010
Posts: 10
Quote:
Originally Posted by dportas View Post
You mean your company doesn't employ testers or have a QA process, you intend do all the application regression testing yourself and that won't cost your organisation anything either? And the cost or risk of something hitting production that isn't captured during testing is also negligible? Then I'd say go ahead, absolutely.
Not negligible, perhaps, but definitely much smaller than the cost of continuing to use it as-is. These portions of the database do not, for the most part, interact with the portion I am interested in changing, which limits much of the risk.

It's also worth noting that this isn't a full redesign of the database in its entirety. A lot of it is remaining as is. This is a fleshing out of what previously exists as a single OrderDetails table. Much of the database - customers, employees, raw material sourcing and the like are remaining as-is.

This is not a huge amount of work, either. The level of complexity is a bit greater than I am personally used to dealing with in database design, but we're not talking about a wildly complicated and intricate project, either. On the front end side of this, it will require that changes be made to two data input forms. I can live with that. And, as I said before, I'm not too worried about that aspect of it, as I'm substantially more familiar with that than I am with dealing with the backend.


Quote:
Personally (and this is just my personal outlook on work, not necessarily yours) I would also be tempted to start looking for a new job. If I'm not in a position where my work is in demand because it is aligned with the priorities of the business then I would think one of A) I'm wasting my time B) I'm working for an organisation that doesn't have a great future anyway.

Good luck.
A: You're probably right about that, and I don't, in fact, intend to remain here forever, however...
B: This is a small company with limited IT needs. However, it has been a continually successful company, even through the recession. It is not especially glamorous or challenging work, but it is reliable and pays fairly well for what is asked of me.

While I appreciate your input, this is not a career advice forum, and I find it baffling that so few of the responses have anything to do with what was actually asked.

I really do not need anyone's opinion on job prospects, dating, family, child-rearing, exercise, nutrition, or lifestyle. I am interested solely in your opinions and expertise regarding specifically that which this forum is ostensibly dedicated to - database design.

Maybe I should rephrase my initial question? I'll try that:

Given the hypothetical scenario that you were setting about building a database to store specifications for a customizable product (as previously described), how meticulously would you design it, and why? Every bit accounted for and itemized, or a few shortcuts here and there? And what is your reasoning behind doing it that way? I recognize that there is probably more than one way to skin a cat when it comes to database design. I haven't really settled on away that I personally prefer, so I'm inquiring about other people - again, ostensibly with more experience and expertise in the subject - and how they tend to approach a similar problem.

This hypothetical presupposes that you've already decided to design such a database, so there's no need to say, "Well, I'd just leave it alone because it's not worth it to mess with it."

Last edited by zeroinsum; 03-05-10 at 15:52.
Reply With Quote
  #21 (permalink)  
Old 03-05-10, 15:47
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
For the design in general, your goal should be to make a design to at least Fifth Normal Form.

For your generalisation / specialisation hierarchy, take a look at the following example. However, I recommend you study some of the extensive database design literature on that kind of problem. Halpin for example (cited in the following article) devotes a whole chapter in his book on how to model supertypes and subtypes.

Data Based : Distributed Keys and Disjoint Subtypes
Reply With Quote
  #22 (permalink)  
Old 03-05-10, 15:50
mike_bike_kite mike_bike_kite is offline
vaguely human
 
Join Date: Jun 2007
Location: London
Posts: 2,519
It sounds like there's nothing to stop you going ahead with the project then. Would it be worth discussing the project with the boss - say you'd like to investigate in your free time how the current system could be improved - and then that way they won't feel like they've had the rug pulled from under them.

I'd pay particular attention to making sure that everyone concerned (from data entry people to the bosses) all have something to gain from the new system and that nobody will want to veto it. Remember people hate change. You could try creating the new data entry screens in Access and seeing what those entering the data think. I'd be careful to emphasise the accuracy of the data rather than the time saving otherwise they'll assume you're going to just fire one of them when the system's finished. Also bear in mind that someone (you?) will now have to maintain all the data in the look up tables.

Sorry to sound negative above but I was trying to show the other side of the argument and also trying to get a little more info on what your system held in order that folks could make better recommendations (I failed there). Good luck with it all!
__________________
Mike
Reply With Quote
  #23 (permalink)  
Old 03-05-10, 17:08
MarkATrombley MarkATrombley is offline
Registered User
 
Join Date: Jul 2009
Location: Michigan
Posts: 125
Quote:
Originally Posted by zeroinsum View Post

Maybe I should rephrase my initial question? I'll try that:

Given the hypothetical scenario that you were setting about building a database to store specifications for a customizable product (as previously described), how meticulously would you design it, and why? Every bit accounted for and itemized, or a few shortcuts here and there? And what is your reasoning behind doing it that way?
I will answer the revised question, although you probably have a fairly good idea of my answer by now

I design things very meticulously because almost every application or data source gets reused or expanded on, and building on a shaky foundation is much harder than building on a strong foundation. I can't count the number of times I have been asked to turn out a quick and dirty design because "this is only going to be used by a few people", then two weeks after implementation the same people come back and want to expand the usage to be international and the information to be integrated into several other systems. Over my 30 year career in IT I have found that doing things right once is far better than doing things quick and having to fix it forever.
Reply With Quote
  #24 (permalink)  
Old 03-29-10, 19:23
zeroinsum zeroinsum is offline
Registered User
 
Join Date: Mar 2010
Posts: 10
Quote:
Originally Posted by MarkATrombley View Post
When selecting an order you would select from Order and use an INNER JOIN to OrderDetail on Order.OrderID - OrderDetail.OrderID. You would also use a LEFT OUTER JOIN to BucketDetail on OrderDetail.OrderDetailID = BucketDetail.OrderDetailID and another LEFT OUTER JOIN to BoxDetail on OrderDetail.OrderDetailID = BoxDetail.OrderDetailID. Depending on the product type one of the LEFT OUTER JOINs will fail and one will succeed. The OrderDetail.ProductType column will tell you which of the joins should have worked, which in turn tells you which columns you are expecting to find information in, either BucketDetail columns or BoxDetail columns.

Depending on your application you might do that as all one join as above, or just go as far as the OrderDetail table until the need for more information comes up - in which case you can use the ProductType column to know which table to go look for that information.

So instead of having mutually exclusive columns you would have mutually exclusive tables. A specific OrderDetailID would only be found in one of the sub tables and not the others.

Hopefully that is clearer.
Mark,

First of all, I want to thank you again - you've been exceedingly helpful. Had to step away from the project for a bit to prepare for a relocation, but I've returned to it and I've since gone with something very similar to the above and, so far, it's working pretty well.

One question which, I suspect, will be my last - It appears that the design leaves it technically possible for both of the outer joins to succeed, if it somehow happens, on the data entry side, that an entry is made in each table for one orderdetailID.

I'd like that to be impossible. I am reasonably certain that I can definitely prevent it from happening through the front end side through a variety of checks and data validation techniques, but I'm just curious if there's a way to make it so, on the back end, it's completely impossible for the same orderdetailID to exist in both the bucketdetail table and boxdetail table at the same time.

Is that possible?
Reply With Quote
  #25 (permalink)  
Old 03-29-10, 19:34
MarkATrombley MarkATrombley is offline
Registered User
 
Join Date: Jul 2009
Location: Michigan
Posts: 125
You can use triggers to make it impossible to have rows in both tables. Each table would have a trigger that checked the other table for a row with the orderdetailID in it, and disallow the insertion if a match is found.

It would be better to catch this on the front end, but having the triggers would be good insurance.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On