If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > Why is data integrity so important

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 07-28-09, 02:45
joshfolgado joshfolgado is offline
Registered User
 
Join Date: Jul 2009
Posts: 7
Why is data integrity so important

Hi,

We currently have DB's where the constraints are enforced through the API but the DB itself has no constraints.

The problem is that because the DB does not enforce its own integrity, the data is of a low quality with plenty of unexpected issues.

My boss has tasked me to explain why we should design our DB's with constraints instead of being enforced through the API.

Can you help?

Thanks

Josh
Reply With Quote
  #2 (permalink)  
Old 07-28-09, 02:49
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by joshfolgado
My boss has tasked me to explain why we should design our DB's with constraints instead of being enforced through the API.
inform your boss that because the database does not enforce its own integrity, the data is of a low quality with plenty of unexpected issues

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #3 (permalink)  
Old 07-28-09, 03:35
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
This always gets my goat - why is enforcing relational integrity a "decision" that should be justified? The "courageous" decision was leaving it out in the first place.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #4 (permalink)  
Old 07-28-09, 04:03
joshfolgado joshfolgado is offline
Registered User
 
Join Date: Jul 2009
Posts: 7
Thanks for your comments

I know and agree that data integrity is totally essential and it is crazy that a person has to justify this.

Not having data integrity does indeed mean that companies have unexpected errors that most times cannot be duplicated.

But unfortunately I do have to justify that we need data integrity where database constraints enforce data quality instead of using API driven constraints.

To give you an example, we even have fields called "sequence" to save at which order the records are inserted instead of using an auto-incrementing column that is also a primary key.

I am looking for a document that proves that data integrity is very important and preferably this document follows some international standard.

Thanks to all for your help

Josh
Reply With Quote
  #5 (permalink)  
Old 07-28-09, 04:15
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
Your sequence example is not related to integrity. Integrity ensures the data is valid, not that it is correct.

Take your pick of pretty well any and all relational database design books and articles written in the last thirty years.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #6 (permalink)  
Old 07-28-09, 04:18
pootle flump pootle flump is offline
King of Understatement
 
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
In fact, I wouldn't even bother with articles and standards.

Pull out some of the crappy data and ask them "do you want nonsense like this in your database or do you want valid data you can use?". A semi competent business aware person isn't going to need to see a dry, academic article by Chris Date to know that garbage in the database is not going to support his business.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
Reply With Quote
  #7 (permalink)  
Old 07-28-09, 09:42
blindman blindman is offline
World Class Flame Warrior
 
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
If you do not enforce relational integrity in the database (not in the middle tier or application layer), then it is only a matter of "when", and not "if", your data becomes corrupted.
__________________
If it's not practically useful, then it's practically useless.

blindman
www.chess.com: "sqlblindman"
Reply With Quote
  #8 (permalink)  
Old 07-28-09, 12:31
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
You've asked two different questions. The one in the title of this thread and the one in the actual post.

The first one is easy and you've answered it yourself: Integrity is important because without it you will get inconsistent answers from your data.

The second question is why enforce integrity constraints in the database rather than the application. Most SQL DBMSs can only enforce a limited range of constraints. Even the most basic referential integrity support (foreign keys) is very limited in most SQL systems. So it's a fact that many integrity constraints are often enforced in application code or in other places outside the DBMS such as rules engines or a data integration tier.

However, where it is possible to use the DBMS for constraint enforcement then there are important advantages to doing so:

- Integrity logic only has to be implemented and maintained in one place (the database) rather than in many different applications.

- By decoupling integrity constraints from the application code change control becomes more straightforward.

- Integrity is (usually) guaranteed when implemented in the database, ie. it cannot be bypassed as easily as the application can.

- Constraint enforcement in the database is often more efficient than is possible in the application because it doesn't have to be done over the network layer.

- Constraints in the database are metadata available to the query optimiser, which may take advantage of them to make queries and other code more efficient.

- Constraints in the database are also self-documenting for developers and are available to other tools such as data modelling software or integration software.

But as I already said, it's almost inevitable that some or even most integrity constraints be enforced only through application code. Regretably that's the state of DBMS software in the industry today.

Last edited by dportas; 07-28-09 at 12:35.
Reply With Quote
  #9 (permalink)  
Old 07-28-09, 13:54
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by dportas
But as I already said, it's almost inevitable that some or even most integrity constraints be enforced only through application code. Regretably that's the state of DBMS software in the industry today.
you made a number of very valuable and informative points

and then you really harshed my mellow with that bummer of a conclusion

it's downright depressing

you're as much as saying that data integrity can't be completely accomplished without application integrity checks, leaving the inexperienced reader to conclude that he might as well do it all in the application

the rule of thumb should be: do as much as you can with the database, and do only the bits that the database cannot do in the application (and these bits are usually obscure and rare)

your tirade against the inadequacies of current relational database products is getting tiresome

no offence intended towards you personally, of course

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book

Last edited by r937; 07-28-09 at 13:59.
Reply With Quote
  #10 (permalink)  
Old 07-28-09, 14:56
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Rudy,

My conclusion is indeed depressing, I agree. But it is a widely recognised problem and one which the industry has spent billions fixing and continues to try to solve. It's not just me who thinks so, Bill Gates agrees (even though he wrongly thinks the relational model is to blame). It's a plain fact that businesses spend $millions on rules engines and data quality software because of the limitations of the DBMSs they use. So it's clearly a hot and very practical issue for lots of people.

Gates talks up declarative modeling language effort | Developer World - InfoWorld
Business Rules Community (BRCommunity.com): The world's most trusted resource for business rule professionals

Of course I agree that constraints should be enforced in the database whenever possible but in practice many of them can't be. It isn't an obscure or rare problem at all. In fact the commonest such limitation occurs in most databases I know (even yours I suspect). A generic example would be to enforce the rule that an Order must include at least one Product, or an Invoice must have at least one detail line. That's a very basic business rule but many SQL systems can't do it at all. The ones that can require lots of messy procedural code and inefficient compromises.
Reply With Quote
  #11 (permalink)  
Old 07-28-09, 15:57
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by dportas
... BRCommunity.com
omg, ron ross, i met him back in the '80s

i have a ton of his newsletters in a box somewhere

*now* i know where you're coming from


__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #12 (permalink)  
Old 07-28-09, 16:07
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by dportas
...messy procedural code and inefficient compromises.
i got yer compromise right here...

use an INNER JOIN on all data retrievals

(declaring a VIEW is neither messy nor inefficient)

you will never see an order until it has a product, nor an invoice until it has a detail line

vwalah

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
  #13 (permalink)  
Old 07-28-09, 16:41
dportas dportas is offline
Registered User
 
Join Date: Dec 2007
Location: London, UK
Posts: 732
Quote:
Originally Posted by r937
i got yer compromise right here...

use an INNER JOIN on all data retrievals
I hope you are joking but I can't help stating the obvious in case anyone thinks you are serious. Not everyone here will recognise your sense of irony Rudy!

Using INNER JOINs as a substitute for referential integrity means that users can enter inconsistent data but then it disappears and cannot be retrieved. Most good testers would consider that a bug. At least I hope so. If not, then you should probably hire some better testers.

If it ever did make it into production then you would be multiplying the problems many-fold because the joins will exclude rows from every table that references the inconsistent rows you are hiding from users. Then maybe the data model or the queries change slightly, maybe some rows reappear and some don't. One day of course someone has to come along and clear up the mess and anyone who has done large scale data integration or migration projects will know what kind of pain is caused by unenforced constraints.

Referential constraints and other business rules should always be enforced as far as is practical - either in the database or in application code or via some other enforcement mechanism.

Don't rely on application code, or views, or queries just to conceal the problems in your data. That's definitely a worse option than putting constraints into application code.

Last edited by dportas; 07-28-09 at 16:46.
Reply With Quote
  #14 (permalink)  
Old 07-28-09, 20:11
r937 r937 is offline
SQL Consultant
 
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
Quote:
Originally Posted by dportas
Referential constraints and other business rules should always be enforced as far as is practical
[emphasis added]

you're the man now, dog

__________________
r937.com | rudy.ca
please visit Simply SQL and buy my book
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On