| |
|
If this is your first visit, be sure to check out the FAQ by clicking the link above.
You may have to register before you can post: click the register link above to proceed.
To start viewing messages, select the forum that you want to visit from the selection below.
|
 |

07-28-09, 02:45
|
|
Registered User
|
|
Join Date: Jul 2009
Posts: 7
|
|
|
Why is data integrity so important
|
|
Hi,
We currently have DB's where the constraints are enforced through the API but the DB itself has no constraints.
The problem is that because the DB does not enforce its own integrity, the data is of a low quality with plenty of unexpected issues.
My boss has tasked me to explain why we should design our DB's with constraints instead of being enforced through the API.
Can you help?
Thanks
Josh
|
|

07-28-09, 02:49
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by joshfolgado
My boss has tasked me to explain why we should design our DB's with constraints instead of being enforced through the API.
|
inform your boss that because the database does not enforce its own integrity, the data is of a low quality with plenty of unexpected issues

|
|

07-28-09, 03:35
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
|
|
This always gets my goat - why is enforcing relational integrity a "decision" that should be justified? The "courageous" decision was leaving it out in the first place.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

07-28-09, 04:03
|
|
Registered User
|
|
Join Date: Jul 2009
Posts: 7
|
|
|
Thanks for your comments
I know and agree that data integrity is totally essential and it is crazy that a person has to justify this.
Not having data integrity does indeed mean that companies have unexpected errors that most times cannot be duplicated.
But unfortunately I do have to justify that we need data integrity where database constraints enforce data quality instead of using API driven constraints.
To give you an example, we even have fields called "sequence" to save at which order the records are inserted instead of using an auto-incrementing column that is also a primary key.
I am looking for a document that proves that data integrity is very important and preferably this document follows some international standard.
Thanks to all for your help
Josh
|
|

07-28-09, 04:15
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
Your sequence example is not related to integrity. Integrity ensures the data is valid, not that it is correct.
Take your pick of pretty well any and all relational database design books and articles written in the last thirty years.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

07-28-09, 04:18
|
|
King of Understatement
|
|
Join Date: Feb 2004
Location: One Flump in One Place
Posts: 14,905
|
|
In fact, I wouldn't even bother with articles and standards.
Pull out some of the crappy data and ask them "do you want nonsense like this in your database or do you want valid data you can use?". A semi competent business aware person isn't going to need to see a dry, academic article by Chris Date to know that garbage in the database is not going to support his business.
__________________
Testimonial:
Quote:
pootle flump
ur codings are working excelent.
|
|
|

07-28-09, 09:42
|
|
World Class Flame Warrior
|
|
Join Date: Jun 2003
Location: Ohio
Posts: 11,726
|
|
If you do not enforce relational integrity in the database (not in the middle tier or application layer), then it is only a matter of "when", and not "if", your data becomes corrupted.
__________________
If it's not practically useful, then it's practically useless.
blindman
www.chess.com: "sqlblindman"
|
|

07-28-09, 12:31
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
You've asked two different questions. The one in the title of this thread and the one in the actual post.
The first one is easy and you've answered it yourself: Integrity is important because without it you will get inconsistent answers from your data.
The second question is why enforce integrity constraints in the database rather than the application. Most SQL DBMSs can only enforce a limited range of constraints. Even the most basic referential integrity support (foreign keys) is very limited in most SQL systems. So it's a fact that many integrity constraints are often enforced in application code or in other places outside the DBMS such as rules engines or a data integration tier.
However, where it is possible to use the DBMS for constraint enforcement then there are important advantages to doing so:
- Integrity logic only has to be implemented and maintained in one place (the database) rather than in many different applications.
- By decoupling integrity constraints from the application code change control becomes more straightforward.
- Integrity is (usually) guaranteed when implemented in the database, ie. it cannot be bypassed as easily as the application can.
- Constraint enforcement in the database is often more efficient than is possible in the application because it doesn't have to be done over the network layer.
- Constraints in the database are metadata available to the query optimiser, which may take advantage of them to make queries and other code more efficient.
- Constraints in the database are also self-documenting for developers and are available to other tools such as data modelling software or integration software.
But as I already said, it's almost inevitable that some or even most integrity constraints be enforced only through application code. Regretably that's the state of DBMS software in the industry today.
|
Last edited by dportas; 07-28-09 at 12:35.
|

07-28-09, 13:54
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by dportas
But as I already said, it's almost inevitable that some or even most integrity constraints be enforced only through application code. Regretably that's the state of DBMS software in the industry today.
|
you made a number of very valuable and informative points
and then you really harshed my mellow with that bummer of a conclusion
it's downright depressing
you're as much as saying that data integrity can't be completely accomplished without application integrity checks, leaving the inexperienced reader to conclude that he might as well do it all in the application
the rule of thumb should be: do as much as you can with the database, and do only the bits that the database cannot do in the application (and these bits are usually obscure and rare)
your tirade against the inadequacies of current relational database products is getting tiresome
no offence intended towards you personally, of course

|
Last edited by r937; 07-28-09 at 13:59.
|

07-28-09, 14:56
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
Rudy,
My conclusion is indeed depressing, I agree. But it is a widely recognised problem and one which the industry has spent billions fixing and continues to try to solve. It's not just me who thinks so, Bill Gates agrees (even though he wrongly thinks the relational model is to blame). It's a plain fact that businesses spend $millions on rules engines and data quality software because of the limitations of the DBMSs they use. So it's clearly a hot and very practical issue for lots of people.
Gates talks up declarative modeling language effort | Developer World - InfoWorld
Business Rules Community (BRCommunity.com): The world's most trusted resource for business rule professionals
Of course I agree that constraints should be enforced in the database whenever possible but in practice many of them can't be. It isn't an obscure or rare problem at all. In fact the commonest such limitation occurs in most databases I know (even yours I suspect). A generic example would be to enforce the rule that an Order must include at least one Product, or an Invoice must have at least one detail line. That's a very basic business rule but many SQL systems can't do it at all. The ones that can require lots of messy procedural code and inefficient compromises.
|
|

07-28-09, 15:57
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by dportas
... BRCommunity.com
|
omg, ron ross, i met him back in the '80s
i have a ton of his newsletters in a box somewhere
*now* i know where you're coming from

|
|

07-28-09, 16:07
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by dportas
...messy procedural code and inefficient compromises.
|
i got yer compromise right here...
use an INNER JOIN on all data retrievals
(declaring a VIEW is neither messy nor inefficient)
you will never see an order until it has a product, nor an invoice until it has a detail line
vwalah

|
|

07-28-09, 16:41
|
|
Registered User
|
|
Join Date: Dec 2007
Location: London, UK
Posts: 732
|
|
Quote:
|
Originally Posted by r937
i got yer compromise right here...
use an INNER JOIN on all data retrievals
|
I hope you are joking but I can't help stating the obvious in case anyone thinks you are serious. Not everyone here will recognise your sense of irony Rudy!
Using INNER JOINs as a substitute for referential integrity means that users can enter inconsistent data but then it disappears and cannot be retrieved. Most good testers would consider that a bug. At least I hope so. If not, then you should probably hire some better testers.
If it ever did make it into production then you would be multiplying the problems many-fold because the joins will exclude rows from every table that references the inconsistent rows you are hiding from users. Then maybe the data model or the queries change slightly, maybe some rows reappear and some don't. One day of course someone has to come along and clear up the mess and anyone who has done large scale data integration or migration projects will know what kind of pain is caused by unenforced constraints.
Referential constraints and other business rules should always be enforced as far as is practical - either in the database or in application code or via some other enforcement mechanism.
Don't rely on application code, or views, or queries just to conceal the problems in your data. That's definitely a worse option than putting constraints into application code.
|
Last edited by dportas; 07-28-09 at 16:46.
|

07-28-09, 20:11
|
|
SQL Consultant
|
|
Join Date: Apr 2002
Location: Toronto, Canada
Posts: 19,524
|
|
Quote:
|
Originally Posted by dportas
Referential constraints and other business rules should always be enforced as far as is practical
|
[emphasis added]
you're the man now, dog

|
|
| Thread Tools |
Search this Thread |
|
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
|
|