If this is your first visit, be sure to check out the FAQ by clicking the link above. You may have to register before you can post: click the register link above to proceed. To start viewing messages, select the forum that you want to visit from the selection below.

 
Go Back  dBforums > General > Database Concepts & Design > When to use XML instead of database tables for detailed structures

Reply
 
LinkBack Thread Tools Search this Thread Display Modes
  #1 (permalink)  
Old 06-03-06, 20:57
bitstacker bitstacker is offline
Registered User
 
Join Date: May 2006
Posts: 6
When to use XML instead of database tables for detailed structures

I'm looking for links that discuss using XML instead of multiple tables for detailed, key-value-specific data structures.

I'm seeing certain control structures in my DB design that form little hierarchies of values. These could be rendered in the DB using several "small" tables, or in an XML document that's stored in a single field in the parent records of these structures.

I'd like to know if there are any general guidelines or lessons learned about when to stop building smaller table structures and when to use XML in these cases.

Any guidance on this would be appreciated.

Thanks!
Reply With Quote
  #2 (permalink)  
Old 06-05-06, 05:57
r123456 r123456 is offline
Registered User
 
Join Date: Sep 2003
Location: The extremely Royal borough of Kensington, London
Posts: 778
Yuck!

http://thedailywtf.com/forums/60879/ShowPost.aspx

Not only do you encounter the problems that are discussed in the article to which the above link points, but most likely your database will not have a single unique constraint nor any foreign key constraints.
__________________
Bessie Braddock: Winston, you are drunk!
Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.
Reply With Quote
  #3 (permalink)  
Old 06-05-06, 12:16
bitstacker bitstacker is offline
Registered User
 
Join Date: May 2006
Posts: 6
Thanks for the input. The url definitely landed right on the problem (though it seems to have revealed some "religious" sentiments on the subject).

I've concluded that the question is largely motivated by my inexperience (and thus trepidation) about the relational model, and that in the detailed use cases it's going to fall out in favor of tables most of the time.

From pure SQL processing there's no question that XPATH is a headache. I'm planning to use a lot of DataTables in memory, though (perhaps another theological discussion :0), where I could encapsulate the XPATH processing in subroutines....but even then, I'd be making the DB harder to access if there are other uses in the future.

Maybe that's a strong determining factor. If XML looks "easier" now, would it be "easier" for extensions of the application in the future? Sure, "those programmers" could learn and use XPATH, but isn't that just another barrier to extensibility of the app?

I'd say that insight right there weighs heavily against the XML alternative, unless there's a partcularly strong reason to choose it.

Thanks again for the prompting on this.
Reply With Quote
  #4 (permalink)  
Old 06-05-06, 15:05
Pat Phelan Pat Phelan is offline
Resident Curmudgeon
 
Join Date: Feb 2004
Location: In front of the computer
Posts: 12,605
I've been down that path many times. Sometimes things go in favor of the database, sometimes in favor of application based processing (where the code "insures" the data integrity).

I actually did a comparative study three years ago, against several systems that had similar lifespans but drastically different data volumes. The system with the highest data volume used tables almost exclusively, and had referential integrity. We estimate that it cost about 13% more per function to code using RI, but there was almost zero cost for maintaining data integrity (there was some cost due to having to figure out how to match other systems that did not have DRI, so they had "bad data"). The applications that had application enforced integrity cost 300 to 5000 times as much for ongoing data maintenance (note that is times, not percent!).

The short answer is that if you're going to have very small numbers of transactions (a few hundred per year), by a very small group of users (mid-teens), then having the application handle the integrity can be worthwhile to save coding costs. If you are going to have any significant number of transactions (a thousand or more per year) or a significant number of users (more than twenty), you will probably save money in the long run by using tables with declared referential integrity.

-PatP
Reply With Quote
  #5 (permalink)  
Old 06-05-06, 19:02
bitstacker bitstacker is offline
Registered User
 
Join Date: May 2006
Posts: 6
Thanks for the input. That's a very useful perspective. I'm going to be doing the "application processing" using VS2005 Datasets, which offer to provide data integrity...I'll be taking that offer much more seriously, based on your comments.
Reply With Quote
  #6 (permalink)  
Old 06-06-06, 04:40
r123456 r123456 is offline
Registered User
 
Join Date: Sep 2003
Location: The extremely Royal borough of Kensington, London
Posts: 778
A very wise move would be to develop all of the data processes in the database, given that data processes interact with data that would be stored in the database.

It still amazes me that people believe applications should handle data processes. Funny, I've seen this happen in reality and the resulting systems are horrendous - extremely slow, extremely difficult to maintain, and full of corrupt data.

Further a change to a data process as defined from someone non-technical, which is how it should work, can be compared in terms of development costs by considering the following:

1 line amendment to a stored procedure

vs

1000 line amendment to some C# code

The stored procedure can be tested very easily and quickly, where in contrast the application change would take days to test at minimum.

Regarding DataSets, I fail to see how these can add any value to your application at all. Simply return a cursor to the result set in the database and use the strengths of a web language like ASP.NET correctly, which are certainly not data processing.
__________________
Bessie Braddock: Winston, you are drunk!
Churchill: And Madam, you are ugly. And tomorrow, I'll be sober, and you will still be ugly.
Reply With Quote
  #7 (permalink)  
Old 06-06-06, 13:40
bitstacker bitstacker is offline
Registered User
 
Join Date: May 2006
Posts: 6
Re: Datasets

I'm a newbie to contemporary DB design. Perhaps the biggest uncertainty I face is actual performance metrics (response time to the user), which always seem to get down to "it depends".

The overall scenario for my app is described here:

General DB Design

Datasets are presented in the literature on ASP.NET 2.0 as being a representation of SQL Server tables, with relationships, in memory, in a tool set that includes TableAdapters (DataAdapters, at the DataSet level only, in ASP.NET 1.x) and relationships (which again can offer referential integrity). Their purpose (again from the literature, and this is my interpretation) is to provide an "intelligent" caching layer. TableAdapters also (for me) provide an "interface" mechanism between code and stored procedures. For me, this means that the SQL commands can be encapsulated, and that the application writing can focus on business logic. TableAdapters also automagically generate update, delete and insert commands, which is an unqualified plus for me, not being that familiar with SQL.

The I/O requirements for my app are mostly 1) present a selection of rows in a Gridview, and 2) insert rows back into the table. In the first release there won't be a lot of need for unions, joins, distincts and so on ("and so on" = "I don't know this stuff very well"). However, I do want to allow for complex operations on the DB in the future...that was the deciding reason to adopt the principle of "relational" over "xml" for the control tables.

I've made two major assumptions here about what Microsoft is doing with Datasets. 1) They've anticipated the problems of referential integrity and "data processing" code maintenance, and 2) they think that there is some performance benefit to using cached datatables (which is a strong enough benefit to outweigh any lost performance due to the TableAdapter interface). I haven't been able to net-find anything about performance issues with TableAdapters...maybe they're too new, but I would think that would have come out.

Since (from the overal scenario description above) the app is going to serve "slices" of the overall SQL table set to groups of users, with lots of refreshing of the row set views (in the ASP.NET 2.0 Gridview UI control), using DataTables seemed like a natural fit. From an ease of use standpoint, being a relative newbie to this type of coding, it's an attractive choice.

Going back to performance, one point I haven't validated one way or the other is whether I could get the caching provided by the DataTable/TableAdapter approach through built in, internal features of SQL Server. I'm assuming the DB engine is doing a lot of intelligent internal caching and indexing based on "what the users are asking for", so there might be some redundancy in using DataTables. On the other hand, since DataTables and TableAdapters are very closely integrated with SQL Server 2005, I'm assuming that MS has anticipated that and offered DataTables etc. as a genuine additional benefit on top of internal SQL Server caching.

There are lots of reasons to question Microsoft technology, and plenty of examples of MS products that have issues, but MS put a whole lot of work into the DataTable/TableAdapter/SQL Server techonology (and I'm sure I've barely scratched the surface in terms of using it), and they know they're competing head to head with JSEE + every other DB on the planet. I assume they tried to make it work as well as they could.

So, when an internet distributed app requires HTML, JS, CSS, VB or C-something, XML, SQL, and what-did-I-leave-out, for a non-professional programmer like myself, datasets (meaning DataTables etc.) and the integration they provide look like a pretty good deal.
Reply With Quote
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is Off
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On