Re: Datasets
I'm a newbie to contemporary DB design. Perhaps the biggest uncertainty I face is actual performance metrics (response time to the user), which always seem to get down to "it depends".
The overall scenario for my app is described here:
General DB Design
Datasets are presented in the literature on ASP.NET 2.0 as being a representation of SQL Server tables, with relationships, in memory, in a tool set that includes TableAdapters (DataAdapters, at the DataSet level only, in ASP.NET 1.x) and relationships (which again can offer referential integrity). Their purpose (again from the literature, and this is my interpretation) is to provide an "intelligent" caching layer. TableAdapters also (for me) provide an "interface" mechanism between code and stored procedures. For me, this means that the SQL commands can be encapsulated, and that the application writing can focus on business logic. TableAdapters also automagically generate update, delete and insert commands, which is an unqualified plus for me, not being that familiar with SQL.
The I/O requirements for my app are mostly 1) present a selection of rows in a Gridview, and 2) insert rows back into the table. In the first release there won't be a lot of need for unions, joins, distincts and so on ("and so on" = "I don't know this stuff very well"). However, I do want to allow for complex operations on the DB in the future...that was the deciding reason to adopt the principle of "relational" over "xml" for the control tables.
I've made two major assumptions here about what Microsoft is doing with Datasets. 1) They've anticipated the problems of referential integrity and "data processing" code maintenance, and 2) they think that there is some performance benefit to using cached datatables (which is a strong enough benefit to outweigh any lost performance due to the TableAdapter interface). I haven't been able to net-find anything about performance issues with TableAdapters...maybe they're too new, but I would think that would have come out.
Since (from the overal scenario description above) the app is going to serve "slices" of the overall SQL table set to groups of users, with lots of refreshing of the row set views (in the ASP.NET 2.0 Gridview UI control), using DataTables seemed like a natural fit. From an ease of use standpoint, being a relative newbie to this type of coding, it's an attractive choice.
Going back to performance, one point I haven't validated one way or the other is whether I could get the caching provided by the DataTable/TableAdapter approach through built in, internal features of SQL Server. I'm assuming the DB engine is doing a lot of intelligent internal caching and indexing based on "what the users are asking for", so there might be some redundancy in using DataTables. On the other hand, since DataTables and TableAdapters are very closely integrated with SQL Server 2005, I'm assuming that MS has anticipated that and offered DataTables etc. as a genuine additional benefit on top of internal SQL Server caching.
There are lots of reasons to question Microsoft technology, and plenty of examples of MS products that have issues, but MS put a whole lot of work into the DataTable/TableAdapter/SQL Server techonology (and I'm sure I've barely scratched the surface in terms of using it), and they know they're competing head to head with JSEE + every other DB on the planet. I assume they tried to make it work as well as they could.
So, when an internet distributed app requires HTML,
JS, CSS,
VB or C-something, XML, SQL, and what-did-I-leave-out, for a non-professional programmer like myself, datasets (meaning DataTables etc.) and the integration they provide look like a pretty good deal.