(newbie alert...at least with this level of DB design)
I'm about to commit to a major DB design for SQL Server Express and ASP.NET, and, having never done this before (except piecemeal in Access), I thought I'd see if I could get some views and suggestions about my approach.
I'm hoping the general structure of my data organization is not unique and that people have seen things like this and that there are "best practices" that might apply.
Due to various constraints (my ability, time and budget being foremost), I'm currently building for a one-server operation, hopefully leaving design capacity so I can expand to a server farm environment later (after wild success on the single server
I'm building an app that is a "service" in which "projects" are conducted around "sessions" (interactive group sessions) that produce "workproducts".
In the design, here are several major tables required to run the service, and about a dozen tables required to run a project...call this second group the "Project Files Set".
A "large" project (80% percentile) might have 2MB of data, with maybe a few thousand records in all the different "Project File Set" tables.
Projects can be "active" (being worked on now), "open" (worked on maybe yesterday and tommorrow, but not right now) or "closed" (no longer worked on, only being archived.)
The primary question here is about moving projects between "active" and "open" status, in order to get better I/O performance for the "active" projects.
In the "full server" scenario, there may be 100 active projects and 1000 open projects.
I'm planning to use a cache memory (dataset/datatable objects) to keep most the data from an "active project" in memory. I plan to persist any change to this data immediately back to disk. 90%+ of the changes will be row inserts.
Maybe 1000 row inserts per minute = total load from 100 active projects (I suspect between memory use and CPU use I'll use up capacity on a single server well before I run out of disk, in terms of increasing active projects per server...but that's just a guess.)
Given this background, I am (or was) looking at four major alternatives for the overall data structure:
Alternative A): Make a separate DB for the "Active" projects and the "Open" projects.
>>>> When a project is "active" (someone has logged on and wants to use it), the data is selected by project GUID from the "Open" projects database and copied into the "Active" projects database, and marked as "in use" in the "Open" database. When the project becomes "inactive", it is copied from "Active" back to "Open" database, and deleted from the "Active" database.
>>> A1) In this alt, each database contains one "Project File Set"...that is, the "Projects" file contains a "Project Header Record" for each open project; the "Sessions" file contains a "Session Header Record" for each session (linked to its project), and so on...all the data for all "active" or all "open" projects is in the same file set, separated by different project GUID values.
>>> A2) The purpose of separating the "Active" from the "Open" is to reduce disk I/O time for i) loading memory from "Active" files and ii) inserting and updating records back to "Active" files.
>>>> The "Active" files might represent 100 projects where the main "workproduct" files are ~100,000 records long (including all active projects) where the "Open" files might represent 1000 projects and the workproduct files are ~1M records long.
>>>> Maybe easier to develop...just copy the "Active" to the "Open" when the detailed design has settled down? Use Bulk Copy routines for moving the data between the two?
Alternative B): In this alternative the "Active" and "Open" files are in the same database, with different names: "Projects_Active" & "Projects_Open"; "Sessions_Active" & "Sessions_Open", and so on for all the other files in the "Project File Set".
>>>> May a long file list, but easier to copy records between files?
Alternative C): Use some kind of blob repository to store the "open" projects tables when they're not "active"
>>> I looked at this but then couldn't see that it would save any space or be more secure, and would just create a "serialize/deserialize" function that wouldn't be any simpler than the queries required to move multiple files between "active" and "open".
Alternative D): Make a separate DB for each Project. Use an "template" DB and copy it, rename it, and initialize it for each Project.
>>> I've gradually learned this is "bad design" (I suspect it's "terrible design"); plus I couldn't figure out how to point a TableAdapter in ASP.NET at multiple connection strings. This seemed so far off the beaten track that I decided it's out.
So, I'm hoping to get some feedback to help:
1) confirm that alternatives C) and D) don't make sense, and
2) understand why one or the other of A) or B) would be better.
3) or, rethink the whole thing with another general approach.