Design Wizz's -Banana Splits - Should I split our database?
We have a large database for our main product which collects location data from about 10,000 devices (Device Type 1). The database also contains the device configuration and firmware data, which is used to manage the devices. We have one large customer who generates most of our business and who is responsible for 90% of the data. The current location and journey tables in the database count 100 million records. We are currently inserting records into a single database at 10 locations records per second.
This year we are to launch a new type of device (Device Type 2) which uses a different front end application and will be used by a number of different customers across Europe and the US.
There is a bit of divide in the office as whether or not we should split the database into composite parts:
Database 1) Device Manager - a central repository of all devices
Database 2) Device Type 1 data
Database 3) Device Type 2 data
Database 1) Keep all the data for both device types in the same database.
Like I mentioned the device types do share commonality, the all have a device record, they send locations, they have firmware, they have contacts and customers.
My colleague is worried that if we keep the data in one database and something goes wrong both products go down. His argument is that if they are split into several database new versions of the database can be deployed without affecting the other.
Personally, I like having them in one database as the maintainance is managable and because the device share commonality in terms of data.
Does anyone have a view or experience of this type of design problem? That is, what are the design rules for when to create a new database? If you want to know anymore details I'm more than happy to elaborate. I've searched the interweb for design guidance on this problem but I've not seen any concrete views on this type of problem.
Your question is really about availability - how to ensure the desired up-time of your database system. The answer will depend on what software, hardware and process you have as part of your availability planning. Avoiding a single point of failure is a good strategy but that doesn't mean you need more than one database because there are systems that will allow any database to failover to a mirror copy or to a standby server.
By itself, just partitioning the problem into two is not much of an high availability strategy at all.
Thanks for the responses pootle flump and dportas.
I suppose the question is in part about availabilty. Taking the design part of the question further when is it prudent to create a new database? I mean, what drives the decision in your experience to create a new database? Does it all boil down to the data, if the applications share any data commonality or entities data should be stored in one large database?
There's probably no hard and fast rule, just looking for your opinions.
The gist is that a single data model should typically map to a single database (if you maintain multiple identical copies of a database for availability purposes, this is considered a single database). Anytime you are considering N databases of the same structure containing different data, it is time to stop and think about what you are about to do.
Have a google, and look in this forum, at sub types and super types for modelling entities that share characteristics.
This is a decent article too (it covers this topic by kind of mapping the concept to OOP): Implementing Table Inheritance with SQL Server