Results 1 to 4 of 4
  1. #1
    Join Date
    Jul 2009
    Posts
    2

    Question Data model for master data selectively overridden per customer

    Hi folks, I'm hoping for some input on this problem and my solution. I've not found a solution that doesn't have some drawback, but I'm sure someone has faced a modeling problem like this before.

    Thanks in advance for any help.
    ------------------------------------

    For this question, consider an application using a multi-tennant database with modeling of manufacturers and models. If we're talking about cars then the manufacturers would be Ford, Chevrolet, BMW, etc. and the models would be F-150, Camaro, and M3.

    Relationship of model to manufacturer is many to one. Data for each tenant is separated using a customer_id.

    Requirements for the data model:
    • Manufacturers and models can be defined at the master level to make them available to all customers
    • Customer select which of the master entities they would like to use
    • Customers may override attributes of a master model or manufacturer
    • Customers may create their own manufacturers
    • Customers may create their own models for their own or a master manufacturer
    • Other entities in the model will be related to these entities so it is desirable that there be one main table for each that a foreign key can be made to. The manufacturers and models tables fill that role in the example.


    In this example:
    • Customer 1 uses Ford as-is, overrides Chevrolet, and adds two custom manufacturers
    • Customer 1 uses Chevrolet and BMW as-is and adds one custom manufacturer
    • Models are created as per the comments in the script


    Below is an annotated sample implementation that meets all of the requirements. Where the syntax "attribute_1...attribute_n" is used I am simply leaving out the rest of the columns for brevity. It is not meant to imply a mapping of an arbitrary number of attributes.

    • How can this be improved?
    • In what other ways could these relationships be modeled?


    Code:
    	/*
    	 * Master manufacturers shared between all customers
    	 */
    	CREATE TABLE master_manufacturers (
    		master_manufacturer_id INTEGER NOT NULL,
    		name VARCHAR(100) NOT NULL,
    		attribute_1 VARCHAR(50),
    		/* ... */
    		attribute_n VARCHAR(50),
    		PRIMARY KEY (master_manufacturer_id)
    	);
    
    	INSERT INTO
    		master_manufacturers (master_manufacturer_id, name)
    	VALUES
    		(1, 'Ford'),
    		(2, 'Chevrolet'),
    		(3, 'BMW');
    
    	/*
    	 * A Customer's manufacturer.  
    	 *   If master_manufacturer_id IS NULL, then it is a custom manufacturer and manufacturer_custom contains the data
    	 *   If master_manufacturer_id IS NOT NULL and manufacturer_custom does not exist, then the master is used without modification
    	 *   If master_manufacturer_id IS NOT NULL and manufacturer_custom exists, then the master is overridden
    	 */
    	CREATE TABLE manufacturers (
    		manufacturer_id INTEGER NOT NULL,
    		customer_id INTEGER NOT NULL,
    		master_manufacturer_id INTEGER,
    		PRIMARY KEY (manufacturer_id),
    		FOREIGN KEY (master_manufacturer_id) REFERENCES master_manufacturers (master_manufacturer_id),
    		UNIQUE (customer_id, master_manufacturer_id)
    	);
    
    	INSERT INTO
    		manufacturers (manufacturer_id, customer_id, master_manufacturer_id)
    	VALUES
    		(1, 1, 1),
    		(2, 1, 2),
    		(3, 1, NULL),
    		(4, 1, NULL),
    		(5, 2, 2),
    		(6, 2, 3),	
    		(7, 2, NULL);	
    
    	CREATE TABLE manufacturer_custom (
    		manufacturer_id INTEGER NOT NULL,
    		name VARCHAR(100) NOT NULL,
    		attribute_1 VARCHAR(50),
    		/* ... */
    		attribute_n VARCHAR(50),
    		PRIMARY KEY (manufacturer_id),
    		FOREIGN KEY (manufacturer_id) REFERENCES manufacturers (manufacturer_id)
    	);
    
    	INSERT INTO
    		manufacturer_custom (manufacturer_id, name)
    	VALUES
    		(2, 'Chevy'),
    		(3, 'Cust 1 Custom 1'),
    		(4, 'Cust 1 Custom 2'),
    		(7, 'Cust 2 Custom 1');
    
    	/*
    	 * Master models shared between all customers
    	 */
    	CREATE TABLE master_models (
    		master_model_id INTEGER NOT NULL,
    		master_manufacturer_id INTEGER NOT NULL,
    		name VARCHAR(100) NOT NULL,
    		attribute_1 VARCHAR(50),
    		/* ... */
    		attribute_n VARCHAR(50),
    		PRIMARY KEY (master_model_id),
    		FOREIGN KEY (master_manufacturer_id) REFERENCES master_manufacturers (master_manufacturer_id)
    	);
    
    	INSERT INTO
    		master_models (master_model_id, master_manufacturer_id, name)
    	VALUES
    		(1, 1, 'F-150'),
    		(2, 1, 'F-250'),
    		(3, 1, 'Falcon'),
    		(4, 2, 'Camaro'),
    		(5, 2, 'Corvette'),
    		(6, 3, 'M3'),
    		(7, 3, '135i');
    
    	/*
    	 * A Customer's model.  
    	 *   If master_model_id IS NULL, then it is a custom model and model_custom contains the data
    	 *   If master_model_id IS NOT NULL and model_custom does not exist, then the master is used without modification
    	 *   If master_model_id IS NOT NULL and model_custom exists, then the master is overridden
    	 */
    	CREATE TABLE models (
    		model_id INTEGER NOT NULL,
    		master_model_id INTEGER,
    		manufacturer_id INTEGER NOT NULL,
    		attribute_1 VARCHAR(50),
    		/* ... */
    		attribute_n VARCHAR(50),
    		PRIMARY KEY (model_id),
    		FOREIGN KEY (master_model_id) REFERENCES master_models (master_model_id)
    	);
    
    	INSERT INTO
    		models (model_id, master_model_id, manufacturer_id)
    	VALUES
    		(1, 1, 1), -- F-150 for customer_1's Ford
    		(2, 2, 1), -- F-250 for customer_1's Ford
    		(3, 4, 2), -- Camaro for customer_1's Chevy
    		(4, 4, 5), -- Camaro for customer_2's Chevrolet
    		(5, 5, 5), -- Corvette for customer_2's Chevrolet
    		(6, 6, 6), -- M3 for customer_2's BMW
    		(7, NULL, 1), -- F-350 (custom) for customer_1's Ford
    		(8, NULL, 6), -- M7 (custom) for customer_2's BMW
    		(9, NULL, 7); -- Custom Model (custom) for customer_2's Custom Mfg
    
    
    	CREATE TABLE model_custom (
    		model_id INTEGER NOT NULL,
    		name VARCHAR(100) NOT NULL,
    		attribute_1 VARCHAR(50),
    		/* ... */
    		attribute_n VARCHAR(50),
    		PRIMARY KEY (model_id),
    		FOREIGN KEY (model_id) REFERENCES models (model_id)
    	);
    
    	INSERT INTO
    		model_custom (model_id, name)
    	VALUES
    		(7, 'F-350'), -- F-350 for customer_1's Ford
    		(8, 'M7'), -- M7 for customer_2's BMW
    		(9, 'Custom Model'); -- Custom Model for customer_2's Custom Mfg
    
    	/*
    	 * View for a customer's manufacturers
    	 */
    	CREATE VIEW vw_manufacturers AS
    		SELECT
    			m.customer_id,
    			m.manufacturer_id, 
    			COALESCE(cm.name, mm.name) AS name,
    			COALESCE(cm.attribute_1, mm.attribute_1) AS attribute_1,
    			/* ... */
    			COALESCE(cm.attribute_n, mm.attribute_n) AS attribute_n
    		FROM
    			manufacturers m
    		LEFT JOIN
    			master_manufacturers mm
    		USING
    			(master_manufacturer_id)
    		LEFT JOIN
    			manufacturer_custom cm
    		USING
    			(manufacturer_id);
    
    	/*
    	 * Manufacturers for customer_id 1
    	 */
    
    	SELECT manufacturer_id, name FROM vw_manufacturers WHERE customer_id = 1;
    
    	/*
    	 manufacturer_id |      name       
    	-----------------+-----------------
    		       1 | Ford
    		       2 | Chevy
    		       3 | Cust 1 Custom 1
    		       4 | Cust 1 Custom 2
    	*/
    
    	/*
    	 * Manufacturers for customer_id 2
    	 */
    	SELECT manufacturer_id, name FROM vw_manufacturers WHERE customer_id = 2;
    
    	/*
    	 manufacturer_id |      name       
    	-----------------+-----------------
    		       5 | Chevrolet
    		       6 | BMW
    		       7 | Cust 2 Custom 1
    	*/
    
    	/*
    	 * View for a customer's models
    	 */
    	CREATE VIEW vw_models AS
    		SELECT
    			mfg.customer_id,
    			mfg.manufacturer_id,
    			mfg.name AS manufacturers_name,
    			m.model_id,
    			COALESCE(cm.name, mm.name) AS name,
    			COALESCE(cm.attribute_1, mm.attribute_1) AS attribute_1,
    			/* ... */
    			COALESCE(cm.attribute_n, mm.attribute_n) AS attribute_n
    		FROM
    			vw_manufacturers mfg,
    			models m
    		LEFT JOIN
    			master_models mm
    		USING
    			(master_model_id)
    		LEFT JOIN
    			model_custom cm
    		USING
    			(model_id)
    		WHERE
    			mfg.manufacturer_id = m.manufacturer_id;
    
    	/*
    	 * Models for customer_id 1
    	 */
    	SELECT * FROM vw_models WHERE customer_id = 1;
    	/*
    	 customer_id | manufacturer_id | manufacturers_name | model_id |  name  
    	-------------+-----------------+--------------------+----------+--------
    		   1 |               1 | Ford               |        1 | F-150
    		   1 |               1 | Ford               |        2 | F-250
    		   1 |               2 | Chevy              |        3 | Camaro
    		   1 |               1 | Ford               |        7 | F-350
    	*/
    
    	/*
    	 * Models for customer_id 2
    	 */
    	SELECT * FROM vw_models WHERE customer_id = 2;
    	/*
    	 customer_id | manufacturer_id | manufacturers_name | model_id |     name     
    	-------------+-----------------+--------------------+----------+--------------
    		   2 |               5 | Chevrolet          |        4 | Camaro
    		   2 |               5 | Chevrolet          |        5 | Corvette
    		   2 |               6 | BMW                |        6 | M3
    		   2 |               6 | BMW                |        8 | M7
    		   2 |               7 | Cust 2 Custom 1    |        9 | Custom Model
    	*/

  2. #2
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by cope360
    I'm hoping for some input on this problem and my solution. I've not found a solution that doesn't have some drawback, but I'm sure someone has faced a modeling problem like this before.
    I'd just have the following tables and no views:
    • MasterManufacturers: name, other data ...
    • MasterModels: name, manufacturer_name, other data ...
    • CustomerManufacturers: name, alteredYN, showYN, other data ...
    • CustomerModels: name, manufacturer_name, alteredYN, other data ...

    Master data would be loaded into the master tables each time. From there it would be transfered to the customer tables. If you're creating a new customer record then set the showYN field to Y and the alteredYN field to N. You'd transfer in the master data each day into the master tables. You'd then update the customer tables from the master tables if the data is different (or new) and as long as the alteredYN flag is N in the customer table. The moment a user alters a manufacturer or a model then change that records altered flag to Y.

    If a customer doesn't want to see the data for a manufacturer then just set show flag to N. If a customer later wants to go back to the original master data then just set the altered back to N for all the records you want to revert.

    The advantages of this method might be much simpler code - you're just pulling all the data from one table with no fancy logic. You don't need to create fancy views for each new manufacturer. It will obviously be faster. The only downside is a slightly more complex upload of master data as you selectively decide which customer data to overwrite but it's not difficult. I'm not sure about using id's in this set up as the id for a given model (or manufacturer) will be different from customer database to customer database depending on what models they've added themselves - I think you'd be safer keeping with names.

    Mike

  3. #3
    Join Date
    Jul 2009
    Posts
    2
    Thanks for the reply, Mike. A few comments:

    Quote Originally Posted by mike_bike_kite

    The advantages of this method might be much simpler code - you're just pulling all the data from one table with no fancy logic. You don't need to create fancy views for each new manufacturer.
    I agree your suggestion is simpler since the application code will always access just one table. In my example, there would be no need to create a view per manufacturer -- just the one view that reconciles if the data comes from the master or the custom.

    Quote Originally Posted by mike_bike_kite

    I'm not sure about using id's in this set up as the id for a given model (or manufacturer) will be different from customer database to customer database depending on what models they've added themselves - I think you'd be safer keeping with names.
    In my case this is a multi-tenant SaaS application so I have control over all the ids. I'd use a sequence to assign them so that they are unique across all customers.

  4. #4
    Join Date
    Jun 2007
    Location
    London
    Posts
    2,527
    Quote Originally Posted by cope360
    I agree your suggestion is simpler since the application code will always access just one table. In my example, there would be no need to create a view per manufacturer -- just the one view that reconciles if the data comes from the master or the custom.
    I just read "CREATE VIEW vw_manufacturers" and assumed you were creating views ad hoc. In any case surely it is better to just grab the data from the customer table - if it's in there then that's what the customer wants to see assuming the manufacturer show flag is Y.


    Quote Originally Posted by cope360
    In my case this is a multi-tenant SaaS application so I have control over all the ids. I'd use a sequence to assign them so that they are unique across all customers.
    I have no idea what a multi-tenant SaaS application is but let's say there are 20 models and the ids go in sequence from 1 to 20. Now customer A creates a new model just for himself and gets the id 21. Then a new master model now comes out and customer A has the id 22 for this model while other customers get 21. Obviously there are many ways round this but all are more complicated than not using an id at all.

    In any case I'm sure that all manner of models can be made to work - some simply require more effort than others. You obviously have the advantage in that you've been thinking about the problem for a lot longer than I have (only 10 mins so far) and you also know your data (I don't know it all). I still suspect my method will be easier to implement and cause less headaches over time. It might be worth just thinking up some typical scenarios (and some nasty ones to) then seeing what happens when you use these methods on paper. If one looks more difficult on paper then I'll guarantee it will be a pain in real life.

    Mike

Posting Permissions

  • You may not post new threads
  • You may not post replies
  • You may not post attachments
  • You may not edit your posts
  •