After reading articles on multi-tenancy, specifically shared db/schema designs, it is recommended that a tenant ID be included in every table. Is this really true? For example, if I wanted to offer forums in my (hypothetical) SaaS app to my tenants, I might have something that looks like the following tables and fields:
Categories (Fields: Category ID, Tenant ID, Name)
Forums (Fields: Forum ID, Category ID, Tenant ID, Name)
Posts (Fields: Post ID, Forum ID, User ID, Tenant ID, Title, Text)
Here is my question: why would I need the Tenant ID in the Posts table? Couldnt I just do a join on the Tenant, Users, and TenantXrefUsers tables? These latter tables might look like:
Tenant (Fields: Tenant ID, Name)
Users (Fields: User ID, First Name , Last Name)
TenantXrefUsers (Fields: Tenant ID, User ID)
Is it because performing joins is costly?
Thanks in advance for your helping me with my curiosity.
It depends on your business rules, what you're trying to accomplishing, and how much of this you want to enforce in your database.
For example, if you wish to enforce that users can only post in forums to which their company belongs, it's pretty easy to do with a referential integrity constraint:
create table post (
tenant_id int not null,
forum_id int not null,
user_id int not null,
foreign key (tenant_id, forum_id) references forum (tenant_id, forum_id),
foreign key (tenant_id, user_id) references tenant_user (tenant_id, user_id),
Without carrying the tenant id, you must enforce this in your application logic.
also, with having tenant id in every table, you could access the tables via views. the views would be defined as select * from table where tenant id = @tenant. You would set the @tenant, based on the customer that signed in.
You can access data through views regardless of whether you include TenantID in every table. That's what views are for, as a matter of fact. And the SQL example you gave would work as well on a table as on a view.
Also, "SELECT *..." is not the best-practices method for creating views. Enumerate your fields.
If it's not practically useful, then it's practically useless.
All...thanks for your replies. From reading your comments, I gather that the ultimate answer is: yes, a tenant_id should probably be used in every table - but it could be left out, depending on business rules and what I'm trying to accomplish.
As per your specification for Tenant, Users, and TenantXrefUsers tables; there seems to be an n-n relationship between the tenants and users. A user can be part of many tenants.
Now consider you didn't have a Tenant ID in your POSTS table,
Lets say a tenant decides that a particular user has been violating code of conducts and decided to delete all its post.
Your delete query by joining Posts, Tenant, Users, and TenantXrefUsers shall delete all posts of that users across all tenants.
Its a good practice to have Tentant ID for all tables, unless some specific business logic.