Common Schema Design Patterns for SaaS — and How TiDB Helps
When you’re building a SaaS app, the first big question about your database usually isn’t SQL or NoSQL, it’s — “Where should I put all my tenants’ data?” And that simple question opens a deep rabbit hole of schema design choices, trade-offs, and a few “oh no” moments you’ll only discover at scale. I’ve been there. Here’s how the journey usually goes — and how TiDB helps you survive it.
Notice, what I will talk below is just in one database instances. I will share how to build a SaaS application with multi database instances later. Everything in One Table — “Let’s Just Start Simple” At the beginning, life is easy.
You make one table, add a tenant_id column, and boom — you have a multi-tenant system, like this:
CREATE TABLE orders ( tenant_id BIGINT, order_id BIGINT, customer_name VARCHAR(255), ... );
It feels clean. Straightforward. Every query is simple:
SELECT * FROM orders WHERE tenant_id = 123;
Pros:
- Super easy to build.
- No complicated routing or management.
- Great for early-stage SaaS with a handful of customers.
But as time goes on… Problems start showing up:
- Some tenants want extra columns that others don’t. You start adding columns, and suddenly you have a 200-column monster.
- TiDB allows up to 4096 columns, but trust me — if you ever get close to that number, you’re in trouble.
- Then come indexes — you add one for each new field, and soon it’s a nightmare. TiDB supports up to 512 indexes per table, I have to say that maybe we are the only distributed OLTP database that can handle this case, but I have still want to say that don’t test that limit mostly.
- And the worst part? Every ALTER TABLE affects all tenants. You just want to add a column for one customer, but everyone pays the price.
You can probably imagine the other pain points too — noisy neighbors, hot tenants overloading others, or the nightmare of managing data isolation at scale. What started as “simple” quickly turns into “fragile.”
Possible fixes:
- Build a related index table — one small, tenant-specific side table to handle special queries. Some of our SaaS customers use this approach effectively.
- Wait for partial index support (coming soon in TiDB). It’ll make this model much cleaner by letting you define tenant-specific indexes directly in the main table.
This approach works fine for small-to-medium scale. But as you grow, you’ll soon start craving more flexibility. JSON Columns — “Flexibility, Please!”
Eventually, you get tired of schema changes and say: “What if I just throw all tenant-specific fields into JSON?” And it works — at least at first.
CREATE TABLE tenant_data ( tenant_id BIGINT, data JSON );
You can now store everything in that JSON blob. Add a field? Just put it in there. No DDL. No schema migration. Instant happiness. Pros:
- No more DDL pain.
- Great flexibility — each tenant can evolve at their own pace.
But then you notice… Performance starts slipping. That innocent data->’$.user.email’ query suddenly scans the entire table. And because JSON fields aren’t automatically indexed, things get slower as data grows.
That’s when TiDB’s multi-valued index becomes your new best friend. It lets you index JSON keys and even arrays, so your queries stay fast even when the schema gets flexible. Perfect for storing tenant-specific settings, tags, configurations, or custom attributes. If you’re coming from MongoDB or another document database, this model will feel familiar.
But once your business scales, you’ll hit MongoDB’s limits — scalability bottlenecks, sharding pain, or inconsistent performance.
That’s why some high-traffic systems (like the one discussed in this FOSDEM talk Migrating 3B rows to TiDB for a high-traffic application are now migrating billions of JSON rows to TiDB — because it gives them the flexibility and scalability they need.
JSON isn’t a silver bullet. But if you want the document model plus strong SQL and scale, TiDB is the safer road. Each Tenant Has Its Own Database — “Millions of Databases”
Then one day, all customers asks: “Can we have our own database? We don’t want our data mixed with others.” So you do it.
Each tenant gets their own database, in each database, the tenant has the same table definitions or not. Pros:
- Great isolation — backup, restore, and deletion are all easier per tenant.
- Straightforward management — schema consistency is easier to enforce.
- Works well for compliance or custom SLAs.
But then… You wake up one day with millions of databases. Now you’ve got:
- DDL overhead — schema updates must propagate everywhere.
- Metadata explosion — system tables become massive.
TiDB was built for this kind of scale. It can literally manage millions of tables and databases with distributed metadata. And TiDB’s online DDL means you can roll out schema changes safely across thousands of schemas — without downtime, without fear.
You can also see how TiDB help Atlassian to solve this challenge How Atlassian Scales 3M+ Tenants: Multi‑Tenant SaaS on TiDB
The Real World: You’ll End Up with a Hybrid
In reality, most SaaS systems mix these patterns: | **Tenant Type** | **Pattern** | **Why** | |------------------|--------------|--------------------------| | Small tenants | Shared table | Cheap, simple, efficient | | Large tenants | Separate DB | Performance & compliance | | Custom tenants | JSON fields | Schema flexibility |
TiDB fits this hybrid world naturally. It scales horizontally — so you can start small, then grow into whatever model makes sense without ever rewriting your schema. What I Love About TiDB for SaaS
If I had to summarize why TiDB feels “right” for multi-tenant SaaS:
- It scales horizontally. You can have millions of tenants without breaking a sweat.
- It’s Serverless-ready. In TiDB Cloud, you only pay for what your tenants actually use — perfect for the long tail of inactive users.
- Online DDL. No more midnight schema updates or downtime anxiety.
- JSON + Multi-Valued Index. Schema flexibility meets performance.
- HTAP built-in. Analyze tenant data in real time — no ETL, no data warehouse needed.
Final Thoughts Schema design for SaaS is like parenting — everyone starts with simple rules, and then reality hits.
There’s no perfect model, just trade-offs that fit your stage of growth. But if your database can scale and stay flexible, you’re already ahead of the game. And that’s where TiDB really shines — it lets you grow from “one table for everyone” to “millions of tenants worldwide” without burning down your schema every six months.
And as always… If you’re curious what TiDB can do today, you can try it yourself in minutes with a free
Read the full article here: https://medium.com/@siddontang/common-schema-design-patterns-for-saas-and-how-tidb-helps-c9a5ec49aa53