5 Database Design Mistakes That Will Haunt You Later
These five database design decisions seem harmless at first. They become expensive problems at scale. Here's what to do instead from day one.
The Decisions That Don’t Seem Like Decisions
Database design is where shortcuts come to die. The schema you sketch on a whiteboard in week one becomes the foundation that every feature, every query, and every migration has to work with for years. Some design choices feel inconsequential when your database has 500 rows. They become existential when it has 5 million.
We’ve inherited, audited, and rebuilt enough databases to see the same patterns over and over. Here are the five mistakes that cause the most pain — and the fixes that would have taken minutes to implement at the start.
Mistake 1: Not Using UUIDs from the Start
Why It Seems Fine
Auto-incrementing integers are the default in almost every database tutorial. They’re simple, sequential, and human-readable. User #1042 is easier to reference in conversation than User 7f3a9b2c-4d1e-4f8a-b6c7-8d9e0f1a2b3c. When you’re building a single application with a single database, integer IDs work perfectly.
When It Bites You
The problems start when your system grows beyond a single database. If you ever need to merge databases, replicate data across regions, sync with a mobile client, or expose IDs in a public API, integer IDs create collisions and security risks.
Sequential integers also leak information. If a competitor creates an account on your platform and gets user ID 14,200, they know you have roughly 14,200 users. If they check back in a month and get ID 14,350, they know your growth rate. In APIs, sequential IDs enable enumeration attacks — an attacker can iterate through /api/users/1, /api/users/2, /api/users/3 to scrape your entire user base.
The Fix
Use UUIDs (v4 or v7) as primary keys from day one. In PostgreSQL, use the gen_random_uuid() function or the uuid-ossp extension. In SQL Server, use NEWID() for random UUIDs or NEWSEQUENTIALID() for index-friendly sequential ones.
If you’re concerned about index performance with random UUIDs (a valid concern for B-tree indexes), use UUIDv7, which embeds a timestamp component that keeps inserts roughly sequential. PostgreSQL 17+ supports UUIDv7 natively, and SQL Server’s NEWSEQUENTIALID() achieves a similar result.
The migration from integer to UUID primary keys on a table with millions of rows and foreign key relationships is one of the most painful database migrations that exists. Doing it from the start costs nothing.
Mistake 2: Skipping Soft Deletes
Why It Seems Fine
Hard deletes are clean. DELETE FROM users WHERE id = ? removes the row. The database is smaller. There’s no ambiguity about what’s active and what isn’t. Your queries don’t need WHERE deleted_at IS NULL on every statement.
When It Bites You
A customer calls: “I accidentally deleted my account, can you restore it?” With hard deletes, the answer is “only if we have a backup, and restoring it will take hours.” With soft deletes, the answer is “done.”
Beyond accidental deletion, soft deletes are often a compliance requirement. GDPR requires that you can demonstrate what data you held and when you removed it. Financial regulations may require audit trails that include deleted records. Healthcare applications under HIPAA can’t simply destroy records because someone clicked a button.
Even without regulatory pressure, hard deletes break referential integrity. If a user is deleted but their orders, comments, or transactions still reference their ID, you have orphaned records pointing to a ghost. Foreign key constraints will either block the delete entirely or cascade it in ways you didn’t intend.
The Fix
Add a deleted_at timestamp column (nullable) to every table that represents a business entity. NULL means active; a timestamp means deleted. Create a database view for each table that filters out deleted records, and use that view for all application queries. This gives you clean queries without remembering the WHERE clause every time.
In PostgreSQL, a partial index on WHERE deleted_at IS NULL ensures that the soft delete column doesn’t degrade query performance for active records. In SQL Server, filtered indexes achieve the same result.
Mistake 3: Over-Normalizing or Under-Normalizing
Why It Seems Fine
Database normalization — the process of organizing data to reduce redundancy — is taught as a virtue. And it is, to a point. Third normal form (3NF) is a solid default for most applications. The trouble is that developers treat normalization as a binary: either you normalize everything into a dozen joined tables or you stuff everything into one flat table.
When It Bites You
Over-normalization turns simple queries into multi-table joins that crawl as data grows. We’ve seen schemas where retrieving a user’s profile required joining seven tables — user, address, phone, email, preferences, roles, and permissions — for data that is always accessed together. Every page load hit the database with a query that the optimizer struggled to plan efficiently.
Under-normalization creates update anomalies. If you store a customer’s city name directly on the order table (instead of referencing a normalized address), updating that customer’s city means updating every order they’ve ever placed. Miss one and your data is inconsistent.
The Fix
Normalize for correctness, denormalize for performance, and document why. Start in 3NF. When a query becomes a bottleneck, denormalize strategically: add a computed column, create a materialized view, or duplicate a frequently accessed field. The key is that every denormalization is a deliberate decision with a comment explaining the trade-off — not an accident born from a rushed schema.
In PostgreSQL, materialized views with REFRESH MATERIALIZED VIEW CONCURRENTLY give you denormalized read performance without sacrificing source-of-truth integrity. In SQL Server, indexed views serve a similar role.
Mistake 4: No Audit Trail
Why It Seems Fine
When you’re building fast, tracking who changed what and when feels like overhead. The application works. Users can create, update, and delete records. Why add complexity for something nobody has asked for?
When It Bites You
The first time someone asks “who changed this record and when?” — and you can’t answer — is the moment audit trails stop being optional. It might be a customer disputing a charge, a manager investigating a data discrepancy, or a compliance officer asking for evidence of access controls.
Without an audit trail, your only option is combing through application logs — if they exist, if they’re detailed enough, and if they haven’t been rotated. That’s not an audit trail. That’s a prayer.
The Fix
Implement a changes or audit_log table that records the table name, record ID, field name, old value, new value, user ID, and timestamp for every update and delete. In PostgreSQL, this can be automated with trigger functions that fire on INSERT, UPDATE, and DELETE and write to the audit table without any application code changes.
In SQL Server, temporal tables (system-versioned tables) provide built-in audit functionality. Enable them with ALTER TABLE ... ADD PERIOD FOR SYSTEM_TIME and SQL Server automatically tracks the full history of every row.
The storage cost of audit tables is trivial compared to the cost of not having them. Even at scale, audit data compresses well and can be partitioned by date for efficient archival.
Mistake 5: Ignoring Indexes Until Performance Tanks
Why It Seems Fine
Indexes aren’t visible. The application works the same whether a query uses an index or performs a full table scan — at least when the table has 1,000 rows. And index management adds complexity: you need to choose the right columns, understand composite index ordering, and monitor for unused indexes that slow down writes.
When It Bites You
A query that takes 2ms on 10,000 rows can take 15 seconds on 10 million rows without the right index. We’ve seen production databases where a single missing index turned a 50ms API response into a 12-second timeout. The fix took five minutes. Finding the problem took a weekend of panic.
The Fix
Index every foreign key column. This is non-negotiable. Without an index on a foreign key, every join involving that relationship is a sequential scan. PostgreSQL does not automatically create indexes on foreign key columns (unlike some other databases), so this is a step you must do manually.
Index columns that appear in WHERE, ORDER BY, and GROUP BY clauses. Use EXPLAIN ANALYZE in PostgreSQL or execution plans in SQL Server Management Studio to identify sequential scans on large tables.
Monitor for missing indexes proactively. PostgreSQL’s pg_stat_user_tables shows sequential scan counts by table. SQL Server’s sys.dm_db_missing_index_details DMV explicitly tells you which indexes would improve query performance. Set up alerts for queries that exceed a time threshold and review the execution plans monthly.
Build indexing into your development workflow. Every migration that adds a column used in queries should include the index. Don’t treat it as a performance optimization to do later. Do it now, when the cost is zero and the table is empty.
Conclusion
None of these mistakes are difficult to avoid. UUIDs, soft deletes, sensible normalization, audit trails, and indexes are all straightforward to implement at the start of a project. The cost of getting them right on day one is measured in minutes. The cost of fixing them later is measured in weeks and downtime.
If you’re building a new application and want the database designed right from the start — or if you’re dealing with the consequences of one that wasn’t — we’d like to help.