Database Design Best Practices for Modern Applications

Learn essential database design principles, normalization techniques, indexing strategies, and how to build scalable data architectures for modern applications.

Database

PostgreSQL

Design

Backend

Kanyingidickson

Fullstack dev

Published on October 3, 2024•Last updated on February 5, 20264 min read

A well-designed database is the foundation of any successful application. This guide covers essential principles and best practices for designing scalable, maintainable databases.

Fundamental Principles

1. Normalization vs Denormalization

Normalization eliminates data redundancy and ensures data integrity:

1NF: Eliminate repeating groups
2NF: Remove partial dependencies
3NF: Remove transitive dependencies

Denormalization can improve read performance but increases complexity:


sql
-- Normalized (3NF)
CREATE TABLE users (id, name, email);
CREATE TABLE orders (id, user_id, product_id, quantity);
CREATE TABLE products (id, name, price);

-- Denormalized (for performance)
CREATE TABLE orders_denormalized (
  id, user_name, user_email,
  product_name, product_price, quantity
);

2. Primary Keys and Constraints

Always use appropriate constraints:


sql
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),

  CONSTRAINT email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')
);

Indexing Strategies

Types of Indexes

B-Tree Indexes (Default)
- Best for equality and range queries
- Used for PRIMARY KEY and UNIQUE constraints
Hash Indexes
- Best for simple equality comparisons
- Not useful for range queries
GIN Indexes
- For array and full-text search
- Used with @> operator for arrays
BRIN Indexes
- For large tables with naturally ordered data
- Much smaller than B-Tree indexes

Index Best Practices


sql
-- Composite indexes for multi-column queries
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);

-- Partial indexes for filtered queries
CREATE INDEX idx_active_users ON users (email) WHERE active = true;

-- Expression indexes for computed values
CREATE INDEX idx_users_lower_email ON users (lower(email));

Performance Optimization

Query Optimization

EXPLAIN ANALYZE your queries:


sql
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name;

**Avoid SELECT *** in production:


sql
-- Bad
SELECT * FROM users WHERE active = true;

-- Good
SELECT id, name, email FROM users WHERE active = true;

Connection Pooling

Always use connection pooling in production:


javascript
const { Pool } = require('pg');

const pool = new Pool({
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  max: 20, // Maximum number of clients
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Data Types and Constraints

Choose Appropriate Data Types


sql
-- Use appropriate numeric types
CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(10,2) NOT NULL, -- Avoid FLOAT for money
  stock_quantity INTEGER DEFAULT 0,
  is_active BOOLEAN DEFAULT true,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

JSON/JSONB for Flexible Data


sql
-- Use JSONB for dynamic attributes
CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  attributes JSONB, -- Store flexible product attributes
  metadata JSONB    -- Store additional metadata
);

-- Query JSONB data efficiently
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
SELECT * FROM products WHERE attributes @> '{"color": "red"}';

Backup and Recovery

Automated Backups


bash
# Daily backup script
#!/bin/bash
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump -h localhost -U username -d mydb > backup_$DATE.sql

# Keep only last 7 days
find /backups -name "backup_*.sql" -mtime +7 -delete

Point-in-Time Recovery

Enable WAL archiving for PITR:


postgresql
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/archive/%f'

Monitoring and Maintenance

Key Metrics to Monitor

Connection count and usage
Query performance (slow queries)
Table sizes and growth
Index usage and bloat
Replication lag (if applicable)

Regular Maintenance


sql
-- Analyze tables for query planner
ANALYZE;

-- Vacuum to reclaim space and update statistics
VACUUM;

-- Reindex if needed
REINDEX INDEX CONCURRENTLY index_name;

Conclusion

Good database design requires balancing performance, maintainability, and scalability. Start with solid fundamentals (normalization, proper indexing), implement monitoring early, and be prepared to evolve your schema as your application grows.

Remember: your database is often the hardest part of your system to change once in production, so invest time in getting it right from the start.

Share your reaction:

Database Design Best Practices for Modern Applications

Learn essential database design principles, normalization techniques, indexing strategies, and how to build scalable data architectures for modern applications.

Database

PostgreSQL

Design

Backend

Kanyingidickson

Fullstack dev

Published on October 3, 2024•Last updated on February 5, 20264 min read

A well-designed database is the foundation of any successful application. This guide covers essential principles and best practices for designing scalable, maintainable databases.

Fundamental Principles

1. Normalization vs Denormalization

Normalization eliminates data redundancy and ensures data integrity:

1NF: Eliminate repeating groups
2NF: Remove partial dependencies
3NF: Remove transitive dependencies

Denormalization can improve read performance but increases complexity:


sql
-- Normalized (3NF)
CREATE TABLE users (id, name, email);
CREATE TABLE orders (id, user_id, product_id, quantity);
CREATE TABLE products (id, name, price);

-- Denormalized (for performance)
CREATE TABLE orders_denormalized (
  id, user_name, user_email,
  product_name, product_price, quantity
);

2. Primary Keys and Constraints

Always use appropriate constraints:


sql
CREATE TABLE users (
  id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  email VARCHAR(255) UNIQUE NOT NULL,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW(),

  CONSTRAINT email_format CHECK (email ~* '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$')
);

Indexing Strategies

Types of Indexes

B-Tree Indexes (Default)
- Best for equality and range queries
- Used for PRIMARY KEY and UNIQUE constraints
Hash Indexes
- Best for simple equality comparisons
- Not useful for range queries
GIN Indexes
- For array and full-text search
- Used with @> operator for arrays
BRIN Indexes
- For large tables with naturally ordered data
- Much smaller than B-Tree indexes

Index Best Practices


sql
-- Composite indexes for multi-column queries
CREATE INDEX idx_orders_user_date ON orders (user_id, created_at);

-- Partial indexes for filtered queries
CREATE INDEX idx_active_users ON users (email) WHERE active = true;

-- Expression indexes for computed values
CREATE INDEX idx_users_lower_email ON users (lower(email));

Performance Optimization

Query Optimization

EXPLAIN ANALYZE your queries:


sql
EXPLAIN ANALYZE
SELECT u.name, COUNT(o.id) as order_count
FROM users u
LEFT JOIN orders o ON u.id = o.user_id
WHERE u.created_at > '2024-01-01'
GROUP BY u.id, u.name;

**Avoid SELECT *** in production:


sql
-- Bad
SELECT * FROM users WHERE active = true;

-- Good
SELECT id, name, email FROM users WHERE active = true;

Connection Pooling

Always use connection pooling in production:


javascript
const { Pool } = require('pg');

const pool = new Pool({
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  database: process.env.DB_NAME,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  max: 20, // Maximum number of clients
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

Data Types and Constraints

Choose Appropriate Data Types


sql
-- Use appropriate numeric types
CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  price DECIMAL(10,2) NOT NULL, -- Avoid FLOAT for money
  stock_quantity INTEGER DEFAULT 0,
  is_active BOOLEAN DEFAULT true,
  created_at TIMESTAMP WITH TIME ZONE DEFAULT NOW()
);

JSON/JSONB for Flexible Data


sql
-- Use JSONB for dynamic attributes
CREATE TABLE products (
  id SERIAL PRIMARY KEY,
  name VARCHAR(255) NOT NULL,
  attributes JSONB, -- Store flexible product attributes
  metadata JSONB    -- Store additional metadata
);

-- Query JSONB data efficiently
CREATE INDEX idx_products_attributes ON products USING GIN (attributes);
SELECT * FROM products WHERE attributes @> '{"color": "red"}';

Backup and Recovery

Automated Backups


bash
# Daily backup script
#!/bin/bash
DATE=$(date +%Y%m%d_%H%M%S)
pg_dump -h localhost -U username -d mydb > backup_$DATE.sql

# Keep only last 7 days
find /backups -name "backup_*.sql" -mtime +7 -delete

Point-in-Time Recovery

Enable WAL archiving for PITR:


postgresql
# postgresql.conf
wal_level = replica
archive_mode = on
archive_command = 'cp %p /var/lib/postgresql/archive/%f'

Monitoring and Maintenance

Key Metrics to Monitor

Connection count and usage
Query performance (slow queries)
Table sizes and growth
Index usage and bloat
Replication lag (if applicable)

Regular Maintenance


sql
-- Analyze tables for query planner
ANALYZE;

-- Vacuum to reclaim space and update statistics
VACUUM;

-- Reindex if needed
REINDEX INDEX CONCURRENTLY index_name;

Conclusion

Remember: your database is often the hardest part of your system to change once in production, so invest time in getting it right from the start.

Share your reaction:

Fundamental Principles

1. Normalization vs Denormalization

2. Primary Keys and Constraints

Indexing Strategies

Types of Indexes

Index Best Practices

Performance Optimization

Query Optimization

Connection Pooling

Data Types and Constraints

Choose Appropriate Data Types

JSON/JSONB for Flexible Data

Backup and Recovery

Automated Backups

Point-in-Time Recovery

Monitoring and Maintenance

Key Metrics to Monitor

Regular Maintenance

Conclusion

Comments

Leave a Comment

Related Articles

Building Scalable APIs with Node.js and Express

Designing Scalable Databases

Getting Started with Supabase

Fundamental Principles

1. Normalization vs Denormalization

2. Primary Keys and Constraints

Indexing Strategies

Types of Indexes

Index Best Practices

Performance Optimization

Query Optimization

Connection Pooling

Data Types and Constraints

Choose Appropriate Data Types

JSON/JSONB for Flexible Data

Backup and Recovery

Automated Backups

Point-in-Time Recovery

Monitoring and Maintenance

Key Metrics to Monitor

Regular Maintenance

Conclusion

Comments

Leave a Comment

Related Articles

Building Scalable APIs with Node.js and Express

Designing Scalable Databases

Getting Started with Supabase