04 - DBMS
Why This Matters for Temple
Temple (ex-Zomato founding team) builds food delivery infrastructure where data integrity is non-negotiable. Orders, payments, restaurant menus, delivery tracking -- every piece touches a database. Expect questions on schema design, query optimization, and trade-offs between consistency and performance at scale.
Quick Reference
Scan this table in 5 minutes before your interview.
| Topic | Key Points | Interview Tip |
|---|---|---|
| SQL Joins | INNER (intersection), LEFT (all left + match), RIGHT (all right + match), FULL OUTER (union), CROSS (cartesian), SELF (table with itself) | Draw the Venn diagram; always clarify NULL handling |
| B-Tree Index | Balanced tree, O(log n) lookup, great for range queries and sorting | Default index type in PostgreSQL/MySQL — say this explicitly |
| Hash Index | O(1) exact lookup, useless for ranges | Only mention for equality checks like WHERE id = ? |
| Composite Index | Leftmost prefix rule — (a, b, c) supports queries on a, a,b, a,b,c but NOT b alone | Draw the index tree to explain why prefix matters |
| Normalization | 1NF (atomic) → 2NF (no partial deps) → 3NF (no transitive deps) | Know when to denormalize: read-heavy dashboards, caching |
| ACID | Atomicity (all-or-nothing), Consistency (valid state), Isolation (concurrent txns don't interfere), Durability (committed = permanent) | Use the "place order" flow as your example |
| Isolation Levels | Read Uncommitted → Read Committed → Repeatable Read → Serializable | Know the anomaly each level prevents |
| SQL vs NoSQL | SQL for transactions/relationships; NoSQL for scale/flexibility | Suggest polyglot persistence — use both |
| Optimistic Locking | Version column, no locks held, retry on conflict | Best for low-contention reads (menu browsing) |
| Pessimistic Locking | SELECT FOR UPDATE, holds row lock | Best for high-contention writes (seat/slot booking) |
| Deadlocks | Circular wait between transactions | Prevention: consistent lock ordering, timeouts |
1. SQL Joins
Schema for Examples
CREATE TABLE users (
user_id INT PRIMARY KEY,
name VARCHAR(100),
city VARCHAR(50)
);
CREATE TABLE restaurants (
restaurant_id INT PRIMARY KEY,
name VARCHAR(100),
city VARCHAR(50),
cuisine VARCHAR(50)
);
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT REFERENCES users(user_id),
restaurant_id INT REFERENCES restaurants(restaurant_id),
amount DECIMAL(10, 2),
status VARCHAR(20),
created_at TIMESTAMP
);
CREATE TABLE delivery_agents (
agent_id INT PRIMARY KEY,
name VARCHAR(100),
order_id INT REFERENCES orders(order_id), -- currently assigned order (NULL if free)
rating DECIMAL(2, 1)
);INNER JOIN — Only Matching Rows
Returns rows that have a match in both tables. No match = row excluded.
Table A Table B
┌───────┐ ┌───────┐
│ │ │ │
│ ┌───┼─────┼───┐ │
│ │ ██████████ │ │ ← only the overlap
│ └───┼─────┼───┘ │
│ │ │ │
└───────┘ └───────┘-- Get all orders with the user's name and restaurant name
SELECT
o.order_id,
u.name AS customer,
r.name AS restaurant,
o.amount
FROM orders o
INNER JOIN users u ON o.user_id = u.user_id
INNER JOIN restaurants r ON o.restaurant_id = r.restaurant_id;When to use: You only want rows where the relationship exists on both sides. Most common join.
LEFT JOIN — All From Left + Matching From Right
Returns all rows from the left table. If no match on the right, those columns are NULL.
Table A Table B
┌───────┐ ┌───────┐
│ │ │ │
│ ██┌███┼█████┼───┐ │
│ ██│ ██████████ │ │ ← all of A, matching B
│ ██└███┼█████┼───┘ │
│ │ │ │
└───────┘ └───────┘-- Find all users and their orders (including users who never ordered)
SELECT
u.user_id,
u.name,
o.order_id,
o.amount
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id;
-- To find users who NEVER placed an order:
SELECT u.user_id, u.name
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.order_id IS NULL;When to use: Finding "missing" relationships — users with no orders, restaurants with no reviews.
RIGHT JOIN — All From Right + Matching From Left
Mirror of LEFT JOIN. Returns all rows from the right table.
Table A Table B
┌───────┐ ┌───────┐
│ │ │ │
│ ┌───┼█████┼███┐██ │
│ │ ██████████ │██ │ ← matching A, all of B
│ └───┼█████┼███┘██ │
│ │ │ │
└───────┘ └───────┘-- Find all restaurants and any orders they received
SELECT
r.restaurant_id,
r.name AS restaurant,
o.order_id,
o.amount
FROM orders o
RIGHT JOIN restaurants r ON o.restaurant_id = r.restaurant_id;When to use: Rarely in practice — you can always rewrite as a LEFT JOIN by swapping table order. Know it exists for interviews.
FULL OUTER JOIN — All Rows From Both
Returns all rows from both tables. NULLs fill in where there is no match on either side.
Table A Table B
┌───────┐ ┌───────┐
│ │ │ │
│ ██┌███┼█████┼███┐██ │
│ ██│ ██████████ │██ │ ← everything
│ ██└███┼█████┼███┘██ │
│ │ │ │
└───────┘ └───────┘-- Match delivery agents to orders (show unassigned agents AND undelivered orders)
SELECT
da.agent_id,
da.name AS agent_name,
o.order_id,
o.status
FROM delivery_agents da
FULL OUTER JOIN orders o ON da.order_id = o.order_id;When to use: Reconciliation reports — find orphaned records on either side.
CROSS JOIN — Cartesian Product
Every row in A paired with every row in B. No ON clause. Result: |A| * |B| rows.
A: [1, 2] B: [x, y, z]
Result: (1,x) (1,y) (1,z) (2,x) (2,y) (2,z) → 2 × 3 = 6 rows-- Generate all possible restaurant-cuisine pairings for a recommendation matrix
SELECT
r.name AS restaurant,
c.cuisine
FROM restaurants r
CROSS JOIN (
SELECT DISTINCT cuisine FROM restaurants
) c;When to use: Generating combinations, test data, or pairing every item with every category. Be careful with large tables — row count explodes.
SELF JOIN — Table Joined With Itself
The same table appears twice with different aliases.
-- Find delivery agents in the same city (peer comparison for routing)
-- Assume agents have a city column
SELECT
a1.name AS agent_1,
a2.name AS agent_2,
a1.rating AS rating_1,
a2.rating AS rating_2
FROM delivery_agents a1
INNER JOIN delivery_agents a2
ON a1.agent_id < a2.agent_id; -- avoid duplicates and self-pairs
-- Find users who referred other users (users table has referred_by column)
SELECT
referrer.name AS referrer,
referred.name AS referred_user
FROM users referred
INNER JOIN users referrer ON referred.referred_by = referrer.user_id;When to use: Hierarchical data (manager-employee), finding pairs within the same table, graph-like relationships stored in a single table.
2. Indexing
B-Tree Index
The default index type in PostgreSQL and MySQL (InnoDB). A balanced tree structure where:
- Internal nodes store keys and pointers to child nodes
- Leaf nodes store keys and pointers to actual rows (or the row data in clustered indexes)
- All leaf nodes are at the same depth — guarantees O(log n) lookup
[50]
/ \
[20, 35] [70, 85]
/ | \ / | \
[10] [25] [40] [60] [75] [90] ← leaf nodes (linked list for range scans)How it supports operations:
- Exact match
WHERE user_id = 42— traverse tree, O(log n) - Range query
WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31'— find start leaf, scan linked list - Sorting
ORDER BY created_at— leaves already sorted, sequential scan - Prefix matching
WHERE name LIKE 'Pizza%'— tree is ordered, find first match and scan
-- Create a B-Tree index (default, so USING btree is optional)
CREATE INDEX idx_orders_created_at ON orders(created_at);
-- This query benefits from the index:
SELECT * FROM orders
WHERE created_at >= '2025-01-01' AND created_at < '2025-02-01'
ORDER BY created_at DESC;Hash Index
A hash table: key → bucket → row pointer. O(1) average lookup for exact equality only.
Key: "user_42" → hash("user_42") = 7 → Bucket 7 → Row pointerLimitations:
- No range queries (hash destroys ordering)
- No sorting
- No prefix matching
- Not crash-safe in some older MySQL versions
-- Create a Hash index (PostgreSQL)
CREATE INDEX idx_users_email_hash ON users USING hash(email);
-- This benefits:
SELECT * FROM users WHERE email = 'aayush@temple.io';
-- This does NOT benefit (range query):
SELECT * FROM users WHERE email > 'a' AND email < 'b'; -- full scanWhen to use: High-volume exact lookups — session tokens, API key lookups, cache keys.
Composite Index
An index on multiple columns. The leftmost prefix rule determines which queries can use it.
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);Think of it as a phone book sorted by (last_name, first_name):
Index sorted as:
(user_1, 2025-01-01)
(user_1, 2025-01-15)
(user_1, 2025-02-03)
(user_2, 2025-01-05)
(user_2, 2025-01-20)
...Leftmost prefix rule — which queries use this index?
| Query | Uses Index? | Why |
|---|---|---|
WHERE user_id = 1 | Yes | Leftmost column |
WHERE user_id = 1 AND created_at > '2025-01-01' | Yes | Full prefix |
WHERE created_at > '2025-01-01' | No | Skips leftmost column |
WHERE user_id = 1 ORDER BY created_at | Yes | Prefix + sort on next column |
WHERE created_at > '2025-01-01' AND user_id = 1 | Yes | Optimizer reorders conditions |
Covering Index
An index that contains all columns the query needs. The database serves the query entirely from the index without touching the actual table ("index-only scan").
-- Covering index for a common dashboard query
CREATE INDEX idx_orders_covering
ON orders(user_id, created_at, amount, status);
-- This query is fully covered — no table lookup needed:
SELECT user_id, created_at, amount, status
FROM orders
WHERE user_id = 42 AND created_at > '2025-01-01';
-- EXPLAIN will show "Index Only Scan" (PostgreSQL) or "Using index" (MySQL)Trade-off: Wider indexes use more disk and slow down writes. Only create covering indexes for hot queries.
When NOT to Index
| Scenario | Why Indexing Hurts |
|---|---|
Low cardinality columns (status with 3 values) | Index scan not much better than full scan; optimizer may ignore it |
| Frequently updated columns | Every write must update the index too |
| Small tables (< 1000 rows) | Full scan is fast enough; index overhead not worth it |
| Columns used with functions | WHERE UPPER(name) = 'PIZZA' cannot use index on name (use expression index instead) |
| Write-heavy tables with many indexes | Each INSERT/UPDATE touches every index — write amplification |
Practical Example: Optimizing "Find Orders by User in Date Range"
-- The query we want to optimize:
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;
-- Step 1: Check current execution plan
EXPLAIN ANALYZE
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;
-- Result: Seq Scan on orders, cost: 15000ms
-- Step 2: Add composite index (user_id first for equality, created_at for range + sort)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Result: Index Scan, cost: 3ms
-- Step 3: Make it a covering index to avoid table lookup
CREATE INDEX idx_orders_user_date_covering
ON orders(user_id, created_at) INCLUDE (order_id, amount, status);
-- Result: Index Only Scan, cost: 1msKey takeaway: Equality columns first, then range/sort columns, then INCLUDE non-filtered columns.
3. Normalization
1NF — Atomic Values, No Repeating Groups
Every cell holds a single value. No arrays, no comma-separated lists, no nested tables.
Violation:
| order_id | items |
|---|---|
| 1 | "Burger, Fries, Cola" |
| 2 | "Pizza, Garlic Bread" |
1NF Fix: Break items into separate rows or a separate table.
-- Separate order_items table
CREATE TABLE order_items (
item_id INT PRIMARY KEY,
order_id INT REFERENCES orders(order_id),
item_name VARCHAR(100),
quantity INT,
price DECIMAL(10, 2)
);2NF — No Partial Dependencies
All non-key attributes must depend on the entire primary key, not just part of it. Only relevant when you have a composite primary key.
Violation: Composite key is (order_id, item_id)
| order_id | item_id | item_name | item_price | order_date | customer_name |
|---|---|---|---|---|---|
| 1 | 101 | Burger | 8.99 | 2025-01-15 | Aayush |
Problem: order_date and customer_name depend only on order_id, not on (order_id, item_id).
2NF Fix: Split into two tables.
-- Orders table: attributes that depend on order_id alone
CREATE TABLE orders (
order_id INT PRIMARY KEY,
order_date TIMESTAMP,
customer_name VARCHAR(100)
);
-- Order items: attributes that depend on the full composite key
CREATE TABLE order_items (
order_id INT REFERENCES orders(order_id),
item_id INT,
item_name VARCHAR(100),
item_price DECIMAL(10, 2),
quantity INT,
PRIMARY KEY (order_id, item_id)
);3NF — No Transitive Dependencies
No non-key attribute should depend on another non-key attribute. Every non-key column depends directly on the primary key.
Violation:
| order_id | restaurant_id | restaurant_name | restaurant_city |
|---|---|---|---|
| 1 | 10 | Pizza Palace | Mumbai |
Problem: restaurant_name and restaurant_city depend on restaurant_id, not on order_id directly. That is a transitive dependency: order_id → restaurant_id → restaurant_name.
3NF Fix:
-- Orders only reference restaurant_id
CREATE TABLE orders (
order_id INT PRIMARY KEY,
user_id INT REFERENCES users(user_id),
restaurant_id INT REFERENCES restaurants(restaurant_id),
amount DECIMAL(10, 2),
created_at TIMESTAMP
);
-- Restaurant details live in their own table
CREATE TABLE restaurants (
restaurant_id INT PRIMARY KEY,
restaurant_name VARCHAR(100),
restaurant_city VARCHAR(50)
);When to Denormalize
| Scenario | Approach | Example |
|---|---|---|
| Read-heavy dashboards | Materialized views or summary tables | Pre-aggregate daily order counts per restaurant |
| Avoiding expensive joins | Store redundant data | Keep restaurant_name in orders for display queries |
| Caching precomputed data | Denormalized cache table | Store user_total_spent in users table, update on each order |
| Event sourcing / analytics | Flat event tables | Write-once order events with all context embedded |
Rule of thumb: Normalize for writes, denormalize for reads. Start normalized, denormalize when you have a proven performance bottleneck.
Full Normalization Example: Food Order
Unnormalized (0NF):
| order_id | customer | phone | items | restaurant | city |
|----------|----------|-----------|------------------------------|------------|--------|
| 1 | Aayush | 999000111 | Burger:8.99, Fries:3.99 | Pizza Hut | Mumbai |
| 2 | Aayush | 999000111 | Margherita:12.99 | Pizza Hut | Mumbai |1NF — Atomic values:
| order_id | customer | phone | item | price | restaurant | city |
|----------|----------|-----------|------------|-------|------------|--------|
| 1 | Aayush | 999000111 | Burger | 8.99 | Pizza Hut | Mumbai |
| 1 | Aayush | 999000111 | Fries | 3.99 | Pizza Hut | Mumbai |
| 2 | Aayush | 999000111 | Margherita | 12.99 | Pizza Hut | Mumbai |2NF — Remove partial dependencies (key: order_id, item):
orders: (order_id, customer, phone, restaurant, city)
order_items: (order_id, item, price)3NF — Remove transitive dependencies:
users: (user_id, customer, phone)
restaurants: (restaurant_id, restaurant, city)
orders: (order_id, user_id, restaurant_id)
order_items: (order_id, item, price)4. ACID Properties & Isolation Levels
ACID with Food Delivery Examples
Atomicity — All or Nothing
Placing an order involves: deduct wallet balance + create order + notify restaurant. If the restaurant notification fails, everything rolls back — you do not lose money without an order.
BEGIN;
UPDATE wallets SET balance = balance - 299.00 WHERE user_id = 42;
INSERT INTO orders (user_id, restaurant_id, amount, status)
VALUES (42, 10, 299.00, 'CONFIRMED');
INSERT INTO notifications (restaurant_id, message)
VALUES (10, 'New order #1234');
-- If any statement fails, ROLLBACK undoes everything
COMMIT;Consistency — Valid State Before and After
Database constraints ensure we never end up in an invalid state. A negative wallet balance, an order referencing a non-existent restaurant, or a duplicate payment — constraints catch these.
ALTER TABLE wallets ADD CONSTRAINT positive_balance CHECK (balance >= 0);
-- Now: UPDATE wallets SET balance = balance - 10000 WHERE balance = 50;
-- Fails with CHECK constraint violation — atomicity rolls back the transactionIsolation — Concurrent Transactions Don't Interfere
Two users ordering the last available "Chef's Special" at the same time should not both succeed.
Durability — Committed = Permanent
Once the user sees "Order Confirmed", that data survives a server crash. Achieved through write-ahead logging (WAL) — changes are written to a durable log before being applied.
Isolation Levels
| Level | Dirty Read | Non-Repeatable Read | Phantom Read | Performance |
|---|---|---|---|---|
| Read Uncommitted | Possible | Possible | Possible | Fastest |
| Read Committed | Prevented | Possible | Possible | Fast |
| Repeatable Read | Prevented | Prevented | Possible | Moderate |
| Serializable | Prevented | Prevented | Prevented | Slowest |
The Three Anomalies Explained
Dirty Read — Reading uncommitted data
Transaction A Transaction B
───────────── ─────────────
BEGIN;
UPDATE orders SET status =
'CANCELLED' WHERE id = 1;
BEGIN;
SELECT status FROM orders
WHERE id = 1;
-- Reads 'CANCELLED' ← DIRTY READ
ROLLBACK;
-- Order is actually still CONFIRMED, but B acted on stale/wrong dataNon-Repeatable Read — Same query, different result
Transaction A Transaction B
───────────── ─────────────
BEGIN;
SELECT amount FROM orders
WHERE id = 1;
-- Returns 299.00
BEGIN;
UPDATE orders SET amount = 399.00
WHERE id = 1;
COMMIT;
SELECT amount FROM orders
WHERE id = 1;
-- Returns 399.00 ← different! NON-REPEATABLE READ
COMMIT;Phantom Read — New rows appear
Transaction A Transaction B
───────────── ─────────────
BEGIN;
SELECT COUNT(*) FROM orders
WHERE restaurant_id = 10;
-- Returns 5
BEGIN;
INSERT INTO orders (restaurant_id, ...)
VALUES (10, ...);
COMMIT;
SELECT COUNT(*) FROM orders
WHERE restaurant_id = 10;
-- Returns 6 ← PHANTOM ROW
COMMIT;Which Level to Use?
| Use Case | Recommended Level | Why |
|---|---|---|
| Analytics / reporting queries | Read Committed | Stale data is fine, need performance |
| Order placement & payment | Repeatable Read | Must see consistent snapshot during transaction |
| Inventory / last-item booking | Serializable | Must prevent phantom reads on availability checks |
| Default in PostgreSQL | Read Committed | Good balance for most workloads |
| Default in MySQL InnoDB | Repeatable Read | Stronger default, uses MVCC snapshots |
5. SQL vs NoSQL Decision Framework
Comparison Table
| Dimension | SQL (PostgreSQL, MySQL) | NoSQL (MongoDB, DynamoDB, Redis) |
|---|---|---|
| Schema | Rigid, predefined | Flexible, schema-on-read |
| ACID | Full support | Varies (some support per-document ACID) |
| Scaling | Vertical (read replicas for horizontal) | Horizontal (sharding built-in) |
| Joins | Native, efficient | Application-level or denormalized |
| Consistency | Strong by default | Eventual (tunable in some) |
| Query Language | SQL (standardized) | Varies per database |
| Best For | Relationships, transactions, complex queries | High throughput, flexible schema, massive scale |
When to Use SQL (Food Delivery Context)
| Data | Why SQL |
|---|---|
| Orders & Payments | ACID transactions — cannot lose money or double-charge |
| User Accounts | Relational: user → addresses → payment methods → orders |
| Restaurant Menus | Structured: categories → items → variants → add-ons |
| Promotions & Coupons | Complex rules: "20% off if order > 500 AND first order this month" |
When to Use NoSQL (Food Delivery Context)
| Data | Database | Why |
|---|---|---|
| Real-time agent location | Redis (geospatial) | Sub-millisecond geo queries, high write throughput |
| Session store | Redis | TTL-based expiry, fast reads |
| Activity feed | MongoDB or DynamoDB | Flexible schema, append-heavy, time-sorted |
| Push notification queue | Redis Streams / Kafka | High throughput, ordered, consumer groups |
| Search & discovery | Elasticsearch | Full-text search, fuzzy matching, faceted filters |
Polyglot Persistence for Food Delivery
┌─────────────────────────────────────────────────────────────┐
│ Food Delivery Platform │
├──────────────┬──────────────┬───────────────┬───────────────┤
│ PostgreSQL │ Redis │ MongoDB │ Elasticsearch │
├──────────────┼──────────────┼───────────────┼───────────────┤
│ Users │ Sessions │ Activity feed │ Restaurant │
│ Orders │ Agent GPS │ Chat messages │ search │
│ Payments │ Rate limits │ Notifications │ Menu search │
│ Restaurants │ Cart (temp) │ Audit logs │ Geo search │
│ Coupons │ Cache layer │ │ │
└──────────────┴──────────────┴───────────────┴───────────────┘Interview tip: When asked "SQL or NoSQL?", the answer is almost always "both — here's how I'd split the data."
6. Transactions & Locking
Optimistic Locking
Assumes conflicts are rare. No locks held during the transaction. Instead, a version number or timestamp is checked at write time.
-- Step 1: Read the current version
SELECT restaurant_id, menu_version, is_available
FROM menu_items
WHERE item_id = 101;
-- Returns: menu_version = 5, is_available = true
-- Step 2: Update only if version hasn't changed
UPDATE menu_items
SET is_available = false, menu_version = menu_version + 1
WHERE item_id = 101 AND menu_version = 5;
-- Step 3: Check rows affected
-- If 0 rows updated → someone else changed it → RETRY
-- If 1 row updated → successTypeScript implementation pattern:
async function updateMenuItemWithOptimisticLock(
itemId: number,
updates: Partial<MenuItem>,
maxRetries = 3
): Promise<MenuItem> {
for (let attempt = 0; attempt < maxRetries; attempt++) {
const item = await db.query(
'SELECT * FROM menu_items WHERE item_id = $1',
[itemId]
);
const result = await db.query(
`UPDATE menu_items
SET is_available = $1, menu_version = menu_version + 1
WHERE item_id = $2 AND menu_version = $3
RETURNING *`,
[updates.isAvailable, itemId, item.menu_version]
);
if (result.rowCount === 1) {
return result.rows[0]; // success
}
// Version mismatch — retry after brief delay
await sleep(50 * (attempt + 1));
}
throw new Error('Optimistic lock failed after max retries');
}Pessimistic Locking
Assumes conflicts are likely. Acquires a lock before reading, holds it until transaction completes.
-- Lock the row so no other transaction can modify it
BEGIN;
SELECT * FROM orders
WHERE order_id = 1234
FOR UPDATE; -- row-level exclusive lock
-- Safely update — no one else can touch this row
UPDATE orders
SET status = 'DELIVERED', delivered_at = NOW()
WHERE order_id = 1234;
COMMIT; -- lock releasedLock types:
FOR UPDATE— exclusive lock, blocks otherFOR UPDATEandFOR SHAREFOR SHARE— shared lock, allows otherFOR SHAREbut blocksFOR UPDATEFOR UPDATE NOWAIT— fail immediately instead of waiting if lock is heldFOR UPDATE SKIP LOCKED— skip rows that are locked (great for job queues)
-- Job queue pattern: pick next available order for delivery assignment
BEGIN;
SELECT * FROM orders
WHERE status = 'READY_FOR_PICKUP' AND agent_id IS NULL
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED; -- skip orders being assigned by other workers
UPDATE orders SET agent_id = 7, status = 'ASSIGNED'
WHERE order_id = <selected_order_id>;
COMMIT;Comparison: Optimistic vs Pessimistic
| Dimension | Optimistic | Pessimistic |
|---|---|---|
| Lock held during read | No | Yes |
| Conflict detection | At write time (version check) | At read time (lock acquisition) |
| Throughput under low contention | High (no lock overhead) | Lower (lock overhead even when unnecessary) |
| Throughput under high contention | Degrades (many retries) | Stable (queue on lock) |
| Deadlock risk | None | Possible |
| Best for | Read-heavy, low contention (menu browsing) | Write-heavy, high contention (order assignment) |
| Implementation | Application-level version column | Database-level SELECT FOR UPDATE |
MVCC (Multi-Version Concurrency Control)
How PostgreSQL and MySQL InnoDB handle concurrent reads and writes without readers blocking writers.
Core idea: Instead of locking rows for reads, the database keeps multiple versions of each row. Each transaction sees a snapshot of the database as of its start time.
Time Transaction A (Read) Transaction B (Write)
──── ──────────────────── ─────────────────────
T1 BEGIN (snapshot at T1)
T2 BEGIN
T3 UPDATE orders SET amount = 399
WHERE id = 1;
-- Creates new version (amount=399)
-- Old version (amount=299) kept
T4 SELECT amount FROM orders
WHERE id = 1;
-- Sees 299 (snapshot at T1)
-- Does NOT block, does NOT wait
T5 COMMIT;
T6 SELECT amount FROM orders
WHERE id = 1;
-- Still sees 299 (Repeatable Read)
-- OR sees 399 (Read Committed)
T7 COMMIT;How it works in PostgreSQL:
- Each row has hidden columns:
xmin(transaction that created it),xmax(transaction that deleted/updated it) - A read transaction checks: "Is this row version visible to my snapshot?"
- Old versions are cleaned up by
VACUUM
Key benefit: Readers never block writers. Writers never block readers. Only writer-writer conflicts need locking.
7. Deadlock Detection
What Causes Deadlocks
A deadlock occurs when two or more transactions are each waiting for a lock held by the other, creating a circular wait. Four conditions must all be true (Coffman conditions):
| Condition | Meaning |
|---|---|
| Mutual Exclusion | The resource (row lock) cannot be shared |
| Hold and Wait | A transaction holds one lock while waiting for another |
| No Preemption | Locks cannot be forcibly taken from a transaction |
| Circular Wait | T1 waits for T2, T2 waits for T1 (or longer cycle) |
Deadlock Example: Two Concurrent Order Updates
Transaction A Transaction B
───────────── ─────────────
BEGIN; BEGIN;
-- Lock order #1 -- Lock order #2
UPDATE orders SET status = 'PREPARING' UPDATE orders SET status = 'PREPARING'
WHERE order_id = 1; WHERE order_id = 2;
-- Holds lock on order #1 -- Holds lock on order #2
-- Now needs order #2 -- Now needs order #1
UPDATE orders SET status = 'PREPARING' UPDATE orders SET status = 'PREPARING'
WHERE order_id = 2; WHERE order_id = 1;
-- BLOCKED (B holds lock on #2) -- BLOCKED (A holds lock on #1)
-- ⚠ DEADLOCK: A waits for B, B waits for ADetection: Wait-For Graph
The database builds a directed graph of "Transaction X waits for Transaction Y". If the graph contains a cycle, a deadlock exists.
Wait-For Graph:
T_A ──────→ T_B
↑ │
│ ↓
└───────── T_A ← CYCLE DETECTED
Resolution: Kill one transaction (the "victim"), ROLLBACK it,
let the other proceed.PostgreSQL detects deadlocks automatically (checks every deadlock_timeout, default 1 second). The victim transaction gets:
ERROR: deadlock detected
DETAIL: Process 1234 waits for ShareLock on transaction 5678;
blocked by process 9012.
Process 9012 waits for ShareLock on transaction 1234;
blocked by process 1234.Prevention Strategies
1. Consistent Lock Ordering
Always acquire locks in the same order (e.g., by ascending order_id).
-- Both transactions lock order #1 first, then #2
-- Transaction A:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1; -- lock #1 first
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2; -- then #2
COMMIT;
-- Transaction B:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1; -- lock #1 first (waits if A holds it)
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2; -- then #2
COMMIT;
-- No deadlock: B simply waits for A to finish with #1// Application-level lock ordering
async function updateMultipleOrders(orderIds: number[], status: string) {
// Sort to ensure consistent lock order
const sorted = [...orderIds].sort((a, b) => a - b);
await db.transaction(async (trx) => {
for (const id of sorted) {
await trx.query(
'UPDATE orders SET status = $1 WHERE order_id = $2',
[status, id]
);
}
});
}2. Lock Timeout
Set a maximum wait time. If a lock cannot be acquired within the timeout, the transaction fails instead of waiting forever.
-- Set lock timeout to 5 seconds for this transaction
SET LOCAL lock_timeout = '5s';
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- If lock not acquired in 5 seconds:
-- ERROR: canceling statement due to lock timeout
COMMIT;3. NOWAIT — Fail Immediately
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE NOWAIT;
-- If locked: ERROR: could not obtain lock on row in relation "orders"
-- Application catches error and retries with backoff
COMMIT;4. Reduce Lock Scope and Duration
-- BAD: Lock held while doing slow external call
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- ... call payment gateway (2 seconds) ...
UPDATE orders SET status = 'PAID' WHERE order_id = 1;
COMMIT;
-- GOOD: Do external work first, then lock briefly
-- 1. Call payment gateway outside transaction
-- 2. Then:
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
UPDATE orders SET status = 'PAID', payment_ref = 'PAY_123'
WHERE order_id = 1;
COMMIT; -- lock held for milliseconds, not secondsSummary: Deadlock Prevention Checklist
| Strategy | How | Trade-off |
|---|---|---|
| Consistent lock ordering | Sort resources before locking | Requires discipline across codebase |
| Lock timeout | SET lock_timeout = '5s' | Must handle timeout errors in application |
| NOWAIT | FOR UPDATE NOWAIT | Immediate failure, needs retry logic |
| Reduce lock duration | Move slow operations outside transaction | More complex application flow |
| Use optimistic locking | Version column, no row locks | Higher retry rate under contention |
| Deadlock detection (DB) | Automatic in PostgreSQL/MySQL | One transaction is killed — must retry |
Interview Tips Summary
- Schema design questions: Start with 3NF, then discuss when and why you would denormalize.
- "How would you optimize this query?": Check execution plan → add composite index (equality cols first) → consider covering index → measure.
- "SQL or NoSQL?": Never pick one. Describe polyglot persistence — which data goes where and why.
- "How do you handle concurrent updates?": Start with the use case's contention level, then pick optimistic vs pessimistic.
- "What isolation level would you use?": Default to Read Committed for most things, Serializable for financial operations.
- ACID questions: Use the "place an order" flow — it naturally touches all four properties.
- Deadlock questions: Explain the four Coffman conditions, then lead with prevention (lock ordering) over detection.