Skip to content

04 - DBMS

Why This Matters for Temple

Temple (ex-Zomato founding team) builds food delivery infrastructure where data integrity is non-negotiable. Orders, payments, restaurant menus, delivery tracking -- every piece touches a database. Expect questions on schema design, query optimization, and trade-offs between consistency and performance at scale.


Quick Reference

Scan this table in 5 minutes before your interview.

TopicKey PointsInterview Tip
SQL JoinsINNER (intersection), LEFT (all left + match), RIGHT (all right + match), FULL OUTER (union), CROSS (cartesian), SELF (table with itself)Draw the Venn diagram; always clarify NULL handling
B-Tree IndexBalanced tree, O(log n) lookup, great for range queries and sortingDefault index type in PostgreSQL/MySQL — say this explicitly
Hash IndexO(1) exact lookup, useless for rangesOnly mention for equality checks like WHERE id = ?
Composite IndexLeftmost prefix rule — (a, b, c) supports queries on a, a,b, a,b,c but NOT b aloneDraw the index tree to explain why prefix matters
Normalization1NF (atomic) → 2NF (no partial deps) → 3NF (no transitive deps)Know when to denormalize: read-heavy dashboards, caching
ACIDAtomicity (all-or-nothing), Consistency (valid state), Isolation (concurrent txns don't interfere), Durability (committed = permanent)Use the "place order" flow as your example
Isolation LevelsRead Uncommitted → Read Committed → Repeatable Read → SerializableKnow the anomaly each level prevents
SQL vs NoSQLSQL for transactions/relationships; NoSQL for scale/flexibilitySuggest polyglot persistence — use both
Optimistic LockingVersion column, no locks held, retry on conflictBest for low-contention reads (menu browsing)
Pessimistic LockingSELECT FOR UPDATE, holds row lockBest for high-contention writes (seat/slot booking)
DeadlocksCircular wait between transactionsPrevention: consistent lock ordering, timeouts

1. SQL Joins

Schema for Examples

sql
CREATE TABLE users (
    user_id   INT PRIMARY KEY,
    name      VARCHAR(100),
    city      VARCHAR(50)
);

CREATE TABLE restaurants (
    restaurant_id INT PRIMARY KEY,
    name          VARCHAR(100),
    city          VARCHAR(50),
    cuisine       VARCHAR(50)
);

CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    user_id       INT REFERENCES users(user_id),
    restaurant_id INT REFERENCES restaurants(restaurant_id),
    amount        DECIMAL(10, 2),
    status        VARCHAR(20),
    created_at    TIMESTAMP
);

CREATE TABLE delivery_agents (
    agent_id  INT PRIMARY KEY,
    name      VARCHAR(100),
    order_id  INT REFERENCES orders(order_id),  -- currently assigned order (NULL if free)
    rating    DECIMAL(2, 1)
);

INNER JOIN — Only Matching Rows

Returns rows that have a match in both tables. No match = row excluded.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│   ┌───┼─────┼───┐   │
│   │ ██████████ │   │    ← only the overlap
│   └───┼─────┼───┘   │
│       │     │       │
└───────┘     └───────┘
sql
-- Get all orders with the user's name and restaurant name
SELECT
    o.order_id,
    u.name   AS customer,
    r.name   AS restaurant,
    o.amount
FROM orders o
INNER JOIN users u       ON o.user_id = u.user_id
INNER JOIN restaurants r ON o.restaurant_id = r.restaurant_id;

When to use: You only want rows where the relationship exists on both sides. Most common join.

LEFT JOIN — All From Left + Matching From Right

Returns all rows from the left table. If no match on the right, those columns are NULL.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│ ██┌███┼█████┼───┐   │
│ ██│ ██████████ │   │    ← all of A, matching B
│ ██└███┼█████┼───┘   │
│       │     │       │
└───────┘     └───────┘
sql
-- Find all users and their orders (including users who never ordered)
SELECT
    u.user_id,
    u.name,
    o.order_id,
    o.amount
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id;

-- To find users who NEVER placed an order:
SELECT u.user_id, u.name
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.order_id IS NULL;

When to use: Finding "missing" relationships — users with no orders, restaurants with no reviews.

RIGHT JOIN — All From Right + Matching From Left

Mirror of LEFT JOIN. Returns all rows from the right table.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│   ┌───┼█████┼███┐██ │
│   │ ██████████ │██ │    ← matching A, all of B
│   └───┼█████┼███┘██ │
│       │     │       │
└───────┘     └───────┘
sql
-- Find all restaurants and any orders they received
SELECT
    r.restaurant_id,
    r.name AS restaurant,
    o.order_id,
    o.amount
FROM orders o
RIGHT JOIN restaurants r ON o.restaurant_id = r.restaurant_id;

When to use: Rarely in practice — you can always rewrite as a LEFT JOIN by swapping table order. Know it exists for interviews.

FULL OUTER JOIN — All Rows From Both

Returns all rows from both tables. NULLs fill in where there is no match on either side.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│ ██┌███┼█████┼███┐██ │
│ ██│ ██████████ │██ │    ← everything
│ ██└███┼█████┼███┘██ │
│       │     │       │
└───────┘     └───────┘
sql
-- Match delivery agents to orders (show unassigned agents AND undelivered orders)
SELECT
    da.agent_id,
    da.name   AS agent_name,
    o.order_id,
    o.status
FROM delivery_agents da
FULL OUTER JOIN orders o ON da.order_id = o.order_id;

When to use: Reconciliation reports — find orphaned records on either side.

CROSS JOIN — Cartesian Product

Every row in A paired with every row in B. No ON clause. Result: |A| * |B| rows.

 A: [1, 2]    B: [x, y, z]
 Result: (1,x) (1,y) (1,z) (2,x) (2,y) (2,z)   →  2 × 3 = 6 rows
sql
-- Generate all possible restaurant-cuisine pairings for a recommendation matrix
SELECT
    r.name AS restaurant,
    c.cuisine
FROM restaurants r
CROSS JOIN (
    SELECT DISTINCT cuisine FROM restaurants
) c;

When to use: Generating combinations, test data, or pairing every item with every category. Be careful with large tables — row count explodes.

SELF JOIN — Table Joined With Itself

The same table appears twice with different aliases.

sql
-- Find delivery agents in the same city (peer comparison for routing)
-- Assume agents have a city column
SELECT
    a1.name AS agent_1,
    a2.name AS agent_2,
    a1.rating AS rating_1,
    a2.rating AS rating_2
FROM delivery_agents a1
INNER JOIN delivery_agents a2
    ON a1.agent_id < a2.agent_id;  -- avoid duplicates and self-pairs

-- Find users who referred other users (users table has referred_by column)
SELECT
    referrer.name AS referrer,
    referred.name AS referred_user
FROM users referred
INNER JOIN users referrer ON referred.referred_by = referrer.user_id;

When to use: Hierarchical data (manager-employee), finding pairs within the same table, graph-like relationships stored in a single table.


2. Indexing

B-Tree Index

The default index type in PostgreSQL and MySQL (InnoDB). A balanced tree structure where:

  • Internal nodes store keys and pointers to child nodes
  • Leaf nodes store keys and pointers to actual rows (or the row data in clustered indexes)
  • All leaf nodes are at the same depth — guarantees O(log n) lookup
                    [50]
                   /    \
            [20, 35]    [70, 85]
           /   |   \    /   |   \
        [10] [25] [40] [60] [75] [90]   ← leaf nodes (linked list for range scans)

How it supports operations:

  • Exact match WHERE user_id = 42 — traverse tree, O(log n)
  • Range query WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31' — find start leaf, scan linked list
  • Sorting ORDER BY created_at — leaves already sorted, sequential scan
  • Prefix matching WHERE name LIKE 'Pizza%' — tree is ordered, find first match and scan
sql
-- Create a B-Tree index (default, so USING btree is optional)
CREATE INDEX idx_orders_created_at ON orders(created_at);

-- This query benefits from the index:
SELECT * FROM orders
WHERE created_at >= '2025-01-01' AND created_at < '2025-02-01'
ORDER BY created_at DESC;

Hash Index

A hash table: key → bucket → row pointer. O(1) average lookup for exact equality only.

 Key: "user_42"  →  hash("user_42") = 7  →  Bucket 7  →  Row pointer

Limitations:

  • No range queries (hash destroys ordering)
  • No sorting
  • No prefix matching
  • Not crash-safe in some older MySQL versions
sql
-- Create a Hash index (PostgreSQL)
CREATE INDEX idx_users_email_hash ON users USING hash(email);

-- This benefits:
SELECT * FROM users WHERE email = 'aayush@temple.io';

-- This does NOT benefit (range query):
SELECT * FROM users WHERE email > 'a' AND email < 'b';  -- full scan

When to use: High-volume exact lookups — session tokens, API key lookups, cache keys.

Composite Index

An index on multiple columns. The leftmost prefix rule determines which queries can use it.

sql
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

Think of it as a phone book sorted by (last_name, first_name):

 Index sorted as:
 (user_1, 2025-01-01)
 (user_1, 2025-01-15)
 (user_1, 2025-02-03)
 (user_2, 2025-01-05)
 (user_2, 2025-01-20)
 ...

Leftmost prefix rule — which queries use this index?

QueryUses Index?Why
WHERE user_id = 1YesLeftmost column
WHERE user_id = 1 AND created_at > '2025-01-01'YesFull prefix
WHERE created_at > '2025-01-01'NoSkips leftmost column
WHERE user_id = 1 ORDER BY created_atYesPrefix + sort on next column
WHERE created_at > '2025-01-01' AND user_id = 1YesOptimizer reorders conditions

Covering Index

An index that contains all columns the query needs. The database serves the query entirely from the index without touching the actual table ("index-only scan").

sql
-- Covering index for a common dashboard query
CREATE INDEX idx_orders_covering
ON orders(user_id, created_at, amount, status);

-- This query is fully covered — no table lookup needed:
SELECT user_id, created_at, amount, status
FROM orders
WHERE user_id = 42 AND created_at > '2025-01-01';

-- EXPLAIN will show "Index Only Scan" (PostgreSQL) or "Using index" (MySQL)

Trade-off: Wider indexes use more disk and slow down writes. Only create covering indexes for hot queries.

When NOT to Index

ScenarioWhy Indexing Hurts
Low cardinality columns (status with 3 values)Index scan not much better than full scan; optimizer may ignore it
Frequently updated columnsEvery write must update the index too
Small tables (< 1000 rows)Full scan is fast enough; index overhead not worth it
Columns used with functionsWHERE UPPER(name) = 'PIZZA' cannot use index on name (use expression index instead)
Write-heavy tables with many indexesEach INSERT/UPDATE touches every index — write amplification

Practical Example: Optimizing "Find Orders by User in Date Range"

sql
-- The query we want to optimize:
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
  AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;

-- Step 1: Check current execution plan
EXPLAIN ANALYZE
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
  AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;
-- Result: Seq Scan on orders, cost: 15000ms

-- Step 2: Add composite index (user_id first for equality, created_at for range + sort)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Result: Index Scan, cost: 3ms

-- Step 3: Make it a covering index to avoid table lookup
CREATE INDEX idx_orders_user_date_covering
ON orders(user_id, created_at) INCLUDE (order_id, amount, status);
-- Result: Index Only Scan, cost: 1ms

Key takeaway: Equality columns first, then range/sort columns, then INCLUDE non-filtered columns.


3. Normalization

1NF — Atomic Values, No Repeating Groups

Every cell holds a single value. No arrays, no comma-separated lists, no nested tables.

Violation:

order_iditems
1"Burger, Fries, Cola"
2"Pizza, Garlic Bread"

1NF Fix: Break items into separate rows or a separate table.

sql
-- Separate order_items table
CREATE TABLE order_items (
    item_id    INT PRIMARY KEY,
    order_id   INT REFERENCES orders(order_id),
    item_name  VARCHAR(100),
    quantity   INT,
    price      DECIMAL(10, 2)
);

2NF — No Partial Dependencies

All non-key attributes must depend on the entire primary key, not just part of it. Only relevant when you have a composite primary key.

Violation: Composite key is (order_id, item_id)

order_iditem_iditem_nameitem_priceorder_datecustomer_name
1101Burger8.992025-01-15Aayush

Problem: order_date and customer_name depend only on order_id, not on (order_id, item_id).

2NF Fix: Split into two tables.

sql
-- Orders table: attributes that depend on order_id alone
CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    order_date    TIMESTAMP,
    customer_name VARCHAR(100)
);

-- Order items: attributes that depend on the full composite key
CREATE TABLE order_items (
    order_id   INT REFERENCES orders(order_id),
    item_id    INT,
    item_name  VARCHAR(100),
    item_price DECIMAL(10, 2),
    quantity   INT,
    PRIMARY KEY (order_id, item_id)
);

3NF — No Transitive Dependencies

No non-key attribute should depend on another non-key attribute. Every non-key column depends directly on the primary key.

Violation:

order_idrestaurant_idrestaurant_namerestaurant_city
110Pizza PalaceMumbai

Problem: restaurant_name and restaurant_city depend on restaurant_id, not on order_id directly. That is a transitive dependency: order_id → restaurant_id → restaurant_name.

3NF Fix:

sql
-- Orders only reference restaurant_id
CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    user_id       INT REFERENCES users(user_id),
    restaurant_id INT REFERENCES restaurants(restaurant_id),
    amount        DECIMAL(10, 2),
    created_at    TIMESTAMP
);

-- Restaurant details live in their own table
CREATE TABLE restaurants (
    restaurant_id   INT PRIMARY KEY,
    restaurant_name VARCHAR(100),
    restaurant_city VARCHAR(50)
);

When to Denormalize

ScenarioApproachExample
Read-heavy dashboardsMaterialized views or summary tablesPre-aggregate daily order counts per restaurant
Avoiding expensive joinsStore redundant dataKeep restaurant_name in orders for display queries
Caching precomputed dataDenormalized cache tableStore user_total_spent in users table, update on each order
Event sourcing / analyticsFlat event tablesWrite-once order events with all context embedded

Rule of thumb: Normalize for writes, denormalize for reads. Start normalized, denormalize when you have a proven performance bottleneck.

Full Normalization Example: Food Order

Unnormalized (0NF):

| order_id | customer | phone     | items                        | restaurant | city   |
|----------|----------|-----------|------------------------------|------------|--------|
| 1        | Aayush   | 999000111 | Burger:8.99, Fries:3.99      | Pizza Hut  | Mumbai |
| 2        | Aayush   | 999000111 | Margherita:12.99             | Pizza Hut  | Mumbai |

1NF — Atomic values:

| order_id | customer | phone     | item       | price | restaurant | city   |
|----------|----------|-----------|------------|-------|------------|--------|
| 1        | Aayush   | 999000111 | Burger     | 8.99  | Pizza Hut  | Mumbai |
| 1        | Aayush   | 999000111 | Fries      | 3.99  | Pizza Hut  | Mumbai |
| 2        | Aayush   | 999000111 | Margherita | 12.99 | Pizza Hut  | Mumbai |

2NF — Remove partial dependencies (key: order_id, item):

orders: (order_id, customer, phone, restaurant, city)
order_items: (order_id, item, price)

3NF — Remove transitive dependencies:

users: (user_id, customer, phone)
restaurants: (restaurant_id, restaurant, city)
orders: (order_id, user_id, restaurant_id)
order_items: (order_id, item, price)

4. ACID Properties & Isolation Levels

ACID with Food Delivery Examples

Atomicity — All or Nothing

Placing an order involves: deduct wallet balance + create order + notify restaurant. If the restaurant notification fails, everything rolls back — you do not lose money without an order.

sql
BEGIN;

UPDATE wallets SET balance = balance - 299.00 WHERE user_id = 42;
INSERT INTO orders (user_id, restaurant_id, amount, status)
VALUES (42, 10, 299.00, 'CONFIRMED');
INSERT INTO notifications (restaurant_id, message)
VALUES (10, 'New order #1234');

-- If any statement fails, ROLLBACK undoes everything
COMMIT;

Consistency — Valid State Before and After

Database constraints ensure we never end up in an invalid state. A negative wallet balance, an order referencing a non-existent restaurant, or a duplicate payment — constraints catch these.

sql
ALTER TABLE wallets ADD CONSTRAINT positive_balance CHECK (balance >= 0);
-- Now: UPDATE wallets SET balance = balance - 10000 WHERE balance = 50;
-- Fails with CHECK constraint violation — atomicity rolls back the transaction

Isolation — Concurrent Transactions Don't Interfere

Two users ordering the last available "Chef's Special" at the same time should not both succeed.

Durability — Committed = Permanent

Once the user sees "Order Confirmed", that data survives a server crash. Achieved through write-ahead logging (WAL) — changes are written to a durable log before being applied.

Isolation Levels

LevelDirty ReadNon-Repeatable ReadPhantom ReadPerformance
Read UncommittedPossiblePossiblePossibleFastest
Read CommittedPreventedPossiblePossibleFast
Repeatable ReadPreventedPreventedPossibleModerate
SerializablePreventedPreventedPreventedSlowest

The Three Anomalies Explained

Dirty Read — Reading uncommitted data

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 UPDATE orders SET status =
   'CANCELLED' WHERE id = 1;
                                  BEGIN;
                                  SELECT status FROM orders
                                    WHERE id = 1;
                                  -- Reads 'CANCELLED' ← DIRTY READ
 ROLLBACK;
 -- Order is actually still CONFIRMED, but B acted on stale/wrong data

Non-Repeatable Read — Same query, different result

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 SELECT amount FROM orders
   WHERE id = 1;
 -- Returns 299.00
                                  BEGIN;
                                  UPDATE orders SET amount = 399.00
                                    WHERE id = 1;
                                  COMMIT;
 SELECT amount FROM orders
   WHERE id = 1;
 -- Returns 399.00 ← different!    NON-REPEATABLE READ
 COMMIT;

Phantom Read — New rows appear

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 SELECT COUNT(*) FROM orders
   WHERE restaurant_id = 10;
 -- Returns 5
                                  BEGIN;
                                  INSERT INTO orders (restaurant_id, ...)
                                    VALUES (10, ...);
                                  COMMIT;
 SELECT COUNT(*) FROM orders
   WHERE restaurant_id = 10;
 -- Returns 6 ← PHANTOM ROW
 COMMIT;

Which Level to Use?

Use CaseRecommended LevelWhy
Analytics / reporting queriesRead CommittedStale data is fine, need performance
Order placement & paymentRepeatable ReadMust see consistent snapshot during transaction
Inventory / last-item bookingSerializableMust prevent phantom reads on availability checks
Default in PostgreSQLRead CommittedGood balance for most workloads
Default in MySQL InnoDBRepeatable ReadStronger default, uses MVCC snapshots

5. SQL vs NoSQL Decision Framework

Comparison Table

DimensionSQL (PostgreSQL, MySQL)NoSQL (MongoDB, DynamoDB, Redis)
SchemaRigid, predefinedFlexible, schema-on-read
ACIDFull supportVaries (some support per-document ACID)
ScalingVertical (read replicas for horizontal)Horizontal (sharding built-in)
JoinsNative, efficientApplication-level or denormalized
ConsistencyStrong by defaultEventual (tunable in some)
Query LanguageSQL (standardized)Varies per database
Best ForRelationships, transactions, complex queriesHigh throughput, flexible schema, massive scale

When to Use SQL (Food Delivery Context)

DataWhy SQL
Orders & PaymentsACID transactions — cannot lose money or double-charge
User AccountsRelational: user → addresses → payment methods → orders
Restaurant MenusStructured: categories → items → variants → add-ons
Promotions & CouponsComplex rules: "20% off if order > 500 AND first order this month"

When to Use NoSQL (Food Delivery Context)

DataDatabaseWhy
Real-time agent locationRedis (geospatial)Sub-millisecond geo queries, high write throughput
Session storeRedisTTL-based expiry, fast reads
Activity feedMongoDB or DynamoDBFlexible schema, append-heavy, time-sorted
Push notification queueRedis Streams / KafkaHigh throughput, ordered, consumer groups
Search & discoveryElasticsearchFull-text search, fuzzy matching, faceted filters

Polyglot Persistence for Food Delivery

┌─────────────────────────────────────────────────────────────┐
│                     Food Delivery Platform                   │
├──────────────┬──────────────┬───────────────┬───────────────┤
│   PostgreSQL │    Redis     │   MongoDB     │ Elasticsearch │
├──────────────┼──────────────┼───────────────┼───────────────┤
│ Users        │ Sessions     │ Activity feed │ Restaurant    │
│ Orders       │ Agent GPS    │ Chat messages │   search      │
│ Payments     │ Rate limits  │ Notifications │ Menu search   │
│ Restaurants  │ Cart (temp)  │ Audit logs    │ Geo search    │
│ Coupons      │ Cache layer  │               │               │
└──────────────┴──────────────┴───────────────┴───────────────┘

Interview tip: When asked "SQL or NoSQL?", the answer is almost always "both — here's how I'd split the data."


6. Transactions & Locking

Optimistic Locking

Assumes conflicts are rare. No locks held during the transaction. Instead, a version number or timestamp is checked at write time.

sql
-- Step 1: Read the current version
SELECT restaurant_id, menu_version, is_available
FROM menu_items
WHERE item_id = 101;
-- Returns: menu_version = 5, is_available = true

-- Step 2: Update only if version hasn't changed
UPDATE menu_items
SET is_available = false, menu_version = menu_version + 1
WHERE item_id = 101 AND menu_version = 5;

-- Step 3: Check rows affected
-- If 0 rows updated → someone else changed it → RETRY
-- If 1 row updated  → success

TypeScript implementation pattern:

ts
async function updateMenuItemWithOptimisticLock(
  itemId: number,
  updates: Partial<MenuItem>,
  maxRetries = 3
): Promise<MenuItem> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const item = await db.query(
      'SELECT * FROM menu_items WHERE item_id = $1',
      [itemId]
    );

    const result = await db.query(
      `UPDATE menu_items
       SET is_available = $1, menu_version = menu_version + 1
       WHERE item_id = $2 AND menu_version = $3
       RETURNING *`,
      [updates.isAvailable, itemId, item.menu_version]
    );

    if (result.rowCount === 1) {
      return result.rows[0]; // success
    }

    // Version mismatch — retry after brief delay
    await sleep(50 * (attempt + 1));
  }

  throw new Error('Optimistic lock failed after max retries');
}

Pessimistic Locking

Assumes conflicts are likely. Acquires a lock before reading, holds it until transaction completes.

sql
-- Lock the row so no other transaction can modify it
BEGIN;

SELECT * FROM orders
WHERE order_id = 1234
FOR UPDATE;  -- row-level exclusive lock

-- Safely update — no one else can touch this row
UPDATE orders
SET status = 'DELIVERED', delivered_at = NOW()
WHERE order_id = 1234;

COMMIT;  -- lock released

Lock types:

  • FOR UPDATE — exclusive lock, blocks other FOR UPDATE and FOR SHARE
  • FOR SHARE — shared lock, allows other FOR SHARE but blocks FOR UPDATE
  • FOR UPDATE NOWAIT — fail immediately instead of waiting if lock is held
  • FOR UPDATE SKIP LOCKED — skip rows that are locked (great for job queues)
sql
-- Job queue pattern: pick next available order for delivery assignment
BEGIN;

SELECT * FROM orders
WHERE status = 'READY_FOR_PICKUP' AND agent_id IS NULL
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;  -- skip orders being assigned by other workers

UPDATE orders SET agent_id = 7, status = 'ASSIGNED'
WHERE order_id = <selected_order_id>;

COMMIT;

Comparison: Optimistic vs Pessimistic

DimensionOptimisticPessimistic
Lock held during readNoYes
Conflict detectionAt write time (version check)At read time (lock acquisition)
Throughput under low contentionHigh (no lock overhead)Lower (lock overhead even when unnecessary)
Throughput under high contentionDegrades (many retries)Stable (queue on lock)
Deadlock riskNonePossible
Best forRead-heavy, low contention (menu browsing)Write-heavy, high contention (order assignment)
ImplementationApplication-level version columnDatabase-level SELECT FOR UPDATE

MVCC (Multi-Version Concurrency Control)

How PostgreSQL and MySQL InnoDB handle concurrent reads and writes without readers blocking writers.

Core idea: Instead of locking rows for reads, the database keeps multiple versions of each row. Each transaction sees a snapshot of the database as of its start time.

 Time    Transaction A (Read)         Transaction B (Write)
 ────    ────────────────────         ─────────────────────
 T1      BEGIN (snapshot at T1)
 T2                                   BEGIN
 T3                                   UPDATE orders SET amount = 399
                                        WHERE id = 1;
                                      -- Creates new version (amount=399)
                                      -- Old version (amount=299) kept
 T4      SELECT amount FROM orders
           WHERE id = 1;
         -- Sees 299 (snapshot at T1)
         -- Does NOT block, does NOT wait
 T5                                   COMMIT;
 T6      SELECT amount FROM orders
           WHERE id = 1;
         -- Still sees 299 (Repeatable Read)
         -- OR sees 399 (Read Committed)
 T7      COMMIT;

How it works in PostgreSQL:

  • Each row has hidden columns: xmin (transaction that created it), xmax (transaction that deleted/updated it)
  • A read transaction checks: "Is this row version visible to my snapshot?"
  • Old versions are cleaned up by VACUUM

Key benefit: Readers never block writers. Writers never block readers. Only writer-writer conflicts need locking.


7. Deadlock Detection

What Causes Deadlocks

A deadlock occurs when two or more transactions are each waiting for a lock held by the other, creating a circular wait. Four conditions must all be true (Coffman conditions):

ConditionMeaning
Mutual ExclusionThe resource (row lock) cannot be shared
Hold and WaitA transaction holds one lock while waiting for another
No PreemptionLocks cannot be forcibly taken from a transaction
Circular WaitT1 waits for T2, T2 waits for T1 (or longer cycle)

Deadlock Example: Two Concurrent Order Updates

 Transaction A                           Transaction B
 ─────────────                           ─────────────
 BEGIN;                                  BEGIN;

 -- Lock order #1                        -- Lock order #2
 UPDATE orders SET status = 'PREPARING'  UPDATE orders SET status = 'PREPARING'
   WHERE order_id = 1;                     WHERE order_id = 2;
 -- Holds lock on order #1               -- Holds lock on order #2

 -- Now needs order #2                   -- Now needs order #1
 UPDATE orders SET status = 'PREPARING'  UPDATE orders SET status = 'PREPARING'
   WHERE order_id = 2;                     WHERE order_id = 1;
 -- BLOCKED (B holds lock on #2)         -- BLOCKED (A holds lock on #1)

 -- ⚠ DEADLOCK: A waits for B, B waits for A

Detection: Wait-For Graph

The database builds a directed graph of "Transaction X waits for Transaction Y". If the graph contains a cycle, a deadlock exists.

 Wait-For Graph:

 T_A ──────→ T_B
  ↑           │
  │           ↓
  └───────── T_A     ← CYCLE DETECTED

 Resolution: Kill one transaction (the "victim"), ROLLBACK it,
             let the other proceed.

PostgreSQL detects deadlocks automatically (checks every deadlock_timeout, default 1 second). The victim transaction gets:

ERROR: deadlock detected
DETAIL: Process 1234 waits for ShareLock on transaction 5678;
        blocked by process 9012.
        Process 9012 waits for ShareLock on transaction 1234;
        blocked by process 1234.

Prevention Strategies

1. Consistent Lock Ordering

Always acquire locks in the same order (e.g., by ascending order_id).

sql
-- Both transactions lock order #1 first, then #2
-- Transaction A:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1;  -- lock #1 first
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2;  -- then #2
COMMIT;

-- Transaction B:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1;  -- lock #1 first (waits if A holds it)
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2;  -- then #2
COMMIT;
-- No deadlock: B simply waits for A to finish with #1
ts
// Application-level lock ordering
async function updateMultipleOrders(orderIds: number[], status: string) {
  // Sort to ensure consistent lock order
  const sorted = [...orderIds].sort((a, b) => a - b);

  await db.transaction(async (trx) => {
    for (const id of sorted) {
      await trx.query(
        'UPDATE orders SET status = $1 WHERE order_id = $2',
        [status, id]
      );
    }
  });
}

2. Lock Timeout

Set a maximum wait time. If a lock cannot be acquired within the timeout, the transaction fails instead of waiting forever.

sql
-- Set lock timeout to 5 seconds for this transaction
SET LOCAL lock_timeout = '5s';

BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- If lock not acquired in 5 seconds:
-- ERROR: canceling statement due to lock timeout
COMMIT;

3. NOWAIT — Fail Immediately

sql
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE NOWAIT;
-- If locked: ERROR: could not obtain lock on row in relation "orders"
-- Application catches error and retries with backoff
COMMIT;

4. Reduce Lock Scope and Duration

sql
-- BAD: Lock held while doing slow external call
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- ... call payment gateway (2 seconds) ...
UPDATE orders SET status = 'PAID' WHERE order_id = 1;
COMMIT;

-- GOOD: Do external work first, then lock briefly
-- 1. Call payment gateway outside transaction
-- 2. Then:
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
UPDATE orders SET status = 'PAID', payment_ref = 'PAY_123'
  WHERE order_id = 1;
COMMIT;  -- lock held for milliseconds, not seconds

Summary: Deadlock Prevention Checklist

StrategyHowTrade-off
Consistent lock orderingSort resources before lockingRequires discipline across codebase
Lock timeoutSET lock_timeout = '5s'Must handle timeout errors in application
NOWAITFOR UPDATE NOWAITImmediate failure, needs retry logic
Reduce lock durationMove slow operations outside transactionMore complex application flow
Use optimistic lockingVersion column, no row locksHigher retry rate under contention
Deadlock detection (DB)Automatic in PostgreSQL/MySQLOne transaction is killed — must retry

Interview Tips Summary

  1. Schema design questions: Start with 3NF, then discuss when and why you would denormalize.
  2. "How would you optimize this query?": Check execution plan → add composite index (equality cols first) → consider covering index → measure.
  3. "SQL or NoSQL?": Never pick one. Describe polyglot persistence — which data goes where and why.
  4. "How do you handle concurrent updates?": Start with the use case's contention level, then pick optimistic vs pessimistic.
  5. "What isolation level would you use?": Default to Read Committed for most things, Serializable for financial operations.
  6. ACID questions: Use the "place an order" flow — it naturally touches all four properties.
  7. Deadlock questions: Explain the four Coffman conditions, then lead with prevention (lock ordering) over detection.

Frontend interview preparation reference.