04 - DBMS

Why This Matters for Temple

Temple (ex-Zomato founding team) builds food delivery infrastructure where data integrity is non-negotiable. Orders, payments, restaurant menus, delivery tracking -- every piece touches a database. Expect questions on schema design, query optimization, and trade-offs between consistency and performance at scale.

Quick Reference

Scan this table in 5 minutes before your interview.

Topic	Key Points	Interview Tip
SQL Joins	INNER (intersection), LEFT (all left + match), RIGHT (all right + match), FULL OUTER (union), CROSS (cartesian), SELF (table with itself)	Draw the Venn diagram; always clarify NULL handling
B-Tree Index	Balanced tree, O(log n) lookup, great for range queries and sorting	Default index type in PostgreSQL/MySQL — say this explicitly
Hash Index	O(1) exact lookup, useless for ranges	Only mention for equality checks like `WHERE id = ?`
Composite Index	Leftmost prefix rule — `(a, b, c)` supports queries on `a`, `a,b`, `a,b,c` but NOT `b` alone	Draw the index tree to explain why prefix matters
Normalization	1NF (atomic) → 2NF (no partial deps) → 3NF (no transitive deps)	Know when to denormalize: read-heavy dashboards, caching
ACID	Atomicity (all-or-nothing), Consistency (valid state), Isolation (concurrent txns don't interfere), Durability (committed = permanent)	Use the "place order" flow as your example
Isolation Levels	Read Uncommitted → Read Committed → Repeatable Read → Serializable	Know the anomaly each level prevents
SQL vs NoSQL	SQL for transactions/relationships; NoSQL for scale/flexibility	Suggest polyglot persistence — use both
Optimistic Locking	Version column, no locks held, retry on conflict	Best for low-contention reads (menu browsing)
Pessimistic Locking	SELECT FOR UPDATE, holds row lock	Best for high-contention writes (seat/slot booking)
Deadlocks	Circular wait between transactions	Prevention: consistent lock ordering, timeouts

1. SQL Joins

Schema for Examples

sql

CREATE TABLE users (
    user_id   INT PRIMARY KEY,
    name      VARCHAR(100),
    city      VARCHAR(50)
);

CREATE TABLE restaurants (
    restaurant_id INT PRIMARY KEY,
    name          VARCHAR(100),
    city          VARCHAR(50),
    cuisine       VARCHAR(50)
);

CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    user_id       INT REFERENCES users(user_id),
    restaurant_id INT REFERENCES restaurants(restaurant_id),
    amount        DECIMAL(10, 2),
    status        VARCHAR(20),
    created_at    TIMESTAMP
);

CREATE TABLE delivery_agents (
    agent_id  INT PRIMARY KEY,
    name      VARCHAR(100),
    order_id  INT REFERENCES orders(order_id),  -- currently assigned order (NULL if free)
    rating    DECIMAL(2, 1)
);

INNER JOIN — Only Matching Rows

Returns rows that have a match in both tables. No match = row excluded.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│   ┌───┼─────┼───┐   │
│   │ ██████████ │   │    ← only the overlap
│   └───┼─────┼───┘   │
│       │     │       │
└───────┘     └───────┘

sql

-- Get all orders with the user's name and restaurant name
SELECT
    o.order_id,
    u.name   AS customer,
    r.name   AS restaurant,
    o.amount
FROM orders o
INNER JOIN users u       ON o.user_id = u.user_id
INNER JOIN restaurants r ON o.restaurant_id = r.restaurant_id;

When to use: You only want rows where the relationship exists on both sides. Most common join.

LEFT JOIN — All From Left + Matching From Right

Returns all rows from the left table. If no match on the right, those columns are NULL.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│ ██┌███┼█████┼───┐   │
│ ██│ ██████████ │   │    ← all of A, matching B
│ ██└███┼█████┼───┘   │
│       │     │       │
└───────┘     └───────┘

sql

-- Find all users and their orders (including users who never ordered)
SELECT
    u.user_id,
    u.name,
    o.order_id,
    o.amount
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id;

-- To find users who NEVER placed an order:
SELECT u.user_id, u.name
FROM users u
LEFT JOIN orders o ON u.user_id = o.user_id
WHERE o.order_id IS NULL;

When to use: Finding "missing" relationships — users with no orders, restaurants with no reviews.

RIGHT JOIN — All From Right + Matching From Left

Mirror of LEFT JOIN. Returns all rows from the right table.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│   ┌───┼█████┼███┐██ │
│   │ ██████████ │██ │    ← matching A, all of B
│   └───┼█████┼███┘██ │
│       │     │       │
└───────┘     └───────┘

sql

-- Find all restaurants and any orders they received
SELECT
    r.restaurant_id,
    r.name AS restaurant,
    o.order_id,
    o.amount
FROM orders o
RIGHT JOIN restaurants r ON o.restaurant_id = r.restaurant_id;

When to use: Rarely in practice — you can always rewrite as a LEFT JOIN by swapping table order. Know it exists for interviews.

FULL OUTER JOIN — All Rows From Both

Returns all rows from both tables. NULLs fill in where there is no match on either side.

 Table A       Table B
┌───────┐     ┌───────┐
│       │     │       │
│ ██┌███┼█████┼███┐██ │
│ ██│ ██████████ │██ │    ← everything
│ ██└███┼█████┼███┘██ │
│       │     │       │
└───────┘     └───────┘

sql

-- Match delivery agents to orders (show unassigned agents AND undelivered orders)
SELECT
    da.agent_id,
    da.name   AS agent_name,
    o.order_id,
    o.status
FROM delivery_agents da
FULL OUTER JOIN orders o ON da.order_id = o.order_id;

When to use: Reconciliation reports — find orphaned records on either side.

CROSS JOIN — Cartesian Product

Every row in A paired with every row in B. No ON clause. Result: |A| * |B| rows.

 A: [1, 2]    B: [x, y, z]
 Result: (1,x) (1,y) (1,z) (2,x) (2,y) (2,z)   →  2 × 3 = 6 rows

sql

-- Generate all possible restaurant-cuisine pairings for a recommendation matrix
SELECT
    r.name AS restaurant,
    c.cuisine
FROM restaurants r
CROSS JOIN (
    SELECT DISTINCT cuisine FROM restaurants
) c;

When to use: Generating combinations, test data, or pairing every item with every category. Be careful with large tables — row count explodes.

SELF JOIN — Table Joined With Itself

The same table appears twice with different aliases.

sql

-- Find delivery agents in the same city (peer comparison for routing)
-- Assume agents have a city column
SELECT
    a1.name AS agent_1,
    a2.name AS agent_2,
    a1.rating AS rating_1,
    a2.rating AS rating_2
FROM delivery_agents a1
INNER JOIN delivery_agents a2
    ON a1.agent_id < a2.agent_id;  -- avoid duplicates and self-pairs

-- Find users who referred other users (users table has referred_by column)
SELECT
    referrer.name AS referrer,
    referred.name AS referred_user
FROM users referred
INNER JOIN users referrer ON referred.referred_by = referrer.user_id;

When to use: Hierarchical data (manager-employee), finding pairs within the same table, graph-like relationships stored in a single table.

2. Indexing

B-Tree Index

The default index type in PostgreSQL and MySQL (InnoDB). A balanced tree structure where:

Internal nodes store keys and pointers to child nodes
Leaf nodes store keys and pointers to actual rows (or the row data in clustered indexes)
All leaf nodes are at the same depth — guarantees O(log n) lookup

                    [50]
                   /    \
            [20, 35]    [70, 85]
           /   |   \    /   |   \
        [10] [25] [40] [60] [75] [90]   ← leaf nodes (linked list for range scans)

How it supports operations:

Exact match WHERE user_id = 42 — traverse tree, O(log n)
Range query WHERE created_at BETWEEN '2025-01-01' AND '2025-01-31' — find start leaf, scan linked list
Sorting ORDER BY created_at — leaves already sorted, sequential scan
Prefix matching WHERE name LIKE 'Pizza%' — tree is ordered, find first match and scan

sql

-- Create a B-Tree index (default, so USING btree is optional)
CREATE INDEX idx_orders_created_at ON orders(created_at);

-- This query benefits from the index:
SELECT * FROM orders
WHERE created_at >= '2025-01-01' AND created_at < '2025-02-01'
ORDER BY created_at DESC;

Hash Index

A hash table: key → bucket → row pointer. O(1) average lookup for exact equality only.

 Key: "user_42"  →  hash("user_42") = 7  →  Bucket 7  →  Row pointer

Limitations:

No range queries (hash destroys ordering)
No sorting
No prefix matching
Not crash-safe in some older MySQL versions

sql

-- Create a Hash index (PostgreSQL)
CREATE INDEX idx_users_email_hash ON users USING hash(email);

-- This benefits:
SELECT * FROM users WHERE email = 'aayush@temple.io';

-- This does NOT benefit (range query):
SELECT * FROM users WHERE email > 'a' AND email < 'b';  -- full scan

When to use: High-volume exact lookups — session tokens, API key lookups, cache keys.

Composite Index

An index on multiple columns. The leftmost prefix rule determines which queries can use it.

sql

CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);

Think of it as a phone book sorted by (last_name, first_name):

 Index sorted as:
 (user_1, 2025-01-01)
 (user_1, 2025-01-15)
 (user_1, 2025-02-03)
 (user_2, 2025-01-05)
 (user_2, 2025-01-20)
 ...

Leftmost prefix rule — which queries use this index?

Query	Uses Index?	Why
`WHERE user_id = 1`	Yes	Leftmost column
`WHERE user_id = 1 AND created_at > '2025-01-01'`	Yes	Full prefix
`WHERE created_at > '2025-01-01'`	No	Skips leftmost column
`WHERE user_id = 1 ORDER BY created_at`	Yes	Prefix + sort on next column
`WHERE created_at > '2025-01-01' AND user_id = 1`	Yes	Optimizer reorders conditions

Covering Index

An index that contains all columns the query needs. The database serves the query entirely from the index without touching the actual table ("index-only scan").

sql

-- Covering index for a common dashboard query
CREATE INDEX idx_orders_covering
ON orders(user_id, created_at, amount, status);

-- This query is fully covered — no table lookup needed:
SELECT user_id, created_at, amount, status
FROM orders
WHERE user_id = 42 AND created_at > '2025-01-01';

-- EXPLAIN will show "Index Only Scan" (PostgreSQL) or "Using index" (MySQL)

Trade-off: Wider indexes use more disk and slow down writes. Only create covering indexes for hot queries.

When NOT to Index

Scenario	Why Indexing Hurts
Low cardinality columns (`status` with 3 values)	Index scan not much better than full scan; optimizer may ignore it
Frequently updated columns	Every write must update the index too
Small tables (< 1000 rows)	Full scan is fast enough; index overhead not worth it
Columns used with functions	`WHERE UPPER(name) = 'PIZZA'` cannot use index on `name` (use expression index instead)
Write-heavy tables with many indexes	Each INSERT/UPDATE touches every index — write amplification

Practical Example: Optimizing "Find Orders by User in Date Range"

sql

-- The query we want to optimize:
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
  AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;

-- Step 1: Check current execution plan
EXPLAIN ANALYZE
SELECT order_id, amount, status, created_at
FROM orders
WHERE user_id = 42
  AND created_at BETWEEN '2025-01-01' AND '2025-01-31'
ORDER BY created_at DESC;
-- Result: Seq Scan on orders, cost: 15000ms

-- Step 2: Add composite index (user_id first for equality, created_at for range + sort)
CREATE INDEX idx_orders_user_date ON orders(user_id, created_at);
-- Result: Index Scan, cost: 3ms

-- Step 3: Make it a covering index to avoid table lookup
CREATE INDEX idx_orders_user_date_covering
ON orders(user_id, created_at) INCLUDE (order_id, amount, status);
-- Result: Index Only Scan, cost: 1ms

Key takeaway: Equality columns first, then range/sort columns, then INCLUDE non-filtered columns.

3. Normalization

1NF — Atomic Values, No Repeating Groups

Every cell holds a single value. No arrays, no comma-separated lists, no nested tables.

Violation:

order_id	items
1	"Burger, Fries, Cola"
2	"Pizza, Garlic Bread"

1NF Fix: Break items into separate rows or a separate table.

sql

-- Separate order_items table
CREATE TABLE order_items (
    item_id    INT PRIMARY KEY,
    order_id   INT REFERENCES orders(order_id),
    item_name  VARCHAR(100),
    quantity   INT,
    price      DECIMAL(10, 2)
);

2NF — No Partial Dependencies

All non-key attributes must depend on the entire primary key, not just part of it. Only relevant when you have a composite primary key.

Violation: Composite key is (order_id, item_id)

order_id	item_id	item_name	item_price	order_date	customer_name
1	101	Burger	8.99	2025-01-15	Aayush

Problem: order_date and customer_name depend only on order_id, not on (order_id, item_id).

2NF Fix: Split into two tables.

sql

-- Orders table: attributes that depend on order_id alone
CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    order_date    TIMESTAMP,
    customer_name VARCHAR(100)
);

-- Order items: attributes that depend on the full composite key
CREATE TABLE order_items (
    order_id   INT REFERENCES orders(order_id),
    item_id    INT,
    item_name  VARCHAR(100),
    item_price DECIMAL(10, 2),
    quantity   INT,
    PRIMARY KEY (order_id, item_id)
);

3NF — No Transitive Dependencies

No non-key attribute should depend on another non-key attribute. Every non-key column depends directly on the primary key.

Violation:

order_id	restaurant_id	restaurant_name	restaurant_city
1	10	Pizza Palace	Mumbai

Problem: restaurant_name and restaurant_city depend on restaurant_id, not on order_id directly. That is a transitive dependency: order_id → restaurant_id → restaurant_name.

3NF Fix:

sql

-- Orders only reference restaurant_id
CREATE TABLE orders (
    order_id      INT PRIMARY KEY,
    user_id       INT REFERENCES users(user_id),
    restaurant_id INT REFERENCES restaurants(restaurant_id),
    amount        DECIMAL(10, 2),
    created_at    TIMESTAMP
);

-- Restaurant details live in their own table
CREATE TABLE restaurants (
    restaurant_id   INT PRIMARY KEY,
    restaurant_name VARCHAR(100),
    restaurant_city VARCHAR(50)
);

When to Denormalize

Scenario	Approach	Example
Read-heavy dashboards	Materialized views or summary tables	Pre-aggregate daily order counts per restaurant
Avoiding expensive joins	Store redundant data	Keep `restaurant_name` in orders for display queries
Caching precomputed data	Denormalized cache table	Store `user_total_spent` in users table, update on each order
Event sourcing / analytics	Flat event tables	Write-once order events with all context embedded

Rule of thumb: Normalize for writes, denormalize for reads. Start normalized, denormalize when you have a proven performance bottleneck.

Full Normalization Example: Food Order

Unnormalized (0NF):

| order_id | customer | phone     | items                        | restaurant | city   |
|----------|----------|-----------|------------------------------|------------|--------|
| 1        | Aayush   | 999000111 | Burger:8.99, Fries:3.99      | Pizza Hut  | Mumbai |
| 2        | Aayush   | 999000111 | Margherita:12.99             | Pizza Hut  | Mumbai |

1NF — Atomic values:

| order_id | customer | phone     | item       | price | restaurant | city   |
|----------|----------|-----------|------------|-------|------------|--------|
| 1        | Aayush   | 999000111 | Burger     | 8.99  | Pizza Hut  | Mumbai |
| 1        | Aayush   | 999000111 | Fries      | 3.99  | Pizza Hut  | Mumbai |
| 2        | Aayush   | 999000111 | Margherita | 12.99 | Pizza Hut  | Mumbai |

2NF — Remove partial dependencies (key: order_id, item):

orders: (order_id, customer, phone, restaurant, city)
order_items: (order_id, item, price)

3NF — Remove transitive dependencies:

users: (user_id, customer, phone)
restaurants: (restaurant_id, restaurant, city)
orders: (order_id, user_id, restaurant_id)
order_items: (order_id, item, price)

4. ACID Properties & Isolation Levels

ACID with Food Delivery Examples

Atomicity — All or Nothing

Placing an order involves: deduct wallet balance + create order + notify restaurant. If the restaurant notification fails, everything rolls back — you do not lose money without an order.

sql

BEGIN;

UPDATE wallets SET balance = balance - 299.00 WHERE user_id = 42;
INSERT INTO orders (user_id, restaurant_id, amount, status)
VALUES (42, 10, 299.00, 'CONFIRMED');
INSERT INTO notifications (restaurant_id, message)
VALUES (10, 'New order #1234');

-- If any statement fails, ROLLBACK undoes everything
COMMIT;

Consistency — Valid State Before and After

Database constraints ensure we never end up in an invalid state. A negative wallet balance, an order referencing a non-existent restaurant, or a duplicate payment — constraints catch these.

sql

ALTER TABLE wallets ADD CONSTRAINT positive_balance CHECK (balance >= 0);
-- Now: UPDATE wallets SET balance = balance - 10000 WHERE balance = 50;
-- Fails with CHECK constraint violation — atomicity rolls back the transaction

Isolation — Concurrent Transactions Don't Interfere

Two users ordering the last available "Chef's Special" at the same time should not both succeed.

Durability — Committed = Permanent

Once the user sees "Order Confirmed", that data survives a server crash. Achieved through write-ahead logging (WAL) — changes are written to a durable log before being applied.

Isolation Levels

Level	Dirty Read	Non-Repeatable Read	Phantom Read	Performance
Read Uncommitted	Possible	Possible	Possible	Fastest
Read Committed	Prevented	Possible	Possible	Fast
Repeatable Read	Prevented	Prevented	Possible	Moderate
Serializable	Prevented	Prevented	Prevented	Slowest

The Three Anomalies Explained

Dirty Read — Reading uncommitted data

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 UPDATE orders SET status =
   'CANCELLED' WHERE id = 1;
                                  BEGIN;
                                  SELECT status FROM orders
                                    WHERE id = 1;
                                  -- Reads 'CANCELLED' ← DIRTY READ
 ROLLBACK;
 -- Order is actually still CONFIRMED, but B acted on stale/wrong data

Non-Repeatable Read — Same query, different result

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 SELECT amount FROM orders
   WHERE id = 1;
 -- Returns 299.00
                                  BEGIN;
                                  UPDATE orders SET amount = 399.00
                                    WHERE id = 1;
                                  COMMIT;
 SELECT amount FROM orders
   WHERE id = 1;
 -- Returns 399.00 ← different!    NON-REPEATABLE READ
 COMMIT;

Phantom Read — New rows appear

 Transaction A                    Transaction B
 ─────────────                    ─────────────
 BEGIN;
 SELECT COUNT(*) FROM orders
   WHERE restaurant_id = 10;
 -- Returns 5
                                  BEGIN;
                                  INSERT INTO orders (restaurant_id, ...)
                                    VALUES (10, ...);
                                  COMMIT;
 SELECT COUNT(*) FROM orders
   WHERE restaurant_id = 10;
 -- Returns 6 ← PHANTOM ROW
 COMMIT;

Which Level to Use?

Use Case	Recommended Level	Why
Analytics / reporting queries	Read Committed	Stale data is fine, need performance
Order placement & payment	Repeatable Read	Must see consistent snapshot during transaction
Inventory / last-item booking	Serializable	Must prevent phantom reads on availability checks
Default in PostgreSQL	Read Committed	Good balance for most workloads
Default in MySQL InnoDB	Repeatable Read	Stronger default, uses MVCC snapshots

5. SQL vs NoSQL Decision Framework

Comparison Table

Dimension	SQL (PostgreSQL, MySQL)	NoSQL (MongoDB, DynamoDB, Redis)
Schema	Rigid, predefined	Flexible, schema-on-read
ACID	Full support	Varies (some support per-document ACID)
Scaling	Vertical (read replicas for horizontal)	Horizontal (sharding built-in)
Joins	Native, efficient	Application-level or denormalized
Consistency	Strong by default	Eventual (tunable in some)
Query Language	SQL (standardized)	Varies per database
Best For	Relationships, transactions, complex queries	High throughput, flexible schema, massive scale

When to Use SQL (Food Delivery Context)

Data	Why SQL
Orders & Payments	ACID transactions — cannot lose money or double-charge
User Accounts	Relational: user → addresses → payment methods → orders
Restaurant Menus	Structured: categories → items → variants → add-ons
Promotions & Coupons	Complex rules: "20% off if order > 500 AND first order this month"

When to Use NoSQL (Food Delivery Context)

Data	Database	Why
Real-time agent location	Redis (geospatial)	Sub-millisecond geo queries, high write throughput
Session store	Redis	TTL-based expiry, fast reads
Activity feed	MongoDB or DynamoDB	Flexible schema, append-heavy, time-sorted
Push notification queue	Redis Streams / Kafka	High throughput, ordered, consumer groups
Search & discovery	Elasticsearch	Full-text search, fuzzy matching, faceted filters

Polyglot Persistence for Food Delivery

┌─────────────────────────────────────────────────────────────┐
│                     Food Delivery Platform                   │
├──────────────┬──────────────┬───────────────┬───────────────┤
│   PostgreSQL │    Redis     │   MongoDB     │ Elasticsearch │
├──────────────┼──────────────┼───────────────┼───────────────┤
│ Users        │ Sessions     │ Activity feed │ Restaurant    │
│ Orders       │ Agent GPS    │ Chat messages │   search      │
│ Payments     │ Rate limits  │ Notifications │ Menu search   │
│ Restaurants  │ Cart (temp)  │ Audit logs    │ Geo search    │
│ Coupons      │ Cache layer  │               │               │
└──────────────┴──────────────┴───────────────┴───────────────┘

Interview tip: When asked "SQL or NoSQL?", the answer is almost always "both — here's how I'd split the data."

6. Transactions & Locking

Optimistic Locking

Assumes conflicts are rare. No locks held during the transaction. Instead, a version number or timestamp is checked at write time.

sql

-- Step 1: Read the current version
SELECT restaurant_id, menu_version, is_available
FROM menu_items
WHERE item_id = 101;
-- Returns: menu_version = 5, is_available = true

-- Step 2: Update only if version hasn't changed
UPDATE menu_items
SET is_available = false, menu_version = menu_version + 1
WHERE item_id = 101 AND menu_version = 5;

-- Step 3: Check rows affected
-- If 0 rows updated → someone else changed it → RETRY
-- If 1 row updated  → success

TypeScript implementation pattern:

async function updateMenuItemWithOptimisticLock(
  itemId: number,
  updates: Partial<MenuItem>,
  maxRetries = 3
): Promise<MenuItem> {
  for (let attempt = 0; attempt < maxRetries; attempt++) {
    const item = await db.query(
      'SELECT * FROM menu_items WHERE item_id = $1',
      [itemId]
    );

    const result = await db.query(
      `UPDATE menu_items
       SET is_available = $1, menu_version = menu_version + 1
       WHERE item_id = $2 AND menu_version = $3
       RETURNING *`,
      [updates.isAvailable, itemId, item.menu_version]
    );

    if (result.rowCount === 1) {
      return result.rows[0]; // success
    }

    // Version mismatch — retry after brief delay
    await sleep(50 * (attempt + 1));
  }

  throw new Error('Optimistic lock failed after max retries');
}

Pessimistic Locking

Assumes conflicts are likely. Acquires a lock before reading, holds it until transaction completes.

sql

-- Lock the row so no other transaction can modify it
BEGIN;

SELECT * FROM orders
WHERE order_id = 1234
FOR UPDATE;  -- row-level exclusive lock

-- Safely update — no one else can touch this row
UPDATE orders
SET status = 'DELIVERED', delivered_at = NOW()
WHERE order_id = 1234;

COMMIT;  -- lock released

Lock types:

FOR UPDATE — exclusive lock, blocks other FOR UPDATE and FOR SHARE
FOR SHARE — shared lock, allows other FOR SHARE but blocks FOR UPDATE
FOR UPDATE NOWAIT — fail immediately instead of waiting if lock is held
FOR UPDATE SKIP LOCKED — skip rows that are locked (great for job queues)

sql

-- Job queue pattern: pick next available order for delivery assignment
BEGIN;

SELECT * FROM orders
WHERE status = 'READY_FOR_PICKUP' AND agent_id IS NULL
ORDER BY created_at
LIMIT 1
FOR UPDATE SKIP LOCKED;  -- skip orders being assigned by other workers

UPDATE orders SET agent_id = 7, status = 'ASSIGNED'
WHERE order_id = <selected_order_id>;

COMMIT;

Comparison: Optimistic vs Pessimistic

Dimension	Optimistic	Pessimistic
Lock held during read	No	Yes
Conflict detection	At write time (version check)	At read time (lock acquisition)
Throughput under low contention	High (no lock overhead)	Lower (lock overhead even when unnecessary)
Throughput under high contention	Degrades (many retries)	Stable (queue on lock)
Deadlock risk	None	Possible
Best for	Read-heavy, low contention (menu browsing)	Write-heavy, high contention (order assignment)
Implementation	Application-level version column	Database-level `SELECT FOR UPDATE`

MVCC (Multi-Version Concurrency Control)

How PostgreSQL and MySQL InnoDB handle concurrent reads and writes without readers blocking writers.

Core idea: Instead of locking rows for reads, the database keeps multiple versions of each row. Each transaction sees a snapshot of the database as of its start time.

 Time    Transaction A (Read)         Transaction B (Write)
 ────    ────────────────────         ─────────────────────
 T1      BEGIN (snapshot at T1)
 T2                                   BEGIN
 T3                                   UPDATE orders SET amount = 399
                                        WHERE id = 1;
                                      -- Creates new version (amount=399)
                                      -- Old version (amount=299) kept
 T4      SELECT amount FROM orders
           WHERE id = 1;
         -- Sees 299 (snapshot at T1)
         -- Does NOT block, does NOT wait
 T5                                   COMMIT;
 T6      SELECT amount FROM orders
           WHERE id = 1;
         -- Still sees 299 (Repeatable Read)
         -- OR sees 399 (Read Committed)
 T7      COMMIT;

How it works in PostgreSQL:

Each row has hidden columns: xmin (transaction that created it), xmax (transaction that deleted/updated it)
A read transaction checks: "Is this row version visible to my snapshot?"
Old versions are cleaned up by VACUUM

Key benefit: Readers never block writers. Writers never block readers. Only writer-writer conflicts need locking.

7. Deadlock Detection

What Causes Deadlocks

A deadlock occurs when two or more transactions are each waiting for a lock held by the other, creating a circular wait. Four conditions must all be true (Coffman conditions):

Condition	Meaning
Mutual Exclusion	The resource (row lock) cannot be shared
Hold and Wait	A transaction holds one lock while waiting for another
No Preemption	Locks cannot be forcibly taken from a transaction
Circular Wait	T1 waits for T2, T2 waits for T1 (or longer cycle)

Deadlock Example: Two Concurrent Order Updates

 Transaction A                           Transaction B
 ─────────────                           ─────────────
 BEGIN;                                  BEGIN;

 -- Lock order #1                        -- Lock order #2
 UPDATE orders SET status = 'PREPARING'  UPDATE orders SET status = 'PREPARING'
   WHERE order_id = 1;                     WHERE order_id = 2;
 -- Holds lock on order #1               -- Holds lock on order #2

 -- Now needs order #2                   -- Now needs order #1
 UPDATE orders SET status = 'PREPARING'  UPDATE orders SET status = 'PREPARING'
   WHERE order_id = 2;                     WHERE order_id = 1;
 -- BLOCKED (B holds lock on #2)         -- BLOCKED (A holds lock on #1)

 -- ⚠ DEADLOCK: A waits for B, B waits for A

Detection: Wait-For Graph

The database builds a directed graph of "Transaction X waits for Transaction Y". If the graph contains a cycle, a deadlock exists.

 Wait-For Graph:

 T_A ──────→ T_B
  ↑           │
  │           ↓
  └───────── T_A     ← CYCLE DETECTED

 Resolution: Kill one transaction (the "victim"), ROLLBACK it,
             let the other proceed.

PostgreSQL detects deadlocks automatically (checks every deadlock_timeout, default 1 second). The victim transaction gets:

ERROR: deadlock detected
DETAIL: Process 1234 waits for ShareLock on transaction 5678;
        blocked by process 9012.
        Process 9012 waits for ShareLock on transaction 1234;
        blocked by process 1234.

Prevention Strategies

1. Consistent Lock Ordering

Always acquire locks in the same order (e.g., by ascending order_id).

sql

-- Both transactions lock order #1 first, then #2
-- Transaction A:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1;  -- lock #1 first
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2;  -- then #2
COMMIT;

-- Transaction B:
BEGIN;
UPDATE orders SET status = 'PREPARING' WHERE order_id = 1;  -- lock #1 first (waits if A holds it)
UPDATE orders SET status = 'PREPARING' WHERE order_id = 2;  -- then #2
COMMIT;
-- No deadlock: B simply waits for A to finish with #1

// Application-level lock ordering
async function updateMultipleOrders(orderIds: number[], status: string) {
  // Sort to ensure consistent lock order
  const sorted = [...orderIds].sort((a, b) => a - b);

  await db.transaction(async (trx) => {
    for (const id of sorted) {
      await trx.query(
        'UPDATE orders SET status = $1 WHERE order_id = $2',
        [status, id]
      );
    }
  });
}

2. Lock Timeout

Set a maximum wait time. If a lock cannot be acquired within the timeout, the transaction fails instead of waiting forever.

sql

-- Set lock timeout to 5 seconds for this transaction
SET LOCAL lock_timeout = '5s';

BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- If lock not acquired in 5 seconds:
-- ERROR: canceling statement due to lock timeout
COMMIT;

3. NOWAIT — Fail Immediately

sql

BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE NOWAIT;
-- If locked: ERROR: could not obtain lock on row in relation "orders"
-- Application catches error and retries with backoff
COMMIT;

4. Reduce Lock Scope and Duration

sql

-- BAD: Lock held while doing slow external call
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
-- ... call payment gateway (2 seconds) ...
UPDATE orders SET status = 'PAID' WHERE order_id = 1;
COMMIT;

-- GOOD: Do external work first, then lock briefly
-- 1. Call payment gateway outside transaction
-- 2. Then:
BEGIN;
SELECT * FROM orders WHERE order_id = 1 FOR UPDATE;
UPDATE orders SET status = 'PAID', payment_ref = 'PAY_123'
  WHERE order_id = 1;
COMMIT;  -- lock held for milliseconds, not seconds

Summary: Deadlock Prevention Checklist

Strategy	How	Trade-off
Consistent lock ordering	Sort resources before locking	Requires discipline across codebase
Lock timeout	`SET lock_timeout = '5s'`	Must handle timeout errors in application
NOWAIT	`FOR UPDATE NOWAIT`	Immediate failure, needs retry logic
Reduce lock duration	Move slow operations outside transaction	More complex application flow
Use optimistic locking	Version column, no row locks	Higher retry rate under contention
Deadlock detection (DB)	Automatic in PostgreSQL/MySQL	One transaction is killed — must retry

Interview Tips Summary

Schema design questions: Start with 3NF, then discuss when and why you would denormalize.
"How would you optimize this query?": Check execution plan → add composite index (equality cols first) → consider covering index → measure.
"SQL or NoSQL?": Never pick one. Describe polyglot persistence — which data goes where and why.
"How do you handle concurrent updates?": Start with the use case's contention level, then pick optimistic vs pessimistic.
"What isolation level would you use?": Default to Read Committed for most things, Serializable for financial operations.
ACID questions: Use the "place an order" flow — it naturally touches all four properties.
Deadlock questions: Explain the four Coffman conditions, then lead with prevention (lock ordering) over detection.

04 - DBMS ​

Why This Matters for Temple ​

Quick Reference ​

1. SQL Joins ​

Schema for Examples ​

INNER JOIN — Only Matching Rows ​

LEFT JOIN — All From Left + Matching From Right ​

RIGHT JOIN — All From Right + Matching From Left ​

FULL OUTER JOIN — All Rows From Both ​

CROSS JOIN — Cartesian Product ​

SELF JOIN — Table Joined With Itself ​

2. Indexing ​

B-Tree Index ​

Hash Index ​

Composite Index ​

Covering Index ​

When NOT to Index ​

Practical Example: Optimizing "Find Orders by User in Date Range" ​

3. Normalization ​

1NF — Atomic Values, No Repeating Groups ​

2NF — No Partial Dependencies ​

3NF — No Transitive Dependencies ​

When to Denormalize ​

Full Normalization Example: Food Order ​

4. ACID Properties & Isolation Levels ​

ACID with Food Delivery Examples ​

Isolation Levels ​

The Three Anomalies Explained ​

Which Level to Use? ​

5. SQL vs NoSQL Decision Framework ​

Comparison Table ​

When to Use SQL (Food Delivery Context) ​

When to Use NoSQL (Food Delivery Context) ​

Polyglot Persistence for Food Delivery ​

6. Transactions & Locking ​

Optimistic Locking ​

Pessimistic Locking ​

Comparison: Optimistic vs Pessimistic ​

MVCC (Multi-Version Concurrency Control) ​

7. Deadlock Detection ​

What Causes Deadlocks ​

Deadlock Example: Two Concurrent Order Updates ​

Detection: Wait-For Graph ​

Prevention Strategies ​

Summary: Deadlock Prevention Checklist ​

Interview Tips Summary ​