Backend Fundamentals (L4 Reference)

Google probes backend fundamentals less aggressively than Uber or Salesforce, but you still need them for:

The scoped HLD round (if you get Variant B)
Coding follow-ups like "what if this ran across 100 machines?"
The Googleyness round when you explain past work

Keep this compact. Each section is a reference — deeper dives live in Uber backend notes and Paytm HLD.

1. Quick Reference Cheat Sheet

Concept	Rule of thumb	When it matters
SQL	Strong consistency, ACID, joins	Transactional data (orders, users, billing)
NoSQL	Horizontal scale, schema-flex	High-volume append (events, logs, feeds)
Cache-aside	App reads cache, falls back to DB	Most read-heavy services
LRU / LFU	Evict least recently / frequently used	Caches with bounded memory
Redis	In-memory, rich data structures	Cache, session store, rate limiter, distributed lock
Memcached	Simple key-value, no persistence	Pure cache, no ops overhead
Kafka	Log-based, replay, high throughput	Event pipelines, analytics, fan-out
RabbitMQ	Queue-based, flexible routing	Task queues, RPC, low-throughput
REST	Stateless HTTP, easy to debug	External APIs, simple CRUD
gRPC	Binary, typed, streaming	Internal service-to-service
HTTP/2	Multiplexing, header compression	Modern web, replaces 1.1
HTTP/3	QUIC over UDP, no head-of-line blocking	Mobile, lossy networks
CAP	Pick 2 of Consistency, Availability, Partition-tolerance	Distributed data stores
Raft	Leader-based consensus	Distributed state (etcd, Consul)
LB L4	TCP-level, faster, no app awareness	High-throughput, simple routing
LB L7	HTTP-aware, routing by path/header	API gateways, ingress
Circuit breaker	Stop calling failing service	Service-to-service resilience
Backpressure	Slow down upstream when downstream overloaded	Streaming, queues

2. Databases

SQL vs NoSQL

SQL (Postgres, MySQL, Spanner):

Strong consistency (typically)
ACID transactions across rows
Rich joins, aggregations, secondary indexes
Vertical scale first; horizontal scale requires sharding
Use when: data has relationships, transactions matter, schema is stable

NoSQL (varies):

Document (MongoDB, Firestore): flexible schema, denormalized reads
Key-Value (DynamoDB, Bigtable): O(1) by key, horizontal scale built-in
Wide-column (Cassandra, Bigtable): write-heavy, tunable consistency
Graph (Neo4j, Neptune): relationship-first queries
Horizontal scale by default; eventual consistency (tunable)
Use when: data is append-heavy, schema evolves fast, or data is inherently unrelated

Google-specific: Spanner gives you ACID + global scale + SQL. It's NOT a vanilla SQL database — it uses TrueTime and Paxos under the hood. Bigtable is the flagship wide-column store.

ACID (transaction guarantees)

Letter	Guarantees
A Atomicity	All-or-nothing; partial failures roll back
C Consistency	Transaction leaves DB in a valid state (constraints hold)
I Isolation	Concurrent transactions don't see each other's uncommitted state
D Durability	Committed data survives crashes (fsync)

Isolation levels (weakest to strongest)

Level	Prevents	Allows
Read Uncommitted	Nothing	Dirty reads
Read Committed	Dirty reads	Non-repeatable reads, phantoms
Repeatable Read	Dirty + non-repeatable reads	Phantoms
Serializable	All anomalies	Highest lock cost
Snapshot Isolation (MVCC)	Dirty + non-repeatable	Write skew

Default for Postgres: Read Committed. For MySQL InnoDB: Repeatable Read with phantom protection via gap locks.

Indexes

B-tree index: default for most SQL DBs. O(log n) lookup, range queries supported. Covers WHERE, ORDER BY, range scans.

Hash index: O(1) exact lookup, no range. Useful for equality-only lookups.

Composite index: CREATE INDEX ON orders (user_id, created_at). Leftmost prefix rule — can be used for WHERE user_id = ? and WHERE user_id = ? AND created_at > ? but NOT WHERE created_at > ? alone.

Covering index: includes all columns needed by the query, avoiding table lookup. Huge speedup for read-heavy queries.

When to index:

Columns in WHERE, JOIN ON, ORDER BY, GROUP BY
Columns with high selectivity (few duplicates)
Columns read much more than written (indexes slow writes)

When NOT to index:

Low-selectivity columns (boolean, status with 3 values)
Tables with high write volume and rare reads
Columns that are updated frequently

Sharding

Horizontal partitioning — split rows across multiple DB instances by a shard key.

Strategies:

Range-based: user_id 1-1M on shard A, 1M-2M on shard B. Simple, but hotspots if access is skewed.
Hash-based: hash(user_id) % N. Uniform distribution, but resharding is painful when N changes.
Consistent hashing: minimizes data movement on rebalance. Industry standard for elastic clusters.
Directory-based: central lookup service maps keys to shards. Flexible, but the lookup is an extra hop + SPOF risk.

Trade-offs:

Cross-shard queries require scatter-gather — slow, expensive
Cross-shard transactions require 2PC — complex, blocking
Hot partition detection (monitor per-shard QPS) — rebalance before the shard melts

Replication

Leader-follower (primary-replica):

Writes go to leader, leader replicates to followers
Reads can go to followers (read replicas) — reduces leader load
Replication lag — followers can be seconds behind (read-your-writes inconsistency)

Multi-leader:

Multiple writable replicas
Conflict resolution via last-write-wins, vector clocks, CRDTs
Use case: multi-region writes where latency matters more than strict consistency

Leaderless (Dynamo-style):

Clients write to N nodes, read from R nodes
R + W > N → strong consistency
Used by Cassandra, DynamoDB

Consistency models

Model	Guarantee
Strong (linearizable)	Read sees the latest write, globally
Sequential	All clients see operations in the same order, but not necessarily real-time
Causal	Operations with causal dependency are ordered; unrelated ones may not be
Eventual	All replicas converge eventually; no ordering guarantee

Most distributed SQL (Spanner, CockroachDB) give linearizable. Most NoSQL give eventual or tunable.

3. Caching

Patterns

Cache-aside (lazy loading):

read: app → cache. miss → app → db → app writes to cache → app returns
write: app → db. Invalidate cache entry.

Most common pattern. Resilient (cache failure doesn't break app), but cold starts hurt.

Write-through:

write: app → cache → db (synchronous). Read always hits cache.

Always-warm cache, but writes pay the cache cost even if the key is never read.

Write-behind (write-back):

write: app → cache (async flush to db).

Fastest writes, but durability risk if the cache crashes before flushing.

Read-through:

Cache acts as the DB proxy. App only talks to cache; cache pulls from DB on miss.

Simpler app code, but couples cache and DB tightly.

Eviction

LRU (Least Recently Used): evict the entry accessed longest ago. Simple, good for temporal locality.
LFU (Least Frequently Used): evict the entry accessed least total times. Good for skewed frequency.
FIFO: evict oldest by insertion, regardless of access. Rare in production.
TTL (Time-to-Live): expire after N seconds. Combine with LRU for belt-and-suspenders.

LRU implementation: HashMap + Doubly Linked List. O(1) get, O(1) put. See LRU Cache problem.

Redis vs Memcached

Feature	Redis	Memcached
Data structures	Strings, lists, sets, hashes, sorted sets, streams	Strings only
Persistence	RDB snapshots + AOF	None (pure cache)
Replication	Yes (primary-replica)	No native
Clustering	Redis Cluster (sharded)	Client-side sharding
Use case	Cache + session + queue + counters + pub/sub	Simple cache, minimal ops

Rule: start with Redis unless you have a specific reason to use Memcached. The feature surface is worth the complexity.

Cache invalidation

"There are only two hard things in computer science: cache invalidation and naming things." — Phil Karlton

Strategies:

TTL-based: accept stale reads up to N seconds. Simple, no coordination. Use for non-critical data.
Event-based: DB publishes invalidation events on write. Cache consumes and evicts. Complex, fast.
Version-based: cache key includes a version (user:123:v7). On write, bump version → old key becomes garbage. Clean but leaks memory until TTL.

4. Concurrency

Threads vs processes

Threads: share memory, cheaper to spawn, race-condition prone
Processes: isolated memory, IPC via pipes/sockets, heavier

Most backend services use many threads in a single process (Java, Go, Python with async). Use processes for true isolation (e.g., sandboxing untrusted code).

Locks

Mutex: one thread at a time holds the lock. RWLock: many readers OR one writer. Good for read-heavy data. Spinlock: busy-wait, no syscall. Only for very short critical sections. Semaphore: counter-based, allows N concurrent holders. Good for resource pools.

Atomics

For simple counter updates, atomics (compare-and-swap, fetch-and-add) are faster than locks. std::atomic<int> in C++, AtomicInteger in Java, sync/atomic in Go.

Rule: atomics for single-variable updates, locks for multi-variable critical sections.

Race conditions

Three classic races:

Check-then-act: if (!map.contains(k)) map.put(k, v); — two threads both check, both put. Fix: putIfAbsent.
Read-modify-write: counter = counter + 1 is 3 ops. Fix: atomic ++counter.
Iteration + mutation: iterate a list while another thread inserts. Fix: snapshot iteration, or concurrent collection.

Deadlock prevention

A deadlock needs all 4 of Coffman's conditions. Break any one:

Mutual exclusion — hard to remove (some resources must be exclusive)
Hold and wait — acquire all locks upfront
No preemption — allow lock timeout and release
Circular wait — always acquire locks in a global order (most practical fix)

Golden rule: establish a total order on all locks (e.g., alphabetical by name, or numerical by ID) and acquire in that order everywhere. No circular wait, no deadlock.

5. Messaging

Kafka vs RabbitMQ

Feature	Kafka	RabbitMQ
Model	Log (append-only partitioned stream)	Queue (routed message broker)
Throughput	Very high (millions msg/s)	Moderate (tens of thousands msg/s)
Retention	Days to forever (configurable)	Until consumed
Replay	Built-in (consumer offsets)	No native, requires DLQ trick
Ordering	Per-partition	Per-queue
Use case	Event streaming, analytics, CDC	Task queues, RPC, flexible routing

Rule: Kafka for high-throughput event pipelines, RabbitMQ for work queues with flexible routing.

Delivery semantics

Guarantee	Meaning	Cost
At-most-once	May lose messages, no duplicates	Cheapest, rare in prod
At-least-once	May deliver duplicates, no loss	Default, requires idempotent consumers
Exactly-once	One and only one	Expensive, requires transactional producer + idempotent consumer

L4 expectation: know that at-least-once + idempotency is the standard production pattern. True exactly-once is rare and expensive (Kafka exactly-once semantics via transactional producers, or 2PC).

Idempotency

A consumer is idempotent if processing the same message twice has the same effect as once.

Patterns:

Idempotency key: include a unique key per message; consumer checks a dedup store (Redis, DB) before acting
Upsert semantics: INSERT ... ON CONFLICT DO UPDATE
Version/sequence numbers: ignore messages with an older version than the current state

Dead Letter Queue (DLQ)

When a message fails processing N times, move it to a DLQ for manual inspection or async retry. Prevents one bad message from blocking the queue.

Pattern:

Main queue → consumer attempts processing → fail → retry up to N times
After N failures → move to DLQ
Alerting on DLQ depth; replay tool to re-enqueue fixed messages

6. Networking

HTTP versions

Version	Key features
HTTP/1.1	Text-based, one request per connection (or pipelining), head-of-line blocking
HTTP/2	Binary framing, multiplexing over one TCP connection, header compression (HPACK), server push
HTTP/3	Built on QUIC (UDP), no head-of-line blocking at transport layer, faster handshake, better on lossy networks

L4 rule: know the differences conceptually. In practice, web backends use HTTP/2 or HTTP/3; internal service-to-service often uses gRPC (which runs over HTTP/2).

REST vs gRPC

Feature	REST (JSON)	gRPC (Protobuf)
Format	JSON (text)	Protobuf (binary)
Schema	Open (OpenAPI optional)	Strictly typed (.proto files)
Streaming	Limited (SSE, WebSockets)	First-class (unary, server-stream, client-stream, bidirectional)
Debuggability	Easy (curl, browser)	Harder (grpcurl, bloomrpc)
Size on wire	Larger	~5-10x smaller
Use case	Public APIs, simple CRUD	Internal microservices, high-throughput, streaming

DNS

DNS resolves hostnames to IPs.

A record: hostname → IPv4
AAAA: hostname → IPv6
CNAME: alias (hostname → hostname)
TXT: arbitrary text (used for SPF, DKIM, domain verification)
MX: mail server records

Resolution flow: stub resolver → recursive resolver (often ISP or 8.8.8.8) → root → TLD → authoritative → answer (cached at each step).

TTL: controls how long a record is cached. Low TTL (60s) for agile changes, high TTL (1h+) for stable records.

TLS

TLS provides confidentiality, integrity, authentication for connections (HTTPS, gRPC, internal mTLS).

Handshake (TLS 1.3 simplified):

Client Hello (supported cipher suites, random)
Server Hello (chosen cipher, certificate, random)
Client verifies certificate chain against trusted CAs
Key exchange (ECDHE)
Both derive session keys; encrypted communication begins

mTLS (mutual TLS): client also presents a certificate. Used for service-to-service auth in zero-trust architectures.

Load Balancers

Layer 4 (TCP):

Operates on IP + port
No HTTP awareness (cannot route by URL, header)
Very fast, used for raw TCP workloads (databases, gRPC before L7 support)
Examples: AWS NLB, Google Network Load Balancer

Layer 7 (HTTP):

Terminates HTTPS, routes by path, header, cookie
Can do URL rewriting, header injection, rate limiting
Slower than L4 but vastly more flexible
Examples: nginx, HAProxy, AWS ALB, GCP HTTPS LB, Envoy

Algorithms:

Round-robin: each request to the next server
Least-connections: route to the server with fewest active connections
IP-hash / consistent-hash: same client always goes to the same backend (sticky sessions, cache locality)
Weighted: assign different capacities to different backends

7. Distributed Systems

CAP Theorem

"In a distributed system experiencing a network partition, you must choose between Consistency and Availability."

CP systems: refuse requests during partition rather than return stale data. Examples: Zookeeper, etcd, Spanner (technically CA under their SLA).
AP systems: always respond, even with stale data. Examples: Cassandra (tunable), DynamoDB (tunable), Couchbase.

L4 rule: in practice, CAP is a spectrum, not a binary. Most systems are "mostly consistent with eventual fallback" or "mostly available with best-effort consistency." Know this; don't over-cite CAP.

Consistency models recap (from above)

Strong > Sequential > Causal > Eventual

Consensus: Raft

Raft elects a leader; the leader accepts all writes and replicates to followers. If the leader fails, followers elect a new one.

Key mechanisms:

Leader election: followers wait for heartbeat; if missing, start an election with a random timeout (prevents split votes)
Log replication: leader appends entry locally, sends to followers, commits once a majority acknowledge
Safety: a new leader must have all committed entries from the previous term (election restrictions enforce this)

Use cases: etcd (Kubernetes config), Consul (service discovery), CockroachDB (per-range consensus).

L4 rule: know that Raft is the modern pragmatic choice over Paxos (which is notoriously hard to implement correctly). You don't need to implement Raft — you need to know its role.

Distributed locks

Single-node locks don't work across multiple processes. Common options:

Redis (Redlock):

SETNX key value EX ttl to acquire
Client must refresh TTL or release before expiry
Caveats: clock drift, GC pauses, Redis failover can break correctness. Redlock is debated (Martin Kleppmann's critique).

Zookeeper / etcd:

Create ephemeral sequential node
Watch the node just before yours in sequence
Session death automatically releases (ephemeral)
Safer than Redis for correctness-critical locks, higher latency

Database row lock:

SELECT ... FOR UPDATE
Simple, ACID-safe, but scales poorly

Rule: for most backend use cases, Redis is "good enough" if you can tolerate rare correctness violations. Use Zookeeper/etcd if you need strict correctness.

Circuit breaker

When a downstream service is failing, stop calling it to prevent cascading failure.

States:

Closed: requests pass through. Count failures.
Open: after failure threshold, requests fail fast without calling the downstream.
Half-open: after a cooldown, allow a trial request. Success → close; failure → re-open.

Libraries: Netflix Hystrix (deprecated), Resilience4j, Polly (.NET), Istio (sidecar-level).

Other resilience patterns

Retries with exponential backoff + jitter — avoid thundering-herd on recovery
Bulkhead — isolate pools of resources (e.g., separate thread pool per downstream) so one slow dep doesn't starve the rest
Timeout — every remote call MUST have a timeout; no exceptions
Rate limiting — protect yourself and downstream (token bucket, sliding window — see HLD 51)

8. Scaling Patterns

Vertical vs horizontal

Vertical (scale up): bigger machine. Simple, but has a ceiling (~96-128 vCPUs on commodity cloud) and expensive per unit.

Horizontal (scale out): more machines. Harder (need to shard, handle node failures) but scales ~linearly if done right.

Rule: vertical first for stateful systems (DBs), horizontal first for stateless (web servers). At Google scale, everything is horizontal.

CDN

Content Delivery Network — edge caches close to users.

Use cases:

Static assets (images, JS bundles, fonts)
Cached API responses (GET with short TTL)
Video streaming

Key concepts:

Origin shield: single path from CDN edges to origin, reduces origin load
Cache key: usually URL + query params + headers (like Accept-Language)
Cache-Control headers: public, max-age=3600 for CDN-cacheable; private, no-store for user-specific
Purge / invalidation: CDN APIs to evict stale content on deploy

Read replicas

Async-replicated copies of the primary DB for read-scaling.

Trade-offs:

Eventual consistency — replicas lag by ms to seconds
Read-your-writes issue — user writes to primary, reads from replica, might not see own write
Solution: sticky reads (route same user to primary for N seconds after write) or read-from-primary for sensitive reads

Hot partition detection

When one shard gets disproportionate traffic:

Detect: per-shard QPS, per-shard latency, per-shard CPU dashboards
Mitigate: add a cache in front of the hot key, or re-shard to split the hot partition
Prevent: choose shard key with high cardinality AND uniform access pattern. user_id is usually fine; country_code is a hotspot waiting to happen.

Other patterns to know

Sidecar: co-located process handles cross-cutting concerns (logging, tracing, mTLS). Istio/Envoy pattern.
Service mesh: network of sidecars providing service discovery, LB, observability, policy enforcement.
CQRS: separate read model from write model. Writes go to normalized store, reads come from denormalized view. Good for read-heavy with complex queries.
Event sourcing: store state as a log of events; current state is a fold over events. Pairs well with CQRS. Replay-able, auditable, but higher complexity.

9. L4-Scoped System Design Framework

When you get the Variant B HLD round, compress your thinking into these 8 steps. Budget 45 min total.

Step	Time	Output
1. Clarify requirements	5 min	Functional + non-functional req (QPS, data size, SLA)
2. API design	5 min	Endpoint signatures, request/response shapes
3. Data model	5 min	Tables / entities / keys
4. High-level architecture	10 min	Boxes-and-arrows: LB, app tier, DB, cache, queue
5. Deep dive one component	10 min	The interviewer picks; expect scaling or consistency
6. Scaling	5 min	Sharding, replication, caching, CDN
7. Reliability	3 min	Retries, circuit breakers, monitoring, alerting
8. Trade-offs	2 min	Cost vs latency, consistency vs availability

L4 scale guardrails: millions of users, hundreds-to-thousands QPS, gigabytes-to-terabytes data. DON'T over-design for billions. Simpler = better at this level.

See HLD notes 50-55 for worked examples.

Cross-references

Uber Backend Fundamentals — deeper dive on same topics
Paytm HLD fundamentals — CAP / sharding / caching
01-interview-loop — where HLD fits in Variant B
02-coding-strategy — coding round tactics

Backend Fundamentals (L4 Reference) ​

1. Quick Reference Cheat Sheet ​

2. Databases ​

SQL vs NoSQL ​

ACID (transaction guarantees) ​

Isolation levels (weakest to strongest) ​

Indexes ​

Sharding ​

Replication ​

Consistency models ​

3. Caching ​

Patterns ​

Eviction ​

Redis vs Memcached ​

Cache invalidation ​

4. Concurrency ​

Threads vs processes ​

Locks ​

Atomics ​

Race conditions ​

Deadlock prevention ​

5. Messaging ​

Kafka vs RabbitMQ ​

Delivery semantics ​

Idempotency ​

Dead Letter Queue (DLQ) ​

6. Networking ​

HTTP versions ​

REST vs gRPC ​

DNS ​

TLS ​

Load Balancers ​

7. Distributed Systems ​

CAP Theorem ​

Consistency models recap (from above) ​

Consensus: Raft ​

Distributed locks ​

Circuit breaker ​

Other resilience patterns ​

8. Scaling Patterns ​

Vertical vs horizontal ​

CDN ​

Read replicas ​

Hot partition detection ​

Other patterns to know ​

9. L4-Scoped System Design Framework ​

Cross-references ​