Backend Fundamentals (L4 Reference) ​
Google probes backend fundamentals less aggressively than Uber or Salesforce, but you still need them for:
- The scoped HLD round (if you get Variant B)
- Coding follow-ups like "what if this ran across 100 machines?"
- The Googleyness round when you explain past work
Keep this compact. Each section is a reference — deeper dives live in Uber backend notes and Paytm HLD.
1. Quick Reference Cheat Sheet ​
| Concept | Rule of thumb | When it matters |
|---|---|---|
| SQL | Strong consistency, ACID, joins | Transactional data (orders, users, billing) |
| NoSQL | Horizontal scale, schema-flex | High-volume append (events, logs, feeds) |
| Cache-aside | App reads cache, falls back to DB | Most read-heavy services |
| LRU / LFU | Evict least recently / frequently used | Caches with bounded memory |
| Redis | In-memory, rich data structures | Cache, session store, rate limiter, distributed lock |
| Memcached | Simple key-value, no persistence | Pure cache, no ops overhead |
| Kafka | Log-based, replay, high throughput | Event pipelines, analytics, fan-out |
| RabbitMQ | Queue-based, flexible routing | Task queues, RPC, low-throughput |
| REST | Stateless HTTP, easy to debug | External APIs, simple CRUD |
| gRPC | Binary, typed, streaming | Internal service-to-service |
| HTTP/2 | Multiplexing, header compression | Modern web, replaces 1.1 |
| HTTP/3 | QUIC over UDP, no head-of-line blocking | Mobile, lossy networks |
| CAP | Pick 2 of Consistency, Availability, Partition-tolerance | Distributed data stores |
| Raft | Leader-based consensus | Distributed state (etcd, Consul) |
| LB L4 | TCP-level, faster, no app awareness | High-throughput, simple routing |
| LB L7 | HTTP-aware, routing by path/header | API gateways, ingress |
| Circuit breaker | Stop calling failing service | Service-to-service resilience |
| Backpressure | Slow down upstream when downstream overloaded | Streaming, queues |
2. Databases ​
SQL vs NoSQL ​
SQL (Postgres, MySQL, Spanner):
- Strong consistency (typically)
- ACID transactions across rows
- Rich joins, aggregations, secondary indexes
- Vertical scale first; horizontal scale requires sharding
- Use when: data has relationships, transactions matter, schema is stable
NoSQL (varies):
- Document (MongoDB, Firestore): flexible schema, denormalized reads
- Key-Value (DynamoDB, Bigtable): O(1) by key, horizontal scale built-in
- Wide-column (Cassandra, Bigtable): write-heavy, tunable consistency
- Graph (Neo4j, Neptune): relationship-first queries
- Horizontal scale by default; eventual consistency (tunable)
- Use when: data is append-heavy, schema evolves fast, or data is inherently unrelated
Google-specific: Spanner gives you ACID + global scale + SQL. It's NOT a vanilla SQL database — it uses TrueTime and Paxos under the hood. Bigtable is the flagship wide-column store.
ACID (transaction guarantees) ​
| Letter | Guarantees |
|---|---|
| A Atomicity | All-or-nothing; partial failures roll back |
| C Consistency | Transaction leaves DB in a valid state (constraints hold) |
| I Isolation | Concurrent transactions don't see each other's uncommitted state |
| D Durability | Committed data survives crashes (fsync) |
Isolation levels (weakest to strongest) ​
| Level | Prevents | Allows |
|---|---|---|
| Read Uncommitted | Nothing | Dirty reads |
| Read Committed | Dirty reads | Non-repeatable reads, phantoms |
| Repeatable Read | Dirty + non-repeatable reads | Phantoms |
| Serializable | All anomalies | Highest lock cost |
| Snapshot Isolation (MVCC) | Dirty + non-repeatable | Write skew |
Default for Postgres: Read Committed. For MySQL InnoDB: Repeatable Read with phantom protection via gap locks.
Indexes ​
B-tree index: default for most SQL DBs. O(log n) lookup, range queries supported. Covers WHERE, ORDER BY, range scans.
Hash index: O(1) exact lookup, no range. Useful for equality-only lookups.
Composite index: CREATE INDEX ON orders (user_id, created_at). Leftmost prefix rule — can be used for WHERE user_id = ? and WHERE user_id = ? AND created_at > ? but NOT WHERE created_at > ? alone.
Covering index: includes all columns needed by the query, avoiding table lookup. Huge speedup for read-heavy queries.
When to index:
- Columns in
WHERE,JOIN ON,ORDER BY,GROUP BY - Columns with high selectivity (few duplicates)
- Columns read much more than written (indexes slow writes)
When NOT to index:
- Low-selectivity columns (boolean, status with 3 values)
- Tables with high write volume and rare reads
- Columns that are updated frequently
Sharding ​
Horizontal partitioning — split rows across multiple DB instances by a shard key.
Strategies:
- Range-based:
user_id 1-1M on shard A, 1M-2M on shard B. Simple, but hotspots if access is skewed. - Hash-based:
hash(user_id) % N. Uniform distribution, but resharding is painful when N changes. - Consistent hashing: minimizes data movement on rebalance. Industry standard for elastic clusters.
- Directory-based: central lookup service maps keys to shards. Flexible, but the lookup is an extra hop + SPOF risk.
Trade-offs:
- Cross-shard queries require scatter-gather — slow, expensive
- Cross-shard transactions require 2PC — complex, blocking
- Hot partition detection (monitor per-shard QPS) — rebalance before the shard melts
Replication ​
Leader-follower (primary-replica):
- Writes go to leader, leader replicates to followers
- Reads can go to followers (read replicas) — reduces leader load
- Replication lag — followers can be seconds behind (read-your-writes inconsistency)
Multi-leader:
- Multiple writable replicas
- Conflict resolution via last-write-wins, vector clocks, CRDTs
- Use case: multi-region writes where latency matters more than strict consistency
Leaderless (Dynamo-style):
- Clients write to N nodes, read from R nodes
R + W > N→ strong consistency- Used by Cassandra, DynamoDB
Consistency models ​
| Model | Guarantee |
|---|---|
| Strong (linearizable) | Read sees the latest write, globally |
| Sequential | All clients see operations in the same order, but not necessarily real-time |
| Causal | Operations with causal dependency are ordered; unrelated ones may not be |
| Eventual | All replicas converge eventually; no ordering guarantee |
Most distributed SQL (Spanner, CockroachDB) give linearizable. Most NoSQL give eventual or tunable.
3. Caching ​
Patterns ​
Cache-aside (lazy loading):
read: app → cache. miss → app → db → app writes to cache → app returns
write: app → db. Invalidate cache entry.Most common pattern. Resilient (cache failure doesn't break app), but cold starts hurt.
Write-through:
write: app → cache → db (synchronous). Read always hits cache.Always-warm cache, but writes pay the cache cost even if the key is never read.
Write-behind (write-back):
write: app → cache (async flush to db).Fastest writes, but durability risk if the cache crashes before flushing.
Read-through:
Cache acts as the DB proxy. App only talks to cache; cache pulls from DB on miss.Simpler app code, but couples cache and DB tightly.
Eviction ​
- LRU (Least Recently Used): evict the entry accessed longest ago. Simple, good for temporal locality.
- LFU (Least Frequently Used): evict the entry accessed least total times. Good for skewed frequency.
- FIFO: evict oldest by insertion, regardless of access. Rare in production.
- TTL (Time-to-Live): expire after N seconds. Combine with LRU for belt-and-suspenders.
LRU implementation: HashMap + Doubly Linked List. O(1) get, O(1) put. See LRU Cache problem.
Redis vs Memcached ​
| Feature | Redis | Memcached |
|---|---|---|
| Data structures | Strings, lists, sets, hashes, sorted sets, streams | Strings only |
| Persistence | RDB snapshots + AOF | None (pure cache) |
| Replication | Yes (primary-replica) | No native |
| Clustering | Redis Cluster (sharded) | Client-side sharding |
| Use case | Cache + session + queue + counters + pub/sub | Simple cache, minimal ops |
Rule: start with Redis unless you have a specific reason to use Memcached. The feature surface is worth the complexity.
Cache invalidation ​
"There are only two hard things in computer science: cache invalidation and naming things." — Phil Karlton
Strategies:
- TTL-based: accept stale reads up to N seconds. Simple, no coordination. Use for non-critical data.
- Event-based: DB publishes invalidation events on write. Cache consumes and evicts. Complex, fast.
- Version-based: cache key includes a version (
user:123:v7). On write, bump version → old key becomes garbage. Clean but leaks memory until TTL.
4. Concurrency ​
Threads vs processes ​
- Threads: share memory, cheaper to spawn, race-condition prone
- Processes: isolated memory, IPC via pipes/sockets, heavier
Most backend services use many threads in a single process (Java, Go, Python with async). Use processes for true isolation (e.g., sandboxing untrusted code).
Locks ​
Mutex: one thread at a time holds the lock. RWLock: many readers OR one writer. Good for read-heavy data. Spinlock: busy-wait, no syscall. Only for very short critical sections. Semaphore: counter-based, allows N concurrent holders. Good for resource pools.
Atomics ​
For simple counter updates, atomics (compare-and-swap, fetch-and-add) are faster than locks. std::atomic<int> in C++, AtomicInteger in Java, sync/atomic in Go.
Rule: atomics for single-variable updates, locks for multi-variable critical sections.
Race conditions ​
Three classic races:
- Check-then-act:
if (!map.contains(k)) map.put(k, v);— two threads both check, both put. Fix:putIfAbsent. - Read-modify-write:
counter = counter + 1is 3 ops. Fix: atomic++counter. - Iteration + mutation: iterate a list while another thread inserts. Fix: snapshot iteration, or concurrent collection.
Deadlock prevention ​
A deadlock needs all 4 of Coffman's conditions. Break any one:
- Mutual exclusion — hard to remove (some resources must be exclusive)
- Hold and wait — acquire all locks upfront
- No preemption — allow lock timeout and release
- Circular wait — always acquire locks in a global order (most practical fix)
Golden rule: establish a total order on all locks (e.g., alphabetical by name, or numerical by ID) and acquire in that order everywhere. No circular wait, no deadlock.
5. Messaging ​
Kafka vs RabbitMQ ​
| Feature | Kafka | RabbitMQ |
|---|---|---|
| Model | Log (append-only partitioned stream) | Queue (routed message broker) |
| Throughput | Very high (millions msg/s) | Moderate (tens of thousands msg/s) |
| Retention | Days to forever (configurable) | Until consumed |
| Replay | Built-in (consumer offsets) | No native, requires DLQ trick |
| Ordering | Per-partition | Per-queue |
| Use case | Event streaming, analytics, CDC | Task queues, RPC, flexible routing |
Rule: Kafka for high-throughput event pipelines, RabbitMQ for work queues with flexible routing.
Delivery semantics ​
| Guarantee | Meaning | Cost |
|---|---|---|
| At-most-once | May lose messages, no duplicates | Cheapest, rare in prod |
| At-least-once | May deliver duplicates, no loss | Default, requires idempotent consumers |
| Exactly-once | One and only one | Expensive, requires transactional producer + idempotent consumer |
L4 expectation: know that at-least-once + idempotency is the standard production pattern. True exactly-once is rare and expensive (Kafka exactly-once semantics via transactional producers, or 2PC).
Idempotency ​
A consumer is idempotent if processing the same message twice has the same effect as once.
Patterns:
- Idempotency key: include a unique key per message; consumer checks a dedup store (Redis, DB) before acting
- Upsert semantics:
INSERT ... ON CONFLICT DO UPDATE - Version/sequence numbers: ignore messages with an older version than the current state
Dead Letter Queue (DLQ) ​
When a message fails processing N times, move it to a DLQ for manual inspection or async retry. Prevents one bad message from blocking the queue.
Pattern:
- Main queue → consumer attempts processing → fail → retry up to N times
- After N failures → move to DLQ
- Alerting on DLQ depth; replay tool to re-enqueue fixed messages
6. Networking ​
HTTP versions ​
| Version | Key features |
|---|---|
| HTTP/1.1 | Text-based, one request per connection (or pipelining), head-of-line blocking |
| HTTP/2 | Binary framing, multiplexing over one TCP connection, header compression (HPACK), server push |
| HTTP/3 | Built on QUIC (UDP), no head-of-line blocking at transport layer, faster handshake, better on lossy networks |
L4 rule: know the differences conceptually. In practice, web backends use HTTP/2 or HTTP/3; internal service-to-service often uses gRPC (which runs over HTTP/2).
REST vs gRPC ​
| Feature | REST (JSON) | gRPC (Protobuf) |
|---|---|---|
| Format | JSON (text) | Protobuf (binary) |
| Schema | Open (OpenAPI optional) | Strictly typed (.proto files) |
| Streaming | Limited (SSE, WebSockets) | First-class (unary, server-stream, client-stream, bidirectional) |
| Debuggability | Easy (curl, browser) | Harder (grpcurl, bloomrpc) |
| Size on wire | Larger | ~5-10x smaller |
| Use case | Public APIs, simple CRUD | Internal microservices, high-throughput, streaming |
DNS ​
DNS resolves hostnames to IPs.
- A record: hostname → IPv4
- AAAA: hostname → IPv6
- CNAME: alias (hostname → hostname)
- TXT: arbitrary text (used for SPF, DKIM, domain verification)
- MX: mail server records
Resolution flow: stub resolver → recursive resolver (often ISP or 8.8.8.8) → root → TLD → authoritative → answer (cached at each step).
TTL: controls how long a record is cached. Low TTL (60s) for agile changes, high TTL (1h+) for stable records.
TLS ​
TLS provides confidentiality, integrity, authentication for connections (HTTPS, gRPC, internal mTLS).
Handshake (TLS 1.3 simplified):
- Client Hello (supported cipher suites, random)
- Server Hello (chosen cipher, certificate, random)
- Client verifies certificate chain against trusted CAs
- Key exchange (ECDHE)
- Both derive session keys; encrypted communication begins
mTLS (mutual TLS): client also presents a certificate. Used for service-to-service auth in zero-trust architectures.
Load Balancers ​
Layer 4 (TCP):
- Operates on IP + port
- No HTTP awareness (cannot route by URL, header)
- Very fast, used for raw TCP workloads (databases, gRPC before L7 support)
- Examples: AWS NLB, Google Network Load Balancer
Layer 7 (HTTP):
- Terminates HTTPS, routes by path, header, cookie
- Can do URL rewriting, header injection, rate limiting
- Slower than L4 but vastly more flexible
- Examples: nginx, HAProxy, AWS ALB, GCP HTTPS LB, Envoy
Algorithms:
- Round-robin: each request to the next server
- Least-connections: route to the server with fewest active connections
- IP-hash / consistent-hash: same client always goes to the same backend (sticky sessions, cache locality)
- Weighted: assign different capacities to different backends
7. Distributed Systems ​
CAP Theorem ​
"In a distributed system experiencing a network partition, you must choose between Consistency and Availability."
- CP systems: refuse requests during partition rather than return stale data. Examples: Zookeeper, etcd, Spanner (technically CA under their SLA).
- AP systems: always respond, even with stale data. Examples: Cassandra (tunable), DynamoDB (tunable), Couchbase.
L4 rule: in practice, CAP is a spectrum, not a binary. Most systems are "mostly consistent with eventual fallback" or "mostly available with best-effort consistency." Know this; don't over-cite CAP.
Consistency models recap (from above) ​
Strong > Sequential > Causal > Eventual
Consensus: Raft ​
Raft elects a leader; the leader accepts all writes and replicates to followers. If the leader fails, followers elect a new one.
Key mechanisms:
- Leader election: followers wait for heartbeat; if missing, start an election with a random timeout (prevents split votes)
- Log replication: leader appends entry locally, sends to followers, commits once a majority acknowledge
- Safety: a new leader must have all committed entries from the previous term (election restrictions enforce this)
Use cases: etcd (Kubernetes config), Consul (service discovery), CockroachDB (per-range consensus).
L4 rule: know that Raft is the modern pragmatic choice over Paxos (which is notoriously hard to implement correctly). You don't need to implement Raft — you need to know its role.
Distributed locks ​
Single-node locks don't work across multiple processes. Common options:
Redis (Redlock):
SETNX key value EX ttlto acquire- Client must refresh TTL or release before expiry
- Caveats: clock drift, GC pauses, Redis failover can break correctness. Redlock is debated (Martin Kleppmann's critique).
Zookeeper / etcd:
- Create ephemeral sequential node
- Watch the node just before yours in sequence
- Session death automatically releases (ephemeral)
- Safer than Redis for correctness-critical locks, higher latency
Database row lock:
SELECT ... FOR UPDATE- Simple, ACID-safe, but scales poorly
Rule: for most backend use cases, Redis is "good enough" if you can tolerate rare correctness violations. Use Zookeeper/etcd if you need strict correctness.
Circuit breaker ​
When a downstream service is failing, stop calling it to prevent cascading failure.
States:
- Closed: requests pass through. Count failures.
- Open: after failure threshold, requests fail fast without calling the downstream.
- Half-open: after a cooldown, allow a trial request. Success → close; failure → re-open.
Libraries: Netflix Hystrix (deprecated), Resilience4j, Polly (.NET), Istio (sidecar-level).
Other resilience patterns ​
- Retries with exponential backoff + jitter — avoid thundering-herd on recovery
- Bulkhead — isolate pools of resources (e.g., separate thread pool per downstream) so one slow dep doesn't starve the rest
- Timeout — every remote call MUST have a timeout; no exceptions
- Rate limiting — protect yourself and downstream (token bucket, sliding window — see HLD 51)
8. Scaling Patterns ​
Vertical vs horizontal ​
Vertical (scale up): bigger machine. Simple, but has a ceiling (~96-128 vCPUs on commodity cloud) and expensive per unit.
Horizontal (scale out): more machines. Harder (need to shard, handle node failures) but scales ~linearly if done right.
Rule: vertical first for stateful systems (DBs), horizontal first for stateless (web servers). At Google scale, everything is horizontal.
CDN ​
Content Delivery Network — edge caches close to users.
Use cases:
- Static assets (images, JS bundles, fonts)
- Cached API responses (GET with short TTL)
- Video streaming
Key concepts:
- Origin shield: single path from CDN edges to origin, reduces origin load
- Cache key: usually URL + query params + headers (like
Accept-Language) - Cache-Control headers:
public, max-age=3600for CDN-cacheable;private, no-storefor user-specific - Purge / invalidation: CDN APIs to evict stale content on deploy
Read replicas ​
Async-replicated copies of the primary DB for read-scaling.
Trade-offs:
- Eventual consistency — replicas lag by ms to seconds
- Read-your-writes issue — user writes to primary, reads from replica, might not see own write
- Solution: sticky reads (route same user to primary for N seconds after write) or read-from-primary for sensitive reads
Hot partition detection ​
When one shard gets disproportionate traffic:
- Detect: per-shard QPS, per-shard latency, per-shard CPU dashboards
- Mitigate: add a cache in front of the hot key, or re-shard to split the hot partition
- Prevent: choose shard key with high cardinality AND uniform access pattern.
user_idis usually fine;country_codeis a hotspot waiting to happen.
Other patterns to know ​
- Sidecar: co-located process handles cross-cutting concerns (logging, tracing, mTLS). Istio/Envoy pattern.
- Service mesh: network of sidecars providing service discovery, LB, observability, policy enforcement.
- CQRS: separate read model from write model. Writes go to normalized store, reads come from denormalized view. Good for read-heavy with complex queries.
- Event sourcing: store state as a log of events; current state is a fold over events. Pairs well with CQRS. Replay-able, auditable, but higher complexity.
9. L4-Scoped System Design Framework ​
When you get the Variant B HLD round, compress your thinking into these 8 steps. Budget 45 min total.
| Step | Time | Output |
|---|---|---|
| 1. Clarify requirements | 5 min | Functional + non-functional req (QPS, data size, SLA) |
| 2. API design | 5 min | Endpoint signatures, request/response shapes |
| 3. Data model | 5 min | Tables / entities / keys |
| 4. High-level architecture | 10 min | Boxes-and-arrows: LB, app tier, DB, cache, queue |
| 5. Deep dive one component | 10 min | The interviewer picks; expect scaling or consistency |
| 6. Scaling | 5 min | Sharding, replication, caching, CDN |
| 7. Reliability | 3 min | Retries, circuit breakers, monitoring, alerting |
| 8. Trade-offs | 2 min | Cost vs latency, consistency vs availability |
L4 scale guardrails: millions of users, hundreds-to-thousands QPS, gigabytes-to-terabytes data. DON'T over-design for billions. Simpler = better at this level.
See HLD notes 50-55 for worked examples.
Cross-references ​
- Uber Backend Fundamentals — deeper dive on same topics
- Paytm HLD fundamentals — CAP / sharding / caching
- 01-interview-loop — where HLD fits in Variant B
- 02-coding-strategy — coding round tactics