Skip to content

Backend Fundamentals (L4 Reference) ​

Google probes backend fundamentals less aggressively than Uber or Salesforce, but you still need them for:

  • The scoped HLD round (if you get Variant B)
  • Coding follow-ups like "what if this ran across 100 machines?"
  • The Googleyness round when you explain past work

Keep this compact. Each section is a reference — deeper dives live in Uber backend notes and Paytm HLD.


1. Quick Reference Cheat Sheet ​

ConceptRule of thumbWhen it matters
SQLStrong consistency, ACID, joinsTransactional data (orders, users, billing)
NoSQLHorizontal scale, schema-flexHigh-volume append (events, logs, feeds)
Cache-asideApp reads cache, falls back to DBMost read-heavy services
LRU / LFUEvict least recently / frequently usedCaches with bounded memory
RedisIn-memory, rich data structuresCache, session store, rate limiter, distributed lock
MemcachedSimple key-value, no persistencePure cache, no ops overhead
KafkaLog-based, replay, high throughputEvent pipelines, analytics, fan-out
RabbitMQQueue-based, flexible routingTask queues, RPC, low-throughput
RESTStateless HTTP, easy to debugExternal APIs, simple CRUD
gRPCBinary, typed, streamingInternal service-to-service
HTTP/2Multiplexing, header compressionModern web, replaces 1.1
HTTP/3QUIC over UDP, no head-of-line blockingMobile, lossy networks
CAPPick 2 of Consistency, Availability, Partition-toleranceDistributed data stores
RaftLeader-based consensusDistributed state (etcd, Consul)
LB L4TCP-level, faster, no app awarenessHigh-throughput, simple routing
LB L7HTTP-aware, routing by path/headerAPI gateways, ingress
Circuit breakerStop calling failing serviceService-to-service resilience
BackpressureSlow down upstream when downstream overloadedStreaming, queues

2. Databases ​

SQL vs NoSQL ​

SQL (Postgres, MySQL, Spanner):

  • Strong consistency (typically)
  • ACID transactions across rows
  • Rich joins, aggregations, secondary indexes
  • Vertical scale first; horizontal scale requires sharding
  • Use when: data has relationships, transactions matter, schema is stable

NoSQL (varies):

  • Document (MongoDB, Firestore): flexible schema, denormalized reads
  • Key-Value (DynamoDB, Bigtable): O(1) by key, horizontal scale built-in
  • Wide-column (Cassandra, Bigtable): write-heavy, tunable consistency
  • Graph (Neo4j, Neptune): relationship-first queries
  • Horizontal scale by default; eventual consistency (tunable)
  • Use when: data is append-heavy, schema evolves fast, or data is inherently unrelated

Google-specific: Spanner gives you ACID + global scale + SQL. It's NOT a vanilla SQL database — it uses TrueTime and Paxos under the hood. Bigtable is the flagship wide-column store.

ACID (transaction guarantees) ​

LetterGuarantees
A AtomicityAll-or-nothing; partial failures roll back
C ConsistencyTransaction leaves DB in a valid state (constraints hold)
I IsolationConcurrent transactions don't see each other's uncommitted state
D DurabilityCommitted data survives crashes (fsync)

Isolation levels (weakest to strongest) ​

LevelPreventsAllows
Read UncommittedNothingDirty reads
Read CommittedDirty readsNon-repeatable reads, phantoms
Repeatable ReadDirty + non-repeatable readsPhantoms
SerializableAll anomaliesHighest lock cost
Snapshot Isolation (MVCC)Dirty + non-repeatableWrite skew

Default for Postgres: Read Committed. For MySQL InnoDB: Repeatable Read with phantom protection via gap locks.

Indexes ​

B-tree index: default for most SQL DBs. O(log n) lookup, range queries supported. Covers WHERE, ORDER BY, range scans.

Hash index: O(1) exact lookup, no range. Useful for equality-only lookups.

Composite index: CREATE INDEX ON orders (user_id, created_at). Leftmost prefix rule — can be used for WHERE user_id = ? and WHERE user_id = ? AND created_at > ? but NOT WHERE created_at > ? alone.

Covering index: includes all columns needed by the query, avoiding table lookup. Huge speedup for read-heavy queries.

When to index:

  • Columns in WHERE, JOIN ON, ORDER BY, GROUP BY
  • Columns with high selectivity (few duplicates)
  • Columns read much more than written (indexes slow writes)

When NOT to index:

  • Low-selectivity columns (boolean, status with 3 values)
  • Tables with high write volume and rare reads
  • Columns that are updated frequently

Sharding ​

Horizontal partitioning — split rows across multiple DB instances by a shard key.

Strategies:

  • Range-based: user_id 1-1M on shard A, 1M-2M on shard B. Simple, but hotspots if access is skewed.
  • Hash-based: hash(user_id) % N. Uniform distribution, but resharding is painful when N changes.
  • Consistent hashing: minimizes data movement on rebalance. Industry standard for elastic clusters.
  • Directory-based: central lookup service maps keys to shards. Flexible, but the lookup is an extra hop + SPOF risk.

Trade-offs:

  • Cross-shard queries require scatter-gather — slow, expensive
  • Cross-shard transactions require 2PC — complex, blocking
  • Hot partition detection (monitor per-shard QPS) — rebalance before the shard melts

Replication ​

Leader-follower (primary-replica):

  • Writes go to leader, leader replicates to followers
  • Reads can go to followers (read replicas) — reduces leader load
  • Replication lag — followers can be seconds behind (read-your-writes inconsistency)

Multi-leader:

  • Multiple writable replicas
  • Conflict resolution via last-write-wins, vector clocks, CRDTs
  • Use case: multi-region writes where latency matters more than strict consistency

Leaderless (Dynamo-style):

  • Clients write to N nodes, read from R nodes
  • R + W > N → strong consistency
  • Used by Cassandra, DynamoDB

Consistency models ​

ModelGuarantee
Strong (linearizable)Read sees the latest write, globally
SequentialAll clients see operations in the same order, but not necessarily real-time
CausalOperations with causal dependency are ordered; unrelated ones may not be
EventualAll replicas converge eventually; no ordering guarantee

Most distributed SQL (Spanner, CockroachDB) give linearizable. Most NoSQL give eventual or tunable.


3. Caching ​

Patterns ​

Cache-aside (lazy loading):

read: app → cache. miss → app → db → app writes to cache → app returns
write: app → db. Invalidate cache entry.

Most common pattern. Resilient (cache failure doesn't break app), but cold starts hurt.

Write-through:

write: app → cache → db (synchronous). Read always hits cache.

Always-warm cache, but writes pay the cache cost even if the key is never read.

Write-behind (write-back):

write: app → cache (async flush to db).

Fastest writes, but durability risk if the cache crashes before flushing.

Read-through:

Cache acts as the DB proxy. App only talks to cache; cache pulls from DB on miss.

Simpler app code, but couples cache and DB tightly.

Eviction ​

  • LRU (Least Recently Used): evict the entry accessed longest ago. Simple, good for temporal locality.
  • LFU (Least Frequently Used): evict the entry accessed least total times. Good for skewed frequency.
  • FIFO: evict oldest by insertion, regardless of access. Rare in production.
  • TTL (Time-to-Live): expire after N seconds. Combine with LRU for belt-and-suspenders.

LRU implementation: HashMap + Doubly Linked List. O(1) get, O(1) put. See LRU Cache problem.

Redis vs Memcached ​

FeatureRedisMemcached
Data structuresStrings, lists, sets, hashes, sorted sets, streamsStrings only
PersistenceRDB snapshots + AOFNone (pure cache)
ReplicationYes (primary-replica)No native
ClusteringRedis Cluster (sharded)Client-side sharding
Use caseCache + session + queue + counters + pub/subSimple cache, minimal ops

Rule: start with Redis unless you have a specific reason to use Memcached. The feature surface is worth the complexity.

Cache invalidation ​

"There are only two hard things in computer science: cache invalidation and naming things." — Phil Karlton

Strategies:

  • TTL-based: accept stale reads up to N seconds. Simple, no coordination. Use for non-critical data.
  • Event-based: DB publishes invalidation events on write. Cache consumes and evicts. Complex, fast.
  • Version-based: cache key includes a version (user:123:v7). On write, bump version → old key becomes garbage. Clean but leaks memory until TTL.

4. Concurrency ​

Threads vs processes ​

  • Threads: share memory, cheaper to spawn, race-condition prone
  • Processes: isolated memory, IPC via pipes/sockets, heavier

Most backend services use many threads in a single process (Java, Go, Python with async). Use processes for true isolation (e.g., sandboxing untrusted code).

Locks ​

Mutex: one thread at a time holds the lock. RWLock: many readers OR one writer. Good for read-heavy data. Spinlock: busy-wait, no syscall. Only for very short critical sections. Semaphore: counter-based, allows N concurrent holders. Good for resource pools.

Atomics ​

For simple counter updates, atomics (compare-and-swap, fetch-and-add) are faster than locks. std::atomic<int> in C++, AtomicInteger in Java, sync/atomic in Go.

Rule: atomics for single-variable updates, locks for multi-variable critical sections.

Race conditions ​

Three classic races:

  1. Check-then-act: if (!map.contains(k)) map.put(k, v); — two threads both check, both put. Fix: putIfAbsent.
  2. Read-modify-write: counter = counter + 1 is 3 ops. Fix: atomic ++counter.
  3. Iteration + mutation: iterate a list while another thread inserts. Fix: snapshot iteration, or concurrent collection.

Deadlock prevention ​

A deadlock needs all 4 of Coffman's conditions. Break any one:

  1. Mutual exclusion — hard to remove (some resources must be exclusive)
  2. Hold and wait — acquire all locks upfront
  3. No preemption — allow lock timeout and release
  4. Circular wait — always acquire locks in a global order (most practical fix)

Golden rule: establish a total order on all locks (e.g., alphabetical by name, or numerical by ID) and acquire in that order everywhere. No circular wait, no deadlock.


5. Messaging ​

Kafka vs RabbitMQ ​

FeatureKafkaRabbitMQ
ModelLog (append-only partitioned stream)Queue (routed message broker)
ThroughputVery high (millions msg/s)Moderate (tens of thousands msg/s)
RetentionDays to forever (configurable)Until consumed
ReplayBuilt-in (consumer offsets)No native, requires DLQ trick
OrderingPer-partitionPer-queue
Use caseEvent streaming, analytics, CDCTask queues, RPC, flexible routing

Rule: Kafka for high-throughput event pipelines, RabbitMQ for work queues with flexible routing.

Delivery semantics ​

GuaranteeMeaningCost
At-most-onceMay lose messages, no duplicatesCheapest, rare in prod
At-least-onceMay deliver duplicates, no lossDefault, requires idempotent consumers
Exactly-onceOne and only oneExpensive, requires transactional producer + idempotent consumer

L4 expectation: know that at-least-once + idempotency is the standard production pattern. True exactly-once is rare and expensive (Kafka exactly-once semantics via transactional producers, or 2PC).

Idempotency ​

A consumer is idempotent if processing the same message twice has the same effect as once.

Patterns:

  • Idempotency key: include a unique key per message; consumer checks a dedup store (Redis, DB) before acting
  • Upsert semantics: INSERT ... ON CONFLICT DO UPDATE
  • Version/sequence numbers: ignore messages with an older version than the current state

Dead Letter Queue (DLQ) ​

When a message fails processing N times, move it to a DLQ for manual inspection or async retry. Prevents one bad message from blocking the queue.

Pattern:

  • Main queue → consumer attempts processing → fail → retry up to N times
  • After N failures → move to DLQ
  • Alerting on DLQ depth; replay tool to re-enqueue fixed messages

6. Networking ​

HTTP versions ​

VersionKey features
HTTP/1.1Text-based, one request per connection (or pipelining), head-of-line blocking
HTTP/2Binary framing, multiplexing over one TCP connection, header compression (HPACK), server push
HTTP/3Built on QUIC (UDP), no head-of-line blocking at transport layer, faster handshake, better on lossy networks

L4 rule: know the differences conceptually. In practice, web backends use HTTP/2 or HTTP/3; internal service-to-service often uses gRPC (which runs over HTTP/2).

REST vs gRPC ​

FeatureREST (JSON)gRPC (Protobuf)
FormatJSON (text)Protobuf (binary)
SchemaOpen (OpenAPI optional)Strictly typed (.proto files)
StreamingLimited (SSE, WebSockets)First-class (unary, server-stream, client-stream, bidirectional)
DebuggabilityEasy (curl, browser)Harder (grpcurl, bloomrpc)
Size on wireLarger~5-10x smaller
Use casePublic APIs, simple CRUDInternal microservices, high-throughput, streaming

DNS ​

DNS resolves hostnames to IPs.

  • A record: hostname → IPv4
  • AAAA: hostname → IPv6
  • CNAME: alias (hostname → hostname)
  • TXT: arbitrary text (used for SPF, DKIM, domain verification)
  • MX: mail server records

Resolution flow: stub resolver → recursive resolver (often ISP or 8.8.8.8) → root → TLD → authoritative → answer (cached at each step).

TTL: controls how long a record is cached. Low TTL (60s) for agile changes, high TTL (1h+) for stable records.

TLS ​

TLS provides confidentiality, integrity, authentication for connections (HTTPS, gRPC, internal mTLS).

Handshake (TLS 1.3 simplified):

  1. Client Hello (supported cipher suites, random)
  2. Server Hello (chosen cipher, certificate, random)
  3. Client verifies certificate chain against trusted CAs
  4. Key exchange (ECDHE)
  5. Both derive session keys; encrypted communication begins

mTLS (mutual TLS): client also presents a certificate. Used for service-to-service auth in zero-trust architectures.

Load Balancers ​

Layer 4 (TCP):

  • Operates on IP + port
  • No HTTP awareness (cannot route by URL, header)
  • Very fast, used for raw TCP workloads (databases, gRPC before L7 support)
  • Examples: AWS NLB, Google Network Load Balancer

Layer 7 (HTTP):

  • Terminates HTTPS, routes by path, header, cookie
  • Can do URL rewriting, header injection, rate limiting
  • Slower than L4 but vastly more flexible
  • Examples: nginx, HAProxy, AWS ALB, GCP HTTPS LB, Envoy

Algorithms:

  • Round-robin: each request to the next server
  • Least-connections: route to the server with fewest active connections
  • IP-hash / consistent-hash: same client always goes to the same backend (sticky sessions, cache locality)
  • Weighted: assign different capacities to different backends

7. Distributed Systems ​

CAP Theorem ​

"In a distributed system experiencing a network partition, you must choose between Consistency and Availability."

  • CP systems: refuse requests during partition rather than return stale data. Examples: Zookeeper, etcd, Spanner (technically CA under their SLA).
  • AP systems: always respond, even with stale data. Examples: Cassandra (tunable), DynamoDB (tunable), Couchbase.

L4 rule: in practice, CAP is a spectrum, not a binary. Most systems are "mostly consistent with eventual fallback" or "mostly available with best-effort consistency." Know this; don't over-cite CAP.

Consistency models recap (from above) ​

Strong > Sequential > Causal > Eventual

Consensus: Raft ​

Raft elects a leader; the leader accepts all writes and replicates to followers. If the leader fails, followers elect a new one.

Key mechanisms:

  • Leader election: followers wait for heartbeat; if missing, start an election with a random timeout (prevents split votes)
  • Log replication: leader appends entry locally, sends to followers, commits once a majority acknowledge
  • Safety: a new leader must have all committed entries from the previous term (election restrictions enforce this)

Use cases: etcd (Kubernetes config), Consul (service discovery), CockroachDB (per-range consensus).

L4 rule: know that Raft is the modern pragmatic choice over Paxos (which is notoriously hard to implement correctly). You don't need to implement Raft — you need to know its role.

Distributed locks ​

Single-node locks don't work across multiple processes. Common options:

Redis (Redlock):

  • SETNX key value EX ttl to acquire
  • Client must refresh TTL or release before expiry
  • Caveats: clock drift, GC pauses, Redis failover can break correctness. Redlock is debated (Martin Kleppmann's critique).

Zookeeper / etcd:

  • Create ephemeral sequential node
  • Watch the node just before yours in sequence
  • Session death automatically releases (ephemeral)
  • Safer than Redis for correctness-critical locks, higher latency

Database row lock:

  • SELECT ... FOR UPDATE
  • Simple, ACID-safe, but scales poorly

Rule: for most backend use cases, Redis is "good enough" if you can tolerate rare correctness violations. Use Zookeeper/etcd if you need strict correctness.

Circuit breaker ​

When a downstream service is failing, stop calling it to prevent cascading failure.

States:

  • Closed: requests pass through. Count failures.
  • Open: after failure threshold, requests fail fast without calling the downstream.
  • Half-open: after a cooldown, allow a trial request. Success → close; failure → re-open.

Libraries: Netflix Hystrix (deprecated), Resilience4j, Polly (.NET), Istio (sidecar-level).

Other resilience patterns ​

  • Retries with exponential backoff + jitter — avoid thundering-herd on recovery
  • Bulkhead — isolate pools of resources (e.g., separate thread pool per downstream) so one slow dep doesn't starve the rest
  • Timeout — every remote call MUST have a timeout; no exceptions
  • Rate limiting — protect yourself and downstream (token bucket, sliding window — see HLD 51)

8. Scaling Patterns ​

Vertical vs horizontal ​

Vertical (scale up): bigger machine. Simple, but has a ceiling (~96-128 vCPUs on commodity cloud) and expensive per unit.

Horizontal (scale out): more machines. Harder (need to shard, handle node failures) but scales ~linearly if done right.

Rule: vertical first for stateful systems (DBs), horizontal first for stateless (web servers). At Google scale, everything is horizontal.

CDN ​

Content Delivery Network — edge caches close to users.

Use cases:

  • Static assets (images, JS bundles, fonts)
  • Cached API responses (GET with short TTL)
  • Video streaming

Key concepts:

  • Origin shield: single path from CDN edges to origin, reduces origin load
  • Cache key: usually URL + query params + headers (like Accept-Language)
  • Cache-Control headers: public, max-age=3600 for CDN-cacheable; private, no-store for user-specific
  • Purge / invalidation: CDN APIs to evict stale content on deploy

Read replicas ​

Async-replicated copies of the primary DB for read-scaling.

Trade-offs:

  • Eventual consistency — replicas lag by ms to seconds
  • Read-your-writes issue — user writes to primary, reads from replica, might not see own write
  • Solution: sticky reads (route same user to primary for N seconds after write) or read-from-primary for sensitive reads

Hot partition detection ​

When one shard gets disproportionate traffic:

  • Detect: per-shard QPS, per-shard latency, per-shard CPU dashboards
  • Mitigate: add a cache in front of the hot key, or re-shard to split the hot partition
  • Prevent: choose shard key with high cardinality AND uniform access pattern. user_id is usually fine; country_code is a hotspot waiting to happen.

Other patterns to know ​

  • Sidecar: co-located process handles cross-cutting concerns (logging, tracing, mTLS). Istio/Envoy pattern.
  • Service mesh: network of sidecars providing service discovery, LB, observability, policy enforcement.
  • CQRS: separate read model from write model. Writes go to normalized store, reads come from denormalized view. Good for read-heavy with complex queries.
  • Event sourcing: store state as a log of events; current state is a fold over events. Pairs well with CQRS. Replay-able, auditable, but higher complexity.

9. L4-Scoped System Design Framework ​

When you get the Variant B HLD round, compress your thinking into these 8 steps. Budget 45 min total.

StepTimeOutput
1. Clarify requirements5 minFunctional + non-functional req (QPS, data size, SLA)
2. API design5 minEndpoint signatures, request/response shapes
3. Data model5 minTables / entities / keys
4. High-level architecture10 minBoxes-and-arrows: LB, app tier, DB, cache, queue
5. Deep dive one component10 minThe interviewer picks; expect scaling or consistency
6. Scaling5 minSharding, replication, caching, CDN
7. Reliability3 minRetries, circuit breakers, monitoring, alerting
8. Trade-offs2 minCost vs latency, consistency vs availability

L4 scale guardrails: millions of users, hundreds-to-thousands QPS, gigabytes-to-terabytes data. DON'T over-design for billions. Simpler = better at this level.

See HLD notes 50-55 for worked examples.


Cross-references ​

Frontend interview preparation reference.