23 β Problem: Cache with Eviction Policies β
Understanding the Problem β
An in-memory cache is one of those systems that's almost trivial to write naively and almost impossible to get right in production. A junior writes a Map with a maxSize and a get/put pair. A mid-level writes an LRU with a doubly-linked list. A senior treats the cache as a framework β a thin facade over storage, with the eviction decision plugged in through an interface, metrics emitted on every mutation, stampede protection on the hot path, and a concurrency model that doesn't serialize every request behind a single mutex.
The senior answer factors eviction as a Strategy
LRU is the default you reach for in every interview, but it's only one of four or five policies a real cache needs to support. LFU wins under Zipfian workloads. FIFO is dead simple and sometimes good enough. TTL is orthogonal β you want it composable on top of any policy. The senior move is to recognise that IEvictionPolicy is the right abstraction, write the cache around it, and then implement LRU as the first plugin. Then LFU, FIFO, and TTL fall out without touching the cache.
The problem also rewards you for naming three things a junior never thinks about: observability (hits, misses, evictions, latency percentiles), thread safety (striped locks, not a single giant mutex), and stampede protection (singleflight / request coalescing so ten threads asking for the same missing key don't all stampede the backing source). If you mention none of these, you are SDE1. If you mention all three and explain the tradeoffs, you are SDE3.
Clarifying Questions β
You: Is this an in-process cache only, or do we need to support a tiered setup with a local L1 and a distributed L2 (Redis, Memcached)?
Interviewer: In-process for the core design. Extensibility to tiered is a follow-up.
You: Is TTL per-key, global, or both? And is it required, or only for some entries?
Interviewer: Per-key, optional. Some entries live forever until evicted by capacity; others have an explicit expiry.
You: What's the expected access pattern β read-heavy, write-heavy, balanced? That affects the concurrency model.
Interviewer: Assume read-heavy, roughly 20:1 reads to writes. But the design should be sane under write-heavy too.
You: Is observability in scope β hit rate, miss rate, eviction count, latency?
Interviewer: Yes. At minimum counters for hits / misses / evictions, exposed for scraping.
You: Stampede protection? If 100 threads miss on the same key, do we want 100 loads or one?
Interviewer: One. Assume the loader is expensive β a DB query or an RPC. Describe how you'd coalesce.
You: Is capacity measured by entry count, or by bytes?
Interviewer: Count for the core design. Byte-based is a follow-up. Discuss the tradeoff.
You: Thread safety β single-threaded for the core, or concurrent from the start?
Interviewer: Concurrent. Describe the lock strategy, but implement in TypeScript where the concurrency is mostly conceptual.
You: Do we need to support eviction callbacks β "tell me when a key is evicted"?
Interviewer: Yes, via an observer hook.
You: Can the eviction policy be swapped at runtime, or only at construction?
Interviewer: Construction-time. Hot-swapping is a separate problem.
Functional Requirements β
get(key)β returns the value if present and not expired, otherwiseundefined. Updates policy state (e.g. LRU recency, LFU frequency).put(key, value, ttlMs?)β inserts or updates. If at capacity, triggers exactly one eviction (per the current policy).delete(key)β removes the entry and cleans up any policy state.has(key)β presence check without altering policy state (debated β some policies would count it as access; we treathasas read-only).size()β number of live, unexpired entries.clear()β drop everything.- Pluggable policy: LRU, LFU, FIFO, TTL. TTL is composable on top of any of the others.
- Observability: hit / miss / eviction / expiry counters, exposed for read.
- Eviction callback: subscribers notified on every eviction with the key and reason (
"capacity" | "ttl" | "manual"). - Stampede protection for a
getOrLoad(key, loader)path: concurrent misses on the same key coalesce into a single loader invocation.
Non-Functional Requirements β
- O(1)
getandputincluding the policy's bookkeeping. - O(1) eviction selection β picking the victim must not be a linear scan. Rules out naive LFU with an array of frequencies.
- Bounded memory β capacity is a hard ceiling. No unbounded auxiliary structures.
- Thread-safe under concurrent readers and writers. Target: lock contention that degrades gracefully, not a single mutex that serialises the world.
- Observability: counters updated on the hot path must be cheap (atomic / lock-free counters, not a shared mutex).
- Extensibility: a new eviction policy is a new class implementing
IEvictionPolicy. No change to the cache body.
Core Entities and Relationships β
| Entity | Responsibility |
|---|---|
Cache<K, V> | Facade. Owns storage (a Map), delegates eviction decisions to an IEvictionPolicy, emits metrics and events. |
Entry<V> | Value object stored in the map: value, expiresAt?, createdAt. |
IEvictionPolicy<K> | Strategy. Three hooks: onAccess(key), onPut(key), onDelete(key), plus evict(): K to pick a victim. |
LRUPolicy<K> / LFUPolicy<K> / FIFOPolicy<K> | Concrete policies. Each owns its own ordering/scoring structure. |
TTLDecorator<K> | Wraps any IEvictionPolicy with a per-key expiry check, backed by a min-heap (or sorted structure) of expiries and a periodic sweeper. |
IClock | Abstraction for now(). Injected so tests can advance time deterministically. |
MetricsEmitter | Thin wrapper around counters (hits, misses, evictions, expiries). Thread-safe counter updates. |
EvictionListener<K> | Observer. onEvict(key, reason). |
SingleFlight<K, V> | De-duplicates concurrent loads for the same key. |
The key relationship: Cache has-an IEvictionPolicy, has-an IClock, has-a MetricsEmitter, has-many EvictionListeners. The policy owns its ordering state β the cache never touches it except through the three hooks.
Interfaces β
interface IClock {
now(): number; // epoch millis
}
interface Entry<V> {
readonly value: V;
readonly createdAt: number;
readonly expiresAt?: number; // undefined => never expires
}
type EvictionReason = "capacity" | "ttl" | "manual";
interface EvictionListener<K> {
onEvict(key: K, reason: EvictionReason): void;
}
interface IEvictionPolicy<K> {
/** Called after a successful get. Not called on miss. */
onAccess(key: K): void;
/** Called after a put (new or update). */
onPut(key: K): void;
/** Called when a key is removed (manual delete, eviction, TTL expiry). */
onDelete(key: K): void;
/** Pick a victim. Must be O(1). Caller is responsible for removing from storage. */
evict(): K;
/** Current number of keys tracked. Used for invariant checks. */
size(): number;
}
interface ICache<K, V> {
get(key: K): V | undefined;
put(key: K, value: V, ttlMs?: number): void;
delete(key: K): boolean;
has(key: K): boolean;
size(): number;
clear(): void;
addListener(listener: EvictionListener<K>): void;
metrics(): Readonly<CacheMetrics>;
}
interface CacheMetrics {
hits: number;
misses: number;
evictions: number;
expiries: number;
}Why three hooks instead of one generic onEvent(type, key)? Typing. Each hook has a clear contract. onAccess on a miss is illegal β the type system shouldn't let us accidentally call it from the miss path. Separate methods also mean the policy body has no switch over event types.
Class Diagram β
ββββββββββββββββββββββββββββ
β ICache<K,V> β
ββββββββββββββ²ββββββββββββββ
β
ββββββββββββββ΄ββββββββββββββ
β Cache<K,V> β
β - storage: Map<K,Entry> β
β - policy: IEvictionPolicyβ
β - clock: IClock β
β - metrics: MetricsEmitterβ
β - listeners: Listener[] β
β - loader: SingleFlight β
ββββββββββ¬βββββββββββ¬βββββββ
β β
βββββββββββββββββ ββββββββββββββββ
βΌ βΌ
βββββββββββββββββββββββββββ βββββββββββββββββββββββββββ
β IEvictionPolicy<K> β β IClock β
ββββββββββββββ²βββββββββββββ βββββββββββββββββββββββββββ
β
ββββββββββββΌβββββββββββ¬βββββββββββββββββ
β β β β
ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββ
βLRUPolicyββLFUPolicyββFIFOPolicyβ β TTLDecorator<K> β
β ββ ββ β β wraps any policy β
β DLL+Map ββ 2 maps ββ queue β β + expiry heap β
ββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββ
βββββββββββββββββββββββββ ββββββββββββββββββββββββββ
β MetricsEmitter β β SingleFlight<K,V> β
β hits / misses / ... β β in-flight: Map<K,Promise>β
βββββββββββββββββββββββββ ββββββββββββββββββββββββββClass Design β
Cache<K, V> β the facade β
State:
storage: Map<K, Entry<V>>β the actual key-value table.policy: IEvictionPolicy<K>β the pluggable strategy.clock: IClockβ injected for TTL and test determinism.metrics: MetricsEmitterβ hit/miss/eviction counters.listeners: EvictionListener<K>[]β notified on every eviction.capacity: numberβ entry count ceiling.singleFlight: SingleFlight<K, V>β forgetOrLoad.
The facade never implements eviction logic itself. It calls policy.onPut / onAccess / onDelete and, when storage is full, policy.evict() to get a victim key, then removes that key from storage. All ordering/scoring lives in the policy.
LRUPolicy<K> β doubly-linked list + hash map β
The classic O(1) LRU. State:
map: Map<K, Node<K>>β key β DLL node.head, tail: Node<K>β sentinels. Head side is most-recently-used; tail side is least-recently-used.
On onAccess(k) and onPut(k): move map.get(k) to just after head (create the node first on put). On evict(): unlink the node just before tail, delete it from the map, return its key.
Every operation is O(1). The DLL exists precisely because a JS Map doesn't let you move a key to the front in O(1) β you'd have to delete and re-insert, which is O(1) amortised but churns the map and wrecks iteration order guarantees.
LFUPolicy<K> β the O(1) algorithm β
Naive LFU is "scan all entries, pick the minimum frequency" which is O(n) per eviction. The O(1) algorithm (Shi et al., 2010; widely adopted) uses two maps:
keyToFreq: Map<K, FreqNode>β key β its frequency bucket.freqList: DoublyLinkedList<FreqNode>β buckets ordered by frequency, ascending.
Each FreqNode holds a frequency: number and a keys: LinkedHashSet<K> β an insertion-ordered set so that within a frequency bucket, the least-recently-added key is evicted first (that's the tiebreaker).
On onAccess(k):
- Find the current bucket
b=keyToFreq.get(k). - Remove
kfromb.keys. - Find-or-create bucket
b'with frequencyb.frequency + 1, inserted right afterbinfreqList. - Add
ktob'.keys, updatekeyToFreq.set(k, b'). - If
b.keysis now empty, unlinkbfromfreqList.
On evict(): take the head of freqList (lowest frequency), remove its oldest key, return it.
All operations are O(1).
FIFOPolicy<K> β plain queue β
A doubly-linked list with only add-to-back and remove-from-front. onAccess is a no-op (FIFO doesn't care about access). onPut appends; evict() pops the front.
TTLDecorator<K> β wraps any policy β
Composable. Wraps an IEvictionPolicy<K> and adds expiry tracking:
expiryHeap: MinHeap<{ key: K; expiresAt: number }>β lets the sweeper find the earliest-expiring key in O(log n).expiryMap: Map<K, number>β for lazy expiry on access.
Two expiry strategies, used together:
- Lazy: on
get, checkexpiryMap.get(k)againstclock.now(). If expired, delete and report miss. - Active: a periodic sweeper (
setIntervalor a dedicated thread in Java) that pops expired entries fromexpiryHeapand evicts them.
The hybrid avoids the two failure modes of each alone: lazy-only lets cold expired entries squat forever; active-only wastes wakeups and may miss entries that expire between sweeps.
SingleFlight<K, V> β stampede protection β
State: inFlight: Map<K, Promise<V>>.
On getOrLoad(k, loader):
- If
inFlight.has(k), return the same promise. - Otherwise, invoke
loader(), store its promise ininFlight, and on settle (success or failure), remove the entry.
All N callers get the same promise. One actual load.
Key Methods β
Cache β the facade β
class Cache<K, V> implements ICache<K, V> {
private readonly storage = new Map<K, Entry<V>>();
private readonly listeners: EvictionListener<K>[] = [];
private readonly metricsEmitter = new MetricsEmitter();
private readonly singleFlight = new SingleFlight<K, V>();
constructor(
private readonly capacity: number,
private readonly policy: IEvictionPolicy<K>,
private readonly clock: IClock = new SystemClock(),
) {
if (capacity <= 0) throw new RangeError("capacity must be > 0");
}
get(key: K): V | undefined {
const entry = this.storage.get(key);
if (entry === undefined) {
this.metricsEmitter.miss();
return undefined;
}
if (this.isExpired(entry)) {
this.storage.delete(key);
this.policy.onDelete(key);
this.metricsEmitter.expiry();
this.notify(key, "ttl");
this.metricsEmitter.miss();
return undefined;
}
this.policy.onAccess(key);
this.metricsEmitter.hit();
return entry.value;
}
put(key: K, value: V, ttlMs?: number): void {
const now = this.clock.now();
const entry: Entry<V> = {
value,
createdAt: now,
expiresAt: ttlMs !== undefined ? now + ttlMs : undefined,
};
if (this.storage.has(key)) {
// Update path β no eviction, no capacity check.
this.storage.set(key, entry);
this.policy.onPut(key);
return;
}
// Insert path β may need to evict.
if (this.storage.size >= this.capacity) {
const victim = this.policy.evict();
this.storage.delete(victim);
this.metricsEmitter.eviction();
this.notify(victim, "capacity");
}
this.storage.set(key, entry);
this.policy.onPut(key);
}
delete(key: K): boolean {
if (!this.storage.has(key)) return false;
this.storage.delete(key);
this.policy.onDelete(key);
this.notify(key, "manual");
return true;
}
has(key: K): boolean {
const entry = this.storage.get(key);
if (entry === undefined) return false;
if (this.isExpired(entry)) {
this.storage.delete(key);
this.policy.onDelete(key);
this.metricsEmitter.expiry();
this.notify(key, "ttl");
return false;
}
return true;
}
size(): number {
return this.storage.size;
}
clear(): void {
for (const k of this.storage.keys()) this.policy.onDelete(k);
this.storage.clear();
}
addListener(listener: EvictionListener<K>): void {
this.listeners.push(listener);
}
metrics(): Readonly<CacheMetrics> {
return this.metricsEmitter.snapshot();
}
async getOrLoad(key: K, loader: () => Promise<V>, ttlMs?: number): Promise<V> {
const cached = this.get(key);
if (cached !== undefined) return cached;
// Coalesce concurrent misses.
return this.singleFlight.do(key, async () => {
// Double-check after acquiring the slot β another winner may have populated it.
const again = this.get(key);
if (again !== undefined) return again;
const loaded = await loader();
this.put(key, loaded, ttlMs);
return loaded;
});
}
private isExpired(entry: Entry<V>): boolean {
return entry.expiresAt !== undefined && this.clock.now() >= entry.expiresAt;
}
private notify(key: K, reason: EvictionReason): void {
for (const l of this.listeners) {
try {
l.onEvict(key, reason);
} catch {
// Listener errors must not break the cache. Log in a real system.
}
}
}
}LRUPolicy β full implementation β
class LRUNode<K> {
constructor(
public key: K,
public prev: LRUNode<K> | null = null,
public next: LRUNode<K> | null = null,
) {}
}
class LRUPolicy<K> implements IEvictionPolicy<K> {
private readonly map = new Map<K, LRUNode<K>>();
private readonly head: LRUNode<K>; // sentinel β MRU side
private readonly tail: LRUNode<K>; // sentinel β LRU side
constructor() {
// Sentinels. The key type is a lie for sentinels, but they're never exposed.
this.head = new LRUNode<K>(undefined as unknown as K);
this.tail = new LRUNode<K>(undefined as unknown as K);
this.head.next = this.tail;
this.tail.prev = this.head;
}
onAccess(key: K): void {
const node = this.map.get(key);
if (!node) return; // Defensive β policy state out of sync would be a bug.
this.unlink(node);
this.pushFront(node);
}
onPut(key: K): void {
const existing = this.map.get(key);
if (existing) {
this.unlink(existing);
this.pushFront(existing);
return;
}
const node = new LRUNode(key);
this.map.set(key, node);
this.pushFront(node);
}
onDelete(key: K): void {
const node = this.map.get(key);
if (!node) return;
this.unlink(node);
this.map.delete(key);
}
evict(): K {
const victim = this.tail.prev!;
if (victim === this.head) {
throw new Error("evict() called on empty policy");
}
this.unlink(victim);
this.map.delete(victim.key);
return victim.key;
}
size(): number {
return this.map.size;
}
private pushFront(node: LRUNode<K>): void {
node.prev = this.head;
node.next = this.head.next;
this.head.next!.prev = node;
this.head.next = node;
}
private unlink(node: LRUNode<K>): void {
node.prev!.next = node.next;
node.next!.prev = node.prev;
node.prev = null;
node.next = null;
}
}LFUPolicy β O(1) with two maps and a frequency DLL β
class LinkedHashSet<K> {
// Insertion-ordered set. In JS, a plain Map<K, true> preserves insertion order.
private readonly s = new Map<K, true>();
add(k: K): void { this.s.set(k, true); }
delete(k: K): boolean { return this.s.delete(k); }
has(k: K): boolean { return this.s.has(k); }
size(): number { return this.s.size; }
oldest(): K | undefined {
// First inserted key.
const it = this.s.keys().next();
return it.done ? undefined : it.value;
}
[Symbol.iterator](): IterableIterator<K> { return this.s.keys(); }
}
class FreqNode<K> {
constructor(
public frequency: number,
public keys: LinkedHashSet<K> = new LinkedHashSet<K>(),
public prev: FreqNode<K> | null = null,
public next: FreqNode<K> | null = null,
) {}
}
class LFUPolicy<K> implements IEvictionPolicy<K> {
private readonly keyToFreq = new Map<K, FreqNode<K>>();
private readonly head: FreqNode<K>; // sentinel β lowest frequency side
private readonly tail: FreqNode<K>; // sentinel β highest frequency side
constructor() {
this.head = new FreqNode<K>(-1);
this.tail = new FreqNode<K>(Number.POSITIVE_INFINITY);
this.head.next = this.tail;
this.tail.prev = this.head;
}
onAccess(key: K): void {
const bucket = this.keyToFreq.get(key);
if (!bucket) return;
this.bumpKey(key, bucket);
}
onPut(key: K): void {
const existing = this.keyToFreq.get(key);
if (existing) {
this.bumpKey(key, existing);
return;
}
// New key. Frequency 1. Insert into bucket right after head.
let firstBucket = this.head.next!;
if (firstBucket === this.tail || firstBucket.frequency !== 1) {
firstBucket = this.insertAfter(this.head, new FreqNode<K>(1));
}
firstBucket.keys.add(key);
this.keyToFreq.set(key, firstBucket);
}
onDelete(key: K): void {
const bucket = this.keyToFreq.get(key);
if (!bucket) return;
bucket.keys.delete(key);
this.keyToFreq.delete(key);
if (bucket.keys.size() === 0) this.unlinkBucket(bucket);
}
evict(): K {
const firstBucket = this.head.next!;
if (firstBucket === this.tail) {
throw new Error("evict() called on empty policy");
}
// LRU tiebreak within a frequency bucket.
const victim = firstBucket.keys.oldest();
if (victim === undefined) {
throw new Error("invariant broken: bucket in list but empty");
}
firstBucket.keys.delete(victim);
this.keyToFreq.delete(victim);
if (firstBucket.keys.size() === 0) this.unlinkBucket(firstBucket);
return victim;
}
size(): number {
return this.keyToFreq.size;
}
private bumpKey(key: K, bucket: FreqNode<K>): void {
const nextFreq = bucket.frequency + 1;
let nextBucket = bucket.next!;
if (nextBucket === this.tail || nextBucket.frequency !== nextFreq) {
nextBucket = this.insertAfter(bucket, new FreqNode<K>(nextFreq));
}
bucket.keys.delete(key);
nextBucket.keys.add(key);
this.keyToFreq.set(key, nextBucket);
if (bucket.keys.size() === 0) this.unlinkBucket(bucket);
}
private insertAfter(anchor: FreqNode<K>, node: FreqNode<K>): FreqNode<K> {
node.prev = anchor;
node.next = anchor.next;
anchor.next!.prev = node;
anchor.next = node;
return node;
}
private unlinkBucket(node: FreqNode<K>): void {
node.prev!.next = node.next;
node.next!.prev = node.prev;
node.prev = null;
node.next = null;
}
}FIFOPolicy β one queue, nothing fancy β
class FIFOPolicy<K> implements IEvictionPolicy<K> {
private readonly order = new Set<K>(); // insertion-ordered
onAccess(_key: K): void { /* no-op β FIFO ignores reads */ }
onPut(key: K): void {
if (this.order.has(key)) {
// Policy choice: update-in-place does NOT re-queue. That's classic FIFO.
return;
}
this.order.add(key);
}
onDelete(key: K): void {
this.order.delete(key);
}
evict(): K {
const first = this.order.values().next();
if (first.done) throw new Error("evict() called on empty policy");
this.order.delete(first.value);
return first.value;
}
size(): number {
return this.order.size;
}
}TTLDecorator β hybrid lazy + active expiry β
interface IExpirySink<K> {
deleteExpired(key: K): void;
}
class MinHeap<T> {
private readonly data: T[] = [];
constructor(private readonly less: (a: T, b: T) => boolean) {}
size(): number { return this.data.length; }
peek(): T | undefined { return this.data[0]; }
push(v: T): void {
this.data.push(v);
this.siftUp(this.data.length - 1);
}
pop(): T | undefined {
if (this.data.length === 0) return undefined;
const top = this.data[0];
const last = this.data.pop()!;
if (this.data.length > 0) {
this.data[0] = last;
this.siftDown(0);
}
return top;
}
private siftUp(i: number): void {
while (i > 0) {
const p = (i - 1) >> 1;
if (this.less(this.data[i], this.data[p])) {
[this.data[i], this.data[p]] = [this.data[p], this.data[i]];
i = p;
} else break;
}
}
private siftDown(i: number): void {
const n = this.data.length;
while (true) {
const l = 2 * i + 1, r = 2 * i + 2;
let m = i;
if (l < n && this.less(this.data[l], this.data[m])) m = l;
if (r < n && this.less(this.data[r], this.data[m])) m = r;
if (m === i) break;
[this.data[i], this.data[m]] = [this.data[m], this.data[i]];
i = m;
}
}
}
interface ExpiryRecord<K> { key: K; expiresAt: number; generation: number; }
class TTLManager<K> {
private readonly heap = new MinHeap<ExpiryRecord<K>>((a, b) => a.expiresAt < b.expiresAt);
// Generation counter per key to invalidate stale heap entries after update.
private readonly generation = new Map<K, number>();
private sweepHandle: ReturnType<typeof setInterval> | null = null;
constructor(
private readonly clock: IClock,
private readonly sink: IExpirySink<K>,
private readonly sweepIntervalMs: number = 1000,
) {}
track(key: K, expiresAt: number): void {
const gen = (this.generation.get(key) ?? 0) + 1;
this.generation.set(key, gen);
this.heap.push({ key, expiresAt, generation: gen });
}
untrack(key: K): void {
// Invalidate all heap entries for this key by bumping generation.
const gen = (this.generation.get(key) ?? 0) + 1;
this.generation.set(key, gen);
}
startSweeper(): void {
if (this.sweepHandle !== null) return;
this.sweepHandle = setInterval(() => this.sweep(), this.sweepIntervalMs);
}
stopSweeper(): void {
if (this.sweepHandle !== null) {
clearInterval(this.sweepHandle);
this.sweepHandle = null;
}
}
sweep(): void {
const now = this.clock.now();
while (this.heap.size() > 0) {
const top = this.heap.peek()!;
if (top.expiresAt > now) return;
this.heap.pop();
// Skip stale entries (key was updated or deleted).
const currentGen = this.generation.get(top.key);
if (currentGen !== top.generation) continue;
this.sink.deleteExpired(top.key);
}
}
}The cache uses TTLManager as an internal helper: on put with a ttl, call track; on manual delete or eviction, call untrack. The sweep() path expires entries proactively. The isExpired() check in get/has is the lazy path. Both are wired through the same deleteExpired sink so the metrics and listener notification paths are unified.
Generation counters matter. Without them, overwriting a key with a new TTL leaves a dangling heap entry for the old expiry β when it fires, it would wrongly expire the current (un-expired) value. Generation bumps make those entries harmless ghosts: sweep sees the mismatch and drops them.
SingleFlight β coalescing loader β
class SingleFlight<K, V> {
private readonly inFlight = new Map<K, Promise<V>>();
async do(key: K, fn: () => Promise<V>): Promise<V> {
const existing = this.inFlight.get(key);
if (existing) return existing;
const p = (async () => {
try {
return await fn();
} finally {
this.inFlight.delete(key);
}
})();
this.inFlight.set(key, p);
return p;
}
}If the loader throws, all waiters get the rejection. That's usually what you want β if the DB is down, propagate the failure fast. If you want "one retries, the others fail fast," layer a retry wrapper on top.
Design Decisions & Tradeoffs β
Eviction as Strategy vs fork-per-policy β
A naive approach forks the whole cache per policy: LRUCache, LFUCache, FIFOCache, each with its own get/put. That triples the code and forces every bug fix to be made three times. Worse, every new policy is a copy-paste of the cache body.
Strategy inverts the dependency. The cache is one class; a policy is one class; they meet at IEvictionPolicy. A new policy (say, ARC β Adaptive Replacement Cache) ships as a single class in isolation. Metrics, TTL, observer hooks, stampede protection β all written once.
Cost: one indirection per hot operation. Negligible.
Linked-list-map for LRU vs abusing Map insertion order β
JavaScript's Map iterates in insertion order, and you can simulate LRU by delete-then-set on access. It's tempting and short.
Three reasons to use the DLL anyway:
- Semantic clarity β the DLL is the ordering.
Mapinsertion order is an implementation detail across languages; the moment your interviewer asks "how would this work in Java?" theLinkedHashMapanswer maps cleanly to the DLL story, and the "delete-and-reinsert" answer doesn't. - Interview signal β writing the DLL shows you understand why O(1) LRU needs both a hash for lookup and a linked structure for ordering. The
Maptrick hides that understanding. - Policy decoupling β LFU and ARC also need DLL-like structures, and once you've written the primitive you can share it. Relying on
Maporder means every policy improvises its own storage.
For a throwaway cache, the Map trick is fine. For the design interview and for a library others will extend, the DLL earns its keep.
LFU O(1) vs naive min-heap β
A min-heap keyed on frequency gives O(log n) eviction and O(log n) updates (on access, you'd have to reheapify). The O(1) algorithm above is strictly better for throughput. It's also the "I've read the literature" answer β Shi/Ding et al. 2010 is one of the handful of papers an interviewer is genuinely hoping you've seen.
The cost is complexity: two maps and a linked list of buckets. If the interviewer is time-boxed, sketch the algorithm and implement LRU fully β that's the right allocation of minutes.
TTL: lazy vs active vs hybrid β
| Strategy | Wins | Loses |
|---|---|---|
| Lazy only (check on access) | Zero background work. Simple. | Cold expired entries linger forever, occupying capacity. |
| Active only (periodic sweep) | Keeps capacity accurate. | Wastes wakeups on idle caches. Misses expiries between sweeps β still need lazy. |
| Hybrid | Bounds staleness (active) and corrects on access (lazy). Expires entries at most sweepInterval + epsilon after TTL. | Two code paths to keep consistent. |
Hybrid is the production answer. The generation counter pattern is the detail that makes it robust under updates.
Capacity: count-based vs byte-based β
Count is trivially correct: storage.size >= capacity. Bytes require a Weigher<V> function and careful accounting on update (subtract old, add new). In a managed runtime (JS, Java, Go), "bytes" is a lie β object size is implementation-defined, strings have overhead, references are free-ish. You end up with weigher: (k, v) => v.length and pretending characters are bytes.
Production caches (Caffeine, Guava) support both. The senior answer is: default to count-based, add byte-based via a Weigher when the values are highly variable (e.g. an image cache where a thumbnail and a 4 MB JPEG shouldn't count equally).
One giant lock vs striped vs RWLock β
Covered in the Concurrency section below β short version: striped locks (16 shards) is the default; RWLock is almost always wrong for LRU because every read mutates the DLL.
Patterns Used β
- Strategy β
IEvictionPolicyis the central pattern. Each policy is a plug-in. - Decorator β
TTLDecoratorwraps any policy and adds expiry.MetricsCache(a subclass or wrapper overCache) can add more elaborate telemetry without changing the core. Eviction listeners are a decorator-ish pattern around the storage mutation path. - Observer β
EvictionListenerand itsonEvictnotification. Listeners can publish metrics, emit cloud events, or invalidate downstream caches. - Builder β A
CacheBuilder<K,V>().capacity(100).policy(new LRUPolicy()).clock(testClock).withStampedeProtection().build()makes tests readable and lets new options land without breaking constructor signatures. - Template Method β If many policies share "maintain an ordering structure, notify on change," a
BaseOrderedPolicyclass with aprotected abstract rankOf(key)hook factors the boilerplate. In practice LRU/FIFO/LFU are different enough that we lean on the interface instead; but with more policies, template method starts paying. - Singleton-ish scope for the
MetricsEmitterper cache instance (not a global) β each cache has its own counters. - Null Object β a
NoopPolicythat never evicts is useful for tests and for an unbounded cache mode.
Concurrency Considerations β
Concurrency is where the senior/junior gap yawns widest.
Option A β one coarse lock. Wrap every public method in synchronized / a Mutex. Correct, trivial, and a throughput disaster: every get serialises behind every put. In a read-heavy workload, you've turned a cache into a global mutex.
Option B β RWLock. Readers share, writers are exclusive. Tempting for read-heavy. But: LRU's onAccess mutates the DLL on every read. So "read" on the cache is "write" on the policy β the RWLock degenerates. It only works for policies where reads are truly read-only (FIFO β which ignores reads β or an LRU variant that samples approximate recency like SIEVE or S3-FIFO).
Option C β striped locks. Partition the keyspace into N shards (commonly 16 or 32). Each shard has its own Map, its own IEvictionPolicy instance, and its own lock. A get(k) picks the shard by hash(k) % N and only contends with other ops on the same shard. With a good hash, contention is ~1/N of the single-lock case.
The catch: capacity is per-shard now, not global. If you hash the key and 20% of your keys land in shard 0, shard 0 will evict more aggressively than a global-capacity scheme would. In practice this is acceptable β a well-distributed hash keeps shards roughly balanced β but you should call it out.
Option D β lock-free. ConcurrentHashMap for storage; per-key CAS loops on the policy state. Very hard to get right, especially for LRU's DLL (the classic paper is Rao et al.'s "MemC3" / CLOCK-Pro variants). Not an interview answer unless asked. Metrics counters, on the other hand, should be lock-free (atomic counters).
Senior take. Striped locks are the default. RWLock is a trap for any policy that writes on read (which is LRU, LFU, and most interesting policies). Counters should be lock-free; the rest can sit under the shard lock. For read-heavy workloads with tiny values, consider approximate LRU variants (sampling, CLOCK) to avoid the DLL-write-on-read problem altogether β that's the frontier answer.
Scale & Extensibility β
- Tiered cache (L1 in-proc, L2 Redis). Compose:
TieredCache(l1: Cache, l2: RemoteCache).getchecks L1, then L2, populating L1 on L2 hit. Writes can be write-through (synchronous L1 + L2) or write-behind (L1 now, L2 queued). Tradeoff: write-behind is faster but can lose writes on crash; write-through is safer but slower. - Refresh-ahead. Instead of letting TTL expire and taking a miss, refresh the entry N ms before expiry. Implement as a TTL decorator variant that queues a background reload for entries nearing expiry on access. Avoids the "everyone stampedes at T+TTL" cliff.
- Cache warming on startup. Accept a
Supplier<Iterable<[K,V]>>at construction; pre-populate onstart(). Combined with refresh-ahead, this gives you a cache that's warm from the first request. - Per-entry callbacks. Already implemented via
EvictionListener; could extend to per-key listeners (on eviction of specifically this key). - Size-based with Weighers. Add
weigher: (k: K, v: V) => numberto the builder, track total weight, evict untiltotalWeight + newWeight <= maxWeight. Eviction becomes a loop, not a single call. - New policy = new class. ARC, SIEVE, W-TinyLFU (what Caffeine uses), CLOCK-Pro β each fits the
IEvictionPolicyinterface. The last one is notable: W-TinyLFU combines LRU recency with LFU frequency via a probabilistic admission filter (count-min sketch). State of the art for general-purpose caches. - Stampede protection. Already via
SingleFlight. Extends naturally to negative caching (remember "this key doesn't exist" for a short TTL so we don't stampede on definitively-missing keys). - Async refresh / load.
getOrLoadreturnsPromise<V>. Trivially extended to anAsyncLoadingCachevariant that rejects when the loader fails rather than caching failures.
Edge Cases β
- Zero or negative capacity β constructor throws. A cache that can't hold anything is a bug, not a feature.
puton existing key β update the entry, callpolicy.onPut(policies treat updates as re-puts, with LRU / LFU moving them accordingly; FIFO explicitly doesn't re-queue). No eviction on an update β size is unchanged.- Concurrent expire + get β the
getpath's lazy check and the sweeper's active path both delete. If both fire on the same key: under the shard lock only one wins; the loser finds the entry gone and reports a miss. Counters: the one who deletes incrementsexpiries; the one who misses also incrementsmisses. Slight double-count of misses is acceptable; correct ordering under the lock prevents a double-delete from throwing. - Eviction during iteration β an interviewer may ask for
entries()orkeys(). Contract choice: weakly consistent iteration (reflects state at start of iteration; concurrent changes may or may not appear). Same contract asConcurrentHashMapβ never throws, no total ordering guarantee. ttlMs === 0β immediate expire. Valid; the entry is written and then any subsequentgetmisses. Slightly weird but honest.ttlMs < 0β reject. Negative TTL is a programming error.- Very large single value β with count-based capacity, one 1 GB entry fills 1 slot. With byte-based capacity and a Weigher, if one entry exceeds the total capacity, reject it or evict everything to make room? We choose reject with an explicit error β silently evicting everything to fit an oversized entry is surprising.
- Clock going backward β NTP adjustments, VM pauses. TTL check
now >= expiresAtis stable to small backward jumps; a large jump can delay expiry. For hardened code, use a monotonic clock (process.hrtime.bigint()in Node,System.nanoTime()in Java) for TTL and only use wall-clock for user-facing timestamps. - Listener throws β caught and swallowed (logged in real code). A buggy listener must not break the cache.
clear()under concurrent access β holds all shard locks or acknowledges weak consistency ("clears what was there at the moment you called, concurrent puts may survive").- Key type with bad
hashCode/equalsβ all bets off. Document that K must have stable equality. In TypeScript, reference equality for objects means you better not use{id: 1}as a key β use a primitive. undefinedas a value β ourgetreturnsundefinedon miss. Storingundefinedwould conflate hit-with-undefined and miss. Fix: return a{ hit: boolean, value: V }wrapper from an internal method and only collapse toV | undefinedat the public API. Or forbidundefinedvalues (document it).
Follow-up Questions β
- How would LRU-K or LFU-DA differ? LRU-K tracks the K-th-most-recent access, not the most-recent β so a key hit once then never again falls out faster. LFU-DA (Dynamic Aging) divides frequencies by age to prevent old-but-once-popular keys from squatting. Both fit the
IEvictionPolicyinterface; LFU-DA needs a periodic aging pass. - How to evolve to a 2-tier cache?
TieredCache(l1, l2)composes twoICacheinstances. L1 is in-process (our implementation); L2 is aRemoteCachewrapper over Redis. Reads cascade; writes are either write-through (strong consistency) or write-behind (higher throughput, risk of loss). - Hot-key stampede?
SingleFlighthandles the "concurrent miss" case. For "all TTLs expire at once" (think: caching user sessions that all got created at 9 AM), add jitter to TTLs β randomise Β±10% onput. Saves the backing store from a synchronised thundering herd. - How to observe cache efficacy? Hit rate (
hits / (hits + misses)), eviction rate (per second), expiry rate, average value lifetime, p50/p99 ofgetOrLoadlatency. A 99% hit rate on a 10k-QPS cache saves 9,900 backend calls per second; a 90% hit rate saves 9,000. The difference is whether you need one replica or ten. - How to invalidate keys by pattern? Cache has no natural secondary index, so
invalidate("user:*")requires either (a) linear scan (document as O(n) admin operation), (b) a tag-based index:put(key, value, tags)maintainsMap<tag, Set<key>>, or (c) versioning: each tag has a generation; reads compare generations and treat mismatches as misses. - How to warm the cache on startup? Constructor accepts
warmer: () => Iterable<[K, V, ttl?]>; on start, batch-puts them under a single lock-acquisition cycle. For large warm sets, stream and chunk to avoid blocking. - LRU vs LFU under Zipfian workload? LFU wins β Zipfian has a small hot set accessed many times and a long tail accessed once. LRU keeps any recently-touched key for a full LRU cycle, polluting capacity with one-shot keys. LFU promotes the hot set and evicts the long tail fast. W-TinyLFU does even better by admitting new keys only if their sketched frequency exceeds the current victim's.
- Write-through vs write-behind for a 2-tier cache? Write-through: the call doesn't return until L2 is updated. Consistent, slower. Write-behind: L1 is updated synchronously, L2 asynchronously (via a queue). Faster, at risk of data loss on crash before the queue drains. Choose per requirement: session cache β write-behind; inventory count β write-through.
- How would you implement negative caching? On a loader miss (loader returns
undefinedor throws a "not found"), cache a sentinel for a short TTL (say 30s). Subsequent requests return the sentinel fast and don't re-query. Tradeoff: a key that just got created won't appear to callers for up to 30s. - How do you test concurrency correctness? Property tests with a model (a simple sequential cache), a random op generator, and a stress harness that runs ops concurrently on real cache and a serial model, then compares externally-observable state (size, hit/miss counts, keys present). Tools:
fast-checkin JS,jqwik/ JUnit's concurrency tools in Java. - What breaks if the clock is wrong on one shard in a distributed cache? TTLs drift. Entries expire early on fast-clock nodes, late on slow-clock nodes. If the cache fronts a correctness-critical system (e.g. auth tokens), use monotonic time for intra-cache deadlines and rely on the backing source's wall-clock for cross-node consistency.
- How would you implement a size-based cache with a Weigher? Track
totalWeight; on put, computenewWeight = weigher(k, v); whiletotalWeight + newWeight > maxWeight, evict. Rejection rule: ifnewWeight > maxWeight, either reject or (if you chose "evict to fit") evict everything. Document the choice.
SDE2 vs SDE3 β How the Bar Rises β
| Dimension | SDE2 (expected) | SDE3 (expected) |
|---|---|---|
| Policy pluggability | Writes LRU directly in the cache. May mention "you could abstract eviction." | Designs IEvictionPolicy upfront. Implements LRU first, sketches LFU/FIFO/TTL as plug-ins. Explains why Strategy beats fork-per-policy. |
| LFU awareness | Mentions LFU exists; would implement as min-heap if asked (O(log n)). | Knows the O(1) two-map algorithm. Can at least sketch the frequency DLL. Mentions W-TinyLFU as the production-grade evolution. |
| Lock granularity | "Wrap everything in a mutex." May mention RWLock. | Striped locks as default. Correctly calls out that RWLock fails for LRU because reads write to the DLL. Mentions monotonic clock for TTL. |
| Stampede protection | Not mentioned, or mentions it only when prompted. | Volunteers SingleFlight (singleflight / request coalescing) as part of the core design. Discusses TTL jitter for synchronised expiries. |
| Observability | "We'd add some counters." | Hits, misses, evictions, expiries, latency, stale rate. Counters are lock-free. Decorator pattern to layer richer metrics without polluting the core. |
| TTL strategy | Lazy only, or active only. | Hybrid lazy + active, with generation counters to invalidate stale heap entries after updates. |
| Tiered / extensibility | "We could add Redis in front." | Cleanly composes L1/L2 via an ICache-of-ICache. Discusses write-through vs write-behind, refresh-ahead, cache warming. |
| Edge cases | Empty / full / basic TTL. | Negative TTL, undefined values, clock going backward, listener throwing, eviction during iteration, concurrent expire + get races. |
| Testing | Unit tests per method. | Property tests against a sequential model. Concurrency stress harness. Explicit invariants (policy.size() === storage.size). |
| Scope boundary | "We can add distributed later." | Calls out that distributed cache invalidation is a different problem (CRDT-esque ownership, or versioning). Doesn't conflate the two. |
The difference between SDE2 and SDE3 on this problem is almost entirely about what you mention unprompted. An SDE3 says "singleflight" and "striped locks" and "monotonic clock" before the interviewer has to ask. An SDE2 produces the same answers, beautifully, but only after being nudged toward them. The signal is default posture: do you reach for production concerns without being prompted, or only when reminded?