22 β Problem: Cab Booking System β
Understanding the Problem β
A cab booking platform β Uber, Lyft, Ola β sits at the intersection of four distinct sub-problems: state machines (a ride passes through half a dozen discrete lifecycle steps, and so does every driver and every rider), matching (nearest available driver, filtered by preferences, under a hard latency budget), pricing (base + distance + time + surge + tips, with quote-locking and surge-zone edge cases), and location (a geospatial index that ingests driver pings continuously and answers nearest-neighbour queries in milliseconds). Each piece is a textbook LLD question on its own β together they form one of the archetypal senior system designs.
What this interview is really testing
The trap is diving into matching algorithms before the state machine is drawn. Candidates who start at "I'd use a quadtree" without first nailing down when a ride transitions from REQUESTED to MATCHED, who owns the transition, and what happens on a race end up with a beautiful matcher attached to a ride object that can be mutated from three places at once. The interviewer is watching whether you model the lifecycle first, then layer matching / pricing / location as pluggable strategies on top. Location index and surge algorithm are HLD-ish β acknowledge them, pick a modern default (H3, simple surge formula), and keep moving. The LLD meat is the state machines, the matching race resolution, and the strategy surfaces.
Clarifying Questions β
You: What cab types do we need? Pool, Prime / UberX, XL, Auto / Moto, Premier?
Interviewer: Start with three:
POOL,STANDARD(UberX equivalent),XL. Design should allow adding a fourth (Electric, Premier) as config, not code.
You: Scheduled rides β booked for a future time β in scope?
Interviewer: Out of scope for the core design. Mention how you'd extend to it.
You: What drives surge pricing? Supply/demand ratio per zone, weather, events? Is the algorithm in scope?
Interviewer: The algorithm is out of scope β that's an ML problem. Design the interface so a surge service can publish multipliers per zone, and the pricing engine consumes them.
You: Cancellation policy β who pays, when does a fee kick in, does the driver get compensated?
Interviewer: Rider cancels after driver accepts and is en route: fee after a 2-minute grace window. Driver cancels: no fee to rider, driver is penalized in a rating system. Rider cancels before match: free.
You: Rating system β is it just a number stored on the ride, or is it a full sub-system?
Interviewer: Rider rates driver, driver rates rider, 1β5 stars. Stored on the ride. Rolling average on each user. Don't design the review moderation or fraud detection.
You: Driver acceptance timeout β how long does a driver have to accept before we re-offer?
Interviewer: 15 seconds. If the driver doesn't tap accept, we re-broadcast to the next candidate.
You: Multi-stop rides? Pickup β Stop 1 β Stop 2 β Drop?
Interviewer: Extensibility. The
Rideshould be able to carry multiple waypoints but the core flow assumes single source + single dest.
You: Corporate / B2B accounts where the company pays, not the rider?
Interviewer: Extensibility. Show me where it plugs into the payment flow.
You: Pickup radius β do we match drivers within 2 km, 5 km, or does it depend on density?
Interviewer: Configurable per city. Default 3 km, expand to 5 km if no driver found, fail after 5 km.
You: Single-region or multi-region? Is geosharding relevant to the LLD?
Interviewer: LLD-level is single-city. Mention geosharding in the scale section.
Final Requirements β
- Functional: Rider requests a ride with source, destination, cab type. Matcher finds the nearest available driver respecting preferences. Driver accepts / declines (15 s timeout). Ride transitions through
REQUESTED β MATCHED β ACCEPTED β ARRIVING β PICKED_UP β IN_RIDE β COMPLETEDwithCANCELLEDreachable from early states. Fare computed on completion. Location updates ingested continuously. - Non-functional: P99 matching latency < 1 s, correctness under concurrent match races, ride state transitions are atomic, fairness to drivers (round-robin within tie), system degrades gracefully (no-match > wrong-match).
- Out of scope for core: Scheduled rides, multi-stop, corporate accounts, multi-region, surge ML, fraud detection. Discussed as extensions.
Functional Requirements β
requestRide(riderId, source, dest, cabType)returns aRideinREQUESTEDstate and kicks off matching.- Matcher picks the nearest available driver of the right cab type within the pickup radius.
- Driver receives an offer (
MATCHEDstate) and has 15 s to accept. On accept, ride entersACCEPTEDand driver state becomesMATCHED. On timeout or decline, matcher picks the next candidate. - Driver en route: ride is in
ARRIVING. Driver taps "arrived" β no state change yet; rider taps "I'm in the car" or driver taps "start ride" βPICKED_UPβIN_RIDE. - Driver taps "end ride" β
COMPLETED. Fare is computed, payment captured. - Rider or driver can cancel from any pre-ride state; policy determines fee.
- Location updates: drivers publish GPS pings at ~4 s intervals while
AVAILABLEorIN_RIDE. Index updated in near real time. - Surge multiplier per zone consumed from a surge service; quote locked at request time.
- Payment captured on
COMPLETEDvia pluggable payment gateway. - Rating recorded on both sides after
COMPLETED.
Non-Functional Requirements β
| Requirement | Target | Why |
|---|---|---|
| P99 matching latency | < 1 s from request to driver offer | UX: rider sees the map spin; > 1 s feels broken. |
| Matching correctness | A driver is never offered two rides simultaneously | Duplicated offer = driver confusion, ride racing, eventually lost trust. |
| Ride state atomicity | Every transition is all-or-nothing; a CANCELLED ride never accidentally becomes IN_RIDE | Financial and legal consequences. |
| Location index freshness | Driver ping β index visible < 2 s | Stale index = matching a driver who just went offline or drove 3 km away. |
| Fairness to drivers | Within a tie (equally near, same cab type), round-robin, not LIFO | Drivers who've waited longer get the next ride; prevents the "door-rusher" problem. |
| Surge quote lock | The multiplier shown at request time is honored at COMPLETED even if surge rose mid-ride | Trust. "You said $12 now you're charging $18" is a class-action incident. |
| Partial outage tolerance | If payments are down, ride still completes; payment retried async | Core flow must not block on non-critical dependencies. |
Core Entities and Relationships β
| Entity | Responsibility |
|---|---|
Rider | Identity, payment methods, rating, current active ride (if any). |
Driver | Identity, Vehicle, home city, rating, current state (OFFLINE/AVAILABLE/MATCHED/IN_RIDE), last known Location. |
Vehicle | Plate, model, CabType (POOL / STANDARD / XL), capacity. |
Ride | The transaction. Source, dest, RideState, riderId, driverId (nullable pre-match), Fare (nullable pre-complete), timestamps per transition, cancellation reason. |
Location | { lat, lng, timestamp }. Immutable value object. |
IMatchingStrategy | Pluggable matcher. Given a Ride, returns the next Driver to offer. |
IPricingStrategy | Pluggable pricing. Given a Ride + SurgeMultiplier, returns a Fare. |
ISurgePolicy | Pluggable surge. Given a zone + time, returns a multiplier β₯ 1.0. |
ILocationIndex | Geospatial index over driver locations. queryNearest(point, radius, filter) returns a ranked list. |
IPaymentGateway | Pluggable payment. charge(rider, amount) β success / failure / retry-later. |
Payment | Record of the charge: amount, gateway txn id, status. One per Ride. |
RideService | Facade. Owns the request β match β state-transition orchestration. |
DispatchWorker | Per-zone worker that serializes match offers to prevent double-offering. |
Why separate Ride from RideService? Ride is a state machine on a value β the thing being mutated. RideService orchestrates who can mutate it and in what order. Mixing the two makes testing hell: you can't unit-test state transitions without also faking the matcher, the payment gateway, and the clock.
Why is Driver state separate from Ride state? One driver has one state at a time; one ride has one state at a time; they're not the same state. A driver in MATCHED has exactly one Ride in ACCEPTED/ARRIVING. A driver in IN_RIDE has one Ride in IN_RIDE. Keeping them separate lets us reason about each FSM independently.
Interfaces β
interface IMatchingStrategy {
// Returns the next driver to offer, or null if no candidate remains.
// Implementations: NearestAvailable, NearestWithPreferences, BroadcastToN.
nextCandidate(ride: Ride, excluded: Set<DriverId>): Driver | null;
}
interface IPricingStrategy {
quote(source: Location, dest: Location, cabType: CabType, surge: number): Fare;
finalize(ride: Ride, actualDistance: number, actualDuration: number): Fare;
}
interface ISurgePolicy {
multiplierFor(zone: ZoneId, at: Date): number; // β₯ 1.0
}
interface ILocationIndex {
upsert(driverId: DriverId, location: Location, cabType: CabType): void;
remove(driverId: DriverId): void;
queryNearest(
point: Location,
radiusKm: number,
filter: (d: DriverId) => boolean,
): Array<{ driverId: DriverId; distanceKm: number }>;
}
interface IPaymentGateway {
charge(riderId: RiderId, amount: Money): Promise<PaymentResult>;
}
type PaymentResult =
| { status: "SUCCESS"; txnId: string }
| { status: "FAILED"; reason: string }
| { status: "RETRY_LATER" };Each interface is a seam. A new matcher plugs in without touching ride code; a new payment rail plugs in without touching matching. ILocationIndex abstracts over H3 vs geohash vs quadtree β the matcher doesn't care.
Class Diagram β
+----------------+
| RideService |
| (facade) |
+-------+--------+
|
+------------------+------------------+
| | |
v v v
+--------------+ +-----------------+ +-----------------+
| Ride | | DispatchWorker | | Payment |
| (state mach.)| | (per zone) | | (record) |
+------+-------+ +--------+--------+ +--------+--------+
| | |
| v v
| +-----------------+ +-----------------+
| | IMatchingStrat. | | IPaymentGateway |
| +-----------------+ +-----------------+
| |
| v
| +-----------------+
| | ILocationIndex |
| | (H3 / geohash) |
| +-----------------+
v
+--------------+ +-----------------+
| Rider |<----+---->| Driver |
| | | | (state mach.) |
+--------------+ | +--------+--------+
| |
| v
| +-----------------+
| | Vehicle |
| | (CabType) |
| +-----------------+
|
v
+-----------------+ +-----------------+
| IPricingStrat. |<--------| ISurgePolicy |
+-----------------+ +-----------------+Class Design β
Ride β the transaction state machine β
requestRide()
|
v
+----------+
| REQUESTED|
+----+-----+
| matcher offers driver
v
+----------+ accept timeout / decline
| MATCHED |--------+
+----+-----+ | (re-enter MATCHED with next driver,
| | or CANCELLED if all candidates exhausted)
| driverAccept |
v |
+----------+ |
| ACCEPTED | |
+----+-----+ |
| driver moves |
v |
+----------+ |
| ARRIVING | |
+----+-----+ |
| driver taps |
| "start" |
v |
+----------+ |
| PICKED_UP| |
+----+-----+ |
| auto |
v |
+----------+ |
| IN_RIDE | |
+----+-----+ |
| driver taps |
| "end" |
v |
+----------+ |
|COMPLETED | |
+----------+ |
v
+-----------+
| CANCELLED | (from REQUESTED | MATCHED | ACCEPTED | ARRIVING)
+-----------+Why ACCEPTED separate from MATCHED? MATCHED means the offer is on the driver's screen but unacknowledged. ACCEPTED means the driver tapped accept. Between them lies the acceptance-timeout window β a whole protocol that needs its own state to represent cleanly.
Why PICKED_UP distinct from IN_RIDE? The boundary is billing β the meter starts at PICKED_UP (or at IN_RIDE depending on market rules). Surfacing the distinction lets pricing policies differ per region without touching the FSM.
Driver β availability state machine β
+----------+
| OFFLINE | driver not logged in or on break
+----+-----+
| goOnline()
v
+----------+
|AVAILABLE | visible to matcher, receiving offers
+----+-----+
| offer received
v
+----------+ declines / timeout (back to AVAILABLE)
| MATCHED |-------------------------+
+----+-----+ |
| accepts
v
+----------+
| IN_RIDE | active trip (ARRIVING through IN_RIDE)
+----+-----+
| ride completes / cancels
v
(back to AVAILABLE or OFFLINE)The driver FSM is simpler than the ride FSM because the driver doesn't care about the fine-grained billing states β only "am I free" vs "am I busy" vs "am I being offered a ride."
Location and the geospatial index β
Location is a value object: { lat, lng, timestamp }. The index itself is a one-liner in the LLD β we hand off to a real implementation. Three mainstream choices:
| Index | How it works | Strengths | Weaknesses |
|---|---|---|---|
| Geohash | Base-32 interleaved lat/lng prefix; proximity search = prefix match | Simple; string keys work in any KV store; good for coarse queries | Uneven cell sizes near poles; boundary cells need neighbour lookups |
| Quadtree | Recursive 4-way spatial subdivision | Adaptive density; well-studied | In-memory; rebalancing is a pain under heavy updates |
| H3 (Uber) | Hexagonal hierarchical grid; 16 resolution levels | Uniform neighbour distance; rings of hexagons for radius queries; the modern default | Hex math is less intuitive; library dependency |
For this design we assume H3. Driver pings upsert into H3 cells at resolution 9 (~150 m edge). Radius query = H3 kRing around the pickup point, filtered by cab type and availability. Redis has native GEO commands (geohash-backed) if you want a hot cache. The matcher code doesn't know β it only speaks ILocationIndex.
Fare and pricing composition β
fare = base(cabType)
+ distanceKm * perKmRate(cabType)
+ durationMin * perMinRate(cabType)
+ tip
then multiplied by surge
then + tolls + airport fee
then - discountsThe quote at request time uses estimated distance and duration. The final fare at COMPLETED uses the actual route. The surge multiplier captured at request time wins, even if surge dropped (rider keeps a cheap ride they got lucky with) or rose (rider is protected) β as long as the rider accepted the quote within its TTL (usually 60 s).
Key Methods β
type RiderId = string;
type DriverId = string;
type RideId = string;
type ZoneId = string;
type Money = { amount: number; currency: "USD" | "INR" | "EUR" };
type Location = { lat: number; lng: number; timestamp: number };
type CabType = "POOL" | "STANDARD" | "XL";
type RideState =
| "REQUESTED"
| "MATCHED"
| "ACCEPTED"
| "ARRIVING"
| "PICKED_UP"
| "IN_RIDE"
| "COMPLETED"
| "CANCELLED";
type DriverState = "OFFLINE" | "AVAILABLE" | "MATCHED" | "IN_RIDE";
// Allowed transitions. A map is easier to reason about than nested switch.
const RIDE_TRANSITIONS: Record<RideState, RideState[]> = {
REQUESTED: ["MATCHED", "CANCELLED"],
MATCHED: ["ACCEPTED", "REQUESTED", "CANCELLED"], // back to REQUESTED on decline/timeout
ACCEPTED: ["ARRIVING", "CANCELLED"],
ARRIVING: ["PICKED_UP", "CANCELLED"],
PICKED_UP: ["IN_RIDE"],
IN_RIDE: ["COMPLETED"],
COMPLETED: [],
CANCELLED: [],
};
class Ride {
readonly id: RideId;
readonly riderId: RiderId;
readonly source: Location;
readonly dest: Location;
readonly cabType: CabType;
readonly quotedFare: Fare; // locked at request time
readonly quotedSurge: number;
private _state: RideState = "REQUESTED";
private _driverId: DriverId | null = null;
private _history: Array<{ from: RideState; to: RideState; at: number }> = [];
private _finalFare: Fare | null = null;
private _cancelReason: string | null = null;
get state(): RideState { return this._state; }
get driverId(): DriverId | null { return this._driverId; }
// Core: validate + transition. Throws on illegal transition.
transitionTo(next: RideState, ctx?: { driverId?: DriverId; reason?: string }): void {
const allowed = RIDE_TRANSITIONS[this._state];
if (!allowed.includes(next)) {
throw new IllegalTransitionError(
`ride ${this.id}: cannot move ${this._state} -> ${next}`,
);
}
this._history.push({ from: this._state, to: next, at: Date.now() });
this._state = next;
if (ctx?.driverId !== undefined) this._driverId = ctx.driverId;
if (next === "CANCELLED") this._cancelReason = ctx?.reason ?? "unspecified";
}
attachFinalFare(fare: Fare): void {
if (this._state !== "COMPLETED") {
throw new Error("cannot attach fare before COMPLETED");
}
this._finalFare = fare;
}
}
const DRIVER_TRANSITIONS: Record<DriverState, DriverState[]> = {
OFFLINE: ["AVAILABLE"],
AVAILABLE: ["OFFLINE", "MATCHED"],
MATCHED: ["AVAILABLE", "IN_RIDE", "OFFLINE"], // AVAILABLE on decline, OFFLINE on lost connection
IN_RIDE: ["AVAILABLE", "OFFLINE"],
};
class Driver {
readonly id: DriverId;
readonly vehicle: Vehicle;
private _state: DriverState = "OFFLINE";
private _location: Location | null = null;
private _offerReservation: { rideId: RideId; reservedAt: number } | null = null;
get state(): DriverState { return this._state; }
get location(): Location | null { return this._location; }
transitionTo(next: DriverState): void {
const allowed = DRIVER_TRANSITIONS[this._state];
if (!allowed.includes(next)) {
throw new IllegalTransitionError(
`driver ${this.id}: cannot move ${this._state} -> ${next}`,
);
}
this._state = next;
}
updateLocation(loc: Location): void {
this._location = loc;
}
}
class RideService {
constructor(
private readonly rides: RideRepository,
private readonly drivers: DriverRepository,
private readonly matcher: IMatchingStrategy,
private readonly pricing: IPricingStrategy,
private readonly surge: ISurgePolicy,
private readonly locationIndex: ILocationIndex,
private readonly payments: IPaymentGateway,
private readonly clock: Clock,
) {}
async requestRide(
riderId: RiderId,
source: Location,
dest: Location,
cabType: CabType,
): Promise<Ride> {
const zone = zoneOf(source);
const multiplier = this.surge.multiplierFor(zone, this.clock.now());
const quote = this.pricing.quote(source, dest, cabType, multiplier);
const ride = new Ride({
id: newId(),
riderId, source, dest, cabType,
quotedFare: quote, quotedSurge: multiplier,
});
await this.rides.save(ride);
// Fire-and-forget the dispatch; the caller gets the ride back immediately.
this.dispatch(ride).catch(err => log.error("dispatch failed", { err, rideId: ride.id }));
return ride;
}
// Walks candidates until one accepts or the list is exhausted.
private async dispatch(ride: Ride): Promise<void> {
const excluded = new Set<DriverId>();
while (true) {
if (ride.state === "CANCELLED") return; // rider bailed
const candidate = this.matcher.nextCandidate(ride, excluded);
if (!candidate) {
ride.transitionTo("CANCELLED", { reason: "no-driver-available" });
await this.rides.save(ride);
return;
}
// Atomically reserve the driver for the offer window.
const reserved = await this.drivers.tryReserveForOffer(candidate.id, ride.id);
if (!reserved) {
// Someone else grabbed this driver between the matcher and here.
excluded.add(candidate.id);
continue;
}
ride.transitionTo("MATCHED", { driverId: candidate.id });
await this.rides.save(ride);
const accepted = await this.waitForAccept(ride.id, candidate.id, 15_000);
if (accepted) {
ride.transitionTo("ACCEPTED", { driverId: candidate.id });
await this.drivers.setState(candidate.id, "MATCHED");
await this.rides.save(ride);
return;
}
// Timeout or decline β release the driver, re-enter REQUESTED, try next.
await this.drivers.releaseReservation(candidate.id, ride.id);
ride.transitionTo("REQUESTED");
excluded.add(candidate.id);
await this.rides.save(ride);
}
}
// First-writer-wins: the first driver to call this for a given ride locks it in.
async driverAccept(rideId: RideId, driverId: DriverId): Promise<boolean> {
const ride = await this.rides.get(rideId);
if (ride.state !== "MATCHED" || ride.driverId !== driverId) {
return false; // offer expired or belongs to another driver
}
// Ack the waitForAccept promise; dispatch() does the actual transition.
return this.drivers.ackOffer(driverId, rideId);
}
async updateRideState(rideId: RideId, next: RideState, actor: Actor): Promise<void> {
const ride = await this.rides.get(rideId);
this.authorize(ride, actor, next); // throws if actor can't trigger this transition
ride.transitionTo(next);
await this.rides.save(ride);
if (next === "IN_RIDE") {
await this.drivers.setState(ride.driverId!, "IN_RIDE");
}
if (next === "COMPLETED") {
await this.onCompleted(ride);
}
if (next === "CANCELLED") {
await this.onCancelled(ride, actor);
}
}
private async onCompleted(ride: Ride): Promise<void> {
const actual = await this.tripTelemetry.summarize(ride.id);
const fare = this.pricing.finalize(ride, actual.distanceKm, actual.durationMin);
ride.attachFinalFare(fare);
await this.rides.save(ride);
await this.drivers.setState(ride.driverId!, "AVAILABLE");
// Payment is best-effort; retried by a worker if it fails.
this.capturePayment(ride, fare).catch(err =>
this.paymentRetryQueue.enqueue({ rideId: ride.id, err })
);
}
computeFare(ride: Ride): Fare {
if (ride.state !== "COMPLETED") return ride.quotedFare;
return ride._finalFare ?? ride.quotedFare;
}
private async onCancelled(ride: Ride, actor: Actor): Promise<void> {
// Release driver (if one was assigned).
if (ride.driverId) {
await this.drivers.releaseReservation(ride.driverId, ride.id);
await this.drivers.setState(ride.driverId, "AVAILABLE");
}
// Cancellation fee logic lives in the pricing strategy.
const fee = this.pricing.cancellationFee(ride, actor, this.clock.now());
if (fee.amount > 0) {
this.capturePayment(ride, fee).catch(err =>
this.paymentRetryQueue.enqueue({ rideId: ride.id, err })
);
}
}
private async capturePayment(ride: Ride, fare: Fare): Promise<void> {
const result = await this.payments.charge(ride.riderId, fare);
if (result.status === "FAILED" || result.status === "RETRY_LATER") {
throw new PaymentFailedError(ride.id, result);
}
await this.paymentsRepo.save({
rideId: ride.id,
amount: fare,
txnId: result.txnId,
status: "SUCCESS",
});
}
}
// Concrete matching strategy β nearest available within radius, tie-broken by wait time.
class NearestAvailableStrategy implements IMatchingStrategy {
private static readonly INITIAL_RADIUS_KM = 3;
private static readonly EXPANDED_RADIUS_KM = 5;
constructor(
private readonly locationIndex: ILocationIndex,
private readonly drivers: DriverRepository,
) {}
nextCandidate(ride: Ride, excluded: Set<DriverId>): Driver | null {
const attempt = (radiusKm: number) => {
const results = this.locationIndex.queryNearest(
ride.source,
radiusKm,
(id) => !excluded.has(id) && this.drivers.isAvailableOfType(id, ride.cabType),
);
// Tie-breaker: equally near (within 50 m), pick the driver who has been
// AVAILABLE the longest. Fair to drivers who've been waiting.
results.sort((a, b) => {
if (Math.abs(a.distanceKm - b.distanceKm) < 0.05) {
return this.drivers.availableSince(a.driverId)
- this.drivers.availableSince(b.driverId);
}
return a.distanceKm - b.distanceKm;
});
return results[0]?.driverId ?? null;
};
const first = attempt(NearestAvailableStrategy.INITIAL_RADIUS_KM);
if (first) return this.drivers.get(first);
const second = attempt(NearestAvailableStrategy.EXPANDED_RADIUS_KM);
return second ? this.drivers.get(second) : null;
}
}
// Concrete pricing strategy β base + distance + time, multiplied by surge.
class StandardPricingStrategy implements IPricingStrategy {
private static readonly RATES: Record<CabType, { base: number; perKm: number; perMin: number }> = {
POOL: { base: 2.00, perKm: 0.80, perMin: 0.15 },
STANDARD: { base: 3.00, perKm: 1.20, perMin: 0.20 },
XL: { base: 5.00, perKm: 1.80, perMin: 0.30 },
};
private static readonly CANCEL_GRACE_MS = 2 * 60 * 1000;
quote(source: Location, dest: Location, cabType: CabType, surge: number): Fare {
const distanceKm = haversine(source, dest);
const durationMin = estimateDuration(source, dest);
return this.compute(cabType, distanceKm, durationMin, surge);
}
finalize(ride: Ride, actualDistance: number, actualDuration: number): Fare {
// Honor the quoted surge β rider is protected from mid-ride spikes.
const base = this.compute(ride.cabType, actualDistance, actualDuration, ride.quotedSurge);
// Guard: if actual is wildly higher than quoted (traffic detour), cap at quote + 20%.
const ceiling = { ...ride.quotedFare, amount: ride.quotedFare.amount * 1.2 };
return base.amount > ceiling.amount ? ceiling : base;
}
cancellationFee(ride: Ride, actor: Actor, now: number): Fare {
if (actor.role === "DRIVER") return zeroMoney(ride.quotedFare.currency);
const acceptedAt = ride.transitionTimestamp("ACCEPTED");
if (!acceptedAt || now - acceptedAt < StandardPricingStrategy.CANCEL_GRACE_MS) {
return zeroMoney(ride.quotedFare.currency);
}
// Flat fee equal to the base fare of the cab type.
return {
amount: StandardPricingStrategy.RATES[ride.cabType].base,
currency: ride.quotedFare.currency,
};
}
private compute(cabType: CabType, km: number, min: number, surge: number): Fare {
const r = StandardPricingStrategy.RATES[cabType];
const amount = (r.base + r.perKm * km + r.perMin * min) * surge;
return { amount: roundToCents(amount), currency: "USD" };
}
}Why does requestRide return before matching completes? UX. The rider hits "book" and sees "searchingβ¦" immediately β the server isn't holding the HTTP connection for 2 s while the matcher works. The dispatch runs async; the client polls or subscribes via websocket for state changes.
Why does dispatch re-enter REQUESTED on decline? The ride is between drivers β not matched, not cancelled. Modeling this as REQUESTED keeps the state machine small; a separate RE_OFFERING state would add complexity without new semantics.
Design Decisions & Tradeoffs β
Matching: single-best-pick vs radius-ring broadcast β
Two canonical approaches:
| Approach | How it works | Pros | Cons |
|---|---|---|---|
| Single-best-pick | Offer the nearest driver; wait 15 s; on decline, offer the next. | Driver sees one offer at a time β no confusion. Fair queue. | Rider-side latency on declines (each decline adds 15 s). |
| Broadcast-to-N | Offer to the top 3 drivers simultaneously; first to tap accept wins. | Lower end-to-end latency. Better on driver-sparse nights. | Losing drivers feel cheated ("I accepted but didn't get the ride"). Harder to prevent double-acceptance. |
| Hybrid (what Uber actually does) | Single-best-pick under normal supply; switch to broadcast when matching has failed once. | Best of both. | More config, more A/B testing. |
For the interview, single-best-pick with offer reservation is the defensible default. It composes well with the DispatchWorker pattern (see concurrency) and the tryReserveForOffer primitive generalizes to broadcast later.
Location index: geohash vs quadtree vs H3 β
Already covered in the table above. The short version: H3 is the modern default. Uber built it; Lyft and DoorDash both use it now. Geohash is fine for a greenfield prototype but has edge-cell pain. Quadtree is elegant but is rarely the right choice for a write-heavy workload with millions of pings per minute.
State machine storage: DB row vs event log β
| Approach | Pros | Cons |
|---|---|---|
Single mutable row (rides table, state column) | Simple reads. ACID transitions via WHERE state = $expected. | History is lost unless you also write an audit log. |
Event log (ride_events, append-only) | Full auditability, easy to replay, temporal queries are free. | Reads are expensive (fold the log). Need a materialized view for the current state. |
| Hybrid (mutable row + event log) | Fast reads, full history. | Two writes per transition; consistency between them matters. |
The hybrid is what every real system ends up at. The interview answer: start with the row + append a row to an events side table in the same transaction. Use a CDC pipeline to feed the event log into an analytical store.
Surge: real-time per-zone vs batched β
Surge is ML territory β not LLD β but the LLD question is how frequently the pricing engine reads it. Two options:
- Read-through on every quote: Fresh but costly; every request hits the surge service.
- Zone-level cache with 30-s TTL: Cheap but may be stale during a sudden spike (e.g., concert just ended).
In practice, a push model is used: the surge service publishes updates to a pub/sub topic; the pricing engines subscribe and keep an in-memory map. Reads are O(1) map lookups. This is why ISurgePolicy.multiplierFor is synchronous and fast β the async piece lives in the publisher.
Patterns Used β
| Pattern | Where | Why |
|---|---|---|
| State | Ride, Driver | Lifecycle is a graph with forbidden edges. A state field + transition map is the canonical shape. |
| Strategy | IMatchingStrategy, IPricingStrategy, ISurgePolicy | Each is a policy knob the business owns; swapping them must not ripple through ride code. |
| Observer | Location pings β ILocationIndex; ride state β rider UI + driver UI + analytics | Multiple consumers of the same event. Publishing decouples producer from count-of-consumers. |
| Factory | Ride creation inside RideService.requestRide | Centralizes the invariants β initial state, quote lock, timestamp β in one place. |
| Saga | match β accept β in_ride β capture_payment across ride service, driver service, payments | Long-running multi-step flow with compensating actions (cancel payment, release driver) on failure. |
| Command | Each ride event (Request, Accept, Start, End, Cancel) | Events are serializable records that can be replayed or audited. Fits the event-log storage option. |
| Facade | RideService | The HTTP handlers, websocket layer, and worker pool all talk to one surface instead of eight internal collaborators. |
Concurrency Considerations β
1. The matching race β
The nastiest concurrency problem in the system: two riders request a ride, the matcher picks the same driver for both. Without coordination, both rides transition to MATCHED, both send the driver an offer, and the driver accepts one β the other ride is stuck in MATCHED forever.
Three resolution strategies:
A. Offer-one-driver-at-a-time (reservation lock). Before transitioning REQUESTED β MATCHED, atomically reserve the driver: UPDATE drivers SET reservation = $rideId WHERE id = $driverId AND reservation IS NULL. If the row count is zero, the driver is already offered to someone else; pick the next candidate. Release on decline / timeout / accept. This is what tryReserveForOffer does in the code above.
async tryReserveForOffer(driverId: DriverId, rideId: RideId): Promise<boolean> {
const res = await this.db.exec(
`UPDATE drivers
SET offer_reservation = $1, reserved_at = NOW()
WHERE id = $2 AND offer_reservation IS NULL AND state = 'AVAILABLE'`,
[rideId, driverId],
);
return res.rowCount === 1;
}Pros: simple, race-free, works with any transactional store. Cons: a small latency hit (one DB roundtrip per candidate).
B. Central dispatcher per zone (serialize within zone). One DispatchWorker per geographic zone, single-threaded, processes matching for that zone in order. Races are impossible by construction β only one thread touches zone state. This is what Uber's internal dispatch does. Pros: wicked fast (all in-memory). Cons: the worker is a SPOF for the zone; needs leader election.
C. Optimistic CAS on the ride. Let the race happen, then detect it on state transition. When a driver taps accept on ride A and ride B: both calls to driverAccept hit UPDATE rides SET state='ACCEPTED' WHERE id=$id AND state='MATCHED' AND driver_id=$driverId; only one succeeds. Pros: no upfront locking. Cons: the loser's ride is stuck; the dispatch worker has to detect and re-dispatch.
The canonical answer is (A) + (B) together: per-zone worker for throughput, reservation lock as the correctness backstop in case anything escapes the worker (cross-zone traffic, replica lag, etc.).
2. Ride state transitions are atomic β
Every transition is ride.transitionTo(next) followed by rides.save(ride). Two callers racing on the same ride (e.g., rider cancels while driver accepts) must serialize. Options:
- Row-level lock:
SELECT ... FOR UPDATEbefore the transition. - CAS on state:
UPDATE rides SET state=$new WHERE id=$id AND state=$expected. If zero rows updated, someone transitioned first β re-read and decide.
CAS is preferable: no lock held across application logic, no deadlock risk.
3. Cancel-during-match race β
Rider taps cancel exactly as the matcher is transitioning REQUESTED β MATCHED. Two things must be true afterwards: the ride is in CANCELLED (rider's intent wins, always), and the driver who was about to be offered is released β they should not see a phantom offer.
Solution: the dispatch loop checks ride.state === "CANCELLED" before every side effect. Cancellation writes to the ride first; dispatch observes it on next poll. The reservation is released when dispatch notices the cancel. If the MATCHED transition happened to win the race, the driver sees the offer for a moment, then it's withdrawn β an acceptable UX degradation given how rare the window is.
4. Location ping storm β
A driver sends GPS updates every 4 seconds. Across 100k active drivers, that's 25k writes/s to the index. The writes don't need transactional guarantees β a lost ping is fine, the next one arrives in 4 s. Use a write-optimized path: sharded in-memory index (H3 cell β driver set), batched persistence to Redis every 1 s.
Scale & Extensibility β
| Dimension | Approach |
|---|---|
| Geosharding | One cluster per city; requests routed by source zone. A rider crossing a city boundary mid-ride is a rare edge case β either the origin city owns the ride to completion, or we hand off at a zone boundary (complex; usually not done). |
| Location index hot-cache | Redis GEO commands (GEOADD / GEOSEARCH) as an L1; persistent H3 index as L2. Pings write to L1; L2 is refreshed async. |
| Matching worker pool | One DispatchWorker per zone, each single-threaded for correctness, fronted by a consistent-hash router on zoneId. Adding capacity = adding workers per zone. |
| New cab type | Config addition: add "ELECTRIC" to CabType enum, add pricing table entry, add filter to ILocationIndex.queryNearest. No core code change. |
| B2B / corporate accounts | Rider gains a corporateAccountId. IPaymentGateway becomes PaymentRouter that routes to an invoice-based gateway when the ride is tagged corporate. |
| Scheduled rides | Separate pipeline: ScheduledRide entity, a scheduler worker that wakes N minutes before pickup, creates a Ride, and dispatches. Core flow is unchanged once dispatch starts. |
| Fleet / dispatch mode | For taxi partners who dispatch from their own fleet, add a FleetDispatchStrategy implementing IMatchingStrategy. Fleet operator's driver list replaces city-wide pool. |
| Surge ML signal input | ISurgePolicy is the seam. A new implementation consumes ML model outputs (e.g., predicted demand vs supply per zone); pricing engine doesn't change. |
| Multi-region / cross-border | Out of scope for LLD. The primary change is payment currency handling and regulatory routing β touches pricing strategy, not matching. |
Edge Cases β
- No driver available in radius. Expand radius (3 km β 5 km) once; if still no candidate, fail with
CANCELLED+ reasonno-driver-available. Do not loop indefinitely β a stuck "searchingβ¦" is worse than an honest failure. - Driver declines. Release reservation, exclude from candidate list for this ride, re-enter
REQUESTED, try next. Driver's rating unchanged (declines are free); too many declines in a rolling window β driver flagged. - Driver accepts but cancels before arrival. Ride is in
ACCEPTEDorARRIVING. Ride transitions toREQUESTED, matcher re-dispatches. Driver takes a penalty (rating hit, small platform fee). Rider is not charged; rider sees a "finding a new driver" screen. - Rider cancels while driver en route. Ride is in
ARRIVING. If time sinceACCEPTED> 2 min, rider pays a cancellation fee (typically equal to base fare + proportion of distance traveled); if < 2 min, free. Fee is charged viaIPaymentGateway; on failure, owed and retried. - Driver goes offline mid-match. Driver's device loses connectivity after
MATCHEDbut before accepting. Offer times out at 15 s, exactly as decline would. Driver's state on reconnect: check if their current reservation is still valid; if not, transition toAVAILABLEor remainOFFLINE. - GPS drift. Rider taps "driver isn't here" but the driver's GPS is within 50 m. Classic source of disputes. Mitigation: never auto-cancel on rider claim alone; require driver confirmation too. For the LLD, this is a product flow, not a state machine change.
- Multi-rider pool conflicts. Pool rides share the car; if rider A is in
PICKED_UPand rider B's pickup is 2 min away, rider A's meter is running at a pool rate, not individual. This is a pricing-strategy concern (PoolPricingStrategy), not a ride-state one. The ride object for a pool trip is actually aPoolRidewith multiple rider segments. - Surge changes between quote and accept. Quote is valid for a 60-s TTL. If rider accepts within the TTL, the quoted surge is honored in
finalize; if not, re-quote at current surge before dispatch. Stored on the ride asquotedSurge. - Payment fails post-ride. Ride is
COMPLETED,onCompletedenqueues payment, gateway returnsFAILED. Ride is not un-completed; payment is retried via a worker queue; after 3 attempts, escalated to the rider (add-to-next-trip or direct collection). Driver is paid their share regardless (platform absorbs the risk). - Driver marks "arrived" but is 500 m away. Don't prevent the action β drivers have local knowledge the GPS lacks (building entry on the other side, apartment complex). Instead, log the discrepancy; if the rider disputes, we have the evidence.
- Duplicate
driverAcceptcalls. Driver taps accept twice (network retry, double-tap). Second call seesride.state !== "MATCHED"(first call already transitioned) and returns idempotently. Never a throw. - Ride completed but GPS says they're still moving. Driver ended the trip early. Final fare uses actual-recorded distance up to the "end" tap; rider can dispute in-app. Not a state-machine concern.
Follow-up Questions β
- How do you implement ride-pool matching (two strangers, same car, overlapping routes)? Answer: detour budget (max % extra distance), compatible-route graph, greedy or bipartite matching. Not LLD-grade; HLD + optimization.
- How do you handle surge-zone boundaries? Answer: a ride's surge is determined by the pickup zone, not the drop-off. Smooth the boundary with hex-neighbour averaging to avoid cliff pricing.
- How would you scale matching to 100k rides/min in a megacity? Answer: zone sharding + in-memory dispatch workers + H3 index sharded by top-level cell + Redis GEO hot cache + async persistence.
- How do you prevent driver fraud (GPS spoofing)? Answer: cross-validate GPS against cell tower, WiFi SSID, accelerometer; ML model flags anomalies; suspect rides auto-routed to human review.
- How do you prioritize loyal riders during surge? Answer: surge discount for high-tier riders is a
PricingStrategydecorator on top of the base engine. Business rule, not a core model change. - How do you A/B test pricing? Answer:
PricingStrategyfactory returns a variant based on rider bucket (hash ofriderId). Logs include variant id for attribution. - How do you evolve from city-based to regional-based architecture? Answer: introduce a regional router above city clusters, handle cross-city rides via a "handoff" state at zone boundaries, ensure ride-id is globally unique.
- How do you implement "preferred driver" (rider wants a specific driver)? Answer:
IMatchingStrategycomposes βPreferredDriverStrategytries the requested driver first, falls back toNearestAvailableStrategyafter timeout. Rider is informed if fallback occurs. - How do you handle a driver's ride getting cut off by the app crashing? Answer: driver app reconnects, fetches active ride state from server, resumes. Server tolerates missed pings for up to 60 s before marking the driver unreachable.
- How do you implement "upfront pricing" that quotes the final fare, not a range? Answer:
finalizeuses the quote unless actual distance/duration exceeds a threshold (e.g., +20% detour due to unexpected road closure). Threshold + reconciliation is a pricing-strategy choice. - How do you handle tolls and airport fees? Answer:
FareBreakdownincludes line items; tolls are added by aTollStrategycomponent keyed off the GPS trail vs known toll zones. Surfaces to the rider as a breakdown, not a surprise. - How do you design ride tipping? Answer: post-completion add-on, captured via a separate
IPaymentGateway.chargecall against the rider's payment method. Stored on thePaymentrecord astip.
SDE2 vs SDE3 β How the Bar Rises β
| Dimension | SDE2 (mid-level) | SDE3 (senior) |
|---|---|---|
| State machine completeness | Lists the happy-path states; handles cancel from the obvious states. | All states, all transitions, including re-entrant MATCHED β REQUESTED on decline. Explicit transition map. Illegal transitions throw, not silently no-op. |
| Matching race handling | Acknowledges the race, suggests "use a lock." | Articulates three strategies (reservation, per-zone worker, optimistic CAS), picks one with a stated tradeoff, and shows the concrete tryReserveForOffer primitive. |
| Location index awareness | "Use a quadtree" or "use Redis GEO." | Names H3, explains why (uniform neighbor distance), explains the read path (kRing + filter) and the write path (batched pings), and keeps it behind ILocationIndex so matcher doesn't care. |
| Surge reasoning | Surge is a multiplier. Done. | Quote-lock TTL, surge publisher vs synchronous read, zone boundary smoothing, and decoupling via ISurgePolicy. Acknowledges ML is out of scope but designs the seam cleanly. |
| Observability | "Log everything." | Structured events per transition, metric per state (ride.state.transitions{from, to}), per-zone match-latency histograms, alert on REQUESTED β CANCELLED no-driver rate spike. Debuggability is a design property, not an afterthought. |
| Multi-regional extensibility | Handwaves "we'd shard per city." | Names the seams: regional router above city clusters, consistent-hash on zone id, handoff semantics at boundaries, regulatory routing at the payment layer. |
| Failure handling | "Retry on failure." | Saga model: which step's failure compensates which prior step. Payment failure doesn't reverse ride completion; no-driver failure does cancel. Idempotency keys on mutative endpoints. |
| Testing posture | Happy path tests. | Property-based tests over the state machine (no illegal transitions reachable from any state), deterministic clock injection for timeout tests, chaos-injection on the matcher to validate reservation release on every failure path. |