Find the closest available driver among millions of moving cars, in under two seconds, while those cars report their location every few seconds. Two problems hide inside one prompt: a write-heavy location firehose and a latency-critical proximity search. The answer is never a distance scan over a table — it’s a spatial index. Say that in the first sentence and you’ve already passed the part most candidates fail.
Every FAANG company runs a rubric. The dimensions are roughly the same; the weights differ by company and level. At senior+ the boxes-and-arrows are table stakes — what gets graded hardest is the quality of your decisions: the questions you asked first, the trade-offs you surfaced and defended, and the production reality you volunteered without being asked.
| Dimension | Weight | What earns the signal |
|---|---|---|
| Requirements & scoping | 10–15% | You scoped before drawing, asked enough to bound the problem, pinned the scale number, and stated assumptions out loud. |
| High-level architecture | 20–25% | The right components, a clear data flow, and a reason every box exists. The design satisfies each functional requirement. |
| Technical depth / deep dives | ~30% | You go three questions deep on the hard part without being rescued. This is where staff is won or lost. |
| Trade-offs & judgment | highest effective | Two viable options, what each costs, and a committed pick for this system. Simplicity over flash when flash isn't warranted. |
| Communication / driving | cross-cutting | You drive the 45 minutes; the interviewer never has to rescue you. You narrate, checkpoint, and narrow when the design sprawls. |
| Operational maturity | ↑ in 2026 | The newest weight: observability, rollout, failure modes, on-call reality — volunteered, not pried out. |
A solid design with reasonable trade-offs is a strong score for a mid-level candidate and a downlevel flag for staff. The questions can be identical; the depth expectation is not. As you climb, the balance tips from breadth toward depth, proactivity, and production reality.
You don't recite AWS — you anchor each decision to one of these. It signals you evaluate systems across competing concerns rather than optimizing one axis. Each pillar below is mapped to a move you can make in this exact design.
Observe matching and ingestion.
Protect rider/driver PII.
Survive a node or GPS loss.
Turn a global scan into a few cell lookups.
Don’t durably store every ping.
Right-size the firehose.
Two problems hide in one prompt: a write-heavy location firehose (every car reporting GPS every few seconds) and a latency-critical proximity search (find the nearest driver in under two seconds). The answer to both is a spatial index — never a distance scan over a table. Draw the two paths and the design defends itself.
The simulation. Framing: city-scale ride-hailing — ~5M active drivers publishing location every ~4s (on the order of 1M+ writes/sec), riders expecting a match in < 2s, location allowed to be slightly stale, payments strongly consistent (noted, scoped out).
1–2s.“There are really two systems here: a write-heavy pipeline ingesting location pings, and a read-latency-critical matching service. I’ll design them separately because their scaling pressures are opposite.”
“I’ll set consistency per subsystem: slightly stale driver locations are acceptable, but assignment must be atomic — double-dispatching one driver to two riders is a correctness bug, not a UX nit.”
“We’ll query the database for drivers within X kilometers.” A WHERE distance < r scan over all drivers can’t use an index and signals you’ve never handled spatial data.
Entities: Driver (id, location, status: available/on-trip/offline), Rider, Trip (state machine: requested → matched → ongoing → complete). Interface:
5M drivers × a ping every 4s ≈ 1.25M writes/sec. Persisting that to a durable transactional DB is hopeless and pointless. State it plainly: locations live in an in-memory geo store, sampled asynchronously to durable storage only if history is needed.
“At over a million location writes a second, I won’t durably persist every ping. Current position lives in memory; I sample to cold storage for analytics. The freshest data being in RAM is exactly what the match needs anyway.”
Encode every location into a cell. Geohash turns lat/long into a string whose prefix length controls precision (a 6-char cell is roughly a kilometer; 7-char is ~150m) and nearby points share prefixes — simple and the right first answer. Quadtree subdivides denser regions into finer cells, adapting to driver density. S2 / H3 (what Uber actually uses) give uniform cells on the sphere with cleaner geometry. Store drivers indexed by cell in an in-memory store (e.g. Redis with geo commands).
Start with the rider’s cell and immediate neighbors. If too few candidates (common in suburbs), expand to a coarser precision / wider radius and retry until enough drivers are found or a max radius is hit — then return “no drivers available.”
“Matching reads the rider’s cell plus neighbors, turning a global scan into a handful of cell lookups. If the cell is sparse I widen the radius progressively rather than scanning everything from the start.”
Quadtree or H3 as your first answer. They’re correct for density-adaptive surge cells, but they’re harder to implement; lead with geohash and bring up quadtree/H3 when density or surge granularity demands it.
A single matching server holding all drivers in one process. It can’t survive a hot city and it’s a single point of failure — the index must shard by region/cell.
Two matches must never assign the same driver. Make the assignment atomic: a compare-and-set on the driver’s status (available → reserved) or a short-lived distributed lock per driver. The first match wins the CAS; the second sees the driver is no longer available and moves to the next candidate. Pair it with the WebSocket offer flow: offer to the top candidate with a ~15s timeout; on accept, confirm the reservation; on decline/timeout, release and try the next.
Dense areas (a stadium, midtown Manhattan) pack thousands of drivers into one cell, creating a read/write hot shard. Use a finer precision there (or a quadtree that subdivides automatically) so dense regions get more, smaller cells while rural areas stay coarse. Shard the index by region so one busy city doesn’t overload a single node.
Mark a driver’s position stale after a missed-ping threshold and exclude stale drivers from matching; use last-known position with light interpolation for display only. Don’t offer rides based on a position that’s minutes old.
Compute demand:supply per cell on a rolling window; a coarser cell (geohash-6) is the right granularity for pricing. This is a separate read of the same cell structure — a clean place to mention cross-team seams.
“Assignment is the one strongly-consistent operation in an otherwise eventually-consistent system. I’d guard it with an atomic CAS on driver status so concurrent matches can’t both win the same driver — everything else, including location, can be stale.”
“We’ll just lock the whole driver table.” A global lock kills throughput. The contention is per-driver; the lock must be too.
Two riders, same instant, both matched to driver D. What happens?
Both matching attempts try to flip D’s status from available to reserved with an atomic compare-and-set. Exactly one succeeds. The winner sends D the ride offer over WebSocket; the loser immediately sees D is no longer available and moves to the next candidate in its ranked list. No double-dispatch, and neither rider waits on a global lock.
D doesn’t respond within 15 seconds.
The offer times out, I release D’s reservation back to available, and offer the next-nearest candidate. If everyone in the radius declines, I expand the radius and retry, and eventually tell the rider no drivers are available rather than spinning forever.
Multi-region with regional failover; a city is independently operable. Roll out index or matching changes per region as canaries. Keep a coarse fallback matcher if the primary index degrades.
“With more time I’d detail ETA prediction and the payments path — payments need the strong consistency I deliberately kept out of the location layer. I scoped them out, I didn’t miss them.”
Interviewers push on the index choice and the consistency seams. Commit, name the trade-off, and stop.
Geohash is the right first answer: simple, prefix-based, and supported directly by Redis geo commands, so a match is a cell lookup plus neighbors. I reach for a quadtree or H3 when density varies wildly and I need adaptive cell sizes — for example, surge pricing where one Manhattan cell shouldn't hold 10,000 drivers. Lead with geohash, escalate when density demands it.
Yes — a hot cell, both for writes (pings) and reads (matches). Fix it with finer precision in dense regions (or a quadtree that subdivides automatically), and shard the index by region so one city's load doesn't fall on a single node.
Make assignment atomic: a compare-and-set on the driver's status, or a short per-driver lock. First match wins, second sees the driver is reserved and tries the next candidate. The lock is per-driver, never global, so throughput stays high.
Dynamic radius expansion: start with the rider's cell and neighbors, and if too few candidates, step to a coarser precision / wider radius and retry until enough drivers appear or a max radius is reached — then return 'no drivers available' rather than hanging.
I never persist every ping durably. Current position lives in an in-memory geo store that all matching nodes read; I sample asynchronously to cold storage only for analytics or history. The freshest data being in RAM is exactly what matching wants.
I mark the position stale after a missed-ping threshold and exclude stale drivers from matching — offering a ride based on a minutes-old position is worse than not offering. For display I show last-known with light interpolation, but I don't dispatch on it.
A clean design with one of these undercurrents still scores below the bar at senior+. None are about getting an answer wrong — they're about how you operate.
Jumping to architecture without bounding the problem or confirming scale. Reads as template-matching.
"It depends" with no decision behind it. Name the trade-off, then pick.
Stalling at the core deep dive until the interviewer feeds you the answer. Depth is the senior+ bar.
Proposing WHERE distance < r without a spatial index. The single fastest way to signal you've never indexed location data.
Designing the happy path and forgetting that two concurrent matches can grab the same driver. It's the correctness crux.
No observability, no rollout, no failure-mode plan. In 2026 this reads as "has never carried a pager."
Confident wrong answers when pushed. Far worse than an honest "here's what I'd verify."
Waiting to be asked the next question. At staff you own the 45 minutes.
Run a mock and score yourself honestly against the dimensions the interviewer uses. If you can't hit "strong" on depth and operability, that's your signal on where to drill.
| Dimension | Weak (downlevel) | Strong (at level) |
|---|---|---|
| Scoping | One undifferentiated system. | Split write-heavy ingestion from latency-critical matching; set consistency per subsystem. |
| Spatial index | Distance scan or vague 'use a map API'. | Committed to geohash, named quadtree/H3 and when they win, used cell + neighbor lookups. |
| Path separation | Pings and matches share one store/path. | In-memory geo store for live position; durable sampling separate; matching reads the index. |
| Double-dispatch | Needed a hint; global lock. | Raised it unprompted; per-driver atomic CAS with WebSocket offer + timeout + next-candidate. |
| Hot cells | Single node holds all drivers. | Finer precision in dense areas, region sharding, hot-cell alerting. |
| Operability | Never mentioned it. | Match-latency and dispatch-success SLOs, ingestion lag, regional failover, canary rollout. |