A few thousand seats, a million people hitting “buy” the instant the on-sale opens, and one rule that cannot bend: no two people get the same seat. This is the one cluster where you trade availability for consistency on purpose. The interview turns on two things — how you hold a seat during checkout, and how you keep the backend standing during a Taylor-Swift-scale stampede.
Every FAANG company runs a rubric. The dimensions are roughly the same; the weights differ by company and level. At senior+ the boxes-and-arrows are table stakes — what gets graded hardest is the quality of your decisions: the questions you asked first, the trade-offs you surfaced and defended, and the production reality you volunteered without being asked.
| Dimension | Weight | What earns the signal |
|---|---|---|
| Requirements & scoping | 10–15% | You scoped before drawing, asked enough to bound the problem, pinned the scale number, and stated assumptions out loud. |
| High-level architecture | 20–25% | The right components, a clear data flow, and a reason every box exists. The design satisfies each functional requirement. |
| Technical depth / deep dives | ~30% | You go three questions deep on the hard part without being rescued. This is where staff is won or lost. |
| Trade-offs & judgment | highest effective | Two viable options, what each costs, and a committed pick for this system. Simplicity over flash when flash isn't warranted. |
| Communication / driving | cross-cutting | You drive the 45 minutes; the interviewer never has to rescue you. You narrate, checkpoint, and narrow when the design sprawls. |
| Operational maturity | ↑ in 2026 | The newest weight: observability, rollout, failure modes, on-call reality — volunteered, not pried out. |
A solid design with reasonable trade-offs is a strong score for a mid-level candidate and a downlevel flag for staff. The questions can be identical; the depth expectation is not. As you climb, the balance tips from breadth toward depth, proactivity, and production reality.
You don't recite AWS — you anchor each decision to one of these. It signals you evaluate systems across competing concerns rather than optimizing one axis. Each pillar below is mapped to a move you can make in this exact design.
Watch the invariant.
Never touch raw card data.
Degrade without overselling.
Hold cheaply.
Right consistency, right place.
Shed load early.
One rule cannot bend: no two people get the same seat. That makes this the rare design where you trade availability for consistency on purpose. The interview turns on two things: how you hold a seat during checkout, and how you keep the backend standing during a Taylor-Swift-scale stampede.
The simulation. Framing: a high-demand ticketing platform — a hot on-sale with ~1M concurrent users competing for ~a few thousand seats, a checkout hold window of ~10 min, zero tolerance for double-booking, and eventual consistency acceptable for browsing.
“This is a consistency problem, not a throughput problem. The invariant is that two people never get the same seat, so for booking I’ll choose CP — I’ll reject under partition rather than risk a double sale. Browsing the seat map, though, can be eventually consistent.”
“Two questions: how long do we hold a seat during checkout, and do we want a waiting room for big on-sales? Those shape the locking and the spike strategy more than raw QPS does.”
“We’ll keep it eventually consistent for speed.” For seat inventory that’s an oversell waiting to happen. Naming the wrong consistency model here is an instant flag.
Design Ticketmaster.
The defining constraint is correctness under contention: no seat is ever sold twice. So booking is strongly consistent — CP — while browsing and the seat map can be eventually consistent and cached. The two hard parts are how I hold a seat during checkout and how I survive a million people arriving at the same second. Let me confirm the hold window and whether we want a waiting room.
10-minute hold. Assume a massive on-sale.
Then I’ll use an expiring lock for the hold and a virtual waiting room to throttle entry into the booking path. Let me build the core booking flow first, then layer the spike handling.
Entities: Event, Seat / Ticket (the state machine is the design: available → held → booked), Booking (id, userId, status). Interface:
At on-sale, contention is wildly uneven: a million users converge on a few thousand seats in seconds. You don’t need a precise QPS — you need to recognize that most requests will lose, so the system must reject losers cheaply and protect the booking path. That recognition is what justifies the waiting room.
“The seat state machine is the whole model: available, held, booked. Every concurrency question is really ‘who is allowed to transition this seat, and atomically.’”
User taps a seat → POST /bookings hits the Booking Service. It acquires a distributed lock on that seat in Redis with a 10-minute TTL using an atomic operation (SETNX), writes a booking row with status in-progress, and returns a bookingId, routing the user to payment. On successful payment, the seat flips to booked and the lock releases. If the user abandons checkout, the TTL expires and the seat returns to available automatically — no cleanup job required.
Push availability changes to clients with Server-Sent Events or long-polling so a seat turning red appears in near real time — this is the eventually-consistent read path, deliberately separate from the strongly-consistent booking path.
“The hold is a Redis lock with a TTL. The TTL is the elegant part — an abandoned checkout self-heals when the key expires, so I never need a sweeper job hunting for stale holds.”
Pure optimistic concurrency (version check at write). It’s efficient under low contention but produces a storm of failed checkouts under a hot on-sale — name the contention assumption. Best used as a final-confirm backstop, not the primary hold.
“We’ll hold a database transaction open while the user pays.” Holding a transaction across a multi-minute human checkout pins connections and locks rows for the entire on-sale — it collapses immediately.
A million users against a few thousand seats will flatten any backend if you let them all in. Put a virtual waiting room in front: admit users into the seat-selection / booking path at a controlled rate, hold the rest in a fair FIFO queue, and show their position. This converts an uncontrollable stampede into a steady, provisioned flow — you size the backend for the admission rate, not the mob.
Payment processors retry webhooks; users double-click “Purchase.” Make order creation idempotent with a key (the bookingId or a client token) so a duplicate confirmation never produces a second charge or a second ticket. Non-negotiable wherever money moves.
If the lock store is unavailable, you must fail closed on booking — fall back to a conditional update in the strongly-consistent DB (UPDATE … WHERE status = 'available', an OCC check), and if even that’s uncertain, reject the purchase. A rejected purchase is recoverable; a double-sold seat is a refund, an angry fan, and a support nightmare. As a final backstop, re-verify availability with an OCC check at the moment of confirmation.
Be explicit: strong for the seat transition and the order; eventual for search, the seat map, and analytics. Stating which subsystem gets which model is a senior signal in itself.
“The waiting room is load-shedding by design: I’d rather admit users at a rate I’ve provisioned for and queue the rest fairly than let the full stampede hit the booking path and take everyone down.”
“We’ll fail open if the lock store dies so users can still buy.” Failing open on seat inventory is the oversell. For this system, failing closed is the only defensible call.
Two users click the same seat in the same millisecond. Trace it.
Both attempt an atomic SETNX on seat:{id}. Redis serializes them, so exactly one succeeds and gets the hold plus a booking in-progress; the other’s SETNX fails, it sees the seat is held, and the client is told to pick another seat — in real time over SSE the seat already shows red. No transaction held open, no double-hold.
The Redis lock layer goes down mid-sale.
I fail closed on booking. New holds fall back to a conditional update in the source-of-truth DB — update the seat to held only where it’s currently available, which is an atomic OCC check. If that path is degraded too, I’d rather reject purchases briefly than risk overselling. The invariant wins over availability here, every time — and I’d be alerting on the fail-closed state immediately.
Load-test against synthetic on-sales before real ones; canary changes to the locking/waiting-room logic on smaller events first. Keep the OCC backstop independently deployable.
“With more time I’d detail the payments and fraud paths and dynamic pricing. I scoped them out deliberately — payments need the same strong consistency I built for seats, just on money instead of inventory.”
Interviewers push on the locking model and the failure modes. Commit to CP, name the trade-off, protect the invariant.
Atomic SETNX (or equivalent) on the seat key. Redis serializes the attempts, exactly one wins the hold, the loser sees the seat is held and picks another. No held transaction, no race — the atomicity is the whole guarantee.
The hold is a lock with a TTL, so it auto-releases when the key expires — the seat returns to available with no sweeper job. That automatic expiry is exactly why I use Redis rather than a DB row lock.
I need a temporary reservation that expires on its own. Relational DBs have no native row TTL, so I'd bolt on cron-based cleanup; Redis gives automatic key expiry and sub-millisecond acquire/release under the heavy concurrency of an on-sale. The DB stays the source of truth for the final booked state.
A virtual waiting room in front of the booking path. It admits users at a rate I've provisioned for and holds the rest in a fair FIFO queue with a visible position. I size the backend for the admission rate, not the raw stampede — it's deliberate load-shedding.
Idempotent order creation keyed on the bookingId (or a client idempotency token). The second webhook is recognized as a duplicate and is a no-op — no second charge, no second ticket. Mandatory anywhere money moves.
Fail closed on booking. Holds fall back to a conditional update in the source-of-truth DB — set the seat to held only where it's available, an atomic OCC check — and if that's uncertain, reject the purchase. A rejected sale is recoverable; an oversold seat isn't. And I alert the instant we enter that fallback.
A clean design with one of these undercurrents still scores below the bar at senior+. None are about getting an answer wrong — they're about how you operate.
Jumping to architecture without bounding the problem or confirming scale. Reads as template-matching.
"It depends" with no decision behind it. Name the trade-off, then pick.
Choosing eventual consistency for seat inventory ‘for speed.’ It's an oversell waiting to happen and an instant flag.
Pinning a transaction open while a human pays for minutes — it collapses the connection pool the moment the on-sale starts.
Allowing bookings through when the lock store is down. For seat inventory, failing open is the double-sale.
No observability, no rollout, no failure-mode plan. In 2026 this reads as "has never carried a pager."
Confident wrong answers when pushed. Far worse than an honest "here's what I'd verify."
Waiting to be asked the next question. At staff you own the 45 minutes.
Run a mock and score yourself honestly against the dimensions the interviewer uses. If you can't hit "strong" on depth and operability, that's your signal on where to drill.
| Dimension | Weak (downlevel) | Strong (at level) |
|---|---|---|
| Scoping | Picked eventual consistency or skipped the invariant. | Named no-double-booking as a correctness invariant; chose CP for booking, eventual for browse. |
| Hold mechanism | Held a DB transaction or had no expiry. | Atomic Redis lock with TTL; abandoned holds self-heal; DB is source of truth for booked state. |
| Spike handling | Let the stampede hit the backend. | Virtual waiting room admitting at a provisioned rate with a fair queue. |
| Idempotency | Forgot duplicate webhooks / double-clicks. | Idempotent order creation keyed on bookingId; no double charge or double ticket. |
| Failure fallback | Failed open or had no plan. | Failed closed; OCC conditional-update backstop; alert on entering fallback. |
| Operability | Never mentioned it. | Oversell-rate-zero invariant metric, lock contention, conversion, fail-closed alerting. |