The system design interview, calibrated to the FAANG bar

Voice-first mock interviews with a written hire/no-hire verdict. Built for senior+ engineers preparing for the loop that actually decides the offer.

Now live 15-min session, free →

01Depth

Hand-wave any box. Lose the dimension.

A real bar-raiser can probe any layer — CDN, lock primitive, retry semantics, blob path. SystemDesign.so holds you to the same bar across every subsystem in the stack.

CLIENT interviewer · systemdesign stage 01 / 16

“

if they wave →

↳probe

↳numbers

↳failure mode

CLIENT

User's browser issues a request

02Voice · the medium

This is what your interview actually sounds like.

Spoken in your headphones during the session. Not typed in a chat. Three turns from a senior candidate's real session — the push-back, the math demand, the missing primitive.

T6 23:14

“You’re avoiding scale. Numbers, please. Reads per second. Then we’ll talk caching.”

0:00 / 0:09

T9 31:02

“No idempotency on /book. Retry storm → double-charge. How do you guarantee at-most-once?”

0:00 / 0:11

T11 38:47

“You skipped seat-locking. Two users, one seat, race condition. Walk me through the lock.”

0:00 / 0:08

03How it works

Your AI bar-raiser, on demand.

Four steps. 15, 30, 45, or 60 minutes. Every trade-off you missed, flagged.

01 / PICK

Pick a design

22 designs spanning feeds, messaging, storage, payments, maps, search, ride-share, ad-tech, and trading. Pick a length and a level.

02 / TALK

Talk under pressure

You speak your design out loud. A real drawing canvas tracks the diagram. The AI listens, pushes back on every claim, demands the math, and never lets you move on from a hand-wave.

03 / SCORE

Scored every turn

Six dimensions, 100 points. Phase budgets, capture discipline, unverified-number flags — all live, while you draw.

04 / REVIEW

Verdict & debrief

A hire/no-hire verdict, every red flag the judge caught, the must-haves you missed, and which of four behavioral buckets you actually demonstrated.

04The difference

ChatGPT gives answers.
SystemDesign.so runs the interview.

One is a search box. The other is a bar-raiser with a clipboard. The same prompt, side-by-side.

× Generic LLM

“Design a URL shortener.”

youDesign a URL shortener.

gptSure! Here are common components…

• Write API → generator → KV store

• Base62, 7 chars = 3.5T keys

• Cache hot URLs in Redis

• Anything else I can help with?

youSounds good, thanks!

No probe. No pressure. No signal.

You get a dump of textbook pieces. Not an interview.

◆ SystemDesign.so

“Design a URL shortener for 200M DAU.”

aiFirst — what's the R:W ratio?

you100:1, read-heavy.

aiGood. Your DB choice?

youCassandra.

aiWhy Cassandra over Postgres for 100:1 reads?

youHmm…

aiYou picked an AP store for a read path. Walk me through eviction on a viral link.

Architecture −3 · Engineering −2 · Reliability +1

Caught a bar-raiser mistake. Scored it. Taught you.

05What makes it work

Nine instruments.
One dashboard.

Inside the engine, module by module. Each one runs its own live readout — the signal, the canvas, the rubric, the verdict, the clock. Click a module to pop the hood. ↑↓ to scrub through them.

MOD-001 · INPUT LATENCY · 78ms · BARGE-IN · READY

Voice or text, on demand.

Real-time voice with smart barge-in — talk over the AI when you're ready to pivot, just like a human loop. Or stick to text when you'd rather think.

LIVE INPUT CHANNEL · WSS

VAD

SPEAKING

WPM

148

FILLERS

BARGE-IN

≤80ms

CAND I think the bookings table will get, a few thousand writes per second, so I'd —

MOD-002 · INPUT COHERENCE · 62% · BOXES · 7

If you draw the box, defend it.

An Excalidraw-powered whiteboard with a 28-box systems library. The AI sees what you draw and grades the coherence between your words and your boxes.

CANVAS · SESSION 1A37 7/28 BOXES

CLIENT

GATEWAY

BOOK SVC

QUEUE

POSTGRES

REDIS · PRIMARY?

CDN

WORD↔BOX COHERENCE 62% REDIS LABELLED, POSTGRES SPOKEN

MOD-003 · ENGINE DEPTH · 3 · NEXT-Q · QUEUED

No two sessions ask the same questions.

Every answer routes through the knowledge graph and spawns the next probe — depth escalation, dependency gaps, prerequisite redirects. You can't coast on boilerplate.

PROBE TREE · T7 · 24:21 3-DEEP

→CAND: "I'd use Redis as a write-through cache"

↳What's the eviction policy?DEPTH

↳Cache invalidation on write?PREREQ

↳What about cross-region consistency?DEP

▸NEXT QUEUED: If primary goes down, what fails first?NEXT

MOD-004 · SCORING RECALC · EVERY TURN

Not "good job." A breakdown.

Six dimensions. Sums to 100. Updated every turn, with a quality multiplier that punishes hand-waves. The breakdown your verdict will hang on.

RUBRIC · LIVE ×1.0 MULTIPLIER

Requirements12 / 15

Quantitative7 / 15

API Design11 / 15

Architecture20 / 25

Engineering7 / 15

Reliability4 / 15

TOTAL · LIVE 61/ 100

MOD-005 · LIVE WATCH CLASSES · 6 · FLAGS · 4

Caught the moment you say it.

Six classes of mistake — unverified numbers, math past 2× tolerance, missing failure modes, wrong storage, missing idempotency — flagged inside 200 ms.

MISTAKE FEED · S-12 ≤200ms TO FLAG

23:11UNVERIFIED"a few thousand" — no anchor. Demanded RPS.!

24:21WRONG STORERedis as primary writer. Redirected to durability.×

25:58MISSING IDEMPNo idempotency on /book. Retry-storm risk.×

27:14MATH 2×+"~500 MB/day" off by 4.2×. Math demanded.×

UNVERIFIED · 1 WRONG STORE · 1 MATH · 1 MISSING IDEMP · 1

MOD-006 · CLOCK SESSION · 45-MIN · ELAPSED · 14:08

Same clock as the real loop.

Each phase has a time budget. Live Rushed / On-pace / Slow / Over-budget pills as you go. Learn to manage the clock — not just the answer.

PHASE LEDGER 5 PHASES · 30:52 LEFT

Requirements2:01 / 2:00OVER

Estimation0:35 / 1:00ON-PACE

Entities & API1:28 / 2:00SLOW

Architecture9:18 / 15:00ON-PACE

Deep dives0:46 / 10:00QUEUED

MOD-007 · TRACKING SESSIONS · S-04 vs S-12

Weak spots don't stay weak quietly.

Diff two sessions side-by-side. Watch your six dimensions evolve across attempts. Δ per dimension, per attempt — the same metric your hiring committee would diff.

DELTA · S-04 → S-12 +9.4 NET

S-04 · 3 WEEKS AGO

Reliability4 / 15

Quantitative7 / 15

Architecture14 / 25

VerdictNo Hire

Flags raised6

→

S-12 · TODAY

Reliability11 / 15

Quantitative12 / 15

Architecture21 / 25

VerdictLean Hire

Flags raised2

MOD-008 · OUTPUT CALL · LEAN NO-HIRE

Strong Hire to Strong No-Hire.

An end-of-session frontier-model judge re-reads the whole transcript and produces a verdict — with must-haves covered, missed, and the red flags it'd raise in a real debrief.

VERDICT · S-12 TICKET-BOOKING

CALL

Lean No-Hire

GAP TO LEAN HIRE · +9 pts RED FLAGS · 3 MUST-HAVES · 3 / 5

×No idempotency key on /book — retry-storm risk.

×Wrong storage tier — Redis as primary writer.

×Skipped capacity math until interrupted.

MOD-009 · ENGINE NODES · 1,247 · ACTIVE · 12

The dependency tree behind every choice.

1,247 trigger nodes per design with depends-on / implies / required-before edges. When you say "Redis," the graph already knows what you owe it.

GRAPH · /book SUBTREE 3 PREREQS UNMET

ACTIVE · 12 DORMANT · 1,235 EDGES · 4,189

06The report

What you walk away with.

Every session ends in a debrief that names a verdict, a level, and the exact rubric points you missed. Not vibes. Receipts.

Design a ticket booking system · Senior target

Builds a working system

×× No Hire Below Senior bar 31 pts to the next bucket

Surface-level on every deep-dive. Three push-backs — scale, idempotency, locking — three pivots back to architecture diagrams. No invalidation policy. No idempotency key. No seat-lock primitive. Pattern, not a slip.

Must-haves covered

·Core read-vs-write split named early
·Hot-key (contested seat) flagged as a risk

Red flags triggered

×Unverified scale claim ("10M users")
×Missing idempotency on /book
×Three latency / SLA prompts skipped
×No seat-lock primitive

Phase timeline 47:12 total

Requirements2:01 / 2:00· over budget
Estimation0:35 / 1:00
Entities + API1:28 / 2:00
HLD0:48 / 5:00· rushed
Deep dive0:55 / 5:00· cut short

Capture discipline

Verbal vs captured

You verbally mentioned 13 items. You captured 3 in your notes panel.

Functional reqs3 / 1

Non-functional0 / 2

Back-of-envelope0 / 4

Entities0 / 4

Plus a searchable transcript, a delta against your last attempt, and a next question calibrated to your weakest dimension.

07The math

One mock.
Or five reps a month.

The system-design loop is often what stands between you and a senior offer. Below is what those reps actually cost — line by line, not packaged as tiers.

Human mock · per-session

Pay each time the room opens.

Senior FAANG mentor · 60 min$300

Reschedule fee · 1 of 4$45

Same prompt, second takeunavailable

Written feedback~5 bullet points

Repeat-of-the-same-thingawkward

5 reps over 4 weeks $1,500+

Schedule-locked · 1 opinion · no two graded the same

SystemDesign.so Pro · monthly Launch · −50%

One bill. Five 45-min rooms.

Pro plan · 225 min budget$99$49

Effective per session$19.80$9.80

Same rubric, every run100 pts · 6 dims

Re-do the same promptuntil it clicks

Verdict + transcriptevery time

5 reps over 4 weeks $99$49

−$50 / mo · launch offer Limited period offer

Voice-first · 24/7 · designs library 22 / 22

08Level call

Not "good job." A precise behavior.

SystemDesign.so doesn't grade on a curve, and it doesn't hand out titles. It tells you which of four behaviors you actually demonstrated — and the exact gap to the next bucket.

Names primitives

You can list the building blocks. Can't yet defend a trade-off under push-back.

0 — 25pts

you · 24

Builds a working system

End-to-end design that runs. Stalls when pushed into unfamiliar territory.

26 — 50pts

Owns the deep-dive

Idempotency, locking, consistency, failure modes — unprompted, with the math.

51 — 75pts

Drives the room

Names the trade-off the interviewer hadn't asked yet. Sets the agenda.

76 — 100pts

10FAQ

The questions everyone actually asks.

How is this different from asking a general-purpose AI chatbot for system design help?

A chatbot answers your question. SystemDesign.so asks the questions — the way a FAANG bar-raiser would. It pushes back when you wave your hands, demands numbers when you say "a lot," and writes the verdict at the end. You can't get that from a search box.

Is this just a chatbot with a clever prompt?

No. The judge runs a 12-stage pipeline on top of a frontier model — turn classification, rubric scoring, contradiction-aware memory (so it remembers you said 100:1 reads twelve turns ago), and trigger detection over ~1,200 hand-built failure patterns. A wrapper can chat. It cannot ask the right push-back at the right moment and weigh it against a hand-built rubric.

How is the rubric built and tuned?

Six dimensions, 100 points, four behavioral buckets — built with engineers who have run system-design loops at top-tier companies. Each design has a hand-authored grading sheet (must-haves, common traps, scale tells) plus auto-detected red flags. The rubric ships as v3.2; every session it scores feeds the next calibration.

Voice or text? Do I need to talk out loud?

Voice is the default — the whole product is built around it. You'll get the most out of it talking out loud, exactly as you would in the real room. Text mode exists as a fallback for quiet places or accessibility. The drawing canvas works the same in either.

What happens if I just... bomb the first one?

That's the point. The verdict tells you the four-bucket behavior you actually showed, the must-haves you missed, and the next question calibrated to your weakest dimension. Bombing the mirror is the cheapest way to find out what your loop will reveal.

Can I bring my own design prompts (e.g. from a take-home or my company)?

On the MAX plan, yes. Drop a one-line prompt and SystemDesign.so builds the rubric for it on the fly. On Free and Pro you pick from the 22 designs in the library.

Is my transcript private?

Yes. Sessions are tied to your account, never shared, never used to train external models. You can delete any session permanently.

The verdict machine · now live

The interview is free.
The mirror is not.

Talk through one system design. Get an honest hire/no-hire-style verdict on your design — while you still have time to fix it.

Fifteen minutes. No card. You speak, it judges, then you walk away with audio coaching pulled from your own voice and the exact gap to the next bucket.

Start free interview → Re-read the verbatim

15-min taste· 6 dimensions· 4 behavioral buckets· 1 verdict