🗺️Design Patterns·4 min read

Numbers to Memorize

The handful of reference figures that make capacity estimation fast in any interview.

Capacity estimation becomes much less mysterious when you memorize a small set of reference numbers. You do not need perfect precision in a system design interview or early architecture review. You need order-of-magnitude anchors that help you convert users into QPS, payloads into storage, and latency budgets into realistic component choices.

🔭Think of it like…

These numbers are like knowing that a mile is about 1.6 kilometers and a gallon is about 4 liters. You can still use a calculator later, but the memorized landmarks keep you from confidently designing a bridge that is off by a factor of 100.

Latency numbers every estimate needs

Latency numbers tell you what can fit inside a request path. A few CPU cache reads are invisible. A disk seek is enormous by comparison. A cross-region round trip can consume an entire user-facing budget before your application code runs.

~0.5 ns

L1 cache reference

~4 ns

L2 cache reference

~100 ns

Main memory reference

~1 µs

Fast NVMe random read best case

~100 µs

SSD random read typical order

~5-10 ms

Spinning disk seek

~0.2-1 ms

Same-AZ network round trip

~1-3 ms

Same-region service call

~30-80 ms

Cross-country round trip

~150-250 ms

Intercontinental round trip

The mental model

Memory is nanoseconds, local networks are microseconds to low milliseconds, disks are milliseconds, and wide-area networks are tens to hundreds of milliseconds. Count the slow things in your critical path.

Powers of two and storage units

Storage estimates are easier if you can move between bytes, KB, MB, GB, TB, and PB without pausing. Engineers often use powers of two while product estimates use round powers of ten. For back-of-the-envelope work, the rounded decimal values are usually close enough.

1 KB

~1 thousand bytes

1 MB

~1 million bytes

1 GB

~1 billion bytes

1 TB

~1 trillion bytes

1 PB

~1 quadrillion bytes

1 KB × 1M

~1 GB

1 KB × 1B

~1 TB

1 MB × 1M

~1 TB

fast unit conversions

1 KiB = 1,024 bytes          ≈ 1 KB
1 MiB = 1,024 KiB            ≈ 1 MB
1 GiB = 1,024 MiB            ≈ 1 GB
1 TiB = 1,024 GiB            ≈ 1 TB

Back-of-envelope shortcuts:
  1 KB * 1 million items  ≈ 1 GB
  1 KB * 1 billion items  ≈ 1 TB
  1 MB * 1 million items  ≈ 1 TB
  1 MB * 1 billion items  ≈ 1 PB

The exact binary values matter for billing and filesystem limits. The rounded values matter for fast reasoning: if every message is 1 KB and you store 1 billion messages, you are in terabyte territory before replication, indexes, or backups.

Traffic anchors: QPS from daily volume

Most capacity chains start with daily active users and actions per day. Convert daily actions into average QPS by dividing by 86,400 seconds per day. Then multiply by a peak factor because real traffic is not flat.

86,400

Seconds in one day

~12 QPS

1M requests/day

~120 QPS

10M requests/day

~1.2K QPS

100M requests/day

~11.6K QPS

1B requests/day

3×

Steady enterprise peak factor

5×

Consumer/social peak factor

10×+

Launches, sports, drops, flash sales

Average QPS: total daily actions divided by 86,400.
Peak QPS: average QPS multiplied by a peak factor. Design for peak unless the workload can queue safely.
Read/write split: many products have far more reads than writes. Estimate them separately when cache, database, and fanout requirements differ.

Round 86,400 to 100,000 for first-pass math

Dividing by 100,000 makes mental math easy and is within about 15% of the exact answer. Refine later after the architecture shape is clear.

Rough single-node throughput anchors

Per-node throughput depends on hardware, payload size, indexes, replication, consistency, and code quality. These numbers are not promises; they are sanity checks. If your estimate requires one database node to handle 2 million complex writes per second, something is wrong.

1K-10K QPS

Typical app server per node for nontrivial APIs

10K-100K QPS

Simple cached reads per service node

1K-10K writes/s

Single relational DB primary, workload dependent

10K-100K ops/s

Redis node for small simple commands

10-100 MB/s

Kafka partition order-of-magnitude throughput

100-500 MB/s

Modern SSD sequential throughput per device

1-10 Gbps

Common server NIC capacity range

50-200 MB/s

Sustained object upload/download per busy client or worker

The point is to divide peak load by plausible capacity. If you need 600K peak API QPS and one app node safely handles 5K QPS with headroom, you are in the neighborhood of 120 app nodes before redundancy and regional distribution.

Typical object sizes and hidden multipliers

Object size estimates drive storage, bandwidth, cache, and database design. Include metadata, indexes, replication, compression, retention, and fanout copies. The user-visible payload is rarely the whole cost.

100 B

Tiny event metadata or counter update

1 KB

Chat message, notification, small JSON row

10 KB

Rich post, comment thread item, log line with context

100 KB

Thumbnail or small document

1-5 MB

Phone photo after compression

10-100 MB

Short video clip

2-4×

Common overhead after indexes, replicas, backups

3×

Typical replication factor for durable storage

Hidden multipliers to remember

Replication: three copies turn 100 TB of logical data into roughly 300 TB of raw storage.
Indexes: secondary indexes can be as large as, or larger than, the base data for write-heavy tables.
Retention: 7 days vs. 365 days changes storage by 52×.
Fanout: a single post may be stored once in the origin table but referenced or copied into millions of timelines.

How to use these numbers in a design

Start with the memorized anchors, then run the capacity chain: users → actions → QPS → storage → bandwidth → node count. The goal is not to be perfectly correct; it is to expose which part of the system is large enough to shape the architecture.

At 100 QPS, architecture is dominated by correctness and simplicity.
At 100K QPS, caches, load balancing, partitioning, and observability become central.
At petabytes, retention, compaction, lifecycle policies, and storage tiering become product features, not cleanup details.

Numbers are ranges, not laws

Real benchmarks beat memorized numbers. Use these anchors to pick a plausible design, then validate hot paths with load tests, production telemetry, and vendor limits.

Key takeaways

Latency anchors: memory is nanoseconds, local networks are microseconds to milliseconds, disks are milliseconds, and cross-region calls are tens to hundreds of milliseconds.
Traffic anchors: divide daily actions by 86,400; 1M/day is about 12 QPS and 1B/day is about 11.6K QPS.
Storage anchors: 1 KB times 1M items is about 1 GB; 1 KB times 1B items is about 1 TB before replicas and indexes.
Peak factors matter: use roughly 3× for steady enterprise, 5× for consumer/social, and higher for launches or flash events.
Single-node throughput numbers are sanity checks; divide peak load by conservative per-node capacity and leave headroom.

A cross-region round trip can be tens to hundreds of milliseconds by itself. If it sits on the critical path, it can consume the whole budget before database queries, application work, or retries happen.

About 1 TB. The shortcut is 1 KB × 1B ≈ 1 TB. With three-way replication, indexes, and backups, the raw footprint can be several TB.

Traffic is not flat. Users arrive during daily peaks, launches, and events. A system sized only for average load may fail exactly when the most users are present.

Finished this lesson?

Mark it complete to track your progress through the workbook.