🧱Fundamentals·5 min read

Back-of-the-Envelope Estimation

Turn 'a billion users' into QPS, storage, and server counts in under five minutes.

Back-of-the-envelope estimation turns vague scale into useful engineering numbers: requests per second, storage growth, bandwidth, and server count. The goal is not perfect precision. The goal is to land in the right order of magnitude so the architecture fits the problem before you draw boxes.

🔭Think of it like…

Estimation is ordering pizza for a large party. You do not know exactly how hungry every guest will be, but you can say 80 guests, about 3 slices each, 8 slices per pizza, plus a safety margin. That is enough to avoid showing up with 5 pizzas or 200 pizzas.

Why estimate before designing

The numbers tell you which designs are plausible. A service doing 50 requests per second can run on a small fleet. A service doing 5 million writes per second needs partitioning, queues, careful storage choices, and failure planning. If you skip the math, you may design a bicycle for highway traffic.

estimates drive architecture

small scale:
  10 QPS, 5 GB storage
  -> one app server and one database may be fine

large scale:
  500,000 QPS, 3 PB/year
  -> load balancing, caching, partitioning, queues, and distributed storage

Say assumptions out loud

Estimation is a chain of assumptions. State them clearly, round numbers aggressively, and keep units attached. A reasonable method is more important than pretending the inputs are exact.

Powers of two and storage units

Computers use binary-ish units, but system design interviews usually accept rounded decimal math. Memorize the ladder so you can move from item counts to bytes quickly.

Unit	Rough size	Useful mental anchor
1 KB	1 thousand bytes	A small text message or JSON object
1 MB	1 thousand KB	A compressed image or small bundle
1 GB	1 thousand MB	1 KB multiplied by 1 million items
1 TB	1 thousand GB	1 KB multiplied by 1 billion items
1 PB	1 thousand TB	1 KB multiplied by 1 trillion items

binary powers worth knowing

2^10  = 1,024        ~ 1 thousand
2^20  = 1,048,576    ~ 1 million
2^30  = 1,073,741,824 ~ 1 billion

1 KiB -> 1 MiB -> 1 GiB -> 1 TiB -> 1 PiB
for quick estimates, KB -> MB -> GB -> TB -> PB by 1000x

Latency numbers to keep in your head

Exact numbers vary by hardware, cloud provider, language, and workload, but the ordering is stable. Memory is much faster than disk; local calls are much faster than cross-region calls; network distance matters.

~100 ns

memory access

~1-10 us

local SSD read

~0.5-2 ms

same-zone service call

~1-5 ms

cache or DB round trip in-region

~50-150 ms

cross-continent round trip

86,400

seconds per day

These numbers shape design choices. A browser request that calls five services serially pays network latency five times. A cache hit can save a database round trip. A cross-region synchronous write can dominate the whole request budget.

The core formulas

Most capacity estimates are built from a few reusable formulas. Keep the units visible and convert one step at a time.

request rate

requests/day = DAU * actions per user per day
average QPS  = requests/day / 86,400
peak QPS     = average QPS * peak factor

common peak factor: 3x to 5x for consumer systems
higher for spiky events such as sports, sales, or breaking news

storage and bandwidth

storage = items * bytes per item * replication factor * retention

write bandwidth = writes/second * bytes per write
read bandwidth  = reads/second * bytes per read

server count = peak QPS / safe per-node throughput
then add headroom for failures, deploys, and uneven traffic

Estimate	Formula	Why it matters
QPS	DAU x actions/day / 86,400	Sizes app servers, caches, queues, and DB reads
Peak QPS	average QPS x peak factor	Systems must survive peaks, not averages
Storage	items x size x replication x retention	Sizes databases, object stores, and backups
Bandwidth	QPS x response size	Sizes network links, CDN, and egress cost
Server count	peak QPS / per-node throughput	Turns demand into fleet size

Related patterns

For more practice, see Capacity Chain and Capacity Numbers.

Fully worked example: photo sharing feed

Suppose you are designing a photo-sharing feed. Use round numbers and state assumptions before calculating.

assumptions

DAU = 20 million users
feed opens = 12 per user per day
photos uploaded = 2 per user per day
average feed response = 60 KB
average stored photo after compression = 500 KB
metadata per photo = 2 KB
replication factor = 3
retention = 5 years
peak factor = 4x
one app server safely handles 800 QPS

Read QPS

Feed reads/day: 20M users x 12 opens = 240M feed reads per day.
Average read QPS: 240M / 86,400 is about 2,800 QPS.
Peak read QPS: 2,800 x 4 is about 11,200 QPS.

Write QPS and storage

Uploads/day: 20M users x 2 photos = 40M photos per day.
Average upload QPS: 40M / 86,400 is about 460 uploads per second; peak is about 1,850 uploads per second.
Raw photo storage/day: 40M x 500 KB = 20 TB per day.
Replicated photo storage/day: 20 TB x 3 = 60 TB per day.
Five-year replicated photo storage: 60 TB/day x 365 x 5 is about 110 PB.
Metadata/day: 40M x 2 KB = 80 GB raw, or 240 GB with 3x replication. Metadata is much smaller than media but still large enough to require partitioning over time.

Bandwidth and servers

Peak feed bandwidth: 11,200 QPS x 60 KB is about 672 MB/s before compression and CDN effects.
Peak upload bandwidth: 1,850 uploads/s x 500 KB is about 925 MB/s entering object storage.
App server count: 11,200 peak QPS / 800 safe QPS per node = 14 nodes. Add headroom for deploys and failures, so start with roughly 20 to 25 app servers.

What the example tells you

The math points to a CDN for feed media, object storage for photos, stateless app servers behind a load balancer, a partitioned metadata store, and background processing for thumbnails. The estimate produced design constraints, not just trivia.

Gotchas and practical habits

Average hides peaks: traffic follows time zones, notifications, launches, and special events. Always multiply by a peak factor.
Replication and backups count: a 1 TB logical dataset may consume 3 TB replicated plus backup, index, and log overhead.
Per-node throughput is a safe number: use measured sustainable throughput, not a perfect benchmark from an empty lab.
Reads and writes differ: reads may be cacheable; writes often require durability, ordering, validation, and replication.
Units prevent mistakes: write KB, MB, seconds, days, and years in every line so you do not multiply incompatible quantities.

Key takeaways

Back-of-the-envelope estimation converts vague scale into QPS, storage, bandwidth, and server counts.
Memorize unit ladders and latency anchors: KB to PB, 86,400 seconds per day, and the rough cost of memory, disk, network, and cross-region calls.
QPS comes from DAU x actions per day divided by 86,400, then multiplied by a peak factor.
Storage comes from item count x item size x replication x retention, with extra room for indexes, backups, logs, and growth.
Server count is peak QPS divided by safe per-node throughput, plus headroom for failures, deploys, and uneven load.

Requests per day are 10M x 20 = 200M. Average QPS is 200M / 86,400, which is roughly 2,300 QPS. With a 5x peak factor, plan for about 11,500 QPS before caching or batching.

The system stores more physical bytes than the logical dataset. A 10 TB logical dataset with 3x replication needs about 30 TB before indexes, backups, logs, compaction overhead, and growth margin.

Divide peak QPS by the safe sustained throughput of one node. If peak is 40,000 QPS and one node safely handles 1,000 QPS, the math says 40 nodes. Add headroom so the fleet survives deploys, failures, and uneven load.

Finished this lesson?

Mark it complete to track your progress through the workbook.