DrawLintDrawLint.ai
🧱Fundamentals·5 min read

Back-of-the-Envelope Estimation

Turn 'a billion users' into QPS, storage, and server counts in under five minutes.

Back-of-the-envelope estimation turns vague scale into useful engineering numbers: requests per second, storage growth, bandwidth, and server count. The goal is not perfect precision. The goal is to land in the right order of magnitude so the architecture fits the problem before you draw boxes.

🔭Think of it like…
Estimation is ordering pizza for a large party. You do not know exactly how hungry every guest will be, but you can say 80 guests, about 3 slices each, 8 slices per pizza, plus a safety margin. That is enough to avoid showing up with 5 pizzas or 200 pizzas.

Why estimate before designing

The numbers tell you which designs are plausible. A service doing 50 requests per second can run on a small fleet. A service doing 5 million writes per second needs partitioning, queues, careful storage choices, and failure planning. If you skip the math, you may design a bicycle for highway traffic.

estimates drive architecture
small scale:
  10 QPS, 5 GB storage
  -> one app server and one database may be fine

large scale:
  500,000 QPS, 3 PB/year
  -> load balancing, caching, partitioning, queues, and distributed storage
Say assumptions out loud
Estimation is a chain of assumptions. State them clearly, round numbers aggressively, and keep units attached. A reasonable method is more important than pretending the inputs are exact.

Powers of two and storage units

Computers use binary-ish units, but system design interviews usually accept rounded decimal math. Memorize the ladder so you can move from item counts to bytes quickly.

UnitRough sizeUseful mental anchor
1 KB1 thousand bytesA small text message or JSON object
1 MB1 thousand KBA compressed image or small bundle
1 GB1 thousand MB1 KB multiplied by 1 million items
1 TB1 thousand GB1 KB multiplied by 1 billion items
1 PB1 thousand TB1 KB multiplied by 1 trillion items
binary powers worth knowing
2^10  = 1,024        ~ 1 thousand
2^20  = 1,048,576    ~ 1 million
2^30  = 1,073,741,824 ~ 1 billion

1 KiB -> 1 MiB -> 1 GiB -> 1 TiB -> 1 PiB
for quick estimates, KB -> MB -> GB -> TB -> PB by 1000x

Latency numbers to keep in your head

Exact numbers vary by hardware, cloud provider, language, and workload, but the ordering is stable. Memory is much faster than disk; local calls are much faster than cross-region calls; network distance matters.

~100 ns
memory access
~1-10 us
local SSD read
~0.5-2 ms
same-zone service call
~1-5 ms
cache or DB round trip in-region
~50-150 ms
cross-continent round trip
86,400
seconds per day

These numbers shape design choices. A browser request that calls five services serially pays network latency five times. A cache hit can save a database round trip. A cross-region synchronous write can dominate the whole request budget.

The core formulas

Most capacity estimates are built from a few reusable formulas. Keep the units visible and convert one step at a time.

request rate
requests/day = DAU * actions per user per day
average QPS  = requests/day / 86,400
peak QPS     = average QPS * peak factor

common peak factor: 3x to 5x for consumer systems
higher for spiky events such as sports, sales, or breaking news
storage and bandwidth
storage = items * bytes per item * replication factor * retention

write bandwidth = writes/second * bytes per write
read bandwidth  = reads/second * bytes per read

server count = peak QPS / safe per-node throughput
then add headroom for failures, deploys, and uneven traffic
EstimateFormulaWhy it matters
QPSDAU x actions/day / 86,400Sizes app servers, caches, queues, and DB reads
Peak QPSaverage QPS x peak factorSystems must survive peaks, not averages
Storageitems x size x replication x retentionSizes databases, object stores, and backups
BandwidthQPS x response sizeSizes network links, CDN, and egress cost
Server countpeak QPS / per-node throughputTurns demand into fleet size
Related patterns
For more practice, see Capacity Chain and Capacity Numbers.

Fully worked example: photo sharing feed

Suppose you are designing a photo-sharing feed. Use round numbers and state assumptions before calculating.

assumptions
DAU = 20 million users
feed opens = 12 per user per day
photos uploaded = 2 per user per day
average feed response = 60 KB
average stored photo after compression = 500 KB
metadata per photo = 2 KB
replication factor = 3
retention = 5 years
peak factor = 4x
one app server safely handles 800 QPS

Read QPS

  • Feed reads/day: 20M users x 12 opens = 240M feed reads per day.
  • Average read QPS: 240M / 86,400 is about 2,800 QPS.
  • Peak read QPS: 2,800 x 4 is about 11,200 QPS.

Write QPS and storage

  • Uploads/day: 20M users x 2 photos = 40M photos per day.
  • Average upload QPS: 40M / 86,400 is about 460 uploads per second; peak is about 1,850 uploads per second.
  • Raw photo storage/day: 40M x 500 KB = 20 TB per day.
  • Replicated photo storage/day: 20 TB x 3 = 60 TB per day.
  • Five-year replicated photo storage: 60 TB/day x 365 x 5 is about 110 PB.
  • Metadata/day: 40M x 2 KB = 80 GB raw, or 240 GB with 3x replication. Metadata is much smaller than media but still large enough to require partitioning over time.

Bandwidth and servers

  • Peak feed bandwidth: 11,200 QPS x 60 KB is about 672 MB/s before compression and CDN effects.
  • Peak upload bandwidth: 1,850 uploads/s x 500 KB is about 925 MB/s entering object storage.
  • App server count: 11,200 peak QPS / 800 safe QPS per node = 14 nodes. Add headroom for deploys and failures, so start with roughly 20 to 25 app servers.
What the example tells you
The math points to a CDN for feed media, object storage for photos, stateless app servers behind a load balancer, a partitioned metadata store, and background processing for thumbnails. The estimate produced design constraints, not just trivia.

Gotchas and practical habits

  • Average hides peaks: traffic follows time zones, notifications, launches, and special events. Always multiply by a peak factor.
  • Replication and backups count: a 1 TB logical dataset may consume 3 TB replicated plus backup, index, and log overhead.
  • Per-node throughput is a safe number: use measured sustainable throughput, not a perfect benchmark from an empty lab.
  • Reads and writes differ: reads may be cacheable; writes often require durability, ordering, validation, and replication.
  • Units prevent mistakes: write KB, MB, seconds, days, and years in every line so you do not multiply incompatible quantities.
Key takeaways
  • Back-of-the-envelope estimation converts vague scale into QPS, storage, bandwidth, and server counts.
  • Memorize unit ladders and latency anchors: KB to PB, 86,400 seconds per day, and the rough cost of memory, disk, network, and cross-region calls.
  • QPS comes from DAU x actions per day divided by 86,400, then multiplied by a peak factor.
  • Storage comes from item count x item size x replication x retention, with extra room for indexes, backups, logs, and growth.
  • Server count is peak QPS divided by safe per-node throughput, plus headroom for failures, deploys, and uneven load.
Requests per day are 10M x 20 = 200M. Average QPS is 200M / 86,400, which is roughly 2,300 QPS. With a 5x peak factor, plan for about 11,500 QPS before caching or batching.
The system stores more physical bytes than the logical dataset. A 10 TB logical dataset with 3x replication needs about 30 TB before indexes, backups, logs, compaction overhead, and growth margin.
Divide peak QPS by the safe sustained throughput of one node. If peak is 40,000 QPS and one node safely handles 1,000 QPS, the math says 40 nodes. Add headroom so the fleet survives deploys, failures, and uneven load.
Finished this lesson?

Mark it complete to track your progress through the workbook.