Challenge Drills Library Drawing Guide Learn AI Setup Guide Support About

PasteBin — system design by AgileViper46

Hire

Reviewed by 6 specialized AI reviewers. Explore the diagram and the full per-section feedback below.

Loading diagram…

Hire SignalLean No Hire

The candidate has the right broad building blocks and some good instincts, but the design is too incomplete in critical correctness and operational areas for a senior-level system design round. The missing end-to-end flows, unclear failure handling, and weak connection between assumptions and infrastructure choices make it hard to trust the design in production.

✅ Good

Correctly prioritizes availability for reads

Calling out that read availability matters more than strict consistency is a sensible NFR choice for a paste-sharing system. It shows awareness that temporary staleness is usually less harmful than making shared pastes unavailable.

✅ Good

Identifies uniqueness as a consistency requirement

Stating that each paste URL must map to exactly one unique paste captures the key correctness property in this system. That is the main place where stronger consistency matters, and it is good that the candidate separated it from the read path.

warning

NFRs are qualitative but not measurable

Have you considered what happens when the team needs to decide whether the design actually meets the goals? 'Availability >> consistency' and 'scalability for 1M DAU' are directionally correct, but without concrete targets like read latency, write latency, or availability objectives, these NFRs do not drive clear design trade-offs.

warning

Scalability target is not connected to the stated assumptions

Have you considered translating 1M DAU, 1 paste per user per day, 10:1 read/write, and 5MB max paste size into expected request rates and storage growth? Without tying the NFRs to those assumptions, the scale requirement stays too abstract to justify whether the system needs simple horizontal scaling or more aggressive partitioning and caching.

warning

Consistency model is only partially defined

What happens after a paste is created or deleted if different replicas disagree temporarily? You identified strong uniqueness for URL generation, but the read path consistency expectations are still vague. It would be stronger to explicitly say whether reads can be eventually consistent after create/delete/expiry and what user-visible behavior is acceptable.

info

Expiry behavior should be reflected in the NFRs

You could improve this by stating the expected freshness of expiry enforcement. For example, if a paste expires, is it acceptable for some readers to see it for a short time due to eventual consistency or cache lag? That would make the availability-versus-consistency trade-off much clearer.

✅ Good

Core nouns for the main flow are present

The design identifies the two primary domain entities the system revolves around: User and Paste. For the stated requirements, those are the essential nouns needed to model create/read/delete and ownership of pastes.

✅ Good

Ownership relationship is implied in the model

Including userId on Paste shows the intended 1:N relationship from User to Paste, which is the key relationship for tracking who created a paste and supporting user-scoped operations.

warning

Shareable URL mapping is underspecified

Have you considered what happens when a paste is accessed via its shared URL? The requirements say pastes are shared via URLs, but the entity model does not make it clear whether pasteId itself is the public token or whether there is a separate share key/slug. Without being explicit about that relationship, the read path by URL is ambiguous.

warning

Deletion and expiry lifecycle are not clearly modeled

Have you considered what happens when a paste is expired or deleted but still referenced by a URL? You have status and expiration on Paste, which is a good start, but the entity relationships around lifecycle are not really defined. As an interviewer, I would push on whether deleted and expired are just states on Paste or whether there is any separate tombstone/retention concept needed to make reads behave consistently.

info

Blob relationship could be made more explicit

You could improve this by stating the relationship between Paste and the stored content more clearly. blobUri implies the paste metadata points to external content, but calling out that this is a 1:1 relationship between a Paste record and its content object would make the domain model easier to reason about for the core read path.

✅ Good

Basic traffic and bandwidth estimates are present

The candidate does translate the stated assumptions into rough daily writes, reads, peak RPS, and network throughput. That shows the right instinct to ground the design in numbers rather than discussing infrastructure abstractly.

warning

Storage sizing stops at daily ingest

Have you considered what happens as pastes accumulate over time? The calculation gives 5TB/day, but without converting expiry assumptions into retained data volume, there is no way to size persistent storage. If many pastes live for weeks or months, the system footprint grows very quickly and could overwhelm the chosen storage layer.

warning

Peak request math is not tied cleanly to the workload

Have you considered separating write and read peaks explicitly? From 1M writes/day and 10M reads/day, the average rates are much lower than the stated peak, but the jump to 1200 RPS is not clearly derived for total traffic versus reads only. At senior level, I would expect a clearer chain from DAU to write QPS, read QPS, and then peak multipliers so downstream sizing is easier to trust.

warning

Server count is not justified by request cost

What happens when requests are not uniform? '20 servers' is derived from a 1 second response assumption, but that does not explain how much CPU, memory, disk, or network each server can actually handle. For a paste service with objects up to 5MB, throughput is often constrained by I/O and bandwidth more than raw request count, so server sizing needs to be tied to per-node capacity rather than latency alone.

warning

Read and write bandwidth assume every paste is 5MB

Have you considered the impact of average object size versus max object size? Using the 5MB limit for every paste is acceptable as a worst-case upper bound, but then the rest of the architecture must be sized for that worst case. If this is meant as a realistic estimate, it likely overstates steady-state bandwidth and storage by a large margin. A senior-level answer should call out whether this is worst-case planning or expected average load.

info

Missing capacity chain beyond traffic numbers

You could improve this by extending the calculation from DAU and RPS into retained storage, metadata volume, and any cache hit assumptions. Right now there are isolated numbers, but not a full chain from users to infrastructure sizing.

info

Component choices are not justified by the calculated scale

You could improve this by connecting the numbers to infrastructure decisions: for example, whether this scale needs object storage for paste bodies, how much metadata a database must hold, and whether caching meaningfully reduces the 10:1 read load. The calculations should help explain why each major storage choice fits the workload.

✅ Good

Two-step upload/download flow keeps API lightweight

Returning metadata plus a pre-signed URL for the actual paste content is a sensible API choice for up-to-5MB payloads. It avoids forcing the application API to proxy file bytes and gives clients a clear separation between metadata operations and content transfer.

✅ Good

Core create, read, and delete operations are covered

The routes do map to the stated functional requirements at a basic level: create a paste, fetch a paste, and delete a paste. The read path also supports shareable URLs through a stable paste identifier.

warning

Create flow is underspecified and may leave orphaned objects

What happens if the client calls POST /pastes, receives a pre-signed upload URL, but never uploads the content or the upload only partially succeeds? As written, the API returns PasteMetadata before the paste exists as a usable resource, which can leave dangling metadata or empty objects. You could improve this by making the flow explicit: either create a pending paste and require a finalize call after upload, or document that the object store upload itself creates the authoritative content and metadata is only committed after successful upload.

warning

GET semantics are unclear for expired or missing pastes

Have you considered what the client sees when a paste has expired versus when the pasteId never existed? Right now only DELETE mentions 404, but the read path needs clear behavior too. Without explicit status codes, clients cannot distinguish 'not found', 'expired', and 'temporarily unavailable'. Define the response contract for GET /pastes/{pasteId}, for example 200 with metadata, 404 for unknown/expired if you want to hide existence, or a structured error body if you want clients to handle expiry differently.

warning

Error model and retry guidance are missing

What happens when pre-signed URL generation fails, the object upload/download is rejected, or the client retries a timed-out POST /pastes request? Without status codes and a consistent error shape, clients will struggle to implement safe retries and user-facing errors. You could improve this by defining standard responses such as 400 for invalid TTL, 413 for payload too large, 429 for rate limiting, 5xx for transient failures, and an error body with code/message/retryable fields.

info

Size limit is a requirement but not reflected in the API contract

You state a 5MB paste limit, but what does the client see if it tries to upload 8MB to the pre-signed URL? Relying only on storage-layer rejection makes the API harder to use. You could improve this by returning maxSizeBytes in the create response and documenting that oversized uploads fail with 413 or equivalent storage error semantics.

info

Resource design around preSignedUrl is not very clean

Have you considered making the upload/download URL generation part of the paste resource rather than separate generic endpoints like GET /preSignedUrl and POST /preSignedUrl? As written, those routes are ambiguous because they are not scoped to a specific paste or operation. A cleaner contract would be something like POST /pastes to create an upload session and GET /pastes/{pasteId} to return a download URL, or nested routes such as /pastes/{pasteId}/content.

info

Response payloads need more concrete fields

What exactly is in PasteMetadata besides preSignedUrl? Clients usually need at least pasteId, expiration timestamp, and possibly content type or size to use the system reliably. Without those fields, even basic flows like sharing a URL or showing expiry are underspecified. You could improve this by defining explicit response schemas for create and read.

✅ Good

Direct blob upload/download keeps app tier light

Using pre-signed URLs so clients transfer paste content directly with object storage is a solid choice for the stated 5MB payloads and 1M DAU scale. It avoids turning the Paste service into a bandwidth bottleneck and lets the application tier focus on metadata, auth, and lifecycle management.

✅ Good

Async cleanup for expiry and incomplete uploads

Separating blob cleanup into a worker and using an outbox/CDC-style deletion flow is a thoughtful design choice. It acknowledges that metadata deletion and blob deletion are different failure domains and avoids blocking user-facing requests on storage cleanup.

✅ Good

Short ID generation is handled centrally in the service layer

Generating unique paste IDs in the service and encoding them to short Base62 identifiers fits the URL-sharing requirement well and avoids depending on the database for ID generation on the hot path.

warning

Read path is not fully connected end-to-end

What happens on a cache miss for GET /pastes/{id}? The diagram shows Paste service reading Redis, but there is no explicit flow from the service to Postgres or replicas for metadata lookup on misses. Without a clear fallback path, the read flow does not fully complete for uncached pastes.

warning

Replica and Redis relationships are architecturally unclear

Have you considered what happens when Redis is cold or unavailable? The arrows show Redis reading from replicas, which is backwards from a normal application flow and makes it unclear whether replicas are actually serving application reads. This ambiguity matters because the first bottleneck at the stated read-heavy workload is likely metadata lookup, and the design should clearly show whether reads come from cache, primary DB, or replicas.

warning

Create flow can leave dangling metadata without a clear completion state

What happens when the client gets a pre-signed upload URL, metadata is inserted, but the upload never completes or partially uploads? You mention 'stuck pastes' and cleanup, which is good, but the HLD does not show a clear state transition such as PENDING -> ACTIVE after successful upload. Without that, reads may return metadata for blobs that do not exist yet, and cleanup logic becomes the only correctness mechanism.

warning

Expiry handling is pushed onto the synchronous read path

Have you considered what happens when many expired pastes are accessed repeatedly? The design says GET checks expiration, deletes DB state, and marks blob deletion during the request. That makes a user read trigger write-side work and background coordination, which can increase latency and create contention on hot expired keys. A periodic expiry sweeper plus lightweight read-time validation would keep the read path simpler and more available.

info

CDN usage is not well justified for private pre-signed object access

You could improve this by clarifying whether the CDN is actually serving public shared pastes or just proxying object storage. With pre-signed URLs, CDN cacheability can be limited depending on URL uniqueness and headers. If the intent is to accelerate hot shared pastes, explain the cache key and TTL strategy; otherwise the CDN may be an extra component without much benefit.

warning

Primary database still looks like a single failure and scaling point

What happens when Postgres primary is down? The design includes replicas, but writes for create/delete and likely cache-miss reads still depend on the primary, and there is no failover story shown. At this scale, a single primary is a reasonable starting point, but the HLD should acknowledge promotion/failover and how the service behaves during primary loss.

info

Delete semantics between metadata and blob could be made more explicit

You could improve this by showing whether DELETE is soft delete in metadata first and asynchronous blob removal later, or a hard delete attempt inline. Since reads prioritize availability, explicit tombstoning would make failure handling clearer: if blob deletion fails, the paste should still remain logically deleted and not reappear.

Want this kind of feedback on your own design?

Draw your architecture for PasteBin and get an instant hire/no-hire signal from 6 specialized AI reviewers — free to start.

Get your free review See more PasteBin designs