DrawLintDrawLint.ai

Dropbox / File Storage — system design by AgileViper46

Lean No Hire

Reviewed by 6 specialized AI reviewers. Explore the diagram and the full per-section feedback below.

Loading diagram…

Hire SignalLean No Hire

There are meaningful correct instincts in the architecture, especially around storage separation and direct blob upload, which is a positive mid-level signal. But the missing end-to-end flow details for upload/download, weak API contracts for large files, and lack of credible scaling treatment against the stated 100M DAU assumption leave too many gaps in correctness and completeness.

✅ Good

Consistency tradeoff is explicitly stated

The design correctly calls out that the system can relax consistency in favor of availability, which is a reasonable NFR direction for cross-device file sync at large scale where temporary staleness is often acceptable.

✅ Good

Resumability is tied to fault tolerance

Mentioning restart/resume behavior for uploads and downloads shows awareness of a key reliability requirement for large files and unstable networks, which is important for user experience and operational robustness.

warning

NFRs are too vague and not measurable

Terms like 'available most of the times', 'highly scalable', and 'low latency' are directionally correct but not actionable. Add concrete targets such as availability SLA (for example 99.9% or 99.99%), sync propagation latency (for example p95 under a few seconds), and upload/download success targets so the design can be evaluated against the stated 100M DAU assumption.

warning

Consistency model is underspecified

Saying the system can trade some consistency for availability is a good start, but it should define what consistency users should expect. For example, specify eventual consistency for cross-device sync, read-after-write behavior for the uploading device, and conflict handling expectations when multiple devices update the same file.

info

Availability scope should be broken down by operation

Uploads, downloads, metadata operations, and sync notifications often have different availability expectations. Separating these helps make the NFRs more precise and easier to validate, especially for a large-scale storage system.

✅ Good

Covers the main storage nouns

The design lists the core domain entities needed for a file storage and sync system: Client, File, File Metadata, and Directory. These are relevant to upload, download, and organizing synced content.

warning

User/account entity is missing

A user-facing remote storage system needs a User or Account entity as a primary domain noun, since uploads, downloads, and cross-device sync all happen in the context of ownership and identity. Add a User/Account entity and relate files/directories/clients to it.

warning

Sync-specific entity is not represented

Automatic sync across devices usually requires a domain concept for tracking synchronization state, such as Device, Sync Session, or Change/Version. Client is close, but as written it is ambiguous and does not clearly capture the sync relationship between a user's devices and stored files. Add an explicit sync-relevant noun, or rename Client to Device if that is the intended entity.

warning

Basic traffic numbers are incomplete

You provided DAU and a daily ingest estimate, which is a useful start, but the section is missing rough QPS/throughput calculations. For this problem, convert the daily numbers into writes per second, read/download rate assumptions, and network bandwidth. For example, 100M uploads/day is about 1.2K uploads/sec on average before peak factors, and 10PB/day implies very large sustained ingress bandwidth. Adding average-to-peak multipliers would make the sizing much more methodical.

warning

Read capacity is too vague for auto-sync workload

Saying 'more read compared to write' is directionally reasonable, but at this scale it is not enough to validate capacity. Automatic sync across devices can multiply reads/downloads per uploaded file, so you should estimate a read:write ratio or average number of synced devices per user and derive download QPS and egress bandwidth. Without that, the design cannot be sized confidently against the stated 100M DAU assumption.

warning

Storage growth is only partially quantified

The 10PB/day figure is in the right order of magnitude given your assumptions, but the calculation stops too early. You should also project retained storage over time, including replication/erasure-coding overhead and metadata growth. For example, even a few months of retention at 10PB/day becomes enormous, so showing monthly or yearly storage growth would better demonstrate that the numbers are realistic.

warning

Upload API is too underspecified for large files

A single POST /files with an inline File payload is not a solid fit for files up to 50 GB. Large uploads need an explicit protocol such as multipart/chunked upload or a create-upload-session flow (for example: POST /files to create metadata and upload session, PUT /files/{id}/parts/{partNumber} for chunks, then POST /files/{id}/complete). Without this, retries, resumability, and partial failure handling are unclear.

warning

Download endpoint does not define how large file transfer works

GET /files/:id -> File is too vague for large downloads. For big files and multi-device sync, the API should clearly support streaming or ranged downloads, such as GET /files/{id}/content with HTTP Range support, or returning a signed download URL. This makes resume-after-interruption and partial reads possible.

warning

Sync changes endpoint does not cover practical sync behavior

GET /files?changed_since=<timestamp> is a reasonable starting point, but returning File[] suggests full file contents rather than change metadata. For sync, the endpoint should return file metadata and change state only (id, version, modified_at, deleted/tombstone status, size, checksum), so clients can decide what to upload or download. This is especially important at the stated scale and with large files.

warning

Missing update/delete operations for the primary file entity

The routes cover create and read, but not the rest of CRUD for files. Since sync requires tracking remote changes across devices, the API should include at least metadata update/versioning semantics and delete support (for example DELETE /files/{id}, possibly PATCH /files/{id} for metadata). Without delete/tombstone behavior, clients cannot fully converge state during sync.

info

Route structure mixes listing changes with file listing semantics

Using GET /files?changed_since=... is acceptable, but it overloads the base collection route without clearly separating 'list files' from 'list changes'. A cleaner structure would be something like GET /changes?since=token or GET /files/changes?since=token. This makes the sync contract easier to understand and evolve.

✅ Good

Direct upload path to blob storage

Having the upload agent send file contents directly to S3/Azure Blob is a solid high-level choice for large files. It avoids routing multi-GB payloads through the application tier, which improves scalability and reduces pressure on the upload service.

✅ Good

Metadata separated from file content

Storing file metadata in PostgreSQL while keeping file bytes in object storage is a correct architectural split. This keeps transactional metadata operations separate from bulk blob storage and supports upload/download flows cleanly.

✅ Good

Dedicated sync path is identified

Introducing a sync agent on the client side and a sync service on the backend shows awareness that automatic file sync is a distinct workflow from simple upload/download and needs its own components.

critical

Download path does not reach blob storage

The file reader service reads metadata from PostgreSQL, but there is no connection from the reader service or client to S3/Azure Blob for actually fetching file contents. As drawn, users can discover file metadata but cannot complete file downloads end-to-end. Fix by adding a path where the file reader service returns a signed blob URL or streams the object from blob storage to the client.

critical

Upload flow is not connected end-to-end

The upload service writes metadata to PostgreSQL, and the upload agent uploads directly to blob storage, but there is no connection showing how the client obtains the upload URI or how the upload service coordinates with the upload agent. Without that linkage, the upload flow is incomplete. Fix by connecting client -> SSL termination -> upload service, then upload service -> blob storage presign/URI generation, followed by client upload agent -> blob storage and a completion callback/metadata update.

warning

Several client components are orphaned from the backend

Mobile App, Website, and Desktop are shown, but only the sync agent is connected to SSL termination. There is no logical path from these user-facing clients to upload or download APIs, so the main file operations are not fully represented. Fix by explicitly connecting each client entry point, or a shared client layer, to the load balancer/API path for upload, download, and sync.

warning

Self-loop on sync service does not explain sync propagation

The sync service points to itself, which does not clarify how device changes are detected, stored, and delivered across devices. For automatic sync, the design should show a real flow between client agents, metadata/state storage, and the sync backend. Replace the self-loop with explicit interactions such as sync service reading/writing file state in PostgreSQL and notifying or polling clients for changes.

warning

No redundancy or horizontal scaling is shown for critical backend components

Given the stated assumption of 100M DAU, single instances of Upload service, File reader service, sync service, and PostgreSQL are a major operational risk. Even at high level, the design should indicate replicated stateless services behind the load balancer and a highly available data/storage layer. Fix by showing service fleets, multi-AZ deployment, and a replicated or sharded metadata store appropriate for the expected scale.

info

Metadata store choice may become a bottleneck at stated scale

A single PostgreSQL box is unlikely to comfortably handle metadata and sync state for 100M DAU, especially with frequent file change tracking across devices. The separation of metadata is correct, but the design should at least acknowledge partitioning, replicas, or a more horizontally scalable metadata architecture if this assumption is to be met.

Want this kind of feedback on your own design?

Draw your architecture for Dropbox / File Storage and get an instant hire/no-hire signal from 6 specialized AI reviewers — free to start.