Reviewed by 6 specialized AI reviewers. Explore the diagram and the full per-section feedback below.
Loading diagram…
There are meaningful correct instincts in the architecture, especially around storage separation and direct blob upload, which is a positive mid-level signal. But the missing end-to-end flow details for upload/download, weak API contracts for large files, and lack of credible scaling treatment against the stated 100M DAU assumption leave too many gaps in correctness and completeness.
Consistency tradeoff is explicitly stated
The design correctly calls out that the system can relax consistency in favor of availability, which is a reasonable NFR direction for cross-device file sync at large scale where temporary staleness is often acceptable.
Resumability is tied to fault tolerance
Mentioning restart/resume behavior for uploads and downloads shows awareness of a key reliability requirement for large files and unstable networks, which is important for user experience and operational robustness.
NFRs are too vague and not measurable
Terms like 'available most of the times', 'highly scalable', and 'low latency' are directionally correct but not actionable. Add concrete targets such as availability SLA (for example 99.9% or 99.99%), sync propagation latency (for example p95 under a few seconds), and upload/download success targets so the design can be evaluated against the stated 100M DAU assumption.
Consistency model is underspecified
Saying the system can trade some consistency for availability is a good start, but it should define what consistency users should expect. For example, specify eventual consistency for cross-device sync, read-after-write behavior for the uploading device, and conflict handling expectations when multiple devices update the same file.
Availability scope should be broken down by operation
Uploads, downloads, metadata operations, and sync notifications often have different availability expectations. Separating these helps make the NFRs more precise and easier to validate, especially for a large-scale storage system.
Covers the main storage nouns
The design lists the core domain entities needed for a file storage and sync system: Client, File, File Metadata, and Directory. These are relevant to upload, download, and organizing synced content.
User/account entity is missing
A user-facing remote storage system needs a User or Account entity as a primary domain noun, since uploads, downloads, and cross-device sync all happen in the context of ownership and identity. Add a User/Account entity and relate files/directories/clients to it.
Sync-specific entity is not represented
Automatic sync across devices usually requires a domain concept for tracking synchronization state, such as Device, Sync Session, or Change/Version. Client is close, but as written it is ambiguous and does not clearly capture the sync relationship between a user's devices and stored files. Add an explicit sync-relevant noun, or rename Client to Device if that is the intended entity.
Basic traffic numbers are incomplete
You provided DAU and a daily ingest estimate, which is a useful start, but the section is missing rough QPS/throughput calculations. For this problem, convert the daily numbers into writes per second, read/download rate assumptions, and network bandwidth. For example, 100M uploads/day is about 1.2K uploads/sec on average before peak factors, and 10PB/day implies very large sustained ingress bandwidth. Adding average-to-peak multipliers would make the sizing much more methodical.
Read capacity is too vague for auto-sync workload
Saying 'more read compared to write' is directionally reasonable, but at this scale it is not enough to validate capacity. Automatic sync across devices can multiply reads/downloads per uploaded file, so you should estimate a read:write ratio or average number of synced devices per user and derive download QPS and egress bandwidth. Without that, the design cannot be sized confidently against the stated 100M DAU assumption.
Storage growth is only partially quantified
The 10PB/day figure is in the right order of magnitude given your assumptions, but the calculation stops too early. You should also project retained storage over time, including replication/erasure-coding overhead and metadata growth. For example, even a few months of retention at 10PB/day becomes enormous, so showing monthly or yearly storage growth would better demonstrate that the numbers are realistic.
Upload API is too underspecified for large files
A single POST /files with an inline File payload is not a solid fit for files up to 50 GB. Large uploads need an explicit protocol such as multipart/chunked upload or a create-upload-session flow (for example: POST /files to create metadata and upload session, PUT /files/{id}/parts/{partNumber} for chunks, then POST /files/{id}/complete). Without this, retries, resumability, and partial failure handling are unclear.
Download endpoint does not define how large file transfer works
GET /files/:id -> File is too vague for large downloads. For big files and multi-device sync, the API should clearly support streaming or ranged downloads, such as GET /files/{id}/content with HTTP Range support, or returning a signed download URL. This makes resume-after-interruption and partial reads possible.
Sync changes endpoint does not cover practical sync behavior
GET /files?changed_since=<timestamp> is a reasonable starting point, but returning File[] suggests full file contents rather than change metadata. For sync, the endpoint should return file metadata and change state only (id, version, modified_at, deleted/tombstone status, size, checksum), so clients can decide what to upload or download. This is especially important at the stated scale and with large files.
Missing update/delete operations for the primary file entity
The routes cover create and read, but not the rest of CRUD for files. Since sync requires tracking remote changes across devices, the API should include at least metadata update/versioning semantics and delete support (for example DELETE /files/{id}, possibly PATCH /files/{id} for metadata). Without delete/tombstone behavior, clients cannot fully converge state during sync.
Route structure mixes listing changes with file listing semantics
Using GET /files?changed_since=... is acceptable, but it overloads the base collection route without clearly separating 'list files' from 'list changes'. A cleaner structure would be something like GET /changes?since=token or GET /files/changes?since=token. This makes the sync contract easier to understand and evolve.
Direct upload path to blob storage
Having the upload agent send file contents directly to S3/Azure Blob is a solid high-level choice for large files. It avoids routing multi-GB payloads through the application tier, which improves scalability and reduces pressure on the upload service.
Metadata separated from file content
Storing file metadata in PostgreSQL while keeping file bytes in object storage is a correct architectural split. This keeps transactional metadata operations separate from bulk blob storage and supports upload/download flows cleanly.
Dedicated sync path is identified
Introducing a sync agent on the client side and a sync service on the backend shows awareness that automatic file sync is a distinct workflow from simple upload/download and needs its own components.
Download path does not reach blob storage
The file reader service reads metadata from PostgreSQL, but there is no connection from the reader service or client to S3/Azure Blob for actually fetching file contents. As drawn, users can discover file metadata but cannot complete file downloads end-to-end. Fix by adding a path where the file reader service returns a signed blob URL or streams the object from blob storage to the client.
Upload flow is not connected end-to-end
The upload service writes metadata to PostgreSQL, and the upload agent uploads directly to blob storage, but there is no connection showing how the client obtains the upload URI or how the upload service coordinates with the upload agent. Without that linkage, the upload flow is incomplete. Fix by connecting client -> SSL termination -> upload service, then upload service -> blob storage presign/URI generation, followed by client upload agent -> blob storage and a completion callback/metadata update.
Several client components are orphaned from the backend
Mobile App, Website, and Desktop are shown, but only the sync agent is connected to SSL termination. There is no logical path from these user-facing clients to upload or download APIs, so the main file operations are not fully represented. Fix by explicitly connecting each client entry point, or a shared client layer, to the load balancer/API path for upload, download, and sync.
Self-loop on sync service does not explain sync propagation
The sync service points to itself, which does not clarify how device changes are detected, stored, and delivered across devices. For automatic sync, the design should show a real flow between client agents, metadata/state storage, and the sync backend. Replace the self-loop with explicit interactions such as sync service reading/writing file state in PostgreSQL and notifying or polling clients for changes.
No redundancy or horizontal scaling is shown for critical backend components
Given the stated assumption of 100M DAU, single instances of Upload service, File reader service, sync service, and PostgreSQL are a major operational risk. Even at high level, the design should indicate replicated stateless services behind the load balancer and a highly available data/storage layer. Fix by showing service fleets, multi-AZ deployment, and a replicated or sharded metadata store appropriate for the expected scale.
Metadata store choice may become a bottleneck at stated scale
A single PostgreSQL box is unlikely to comfortably handle metadata and sync state for 100M DAU, especially with frequent file change tracking across devices. The separation of metadata is correct, but the design should at least acknowledge partitioning, replicas, or a more horizontally scalable metadata architecture if this assumption is to be met.
Draw your architecture for Dropbox / File Storage and get an instant hire/no-hire signal from 6 specialized AI reviewers — free to start.