Blob Storage + Presigned URLs
Let clients upload/download large files directly from object storage, bypassing your servers.
Large binary files — images, videos, PDFs, backups — do not belong in your database or in your application server's memory. Store them in object storage (S3, Azure Blob, GCS) and let clients upload and download the bytes directly using short-lived presigned URLs. Your servers stay on the thin coordination path; the heavy data never touches them.
The problem: your server becomes the bottleneck
Imagine the "obvious" design where a user uploads a 2 GB video by POSTing it to your API, and your API forwards it to S3:
client ──(2 GB)──▶ app server ──(2 GB)──▶ S3
│
└─ holds the whole stream while it relaysEven though it "works", this design quietly destroys your fleet:
- Bandwidth doubling: every uploaded byte is received and re-sent by your server, so a 2 GB upload burns ~4 GB of server network. Your egress bill and NIC saturate fast.
- Memory & connection pressure: a slow client on a phone can hold a server thread/connection open for minutes. A few thousand concurrent uploads exhaust your connection pool and RAM, and requests for other endpoints start timing out.
- Head-of-line blocking: the box is busy shovelling bytes instead of serving cheap API calls. One viral upload event can choke an entire service.
- It doesn't scale horizontally for free:you'd have to over-provision app servers purely to act as a dumb pipe — work S3 already does, globally, for cheap.
Presigned URLs: how the offload works
A presigned URL is a normal object-store URL with a cryptographic signature baked into the query string. Your server holds the secret credentials and signs a request on the client's behalf; the signature encodes exactly one operation (e.g. PUT this one key), an expiry, and often constraints like max size or content type. The client then talks straight to S3 — S3 verifies the signature and serves the request as if your server made it.
PUT https://bucket.s3.amazonaws.com/uploads/u_42/clip.mp4
?X-Amz-Algorithm=AWS4-HMAC-SHA256
&X-Amz-Credential=... // which key signed it
&X-Amz-Date=20260610T070000Z
&X-Amz-Expires=300 // valid for 5 minutes
&X-Amz-SignedHeaders=host;content-type
&X-Amz-Signature=9f86d0... // HMAC over the above- Least privilege: the URL grants exactly one verb on one key. It cannot list the bucket or touch other objects.
- Short TTL: minutes, not hours. A leaked URL expires quickly, and it can only do the one thing it was signed for.
- No client credentials: the browser/app never sees your S3 keys — only a disposable signature.
The upload lifecycle: create → upload → finalize
A single direct upload creates a classic consistency gap: the file lands in S3, but your database doesn't automatically know about it — and a client can always drop off mid-upload. The fix is a three-step flow backed by an explicit state machine in your database.
1. POST /uploads → server inserts row {status: "pending"},
returns presigned PUT url (5-min TTL)
2. PUT bytes ─────────────▶ S3 (direct, bypasses app servers)
3. POST /uploads/:id/finalize → server verifies the object exists,
flips row to {status: "ready"}The database state machine
Model the row's lifecycle explicitly so you never serve a half-written file and never leak storage:
| Status | Meaning | Set when |
|---|---|---|
| pending | Row reserved, key chosen, URL issued | On POST /uploads |
| uploaded | Bytes are in S3, not yet validated | On finalize / S3 event |
| ready | Validated & safe to serve | After size/type/scan checks pass |
| failed / orphaned | Upload never completed or failed checks | By a sweeper job after TTL |
Verifying completion — two ways
- Client-driven finalize: the client calls
finalize; the server issues aHEADon the object to confirm it exists and to read its real size/ETag before flipping toready. Simple, but relies on the client making the call. - Event-driven (more robust): configure an S3 event notification (→ SNS/SQS/Lambda) that fires on
s3:ObjectCreated:*. A worker consumes the event and marks the rowready— so completion is recorded even if the client crashes right after the upload. This is the event-driven cousin of finalize.
POST /uploads as truth. Read the real size/ETag from S3 on finalize, re-check the content-type, and run a virus/content scan before marking ready. The presign can also be constrained with a content-length range and content-type condition so S3 rejects oversized or wrong-type uploads at the door.Cleaning up orphans
Clients disappear: closed tabs, dead batteries, cancelled uploads. Rows stuck in pending past their TTL — and the stray S3 objects or incomplete multipart uploads behind them — are garbage. Two janitors keep storage clean:
- A periodic sweeper job deletes
pendingrows older than, say, 24h and removes any partial object. - An S3 lifecycle ruleauto-aborts incomplete multipart uploads after N days, so half-uploaded parts don't silently accrue cost.
Multipart upload: big files, in parallel, resumable
A single PUTis fine up to a few GB, but it's all-or-nothing: lose the connection at 99% and you start over. Multipart upload splits one object into many independent parts that upload in parallel and can be retried or resumed individually. S3 stitches them back into one object at the end.
The multipart API surface
| Operation | What it does | Returns |
|---|---|---|
| CreateMultipartUpload | Opens a session for one key | An UploadId |
| UploadPart | Uploads one chunk (PartNumber + UploadId) | An ETag for that part |
| ListParts | Lists parts already received for an UploadId | Parts + their ETags |
| CompleteMultipartUpload | Assembles parts (ordered list of {PartNumber, ETag}) | The final object |
| AbortMultipartUpload | Discards the session and all its parts | — |
1. POST /uploads/multipart → server: CreateMultipartUpload
returns { uploadId, key }
2. For each 8 MB chunk:
GET /uploads/:id/part-url?n=3 → server presigns UploadPart url for part 3
PUT chunk ─────────────────▶ S3 → responds with ETag "a1b2..."
(parts 1..N upload in parallel; client collects each ETag)
3. POST /uploads/:id/complete
body: [{n:1,etag},{n:2,etag},...] → server: CompleteMultipartUpload
→ object assembled, row → readyThe mechanics that matter:
- Part size & count limits:each part is 5 MiB minimum (except the last) up to 5 GiB, with at most 10,000 partsper object (so up to a 5 TiB object). Pick a part size that keeps you under 10,000 parts for the largest file you support.
- ETags are the receipt: each
UploadPartreturns an ETag (the part's MD5). You must send the exact {PartNumber, ETag} list back inCompleteMultipartUpload; S3 validates it and rejects mismatches, guaranteeing integrity. - Parallelism = throughput:uploading 6–10 parts at once saturates the client's uplink far better than one serial stream, and a single flaky part retries without restarting the whole file.
Resumable & reliable transfers
Connections die — Wi-Fi drops, trains enter tunnels, laptops sleep. The whole point of multipart (up) and Range requests (down) is that a dropped connection costs you one chunk, not the whole file.
Resuming an upload
The UploadId is the resume token. Persist it (client-side and/or in the pending row). After a disconnect, the client reconnects and asks the server to ListParts for thatUploadId — S3 reports which parts already arrived (with their ETags). The client simply uploads the missing parts and then completes:
reconnect
→ ListParts(uploadId) // S3: parts 1,2,3,5 present
→ re-upload only part 4
→ CompleteMultipartUpload([1,2,3,4,5]) // done — no re-sending 4 GBResuming a download
Downloads resume with the HTTP Range header — no special API needed. The client tracks how many bytes it has written to disk; on reconnect it asks only for the rest:
GET /clip.mp4
Range: bytes=1048576- // "send me from 1 MiB onward"
206 Partial Content
Content-Range: bytes 1048576-2097151/2097152
Accept-Ranges: bytes // S3 advertises range supportThe same Range mechanism powers video seeking and adaptive streaming: the player fetches only the byte ranges it needs to start playback.
Serving downloads: presigned vs. CDN
Reads have the same "keep the server off the data path" goal, but caching changes the calculus:
| Mechanism | Best for | Caching |
|---|---|---|
| Presigned GET | Private, per-user files (invoices, your own backups) | Poor — the unique signature makes each URL uncacheable |
| CDN + signed cookies / signed URLs | Large or popular content (videos, public images) | Great — edge caches the object; signature gates access |
| Stable public URL behind CDN | Truly public assets (logos, CSS, thumbnails) | Best — fully cacheable, no signing overhead |
Trade-offs & gotchas
- CDN cacheability vs. privacy:a unique presigned URL per request can't be shared by a cache. Use signed cookies or stable URLs when you want edge caching.
- CORS: browser direct-to-S3 uploads need a CORS policy on the bucket allowing your origin and the
PUT/headers — a classic first-time stumbling block. - Clock & TTL: signatures are time-bound; large slow uploads can outlive a too-short TTL. Size the expiry to the realistic transfer time (and prefer multipart, where each part URL is short).
- Cost discipline: incomplete multipart uploads cost money until aborted — always set a lifecycle rule to reap them.
- Proxying bytes through your app server doubles bandwidth and ties up memory/connections — take the server off the data path.
- Presigned URLs grant one short-lived, least-privilege operation so clients transfer bytes straight to/from object storage.
- Back uploads with a DB state machine (pending → uploaded → ready) and verify completion via finalize or an S3 event; sweep orphans.
- Multipart upload (CreateMultipartUpload → UploadPart → CompleteMultipartUpload, with AbortMultipartUpload/ListParts) gives parallelism, retries, and resume; ETags guarantee integrity.
- Resume uploads via ListParts + the UploadId; resume downloads via HTTP Range requests. A dropped connection costs one chunk, not the whole file.
Mark it complete to track your progress through the workbook.