Reviewed by 6 specialized AI reviewers. Explore the diagram and the full per-section feedback below.
Loading diagram…
The candidate shows strong architectural intuition and correctly identifies the primary bottlenecks of a social media feed. The design is logically sound, and the identified issues are common growth-stage challenges that the candidate is well-positioned to address with further refinement.
Prioritization of Availability over Consistency
Correctly identifying that for a social feed, eventual consistency is acceptable to maintain high availability and system resilience under heavy load.
Clear Latency Targets
Setting specific p95 latency targets for both feed generation and media rendering provides a clear benchmark for evaluating the performance of the proposed architecture.
Missing Consistency Strategy for 'Read-Your-Writes'
While you prioritized availability, you also specified a requirement for 'read latest write for the person writing the data'. You need to define how you will achieve this (e.g., session affinity or read-from-primary for the author) without violating your availability-first constraint.
Lack of Scalability Strategy for Celebrity Accounts
With 500M DAU and potential celebrity accounts with millions of followers, a standard push-based feed model will lead to 'fan-out' issues. You should specify how you plan to handle the write amplification for high-follower accounts to maintain your <500ms latency target.
Core Domain Modeling
The identified entities (Users, Posts, Followers, Media) correctly capture the fundamental nouns required to support the functional requirements of posting, following, and feed generation.
Entity naming consistency
While the entities are correct, consider using singular nouns (e.g., User, Post, Follower, Medium) to align with standard domain modeling practices for entity definitions.
Logical Fanout Estimation
Correctly identifying the fanout write amplification (500 followers per post) as the primary driver for write throughput is a solid approach to understanding the system's bottleneck.
Missing Celebrity/Hotspot Handling
The calculation assumes a uniform 500 followers per user. With millions of followers for celebrities, a simple fanout will cause massive write amplification spikes. You should calculate the impact of 'celebrity' accounts separately to avoid system failure during high-profile posts.
Inconsistent QPS Calculation
You calculated 100M posts/day, which equates to ~1,157 writes/sec (100M / 86,400), but used 1.2K in your formula. While close, ensure your base throughput math is precise to avoid compounding errors in downstream storage and network bandwidth estimates.
Storage Calculation Transparency
The 190 PB/year figure is provided without a breakdown of average media size (photo vs. video). Explicitly stating the assumed average size per media type would make the capacity projection more verifiable.
Use of Pre-signed URLs
Offloading media uploads directly to object storage via pre-signed URLs is a standard, efficient pattern for high-scale media applications to reduce load on the application servers.
Pagination implementation
Including offset and limit parameters in the feed endpoint demonstrates an understanding of the necessity for pagination when handling large datasets.
Inconsistent REST resource naming
The endpoint 'POST /followers/:followerId' is semantically confusing. In REST, the resource being acted upon should be the target. It should be 'POST /users/:userId/follow' or 'POST /follows' with the target user ID in the body.
Missing authentication context
The API design does not account for an authenticated user context (e.g., via headers or tokens). The feed and follow endpoints must implicitly know 'who' is performing the action.
Offset-based pagination limitations
For a feed with 500M DAU, offset-based pagination can lead to performance degradation and duplicate items as new posts are added. Consider cursor-based pagination (e.g., using a timestamp or post ID) for better scalability.
Incomplete POST /posts response
The API definition should specify the structure of the returned 'Post' object to ensure clients can immediately render the created content without an additional GET request.
Efficient Media Handling
Using SAS URLs for direct client-to-storage uploads is an excellent choice for a system with 100M posts/day, as it offloads heavy traffic from your application servers.
Fan-out Architecture
Utilizing a message queue (Kafka) to decouple post creation from feed generation is the correct pattern for handling the fan-out requirements at this scale.
Celebrity Fan-out Bottleneck
The design uses a push-based fan-out for all users. With 500M DAU and celebrity accounts having millions of followers, this will cause 'hot partitions' and massive write amplification. You need a hybrid approach: push for normal users and pull-on-demand for celebrities.
Inconsistent Data Flow
The connection 'S3/Azure Blob -> CassandraDB [Upload complete]' is problematic. S3 should not trigger database writes directly. The 'Media upload service' or a dedicated 'Post completion' worker should handle the database record creation once the upload is verified.
Feed Cache Strategy
While you mentioned Redis, ensure the Feed Service implements a 'pre-computed feed' strategy for active users to ensure low-latency reads, rather than querying Cassandra directly for every feed request.
Draw your architecture for Instagram / Photo Sharing and get an instant hire/no-hire signal from 6 specialized AI reviewers — free to start.