Spotify / Music Streaming β system design by AgileViper46
Hire
Reviewed by 6 specialized AI reviewers. Explore the diagram and the full per-section feedback below.
Loading diagramβ¦
This is going to be my high level diagram users are going to be interacting with my load balancer api which is going to perform dot balancing rate limiting and auth for me and then it is those requests are going to be load balanced across my song service and the previous service both of them are interacting with a post trace sql which stores the song metadata and the actual song information or the blog chunks are stored in an azure blob and the sound service can provide a pre send url or sas url to the user and using the url the user can directly download or upload those songs into the blob
Deep Dive
Now coming to the deep dive part what we are going to do is like For adaptive betrayal streaming what we can do is That whenever a user wants to play a song first that request is going to go To my CDN servers so how this actually going to work is that whenever you want to play a song let's say 1 to 3 the client is going to request the chunk file for it The master file which would have the information about all the segments in that particular song and then try to download each segment on after another and how does the adaptive we thread swimming work is basically The client is going to monitor the network activity from the user's mobile device or the Web app and try to figure out what would be the best Precision of the Beatrice music can be sound like 128K or 256K etc and we can request for higher or lower quality depending on the bandwidth the round trip time latency etc So once the master file is got it will start asking for the next set of chunks and if those chunks are not available in the cdn the cdn is going to pull the information from the azure blog and then serve the user this will ensure that the information is cached once in the cdn during the first request and the information about the song as well as the chunks are kept as close as to the users with multiple cdn servers For the upload of the songs the users are going to communicate with the song service and the song service is just going to create an entry in the sql for this particular song meta data and it would mark its status as created and it would also create a resigned url in the azure blog for it and return it to the user Then the user are going to upload or break basically upload their song into the azure blog part by part i'm not going into detail because this is kind of a blog store design where you just track part by part progress and once every part is completed we would say that the song is completely uploaded this helps in receivable upload download etc and once that is done it would say a final entry point that the song has been completed The song service would mark this particular song is uploaded Now the only thing here After the song has been uploaded the azure blog in the azure blog it would be stitched together as a complete song and then we are going to have a transcoding and encoding layer which is going to Kind of compressed as well as break it into multiple chunks for different bitrates and store it junk into the blob once this transforming and Encoding is complete we can say that the song is ready to be streamed For days we can have a queue which whenever you get an upload complete status you just use the cdc workers to put an event into the Kafka queue and the encoding servers can take the event complete the encoding once the encoding is complete they can again put an event into the Kafka and then Cody was complete and the song service can mark the status as completed.
Now coming to the playlist side we would have a playlist service which are going to be you know directly created particular playlist into the postpress sql as a metadata and it would just have the reference to name what all songs it have and information And the playlist service would be responsible for the update create or delete of the playlist Now to scale this particular system for the search side what I'm thinking is that I would introduce an elasticsearch cluster this elasticsearch cluster would be in sync with the R database using cdc workers whenever there is a change in the data about a song or any information about the playlist etc it would be captured and then it would be updated in the elastic search and our search service would be responsible for handling the search routes for the playlist or the songs now the one thing to call out explicitly here that this is going to be eventually consistent manner and this is fine for the search service because we want low latency and higher availability over the consistent search even if a song is uploaded and if it's visible after 3 to 5 seconds it's not an issue for this particular system similarly if a playlist has been created but it's not searchable just after 5 to 10 seconds then it's not that big of a deal for now Now think here is the another bottleneck would be that a single post best sql cannot handle 1 billion songs with multiple playlists and everything . To handle this The Porsche primary needs to be splitted basically sharded So that the rights can be distributed And every bosses primary I'm going to add two particular posters replica for backup and the reads can happen from the replicas This way we can scale our reads as well I'm also going to introduce a radius clash cluster Which would be helpful in surveying the song metadata and playlist metadata for a frequently asked information The couple of things that we need to take care in this particular thing is what happens about the song information or the song data ideally it generally does not change very much once a song is created and uploaded it stays as it is for its lifetime so we can have a long ttl for it But for the metadata information or any hot song which is being streamed the most and if Information expires in cdn and all the requests now tries to pull the same information from our blob it would be a thundering hard problem so what I am going to do is I am going to ensure that we can do a request collarizing and only one request goes to the back end while the other waits And also we can do a jitter based ttl expiry so that the every chunk and every information every song does not expire at the same time it expires at a random interval so whenever you want to fetch the information it would not be a bursty traffic going to the back end system
Now coming to the offline download The download is actually going to happen directly from the CDN So it would be same as trimming the music we do it in parts and we download a chunk and then play that chunk It would be same as trimming but what we can do in the download weather like if it's being requested as a download we can just try to get the highest bit rate song for it and make sure that they download it The song surveys are the hot parts is not going to be responsible for the download of the assets everything is going to be just like streaming it would be getting all the chunks and the clients had application is going to make sure that it stores all the chunks and all the information about the header segment file instead of a temporary storage it can store into the phones file system or the clients file system and then it can be played from there
For the postgraduate sql we would be sharding the data so let's say we have a songs metadata table as well as a playlist metadata table so we would shard the data based on the song ids we can create a hash of the song id and based on it we can shard the particular data based on a consistent ring formation we can use devisium or other softwares like that to handle this And so when a particular primary goes down the rights might get affected only for those particular songs and playlists ids which exist on that chart and it would not bring the complete system down and we already have two replicas which would be doing a failover during this time and we might see a small amount of interruptions during the right path because the right will be rejected during that point in time and this system is a read heavy system and writes are very less and I am we are allowing those fall at all fault to go during the right part to maintain the consistency and once our secondary gets promoted everything would flow back again
Want this kind of feedback on your own design?
Draw your architecture for Spotify / Music Streaming and get an instant hire/no-hire signal from 6 specialized AI reviewers β free to start.