The Future is Video: Why Understanding YouTube’s Architecture Matters for Devs.
Learn its Design & Build the Next Big Video Platform.
YouTube is one of the most popular applications that stream videos.
Billions of users worldwide use YouTube, which demands high availability, performance, scalability, reliability, resiliency, etc.
Storage and bandwidth estimation are high for YouTube, which means YouTube would require more servers and ensure videos consume less bandwidth without impacting speed, performance, and clarity.
When an end user tries to access YouTube, it initially uses its Google Cloud DNS to divert the users appropriately, which initially hits the webserver ( currently YouTube uses Google ESF ).
Load balancers are used to distribute the load across servers. The requests are routed to the individual microservices using API gateway services. Segregating web servers and application servers and running them as individual microservices will help with scalability, resiliency, and others
When a user tries to upload a video, it uses the upload microservices, which uploads the video to a temporary storage, encoder services pick up the video to convert to various formats, and resolutions which could be played back on multiple devices.
Also, in parallel, the video metadata is stored in the database. The converted video is stored on object storage like Google Cloud storage.
The most viewed videos are stored in the CDN, edge location cache closer to the end user location, which can improve the overall performance and experience of watching videos.
Caching is implemented across various levels of architecture, including web and app servers. A distributed cache is critical to ensure no single point of failure
Thumbnails are generated and stored Google File System
ISP provider’s colocation sites are used for storing some of the famous videos for faster delivery
NoSQL and SQL databases are combined; NoSQL DB has metadata stored. Sharding and replication are adopted to help with the growth, scalability, and disaster recovery scenarios.
Using the keywords and ranking algorithm, the appropriate videos are fetched
A robust observability system is in place to ensure the system meets the SLAs and provides service continuity.
According to the CAP theorem, high availability means strong consistency will take a hit, but for YouTube, eventual consistency will be fine since slight variation for a temporary duration on total likes or comments will not have a considerable impact.
YouTube recommendation engine uses machine learning to provide video recommendations to its users based on their profile, likes, content ranking, and other factors.
The adaptive streaming algorithm is adopted to serve videos to users based on their bandwidth. Depending on the speed, the quality of the chunk is varied.
Vitess is used to overcome the challenges of MySQL scaling across multiple servers. It adopts a shard routing algorithm and encapsulates everything so app servers don’t feel the overhead.