Deep Neural Networks for YouTube Recommendations videos
Part 1
YouTube, represents one of the largest scales and most sophisticated, industrial recommendation systems in existence. In this paper, we describe the system at a high level and focus on the dramatic performance improvements brought by deep learning. The paper is split according to the classic two-stage information, retrieval dichotomy.
first, we detail a deep candidate generation model and then describe a separate deep ranking model. We also provide practical lessons and insights derived from designing, iterating, and maintaining a massive recommendation system with enormous user-facing impact.
YouTube, is the world’s largest platform for creating, sharing, and discovering video content. YouTube recommendations are responsible for helping more than a billion users discover personalized content from an ever-growing corpus of videos. In this paper we will focus on the immense impact deep learning has recently had on the YouTube video recommendations system.
YouTube videos are extremely challenging from three major perspectives:
Scale: Many existing recommendation algorithms proven to work well on small problems fail to operate on our scale. Highly specialized distributed learning algorithms and efficient serving systems are essential for handling YouTube’s massive user base and corpus.
Freshness: YouTube has a very dynamic corpus with many hours of video are uploaded per second. The recommendation system should be responsive enough to model newly uploaded content as well as the latest actions taken by the user. Balancing new content
with well-established videos can be understood from an exploration/exploitation perspective.
Noise: Historical user behavior on YouTube is inherently difficult to predict due to sparsity and a variety of unobservable external factors. We rarely obtain the ground truth of user satisfaction and instead model noisy implicit feedback signals. Furthermore, metadata associated with the content is poorly structured without a well-defined ontology. Our algorithms need to be robust to these particular characteristics of our training data.