How does Netflix know what you want to watch before you do?

Length:

8 min

Published:

April 29, 2025

Did you know that Netflix has a huge team of researchers and that up to 80% of what you watch on Netflix is influenced by their system for recommending titles? Have you ever wondered how the recommendation system works?

The recommendations you see are the result of powerful recommendation models. Originally, each section - e.g. "Continue playing" and "Next time don't miss" had its own model that took data from the same sources as the others but each model was trained separately. Maintaining and improving the individual models was becoming increasingly difficult and expensive.

This year, Netflix is starting to move towards a unified and comprehensive system - building a powerful foundation model that understands user behavior and preferences and can share that data across all recommendation systems.

From many models one supermodel

Originally, Netflix had a bunch of smaller models, each trained separately. For example, one remembered what you liked in action movies, another recommended shows that were popular. But the models didn't communicate with each other. This caused problems especially during updates and when the models needed to be upgraded.

Netflix's new approach is inspired by the workings of large language models (= large language models, LLMs for short). Instead of building lots of small models, they now build one big one that understands your tracking habits as a whole. This model can then help other systems by sharing what it has learned - either directly or through reusable embeddings.

Tokenization = Converting tracking habits into tokens

Netflix is a professional stalker. It watches your every interaction: what you watch, for how long, what you skip, even on what device and in what language. But raw (unlabeled) data alone is not enough. Netflix therefore converts these (inter)actions into tokens (tokens) - units of behavior, such as "Watched Stranger Things for 40 minutes on my phone tonight".

The model is fed these tokens to learn how users behave over time. This is where the next challenge comes in - users do a lot of things. So Netflix has to find a way to decide how much detail to retain, but at the same time, it has to make sure that the data is processed quickly.

A model learns as a person, not just as a machine

As we mentioned, Netflix took inspiration from LLMs that predict the next word or token. But Netflix wants to predict the next action a user might take. But there are lots of actions, so they have to give them different weights - watching a full movie, for example, has more weight and meaning than watching a trailer that's 3 minutes long. So the model learns to perceive what is important and this allows it to better recommend shows that you might like.

Solving the "new show" problem

When a new movie or series comes out and no one has seen it yet, how can Netflix start recommending it?

They are trying to address it in these two ways:

Incremental training (incremental training) - New titles are assigned embeddings (one could say initial data) based on similar existing titles in the database and are gradually ranked based on real interactions from users.
Metadata - Even though no one's seen the show yet. The model knows the genre, the language, the atmosphere and can use this information to judge where best to place it.

That way, brand new shows can appear in your recommendations on day one. From now on, however, shows will be ranked according to how users interact with them.

Embeddings: the secret ingredient

Embeddings are like digital fingerprints of each show, user or genre. They capture subtle patterns of behaviour and preferences. These vectors are then shared with other Netflix tools - for example, to find similar shows, predict your next watch, or personalize your homepage.

But there's a catch: the embeddings change every time the model is re-trained. Netflix therefore uses special mathematical transformations to take old embeddings and transform them into new ones, so that these vectors remain as stable as possible, and other systems can continue to work with them.

Conclusion

Netflix's goal is that you ideally don't have to search for anything at all. They're constantly trying to discover things for you and take your preferences into account. Preferences that are formed based on how you behave on Netflix but also on how users with similar histories behave.

Their "foundation model" represents a significant step towards creating a single system instead of many small tools. It is based on data centralization, inspired by the principles of LLMs and the use of embeddings.

The model learns better, adapts faster and provides better recommendations. Just as large language models have changed the way text works, this approach can transform the way recommender systems work. What does this mean for us? More accurate recommendations and more shows we can actually watch and want to watch without having to search for them.

Sources:

TL;DR of the most commonly used AI terms - Getting lost in the terminology of the AI world? Then this article is for you. We've put together the most searched and most used terms related to AI.
Let's talk about AI: #1 The yin and yang of AI - Discover the benefits and potential drawbacks of AI, including its impact on healthcare, education, the tech industry, job displacement, and security risks.
Let's talk about AI: #2 The Top 5 AI Tools for Technical Writers - Supercharge productivity: Jenni, Bearly, Fireflies, Synthesia, ChatGPT. Streamline writing, enhance efficiency, create videos, support language.
The Intersection of AI and Developer Experience - Artificial Intelligence (AI) has been making waves in various industries, and software development is no exception.

Back to insights

Want to stay one step ahead?

Don't miss our best insights. No spam, just practical analyses, invitations to exclusive events, and podcast summaries delivered straight to your inbox.