People move constantly between platforms and screens – watching, zapping and scrolling throughout the day – but how can we measure the number of individuals who are in front of these different screens?
When we talk about cross-media measurement, there is one concept that gets to the heart of this challenge: deduplication of audiences.
In simple terms, deduplication means identifying how many unique people were reached, rather than simply counting impressions across different platforms. Think of it like counting guests at a party: if some guests visit multiple rooms, you want to know how many unique people were there, not just how many times someone appeared in a room.
Typically, deduplication is considered at a platform level: how can we calculate overlap between YouTube and linear TV, or between Netflix and Prime Video, or Disney+ and linear TV?
To fully capture today’s complex media landscape, however, deduplication must occur at multiple levels:
- Between platforms
This is the most often discussed: measuring the overlap between platforms such as linear TV, YouTube, Meta, Prime Video, Disney+ or TikTok. Understanding this overlap helps marketers allocate budgets and manage frequency across platforms. - Within platforms
Many global platforms include multiple user-facing environments or sub-brands. For example, Amazon includes Prime Video and Twitch, while Meta includes Facebook, Instagram and WhatsApp.
A single person may use several of these environments – or only one. Due to privacy safeguards, platforms cannot always identify the same person across these endpoints unless users explicitly opt-in.
As a result, deduplicating audiences within a platform is not straightforward, yet it can significantly impact reach estimates. - Between devices, accounts and people
Consider shared devices and accounts. Some platforms have a direct one-to-one relationship between people and accounts – social media platforms are a good example. Others, like streaming services, often operate in a one-to-many environment, where multiple people share the same account or device.
This is often called co-viewing, but it is simply another form of deduplication: figuring out who saw the content or ad behind a single account or device.
Together, these three layers create a complex measurement problem. Yet many cross-media solutions primarily focus only on the first layer – platform-level deduplication –leaving gaps in understanding real audience reach.
Why this presents a challenge
- If I’m seeing the same campaign across Facebook, YouTube and linear TV, how can the advertiser be sure I’m not counted three times?
- If I’m a streamer, how can I understand how many people are watching – beyond the account holder?
- If I’m a broadcaster with multiple linear and digital channels, how can I understand my true audience, rather than simply adding up reach across channels?
Often, these relationships cannot be measured directly. Platforms may instead rely on modelling, but this can only go so far.
How to calculate deduplication
There are broadly two approaches:
Deterministic measurement: This involves measuring through direct observation, typically via panels. Because the same individuals are observed across platforms and devices, we can directly calculate the number of unique individuals. Panel members are continuously measured, meaning their devices, exposure and viewing behaviour can be tracked over time within a single, consistent framework.
Probabilistic modelling: Here, duplication is estimated using statistical models. One of the most widely used is based on Conditional Independence, often called the Sainsbury model. There are two variations of this: naïve models, where assumptions estimate the probability of overlap; and trained models, where real data informs the model.
Most cross-media solutions rely heavily on probabilistic modelling, especially for platform-level deduplication. But modelling without strong real-world data has limitations. This is why panels remain essential – providing a ground truth based on real people and real behaviour.
Panels and deduplication
Panels play a critical role in deduplication. They capture real media consumption at both individual, household and device levels – measuring what real people watch in their everyday lives – in a privacy safe way.
This allows us to:
- Directly measure overlap across platforms
- Understand relationships between people, devices, accounts and households
- Attribute media consumption to individuals within the home
- Validate demographic attribution
The key challenge, however, is the complexity and fragmentation of today’s media landscape. We all recognise it in our own behaviour – browsing, watching and scrolling across platforms and screens at our own pace.
No panel on its own can capture the full picture. Panels provide the people-based ground truth, while digital data – direct from platforms – adds the scale needed to reflect how people consume media.
These datasets can be securely combined using privacy-enhancing technologies (PETs), improving the accuracy of deduplicated reach and frequency.
Bringing both together is essential to create a complete view of audiences. Advances in data science and AI are making this increasingly possible, helping to better capture real viewing behaviour at scale and deliver more accurate deduplication of audiences across platforms, devices and environments.
Why deduplication matters
For advertisers, media owners and platforms, the impact is simple: greater clarity.
Without a combination of panels, first-party digital data and privacy-enhancing technologies, most solutions rely heavily on modelling for deduplication. This does not show real-world behaviour and can overstate reach or miscalculate frequency.
By addressing duplication between and within platforms, and across shared devices, it becomes possible to obtain a more holistic view of who was truly reached.
In a complex media world, understanding audiences is not just about counting impressions – it’s about understanding people. And that’s where clarity begins.