Most of the debate around AI and information manipulation focuses on deepfakes of real people. A fabricated video of a politician. A cloned voice delivering a statement they never made... These are real threats and they are already operational as demonstrated in Operation Overload.

But I think the more significant long-term risk is not synthetic versions of real people. It is synthetic people who never existed at all.

The avatar that cannot be reverse-searched

Current fact-checking and OSINT investigation relies heavily on traceability. You find a suspicious article and look for the author. You find a suspicious video and run reverse image search. You find a suspicious social media account and map its network. These methods work because every real person leaves traces: photographs, employment histories, social connections, writing patterns, metadata. The investigative work is in finding and connecting those traces.

AI avatars break this model at the root.

A synthetic journalist generated by a current text-to-video model has no original photograph to match against. There is no LinkedIn profile from ten years ago, no university graduation photo, no tagged image from a conference. Reverse image search returns nothing, because the face was generated and has never appeared anywhere before. The author cannot be found because the author does not exist.

This is categorically different from a human operator running a fake account. Fake accounts built on real people's stolen identities can be traced back to the original. Entirely synthetic personas cannot. There is no thread to pull.

Running a coordinated network of fake journalists using real humans is expensive. You need people, time, language skills, and the constant operational risk that one of them will be identified and traced back to the operation. This is why large-scale FIMI operations have tended to rely on automated content distribution rather than individually crafted personas. The human bottleneck constrains the operation.

Today, a single operator can now generate and manage hundreds of synthetic video personas, each with a distinct face, voice, name, and apparent biography. Each can be deployed to a different niche, region, or platform. Each produces video content that, at casual viewing, is indistinguishable from a real local journalist or regional commentator. The cost per persona has collapsed. This changes what is operationally feasible. The CopyCop operation built 141 confirmed fake .fr domains in the first half of 2025. Imagine each of those domains having a synthetic named journalist posting regular video updates. That is not a future scenario. It is a straightforward combination of capabilities that already exist separately.

Why video is the critical vector

Text-based fake content has weaknesses that video currently does not.

Automated text detection like VIGINUM's copypasta algorithm, while imperfect, has become a standard part of the fact-checking toolkit. Writing style analysis, plagiarism detection, linguistic fingerprinting: these are all reasonably mature. AI-generated text leaves detectable traces, and the research community has built tools to find them.

Video is different. Platform moderation is still primarily designed around text and static images. Automated video analysis at the scale required to monitor a large influence operation is not routine. And crucially: the signals that make text-based influence operations detectable (domain registration patterns, DNS infrastructure, linked accounts) do not necessarily apply to avatar-based video content, which can be distributed through entirely different channels: YouTube, TikTok, Telegram, and increasingly through embedding in the kind of local news site domains.

There is also an authority asymmetry. A text article claiming to be from a named journalist requires the reader to do nothing to assess it. A video of a person speaking to camera, with a name and a plausible face, triggers a very different cognitive response. We are wired to evaluate people, and the avatar is designed to exploit that.

What makes this particularly hard to address is that each component of the threat already exists independently. Synthetic outlets (domains, websites, plausible names) are operational at scale. AI-generated text content is standard. Video avatar technology is commercially available. Social media distribution infrastructure is well understood by the operations that use it.

The convergence of all four, which I think is the real threat, produces something that is orders of magnitude harder to detect than any one component alone.

A synthetic local outlet covering the 2027 French presidential election. A synthetic named journalist who appears in regular video segments. AI-generated articles under that byline. Distributed across social networks through accounts that have been building credibility. Each layer individually detectable with effort. All four together: a synthetic news ecosystem that requires simultaneous investigation of domain infrastructure, video content, text generation, and social graph analysis, coordinated in real time.

Current research and detection capacity is not built for this. We are reasonably good at finding the domains. We are reasonably good at analysing text. We are not good at integrating all of it into a detection workflow that operates at the speed the operations do.

What detection looks like now and what it needs to become

The domain-level detection I have been working with, tracking Certificate Transparency logs, scoring naming patterns, fingerprinting infrastructure, catches operations at the infrastructure layer. It is upstream of content. That is its value: you find the domain before the operation publishes its first article.

But avatar-based operations may have a very different infrastructure profile. A synthetic journalist running a YouTube channel does not need a suspicious .fr domain. They need a Google account, a video generation pipeline, and a distribution strategy. None of that leaves traces in the data sources.

This means the detection toolkit needs to expand in several directions at once.

  • Behavioural analysis of video content: posting patterns, background consistency, subtle rendering artefacts, audio-visual sync anomalies. These are detectable today, with effort, at small scale. They are not easily detectable at the scale at which an industrialised avatar operation would run.
  • Identity network analysis: mapping the relationships between synthetic personas, looking for shared infrastructure signals even when the content layer is clean. An operation running 200 synthetic journalists probably uses shared tooling, shared hosting, shared posting schedules. Those patterns might be visible if you look at the right level.
  • Provenance tracking for video: understanding where video content originates before it spreads, equivalent to what domain registration monitoring does for web infrastructure. This is technically hard and does not currently exist in any robust form.

Operations do not usually deploy their most sophisticated capabilities first. They build and test. The avatar operations that are currently visible, the manufactured experts, the synthetic commentators that are already circulating, are likely early iterations. The technology will improve faster than the detection capacity if the research community does not treat this as an urgent problem now. The 2027 French presidential election is coming too soon. The infrastructure for the text-based operations targeting it is already being built, the video layer will follow.

The window to develop detection methods before the capability is fully deployed is short. Not years. Months.
This is not a reason to panic. It is a reason to start now.